Skip to content

[flax/mistral] support sliding_window: null in config #37402

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 2, 2025

Conversation

yiding
Copy link
Contributor

@yiding yiding commented Apr 9, 2025

What does this PR do?

Adds support for sliding_window: null in FlaxMistral llm models. This is needed for models like Mistral-small. The corresponding pytorch model code already supports this.

Before Submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline, Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

@ArthurZucker @kiansierra

@github-actions github-actions bot marked this pull request as draft April 9, 2025 20:26
Copy link
Contributor

github-actions bot commented Apr 9, 2025

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the Ready for review button (at the bottom of the PR page). This will assign reviewers and trigger CI.

@yiding yiding force-pushed the sliding-window branch 2 times, most recently from 042bdf9 to f242390 Compare April 9, 2025 20:38
@yiding yiding marked this pull request as ready for review April 24, 2025 04:10
@github-actions github-actions bot requested a review from ArthurZucker April 24, 2025 04:10
@yiding
Copy link
Contributor Author

yiding commented May 8, 2025

@ArthurZucker friendly ping.

I have some more changes for flax + mistral after this as well if this review process goes smoothly :)

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we are probably gonna deprecate flax as usage is so low, but happy to merge in the mean time!

@ArthurZucker ArthurZucker enabled auto-merge (squash) June 2, 2025 14:44
@ArthurZucker ArthurZucker disabled auto-merge June 2, 2025 14:44
@ArthurZucker ArthurZucker merged commit cceab97 into huggingface:main Jun 2, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants