-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
More accurate RPM limit enforcement on keys #10037
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…llel request handler to use base routing strategy allows for better redis / internal memory cache usage
uses redis increment cache logic ensures tpm/rpm logic works well across instances
reduces spillover (from 66 -> 2 at 10k+ requests in 1min.)
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
When will this issue be fixed? It is stably reproducible in version 1.65.4, and the multi-instance rate limiting is failing. @krrishdholakia |
Hey @harvardfly acknowledging this. I hope to have this done over the next 2 weeks. Need to do:
Follow up pr:
|
Got it, thank you very much @krrishdholakia . I hope you can fix it in the stable version as soon as possible, as TPM and RPM limits are crucial features. |
@krrishdholakia,
|
Closing as this is now on main -
|
@ScGPS we do sync the redis value to the in memory cache (handled by the BaseRoutingStrategy class) We do this periodically (every 0.01s) to avoid calling redis on each request |
Title
More accurate RPM limit enforcement on keys
Relevant issues
Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
tests/litellm/
directory, Adding at least 1 test is a hard requirement - see detailsmake test-unit
)[https://docs.litellm.ai/docs/extras/contributing_code]Type
🆕 New Feature
🐛 Bug Fix
Changes