-
-
Notifications
You must be signed in to change notification settings - Fork 8.4k
[Core] feat: Implement Priority Scheduling in V1 Engine #18700
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
dbdfa5b
to
8b54316
Compare
Signed-off-by: Nick Hill <[email protected]> Co-authored-by: Yizhou Liu <[email protected]> Signed-off-by: amit <[email protected]>
Signed-off-by: reidliu41 <[email protected]> Co-authored-by: reidliu41 <[email protected]> Signed-off-by: amit <[email protected]>
Signed-off-by: amit <[email protected]>
Signed-off-by: Benjamin Chislett <[email protected]> Signed-off-by: amit <[email protected]>
Signed-off-by: DarkLight1337 <[email protected]> Signed-off-by: amit <[email protected]>
…M with arbitrary components (vllm-project#18987) Signed-off-by: isotr0py <[email protected]> Signed-off-by: Isotr0py <[email protected]> Signed-off-by: amit <[email protected]>
Signed-off-by: reidliu41 <[email protected]> Co-authored-by: reidliu41 <[email protected]> Signed-off-by: amit <[email protected]>
…llm-project#18968) Signed-off-by: mgoin <[email protected]> Signed-off-by: amit <[email protected]>
) Signed-off-by: rongfu.leng <[email protected]> Signed-off-by: amit <[email protected]>
…ct#18992) Signed-off-by: Nick Hill <[email protected]> Signed-off-by: amit <[email protected]>
Signed-off-by: reidliu41 <[email protected]> Co-authored-by: reidliu41 <[email protected]> Signed-off-by: amit <[email protected]>
Signed-off-by: amit <[email protected]>
Signed-off-by: amit <[email protected]>
…m02/vllm into feat/v1-priority-scheduling
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: amit <[email protected]>
Signed-off-by: amit <[email protected]>
Signed-off-by: amit <[email protected]>
the commit history is in a mess, can you clean it up? maybe open another PR? |
Re-submitted as #19057 |
This commit introduces priority scheduling capabilities to the V1 LLM engine.
Key changes include:
EngineCoreRequest
andRequest
updates:priority
field toEngineCoreRequest
andRequest
classes to carry priority information.Processor
update:Processor.process_inputs
to accept and pass thepriority
toEngineCoreRequest
.V1
Scheduler
modifications:--scheduling-policy
argument.policy="priority"
,self.waiting
is managed as a min-heap, prioritizing requests by their assigned priority value (lower value means higher priority) and then by arrival time (FCFS).policy="fcfs"
.Documentation:
docs/usage/v1_guide.md
anddocs/serving/openai_compatible_server.md
to reflect V1 engine's support for priority scheduling.Unit Tests:
tests/v1/core/test_scheduler.py
.This allows you to influence the order of request processing in the V1 engine by assigning priorities, which is particularly useful in scenarios with varying request importance.
FIX #14002