[Core] feat: Implement Priority Scheduling in V1 Engine #18700

amitm02 · 2025-05-26T07:51:16Z

This commit introduces priority scheduling capabilities to the V1 LLM engine.

Key changes include:

EngineCoreRequest and Request updates:
- Added a priority field to EngineCoreRequest and Request classes to carry priority information.
Processor update:
- Modified Processor.process_inputs to accept and pass the priority to EngineCoreRequest.
V1 Scheduler modifications:
- The scheduler now respects the --scheduling-policy argument.
- When policy="priority", self.waiting is managed as a min-heap, prioritizing requests by their assigned priority value (lower value means higher priority) and then by arrival time (FCFS).
- Preemption logic now correctly identifies and preempts the actual lowest-priority running request when space is needed for higher-priority or new requests.
- FCFS behavior is maintained when policy="fcfs".
Documentation:
- Updated docs/usage/v1_guide.md and docs/serving/openai_compatible_server.md to reflect V1 engine's support for priority scheduling.
Unit Tests:
- Added a new test suite in tests/v1/core/test_scheduler.py.

This allows you to influence the order of request processing in the V1 engine by assigning priorities, which is particularly useful in scenarios with varying request importance.

FIX #14002

github-actions · 2025-05-26T07:51:25Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

Signed-off-by: Nick Hill <[email protected]> Co-authored-by: Yizhou Liu <[email protected]> Signed-off-by: amit <[email protected]>

Signed-off-by: reidliu41 <[email protected]> Co-authored-by: reidliu41 <[email protected]> Signed-off-by: amit <[email protected]>

Signed-off-by: amit <[email protected]>

Signed-off-by: Benjamin Chislett <[email protected]> Signed-off-by: amit <[email protected]>

Signed-off-by: DarkLight1337 <[email protected]> Signed-off-by: amit <[email protected]>

…M with arbitrary components (vllm-project#18987) Signed-off-by: isotr0py <[email protected]> Signed-off-by: Isotr0py <[email protected]> Signed-off-by: amit <[email protected]>

Signed-off-by: reidliu41 <[email protected]> Co-authored-by: reidliu41 <[email protected]> Signed-off-by: amit <[email protected]>

…llm-project#18968) Signed-off-by: mgoin <[email protected]> Signed-off-by: amit <[email protected]>

) Signed-off-by: rongfu.leng <[email protected]> Signed-off-by: amit <[email protected]>

…ct#18992) Signed-off-by: Nick Hill <[email protected]> Signed-off-by: amit <[email protected]>

Signed-off-by: reidliu41 <[email protected]> Co-authored-by: reidliu41 <[email protected]> Signed-off-by: amit <[email protected]>

Signed-off-by: amit <[email protected]>

…m02/vllm into feat/v1-priority-scheduling

mergify · 2025-06-01T14:57:28Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @amitm02.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

…heduling

Signed-off-by: amit <[email protected]>

youkaichao · 2025-06-02T08:12:07Z

the commit history is in a mess, can you clean it up? maybe open another PR?

…heduling

amitm02 · 2025-06-03T07:06:18Z

the commit history is in a mess, can you clean it up? maybe open another PR?

Re-submitted as #19057

amitm02 requested review from hmellor, WoosukKwon, robertgshaw2-redhat, njhill, ywang96, comaniac and alexm-redhat as code owners May 26, 2025 07:51

mergify bot added documentation Improvements or additions to documentation v1 labels May 26, 2025

This was referenced May 26, 2025

[Feature]: Implement Priority Scheduling In V1 Engine #14002

Closed

[RFC]: Deprecating vLLM V0 #18571

Open

amitm02 force-pushed the feat/v1-priority-scheduling branch from dbdfa5b to 8b54316 Compare May 27, 2025 12:08

amitm02 requested review from jeejeelee, DarkLight1337, tlrmchlsmth, simon-mo, mgoin, russellb, zhuohan123 and youkaichao as code owners May 27, 2025 12:08

mergify bot added ci/build frontend multi-modality Related to multi-modality (#4194) structured-output tpu Related to Google TPUs labels May 27, 2025

github-project-automation bot added this to Structured Output May 27, 2025

mergify bot added the tool-calling label May 27, 2025

github-project-automation bot added this to Tool Calling May 27, 2025

mergify bot removed the tpu Related to Google TPUs label May 27, 2025

njhill and others added 14 commits June 1, 2025 17:57

[BugFix] Fix multi-node offline data-parallel (vllm-project#18981)

21db17a

Signed-off-by: Nick Hill <[email protected]> Co-authored-by: Yizhou Liu <[email protected]> Signed-off-by: amit <[email protected]>

[Misc] add return token strs for tokenize (vllm-project#18941)

63f4c59

Signed-off-by: reidliu41 <[email protected]> Co-authored-by: reidliu41 <[email protected]> Signed-off-by: amit <[email protected]>

[Misc][Benchmark] Add support for CustomDataset (vllm-project#18511)

8882440

Signed-off-by: amit <[email protected]>

[Bugfix] Fix EAGLE3 broken logits (vllm-project#18909)

2334fe9

Signed-off-by: Benjamin Chislett <[email protected]> Signed-off-by: amit <[email protected]>

[Core] Rework dtype resolution (vllm-project#18751)

5087dcc

Signed-off-by: DarkLight1337 <[email protected]> Signed-off-by: amit <[email protected]>

[LoRA] Support dynamically initialize packed_modules_mapping for VL…

122b00a

…M with arbitrary components (vllm-project#18987) Signed-off-by: isotr0py <[email protected]> Signed-off-by: Isotr0py <[email protected]> Signed-off-by: amit <[email protected]>

[doc] small fix - mkdocs (vllm-project#18996)

9311b97

Signed-off-by: reidliu41 <[email protected]> Co-authored-by: reidliu41 <[email protected]> Signed-off-by: amit <[email protected]>

Let max_num_batched_tokens use human_readable_int for large numbers (v…

6c27d82

…llm-project#18968) Signed-off-by: mgoin <[email protected]> Signed-off-by: amit <[email protected]>

[BugFix] fix data parallel construct ipv6 url addres (vllm-project#18991

694899e

) Signed-off-by: rongfu.leng <[email protected]> Signed-off-by: amit <[email protected]>

[BugFix] Fix incorrect metrics shutdown error log message (vllm-proje…

4e0185b

…ct#18992) Signed-off-by: Nick Hill <[email protected]> Signed-off-by: amit <[email protected]>

[doc] wrong output (vllm-project#19000)

5eab7d7

Signed-off-by: reidliu41 <[email protected]> Co-authored-by: reidliu41 <[email protected]> Signed-off-by: amit <[email protected]>

Update request.py

9228cd1

Signed-off-by: amit <[email protected]>

Update request.py

c062ede

Signed-off-by: amit <[email protected]>

Merge branch 'feat/v1-priority-scheduling' of https://github.com/amit…

ee3ab04

…m02/vllm into feat/v1-priority-scheduling

mergify bot added speculative-decoding tpu Related to Google TPUs labels Jun 1, 2025

mergify bot added the needs-rebase label Jun 1, 2025

Merge remote-tracking branch 'upstream/main' into feat/v1-priority-sc…

fffe3a8

…heduling

mergify bot removed tpu Related to Google TPUs needs-rebase labels Jun 1, 2025

amitm02 added 3 commits June 1, 2025 18:08

minor

c289981

Signed-off-by: amit <[email protected]>

line too long

5e8b804

Signed-off-by: amit <[email protected]>

line too long

70ca4b2

Signed-off-by: amit <[email protected]>

Merge remote-tracking branch 'upstream/main' into feat/v1-priority-sc…

be4d052

…heduling

amitm02 closed this Jun 3, 2025

github-project-automation bot moved this to Done in Tool Calling Jun 3, 2025

github-project-automation bot moved this to Done in Structured Output Jun 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Core] feat: Implement Priority Scheduling in V1 Engine #18700

[Core] feat: Implement Priority Scheduling in V1 Engine #18700

Uh oh!

amitm02 commented May 26, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented May 26, 2025

Uh oh!

mergify bot commented Jun 1, 2025

Uh oh!

youkaichao commented Jun 2, 2025

Uh oh!

amitm02 commented Jun 3, 2025

Uh oh!

Uh oh!

Uh oh!

[Core] feat: Implement Priority Scheduling in V1 Engine #18700

[Core] feat: Implement Priority Scheduling in V1 Engine #18700

Uh oh!

Conversation

amitm02 commented May 26, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented May 26, 2025

Uh oh!

mergify bot commented Jun 1, 2025

Uh oh!

youkaichao commented Jun 2, 2025

Uh oh!

amitm02 commented Jun 3, 2025

Uh oh!

Uh oh!

amitm02 commented May 26, 2025 •

edited by github-actions bot

Loading