-
-
Notifications
You must be signed in to change notification settings - Fork 6.7k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[MISC] Use less CPU when message queue has been empty for some time
#16226
opened Apr 8, 2025 by
p12tic
Loading…
[Bugfix] Do not skip "empty" parts of chats that are parsable
frontend
multi-modality
Related to multi-modality (#4194)
ready
ONLY add when PR is ready to merge/full CI is needed
#16219
opened Apr 7, 2025 by
mgoin
Loading…
[Docs] Add Slides from Singapore Meetup
documentation
Improvements or additions to documentation
#16213
opened Apr 7, 2025 by
simon-mo
Loading…
[Hardware][Google] Track TPU usages in vLLM's data dashboards
ready
ONLY add when PR is ready to merge/full CI is needed
#16211
opened Apr 7, 2025 by
dyli-google
Loading…
[BugFix] Remove 224 from FA supported list
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#16206
opened Apr 7, 2025 by
LucasWilkinson
Loading…
[Model] use AutoWeightsLoader for phimoe,qwen2_moe,qwen3_moe
#16203
opened Apr 7, 2025 by
lengrongfu
Loading…
[Bugfix] Fix profiling.py
documentation
Improvements or additions to documentation
#16202
opened Apr 7, 2025 by
hhy3
Loading…
[Bug] [ROCm] Fix Llama 4 Enablement Bug on ROCm: V0 ROCmFlashAttentionImpl and Triton Fused MoE bugs
#16198
opened Apr 7, 2025 by
tjtanaa
Loading…
[Bugfix] Fix and reorganize broken GGUF tests and bump gguf version
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
#16194
opened Apr 7, 2025 by
Isotr0py
Loading…
[WIP]: Support embedding models in V1
frontend
tpu
Related to Google TPUs
v1
#16188
opened Apr 7, 2025 by
maxdebayser
•
Draft
[TPU][V1] Disable per-request seed/Generator
ci/build
tpu
Related to Google TPUs
v1
#16172
opened Apr 7, 2025 by
NickLucche
Loading…
[V1][Structured Output] Add
validate_grammar()
method to StructuredOutputBackend
v1
#16171
opened Apr 7, 2025 by
shen-shanshan
Loading…
[model] fix load_weights function for dpsk fusedmoe class
#16170
opened Apr 7, 2025 by
BearBiscuit05
Loading…
[Feature] Estimate max-model-len use available KV cache memory
v1
#16168
opened Apr 7, 2025 by
lengrongfu
Loading…
[Bugfix] fix use-ep bug to enable ep by dp/tp size > 1
ready
ONLY add when PR is ready to merge/full CI is needed
#16161
opened Apr 7, 2025 by
zxfan-cpu
Loading…
[WIP] Proper input validation for multi-modal encoder-decoder models
v1
#16156
opened Apr 7, 2025 by
DarkLight1337
•
Draft
Previous Next
ProTip!
Adding no:label will show everything without a label.