[Feature]: Support Pipeline Parallelism on Llama-4-Maverick-17B-128E #16231

Edwinhr716 · 2025-04-08T04:01:42Z

🚀 The feature, motivation and pitch

I'm attempting to deploy Llama-4-Maverick-17B-128E across 16 H100s on two nodes, running this command:

python3 -m vllm.entrypoints.openai.api_server --port 8080 --model meta-llama/Llama-4-Maverick-17B-128E-Instruct --tensor-parallel-size 8 --pipeline-parallel-size 2

I got this message saying that PP isn't supported

NotImplementedError: Pipeline parallelism is not supported for this model. Supported models implement the `SupportsPP` interface.

Llama-4-Maverick-17B-128E is a large LLM that most people will be running across multiple GPU nodes.

Alternatives

No response

Additional context

No response

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

The text was updated successfully, but these errors were encountered:

houseroad · 2025-04-08T05:14:34Z

I think trunk should already have the PP support. @zhewenl, could you help on verification?

zhewenl · 2025-04-08T16:39:25Z

@Edwinhr716 rebase to latest main and try again, it should be supported: https://github.com/vllm-project/vllm/blob/main/vllm/model_executor/models/mllama4.py#L668-L669
(unable to reproduce your issue: https://gist.github.com/zhewenl/c2a946bbc0c24450bd469aa29f836784)

Edwinhr716 · 2025-04-08T16:51:53Z

sweet, so should be available on 0.8.4 release?

houseroad · 2025-04-08T17:54:31Z

It will. Before it's ready, feel free to try our nightly: https://docs.vllm.ai/en/latest/getting_started/installation/gpu.html#pre-built-wheels

houseroad · 2025-04-08T17:54:49Z

Okay, i will close this issue, feel free to re-open if anything else is needed :-)

Edwinhr716 added the feature request New feature or request label Apr 8, 2025

houseroad added this to Llama Features & Optimizations Apr 8, 2025

houseroad assigned houseroad and unassigned houseroad Apr 8, 2025

houseroad moved this to In Progress in Llama Features & Optimizations Apr 8, 2025

houseroad closed this as completed Apr 8, 2025

houseroad moved this from In Progress to Done in Llama Features & Optimizations Apr 8, 2025

houseroad mentioned this issue Apr 8, 2025

[Feature]: Llama4 Support Enhancement #16114

Open

19 tasks

yeqcharlotte mentioned this issue Apr 11, 2025

[Bug]: LLama4 Not working on PP #16385

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Feature]: Support Pipeline Parallelism on Llama-4-Maverick-17B-128E #16231

[Feature]: Support Pipeline Parallelism on Llama-4-Maverick-17B-128E #16231

Edwinhr716 commented Apr 8, 2025 •

edited

Loading

houseroad commented Apr 8, 2025 •

edited

Loading

Uh oh!

zhewenl commented Apr 8, 2025 •

edited

Loading

Uh oh!

Edwinhr716 commented Apr 8, 2025

Uh oh!

houseroad commented Apr 8, 2025

Uh oh!

houseroad commented Apr 8, 2025

Uh oh!

Uh oh!

[Feature]: Support Pipeline Parallelism on Llama-4-Maverick-17B-128E #16231

[Feature]: Support Pipeline Parallelism on Llama-4-Maverick-17B-128E #16231

Comments

Edwinhr716 commented Apr 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🚀 The feature, motivation and pitch

Alternatives

Additional context

Before submitting a new issue...

houseroad commented Apr 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zhewenl commented Apr 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Edwinhr716 commented Apr 8, 2025

Uh oh!

houseroad commented Apr 8, 2025

Uh oh!

houseroad commented Apr 8, 2025

Uh oh!

Edwinhr716 commented Apr 8, 2025 •

edited

Loading

houseroad commented Apr 8, 2025 •

edited

Loading

zhewenl commented Apr 8, 2025 •

edited

Loading