[Feature]: Composite model loading using `AutoWeightsLoader` for all models #15697

DarkLight1337 · 2025-03-28T10:39:19Z

🚀 The feature, motivation and pitch

#9160 first introduced AutoWeightsLoader to recursively call load_weights on sub-modules. This lets composite models (most notably multi-modal models) use language backbones (*Model classes such as LlamaModel) without having to repeat their weight loading logic.

Currently, load_weights is only implemented in a few language backbones. It would be great to standardize this approach and apply it to all language backbones in vLLM. The steps to do this are pretty straightforward:

Move the existing load_weights function from *ForCausalLM to *Model.
Create a new load_weights function in *ForCausalLM that loads the weights using AutoWeightsLoader.
Move any logic in *Model.load_weights that only applies to *ForCausalLM back to *ForCausalLM.load_weights. Usually, this involves lm_head.

For reference, you can look at the implementation for models such as Llama, Gemma2/3, Qwen2 and ChatGLM.

To avoid scope creep, I suggest opening a PR for updating only a few models at a time

Alternatives

No response

Additional context

No response

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

The text was updated successfully, but these errors were encountered:

lengrongfu · 2025-03-28T12:14:51Z

@DarkLight1337 I can try to process several models

DarkLight1337 · 2025-03-28T12:15:59Z

Can you indicate which models you are working on to avoid others duplicating your work?

lengrongfu · 2025-03-28T12:18:16Z

I haven't started yet, do you have any recommendations for simpler models?

DarkLight1337 · 2025-03-28T12:20:33Z

Most language models should work in pretty much the same way (except SSMs like Mamba I guess). You can go in alphabetical order.

lengrongfu · 2025-03-28T12:35:55Z

Thanks ~
/assign

lengrongfu · 2025-03-29T16:33:12Z

DarkLight1337 · 2025-03-30T02:32:14Z

The multi-modal models don't need this change, can you remove them from the list?

jonghyunchoe · 2025-03-31T17:29:04Z

Hi, I could take on a few models, e.g. baichuan, gpt_neox, and mpt.

jonghyunchoe · 2025-04-04T14:35:35Z

I’ll take on a few more models next week.

lengrongfu · 2025-04-11T14:14:11Z

@DarkLight1337 I can add two new skip field, eg. skip_substr: Optional[List[str]] = None, skip_suffix: Optional[List[str]] = None,, because prefix not applicable in all situations.

vllm/vllm/model_executor/models/utils.py

Line 87 in e9528f6

skip_prefixes: Optional[List[str]] = None,

DarkLight1337 · 2025-04-11T14:24:22Z

Unless you also need ignore_unexpected_*, I suggest to instead map the weight to None inside the WeightsMapper

lengrongfu · 2025-04-16T15:45:58Z

Issue record: #16548 (comment)

DarkLight1337 added feature request New feature or request good first issue Good for newcomers labels Mar 28, 2025

DarkLight1337 added this to Onboarding Tasks Mar 28, 2025

DarkLight1337 moved this to Todo in Onboarding Tasks Mar 28, 2025

DarkLight1337 assigned lengrongfu Mar 28, 2025

lengrongfu mentioned this issue Mar 30, 2025

[Model] use AutoWeightsLoader in model load_weights #15770

Merged

jonghyunchoe mentioned this issue Apr 2, 2025

[Model] use AutoWeightsLoader for baichuan, gpt-neox, mpt #15939

Merged

vllm-bot closed this as completed in #15770 Apr 2, 2025

github-project-automation bot moved this from Todo to Done in Onboarding Tasks Apr 2, 2025

DarkLight1337 reopened this Apr 2, 2025

DarkLight1337 moved this from Done to In Progress in Onboarding Tasks Apr 2, 2025

vllm-bot closed this as completed in #15939 Apr 4, 2025

github-project-automation bot moved this from In Progress to Done in Onboarding Tasks Apr 4, 2025

DarkLight1337 reopened this Apr 4, 2025

DarkLight1337 moved this from Done to In Progress in Onboarding Tasks Apr 4, 2025

jonghyunchoe mentioned this issue Apr 5, 2025

[Model] use AutoWeightsLoader for phi, gemma, deepseek #16088

Merged

This was referenced Apr 5, 2025

[Model] use AutoWeightsLoader for stablelm,starcoder2,zamba2 #16103

Merged

[Model] use AutoWeightsLoader for phimoe,qwen2_moe,qwen3_moe #16200

Closed

[Model] use AutoWeightsLoader for phimoe,qwen2_moe,qwen3_moe #16203

Merged

aaron-ang mentioned this issue Apr 8, 2025

[Model] use AutoWeightsLoader for granite, grok1, mixtral #16233

Closed

vllm-bot closed this as completed in #16203 Apr 8, 2025

github-project-automation bot moved this from In Progress to Done in Onboarding Tasks Apr 8, 2025

DarkLight1337 moved this from Done to In Progress in Onboarding Tasks Apr 8, 2025

DarkLight1337 reopened this Apr 8, 2025

This was referenced Apr 9, 2025

[Model] use AutoWeightsLoader for granite, granitemoe, granitemoeshared, grok1, mixtral #16325

Merged

[Model] use AutoWeightsLoader for granite, grok1, mixtral #16327

Closed

[Model] use AutoWeightsLoader for deepseek_v2, internlm2 #16383

Merged

lengrongfu mentioned this issue Apr 13, 2025

[Model] use AutoWeightsLoader for olmoe,opt,orion,persimmon,phi3_small #16548

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Composite model loading using `AutoWeightsLoader` for all models #15697

[Feature]: Composite model loading using `AutoWeightsLoader` for all models #15697

DarkLight1337 commented Mar 28, 2025 •

edited

Loading

lengrongfu commented Mar 28, 2025

DarkLight1337 commented Mar 28, 2025

lengrongfu commented Mar 28, 2025

DarkLight1337 commented Mar 28, 2025 •

edited

Loading

lengrongfu commented Mar 28, 2025

lengrongfu commented Mar 29, 2025 •

edited

Loading

DarkLight1337 commented Mar 30, 2025 •

edited

Loading

jonghyunchoe commented Mar 31, 2025

jonghyunchoe commented Apr 4, 2025 •

edited

Loading

lengrongfu commented Apr 11, 2025 •

edited

Loading

DarkLight1337 commented Apr 11, 2025

lengrongfu commented Apr 16, 2025

[Feature]: Composite model loading using AutoWeightsLoader for all models #15697

[Feature]: Composite model loading using AutoWeightsLoader for all models #15697

Comments

DarkLight1337 commented Mar 28, 2025 • edited Loading

🚀 The feature, motivation and pitch

Alternatives

Additional context

Before submitting a new issue...

lengrongfu commented Mar 28, 2025

DarkLight1337 commented Mar 28, 2025

lengrongfu commented Mar 28, 2025

DarkLight1337 commented Mar 28, 2025 • edited Loading

lengrongfu commented Mar 28, 2025

lengrongfu commented Mar 29, 2025 • edited Loading

DarkLight1337 commented Mar 30, 2025 • edited Loading

jonghyunchoe commented Mar 31, 2025

jonghyunchoe commented Apr 4, 2025 • edited Loading

lengrongfu commented Apr 11, 2025 • edited Loading

DarkLight1337 commented Apr 11, 2025

lengrongfu commented Apr 16, 2025

[Feature]: Composite model loading using `AutoWeightsLoader` for all models #15697

[Feature]: Composite model loading using `AutoWeightsLoader` for all models #15697

DarkLight1337 commented Mar 28, 2025 •

edited

Loading

DarkLight1337 commented Mar 28, 2025 •

edited

Loading

lengrongfu commented Mar 29, 2025 •

edited

Loading

DarkLight1337 commented Mar 30, 2025 •

edited

Loading

jonghyunchoe commented Apr 4, 2025 •

edited

Loading

lengrongfu commented Apr 11, 2025 •

edited

Loading