-
Notifications
You must be signed in to change notification settings - Fork 45
Add Llama4 Multi-Modal Support #382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
def __qeff_init__(self): | ||
self.rotary_embedding = QEffLlama4VisionRotaryEmbedding(config=self.config) | ||
|
||
def forward( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need this forward? remove
17, # constants.INTERN_NUM_PATCHES, | ||
14, # constants.INTERN_FEATURE_SIZE, | ||
inputs_shapes["vision_embeds"] = ( | ||
1, # constants.INTERN_NUM_PATCHES, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to constants? or compute based on the code.
75242b2
to
1b3dd96
Compare
pyproject.toml
Outdated
@@ -19,8 +19,8 @@ classifiers = [ | |||
] | |||
requires-python = ">=3.8,<3.11" | |||
dependencies = [ | |||
"transformers==4.50.0", | |||
"huggingface-hub==0.27.0", | |||
"transformers==4.51.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have moved transformers to 4.51.3 in vLLM
QEfficient/base/modeling_qeff.py
Outdated
@@ -46,6 +46,7 @@ class QEFFBaseModel(ABC): | |||
def _transform_names(cls) -> List[str]: | |||
return [x.__name__ for x in cls._pytorch_transforms + cls._onnx_transforms] | |||
|
|||
@append_tranform |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of adding a decorator to the constructor define a method for applying pytorch transforms, pass a copy of cls._pytorch_transforms. You can add the decorator for that method.
real, imag = torch.cos(freqs), torch.sin(freqs) | ||
freqs_cis = torch.stack([real, imag], dim=-1) | ||
|
||
# freqs_cis = torch.view_as_complex(freqs_cis.contiguous()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove commented lines
final = shared_out + expert_out # restore [B,S,H] | ||
return final.view(B, S, H), router_logits | ||
|
||
# ------------------- Gather based, weights as activation approach --------------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ig no, will remove the other commented approaches.
@@ -1,13 +1,14 @@ | |||
# ----------------------------------------------------------------------------- | |||
# | |||
# Copyright (c) 2025 Qualcomm Innovation Center, Inc. All rights reserved. | |||
# Copyright (c) 2024 Qualcomm Innovation Center, Inc. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please update the year
@@ -49,29 +50,28 @@ | |||
GraniteModel, | |||
GraniteRMSNorm, | |||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why GraniteMoe is removed?
QEffGraniteMoeParallelExperts, | ||
QEffGraniteMoeRotaryEmbedding, | ||
QEffGraniteMoeTopKGating, | ||
from QEfficient.transformers.models.internvl.modeling_internvl import ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
@@ -401,6 +404,8 @@ class KVCacheTransform(ModuleMappingTransform): | |||
@classmethod | |||
def apply(cls, model: nn.Module) -> Tuple[nn.Module, bool]: | |||
model, transformed = super().apply(model) | |||
# FIXME: see if we can merge into _module_mapping dict | |||
transformers.cache_utils.DynamicCache.update = QEffDynamicCache.update |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need it here? since we will be going with HybridChunkCache?
@mohiso22 please rebase the PR. add test and add the model to validated model list |
5d3870d
to
77afbbc
Compare
|
||
lang_inputs = {k: v for k, v in inputs.items() if k not in vision_inputs} | ||
lang_inputs["position_ids"] = np.where( | ||
lang_inputs.pop("attention_mask"), np.arange(padded_len), -1 | ||
) # Need to use -1 as position_ids for invalid tokens | ||
|
||
NOT_MLLAMA = hasattr(self.model.config, "model_type") and self.model.config.model_type != "mllama" | ||
if NOT_MLLAMA: | ||
lang_inputs["index"] = np.array([[0]]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mohiso22 can we rename the index to something which is more readable? image_idx
?
self.language_model = self.model.language_model | ||
self.config = self.model.config | ||
|
||
def forward(self, input_ids, vision_embeds, position_ids, index, past_key_values): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Can you add a detailed the doc string with the interleaving logic we are using here.
- Please renaming the index variable to something more descriptive, such as vision_start_index the current name index is too generic.
def forward(self, input_ids, vision_embeds, position_ids, index, past_key_values): | ||
inputs_embeds = self.model.language_model.get_input_embeddings()(input_ids) | ||
selected = input_ids == self.model.config.image_token_index | ||
indices1 = selected.to(torch.int64).cumsum(1) - 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of indices0 and indices1 please use more descriptive variable names
def get_qeff_language_decoder(self): | ||
return QEffLlama4DecoderWrapper(self) | ||
|
||
def forward(self, input_ids, position_ids, pixel_values, index, past_key_values): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use more descriptive variable name for index
vision_flat = image_features.view(-1, image_features.size(-1)) | ||
projected_vision_flat = self.multi_modal_projector(vision_flat) | ||
selected = input_ids == self.config.image_token_index | ||
indices1 = selected.to(torch.int64).cumsum(1) - 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
|
||
lang_inputs = {k: v for k, v in inputs.items() if k not in vision_inputs} | ||
lang_inputs["position_ids"] = np.where( | ||
lang_inputs.pop("attention_mask"), np.arange(padded_len), -1 | ||
) # Need to use -1 as position_ids for invalid tokens | ||
|
||
NOT_MLLAMA = hasattr(self.model.config, "model_type") and self.model.config.model_type != "mllama" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its not a constant why we are using uppercase here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok will change to smallercase
examples/llama4_lm_example.py
Outdated
@@ -0,0 +1,50 @@ | |||
# ----------------------------------------------------------------------------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this file. Lets maintain only one example different inputs, Text only, text + image, text + multiple images
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok sure
ee23c3a
to
1066b4f
Compare
Signed-off-by: vbaddi <[email protected]> Signed-off-by: Mohit Soni <[email protected]> Signed-off-by: Amit Raj <[email protected]>
Signed-off-by: vbaddi <[email protected]> Signed-off-by: Mohit Soni <[email protected]> Signed-off-by: Amit Raj <[email protected]>
Signed-off-by: Mohit Soni <[email protected]> Signed-off-by: Amit Raj <[email protected]>
Signed-off-by: vbaddi <[email protected]> Signed-off-by: Mohit Soni <[email protected]> Signed-off-by: Amit Raj <[email protected]>
Signed-off-by: vbaddi <[email protected]> Signed-off-by: Mohit Soni <[email protected]> Signed-off-by: Amit Raj <[email protected]>
Signed-off-by: vbaddi <[email protected]> Signed-off-by: Mohit Soni <[email protected]> Signed-off-by: Amit Raj <[email protected]>
…ample Signed-off-by: vbaddi <[email protected]> Signed-off-by: Mohit Soni <[email protected]> Signed-off-by: Amit Raj <[email protected]>
Signed-off-by: vbaddi <[email protected]> Signed-off-by: Mohit Soni <[email protected]> Signed-off-by: Amit Raj <[email protected]>
Signed-off-by: vbaddi <[email protected]> Signed-off-by: Mohit Soni <[email protected]> Signed-off-by: Amit Raj <[email protected]>
…mple files of llama4 as its now same as other Signed-off-by: Amit Raj <[email protected]> Signed-off-by: Mohit Soni <[email protected]> Signed-off-by: Amit Raj <[email protected]>
Signed-off-by: Amit Raj <[email protected]> Signed-off-by: Mohit Soni <[email protected]> Signed-off-by: Amit Raj <[email protected]>
Signed-off-by: Amit Raj <[email protected]> Signed-off-by: Mohit Soni <[email protected]> Signed-off-by: Amit Raj <[email protected]>
Signed-off-by: Amit Raj <[email protected]> Signed-off-by: Mohit Soni <[email protected]> Signed-off-by: Amit Raj <[email protected]>
Signed-off-by: Onkar Chougule <[email protected]> Signed-off-by: Mohit Soni <[email protected]> Signed-off-by: Amit Raj <[email protected]>
Signed-off-by: Mohit Soni <[email protected]> Signed-off-by: Amit Raj <[email protected]>
…_rope Signed-off-by: vbaddi <[email protected]> Signed-off-by: Mohit Soni <[email protected]> Signed-off-by: Amit Raj <[email protected]>
Signed-off-by: Mohit Soni <[email protected]> Signed-off-by: Amit Raj <[email protected]>
Signed-off-by: Mohit Soni <[email protected]> Signed-off-by: Amit Raj <[email protected]>
Signed-off-by: Mohit Soni <[email protected]> Signed-off-by: Amit Raj <[email protected]>
Signed-off-by: Mohit Soni <[email protected]> Signed-off-by: Amit Raj <[email protected]>
Signed-off-by: Mohit Soni <[email protected]> Signed-off-by: Amit Raj <[email protected]>
Signed-off-by: Mohit Soni <[email protected]> Signed-off-by: Amit Raj <[email protected]>
Signed-off-by: Mohit Soni <[email protected]> Signed-off-by: Amit Raj <[email protected]>
Signed-off-by: Amit Raj <[email protected]>
Signed-off-by: Rishin Raj <[email protected]> Signed-off-by: Amit Raj <[email protected]>
Signed-off-by: Rishin Raj <[email protected]> Signed-off-by: Amit Raj <[email protected]>
No description provided.