-
Notifications
You must be signed in to change notification settings - Fork 44
Feat: Onbaord PlamoForCausalLM Architecture #351
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
quic-shagun
commented
Apr 9, 2025
- Add support for PlamoForCausalLM architecture based models
- Tested with Batch Size > 1
- As the model is not available in HF Transformers yet, Used ModuleMethodMapperTransform to map methods
Signed-off-by: quic-shagun <[email protected]>
Signed-off-by: quic-shagun <[email protected]>
Signed-off-by: quic-shagun <[email protected]>
Signed-off-by: quic-shagun <[email protected]>
Signed-off-by: quic-shagun <[email protected]>
Signed-off-by: quic-shagun <[email protected]>
Please refer the PR #373 and make change in modelling file accordingly. |
Updated |
@quic-shagun Do we have the legal approval for merging this architecture? |
Please run perplexity script (torch vs AIC) and paste results here, for text-to-text models. |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove all the training codes and not needed methods from classes.
num_key_value_heads: Optional[int] = None, | ||
max_position_embeddings: int = 2048, | ||
initializer_range: float = 0.02, | ||
rms_norm_eps: float = 1e-6, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there anything new you are doing here, if not there is no need of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Config is being used
) | ||
|
||
|
||
def _rotate_half(x: torch.Tensor) -> torch.Tensor: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is being used in the attention block
Signed-off-by: quic-shagun <[email protected]>
Done |