Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FP8 tensors not saved correctly #37250

Open
Rocketknight1 opened this issue Apr 3, 2025 · 2 comments
Open

FP8 tensors not saved correctly #37250

Rocketknight1 opened this issue Apr 3, 2025 · 2 comments
Labels

Comments

@Rocketknight1
Copy link
Member

Rocketknight1 commented Apr 3, 2025

I tried making a "mini-Deepseek" for testing but encountered some issues. This works fine:

from transformers import AutoConfig, AutoModelForCausalLM

config = AutoConfig.from_pretrained("deepseek-ai/DeepSeek-V3-0324")
config.num_hidden_layers = 1
config.intermediate_size = 1024

model = AutoModelForCausalLM.from_config(config)
model.save_pretrained("test_save")

However, when I try to reload the model, I get the following:

>>> AutoModelForCausalLM.from_pretrained("test_save")
  File "/home/matt/PycharmProjects/transformers/src/transformers/modeling_utils.py", line 806, in _load_state_dict_into_meta_model
    not hf_quantizer.check_quantized_param(
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/matt/PycharmProjects/transformers/src/transformers/quantizers/quantizer_finegrained_fp8.py", line 155, in check_quantized_param
    raise ValueError("Expect quantized weights but got an unquantized weight")
ValueError: Expect quantized weights but got an unquantized weight

It seems like even though we support FP8 loading after #36828, we may not be saving it correctly? cc @kylesayrs

@Rocketknight1
Copy link
Member Author

cc @MekkCyber who worked on #36026 as well

@MekkCyber
Copy link
Contributor

MekkCyber commented Apr 3, 2025

I think it's related to some changes that were made for the deepseek v3 integration ! will look into that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants