Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

push_to_hub() for Llama 3.1 8B doesn't save lm_head.weight tensor #37303

Closed
2 of 4 tasks
wizeng23 opened this issue Apr 5, 2025 · 4 comments
Closed
2 of 4 tasks

push_to_hub() for Llama 3.1 8B doesn't save lm_head.weight tensor #37303

wizeng23 opened this issue Apr 5, 2025 · 4 comments
Labels

Comments

@wizeng23
Copy link

wizeng23 commented Apr 5, 2025

System Info

  • transformers version: 4.49.0
  • Platform: Linux-6.8.0-1015-gcp-x86_64-with-glibc2.35
  • Python version: 3.10.13
  • Huggingface_hub version: 0.30.1
  • Safetensors version: 0.5.3
  • Accelerate version: 1.2.1
  • Accelerate config: not found
  • DeepSpeed version: not installed
  • PyTorch version (GPU?): 2.5.1+cu124 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using distributed or parallel set-up in script?:
  • Using GPU in script?:
  • GPU type: NVIDIA A100-SXM4-40GB

Who can help?

@ArthurZucker

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

import torch
import transformers
model = transformers.AutoModel.from_pretrained("meta-llama/Llama-3.1-8B", torch_dtype=torch.bfloat16)
model.push_to_hub('wizeng23/Llama-test')
tokenizer = transformers.AutoTokenizer.from_pretrained("meta-llama/Llama-3.1-8B")
tokenizer.push_to_hub('wizeng23/Llama-test')

Expected behavior

I'd expect the model weights to be completely unchanged when saving the model. However, it seems the lm_head.weight is not saved at all. model-00004-of-00004.safetensors for Llama 3.1 8B 1.17GB, while in the saved model, it's 117MB: https://huggingface.co/wizeng23/Llama-test/tree/main. I checked the save tensor file, and the only difference is the missing lm head tensor (shape [128256, 4096]); this is 500M params, which seems to fully account for the missing size.

@wizeng23 wizeng23 added the bug label Apr 5, 2025
@zucchini-nlp
Copy link
Member

If the weights are tied, we don't save lm head. When loading the model, an embed_tokens weight can be used for both: embedding and the head. Is loading back raising warnings like "Some weights are not initialized from ckpt"?

@Zephyr271828
Copy link
Contributor

Hi! @wizeng23 , just as @zucchini-nlp mentioned, sometimes tied weights is the reason that lm_head cannot be found in model.state_dict().keys(). You may check this comment.

However, for your issue, it seems tied weights is not the cause. You are loading llama 3.1 8B model with AutoModel.from_pretrained. As a result, the type of your model is <class 'transformers.models.llama.modeling_llama.LlamaModel'>, which does not have lm_head. You may want to verify with the following code:

import os
import torch
import transformers

model_path = "meta-llama/Llama-3.1-8B"

llama_model = transformers.AutoModel.from_pretrained(model_path, torch_dtype=torch.bfloat16)

# check if lm_head is in the named_parameters of LlamaForCausalLM
print(any("lm_head" in name for name, _ in llama_model.named_parameters()))
print(type(llama_model))

# save LlamaModel and check the size of the safetensors
llama_model.save_pretrained('./ckpt/llama_model')
print('model-00004-of-00004.safetensors size: {:.2e} bytes'.format(
    os.path.getsize('./ckpt/llama_model/model-00004-of-00004.safetensors')
))

llama_for_causal_lm = transformers.AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.bfloat16)

# check if lm_head is in the named_parameters of LlamaForCausalLM
print(any("lm_head" in name for name, _ in llama_for_causal_lm.named_parameters()))
print([name for name, _ in llama_for_causal_lm.named_parameters() if "lm_head" in name])

# check if lm_head is in the state_dict of LlamaForCausalLM
print(any("lm_head" in key for key in llama_for_causal_lm.state_dict().keys()))
print([key for key in llama_for_causal_lm.state_dict().keys() if "lm_head" in key])

# save LlamaForCausalLM and check the size of the safetensors
llama_for_causal_lm.save_pretrained('./ckpt/llama_for_causal_lm')
print('model-00004-of-00004.safetensors size: {:.2e} bytes'.format(
    os.path.getsize('./ckpt/llama_for_causal_lm/model-00004-of-00004.safetensors')
))

# load LlamaModel and LlamaForCausalLM from the save path
llama_model = transformers.AutoModel.from_pretrained('./ckpt/llama_model', torch_dtype=torch.bfloat16)
llama_for_causal_lm = transformers.AutoModelForCausalLM.from_pretrained('./ckpt/llama_model', torch_dtype=torch.bfloat16)

Expected Results

LlamaModel

  • size of model-00004-of-00004.safetensors is 117MB
  • does not raise a warning when loading from './ckpt/llama_model'

LlamaForCausalLM

  • size of model-00004-of-00004.safetensors is 1.17GB
  • raise a warning when loading from './ckpt/llama_model': Some weights of LlamaForCausalLM were not initialized from the model checkpoint at ./ckpt/llama_model and are newly initialized: ['lm_head.weight']

@wizeng23
Copy link
Author

wizeng23 commented Apr 7, 2025

Thanks for your answer @Zephyr271828 ! To summarize, this is user error on my part, and I should be using AutoModelForCausalLM instead of AutoModel.

@wizeng23 wizeng23 closed this as completed Apr 7, 2025
@Zephyr271828
Copy link
Contributor

Thanks for your answer @Zephyr271828 ! To summarize, this is user error on my part, and I should be using AutoModelForCausalLM instead of AutoModel.

Glad it helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants