The output logistics of Qwen-7B under 8-bit quantization contain NaN #1504

Kairong-Han · 2025-02-09T03:04:28Z

My version is as follows:

torch: 2.4.1
torchaudio: 2.4.1
torchvision: 0.19.1
cuda: 12.4
bitsandbytes: 0.42.0

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Tue_Oct_29_23:50:19_PDT_2024
Cuda compilation tools, release 12.6, V12.6.85
Build cuda_12.6.r12.6/compiler.35059454_0

When I use qwen-7b, such as

···
device ='cuda:0'
model = AutoModelForCausalLM.from_pretrained(model_name, load_in_8bit=True)
peft_config = LoraConfig(
task_type=TaskType.CAUSAL_LM, inference_mode=False, r=32, lora_alpha=32, lora_dropout=0.1,target_modules=["gate_proj","up_proj","down_proj","q_proj","k_proj","v_proj"]
)
model = get_peft_model(model, peft_config)

model.to(device='cuda:0')

prompt=tokenizer.encode(‘1+1=’, return_tensors="pt",padding='max_length',max_length=100,add_special_tokens=True).to('cuda:0')
labels = prompt
outputs = model(prompt.to(device), labels=labels.to(device),output_attentions=True)
···

The output of qwen outputs.logits[0,:10,:10]：

···
tensor([[ nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan],
[ 7.0352, -1.2031, 1.0703, 0.5190, 2.5098, 6.9844, 0.9600, 2.6797,
2.8594, 6.7031],
[ nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan],
[ nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan],
[ nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan],
[ nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan],
[ nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan],
[ nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan],
[ 3.9023, 5.9258, 11.9141, 4.4336, 4.9883, 2.0195, 3.9336, 5.5039,
-0.5679, 4.5508],
[ 3.8711, 4.6523, 12.2188, 3.4648, 5.2227, 1.4297, 3.0352, 4.8828,
-1.4443, 4.4258]], device='cuda:0', dtype=torch.float16,
grad_fn=)
····

How can I solve this problem？

matthewdouglas · 2025-02-11T19:23:42Z

Please upgrade to the latest bitsandbytes. Additionally, you may wish to try with torch_dtype=torch.bfloat16.

TimDettmers · 2025-02-28T15:03:12Z

What is likely happening is that the quantization leads to poor quality somewhere along the model and then it turns into nan values. However, I use Qwen 2.5 7B regularly and do not see this problem. So something else might be wrong. We would appreciate more information if upgrading bitsandbytes did not yield any solution

matthewdouglas added Waiting for Info Proposing to Close labels Feb 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The output logistics of Qwen-7B under 8-bit quantization contain NaN #1504

The output logistics of Qwen-7B under 8-bit quantization contain NaN #1504

Kairong-Han commented Feb 9, 2025

matthewdouglas commented Feb 11, 2025

TimDettmers commented Feb 28, 2025

The output logistics of Qwen-7B under 8-bit quantization contain NaN #1504

The output logistics of Qwen-7B under 8-bit quantization contain NaN #1504

Comments

Kairong-Han commented Feb 9, 2025

matthewdouglas commented Feb 11, 2025

TimDettmers commented Feb 28, 2025