qwen2-vl多图sft失败无法使用 #372

yasi0124 · 2025-02-28T03:17:37Z

我按照04-Qwen2-VL-2B Lora 微调.md的教程进行了单图的sft，是ok的。
然后因为个人需要，我想测试一下能否进行多图的sft。训练数据按照如下格式准备的。
messages = [
{
"role": "user",
"content": [
{
"type": "image",
"image": f"{file_path}",
"resized_height": 280,
"resized_width": 280,
},
{
"type": "image",
"image": f"{file_path}",
"resized_height": 280,
"resized_width": 280,
},
{"type": "text", "text": "COCO Yes:"},
],
}
]
测试结果，按照这个message格式，推理是ok的，但是训练会报错。
报错如下：

我定位了下，是image_grid_thw tensor这个的tensor维度发生了变化。（正常训练的tensor shape的是[2,3]）

Zeyi-Lin · 2025-03-01T06:47:31Z

让我看看

Zeyi-Lin · 2025-03-01T07:21:06Z

多图sft，这是什么场景？可以考虑开一个新的教程文档来适配这个场景

yasi0124 · 2025-03-03T01:40:17Z

多图sft，这是什么场景？可以考虑开一个新的教程文档来适配这个场景

ds的vl它有一个推理场景是这样的，试了下推理的这种能力有限，想试着按照这种情况sft一下。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

qwen2-vl多图sft失败无法使用 #372

qwen2-vl多图sft失败无法使用 #372

yasi0124 commented Feb 28, 2025

Zeyi-Lin commented Mar 1, 2025

Zeyi-Lin commented Mar 1, 2025

yasi0124 commented Mar 3, 2025

qwen2-vl多图sft失败无法使用 #372

qwen2-vl多图sft失败无法使用 #372

Comments

yasi0124 commented Feb 28, 2025

Zeyi-Lin commented Mar 1, 2025

Zeyi-Lin commented Mar 1, 2025

yasi0124 commented Mar 3, 2025