Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

qwen2-vl多图sft失败无法使用 #372

Open
yasi0124 opened this issue Feb 28, 2025 · 3 comments
Open

qwen2-vl多图sft失败无法使用 #372

yasi0124 opened this issue Feb 28, 2025 · 3 comments

Comments

@yasi0124
Copy link

我按照04-Qwen2-VL-2B Lora 微调.md的教程进行了单图的sft,是ok的。
然后因为个人需要,我想测试一下能否进行多图的sft。训练数据按照如下格式准备的。
messages = [
{
"role": "user",
"content": [
{
"type": "image",
"image": f"{file_path}",
"resized_height": 280,
"resized_width": 280,
},
{
"type": "image",
"image": f"{file_path}",
"resized_height": 280,
"resized_width": 280,
},
{"type": "text", "text": "COCO Yes:"},
],
}
]
测试结果,按照这个message格式,推理是ok的,但是训练会报错。
报错如下:

Image
我定位了下,是image_grid_thw tensor这个的tensor维度发生了变化。(正常训练的tensor shape的是[2,3])

Image

@Zeyi-Lin
Copy link
Contributor

Zeyi-Lin commented Mar 1, 2025

让我看看

@Zeyi-Lin
Copy link
Contributor

Zeyi-Lin commented Mar 1, 2025

多图sft,这是什么场景?可以考虑开一个新的教程文档来适配这个场景

@yasi0124
Copy link
Author

yasi0124 commented Mar 3, 2025

多图sft,这是什么场景?可以考虑开一个新的教程文档来适配这个场景

ds的vl它有一个推理场景是这样的,试了下推理的这种能力有限,想试着按照这种情况sft一下。

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants