Skip to content

FEAT: ovis2 #3170

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from
Draft

FEAT: ovis2 #3170

wants to merge 4 commits into from

Conversation

Minamiyama
Copy link
Collaborator

@Minamiyama Minamiyama commented Apr 1, 2025

fix #3151

@XprobeBot XprobeBot added this to the v1.x milestone Apr 1, 2025
新增Ovis多模态模型的实现,包括视觉分词器、数据集处理、训练回调等功能模块。主要包含以下内容:
1. 新增视觉分词器(VisualTokenizer)及其配置类,支持多种视觉模型(如CLIP、SigLIP、AIMv2)。
2. 新增多模态数据集处理模块,支持图像和视频数据的预处理及对话格式的处理。
3. 新增训练回调模块,支持监控模型训练状态及调整参数。
4. 新增Ovis模型的配置类及实现,支持多模态输入的处理及生成。
5. 新增工具类,包括常量定义、日志打印、数据预处理等。

这些改动为多模态模型的训练和推理提供了完整的支持。
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[NEW MODEL][VL]Ovis2
2 participants