We are hiring

We are a team Committed to Lang-Barrier-Free Exploration. Contact us: [email protected]

Computer Vision Intern/Engineer

We are looking for engineers and interns who are passionate about multimodal language models, OCR, and PDF document translation to join our team.

Develop and optimize multimodal language models (AI models combining image and text)
Develop and improve OCR (Optical Character Recognition) technology to enhance text recognition accuracy across various document types
Design and implement PDF document translation and processing systems
Collaborate with team members to solve technical challenges and drive product innovation
Stay current with the latest industry technological developments and apply them to practical projects

Proficiency with deep learning frameworks (such as PyTorch etc.)
Fundamental knowledge of computer vision and natural language processing
Understanding of OCR technology, prior project experience preferred
Familiarity with multimodal models (like CLIP, GPT-4V, etc.), prior project experience preferred

Research or project experience in multimodal models, document processing, or OCR
Contributions to open-source projects or GitHub repositories
Experience with large-scale AI model training or fine-tuning
Knowledge of PDF document structure and processing techniques

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
README_ZH.md		README_ZH.md