-
-
Notifications
You must be signed in to change notification settings - Fork 8.3k
[Model][2/N] Automatic conversion of CrossEncoding model #19978
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @noooop, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request represents the first part of a series aimed at improving the automatic conversion of models, specifically focusing on cross-encoding and scoring capabilities. It involves a significant refactor of the classification adapter, unifying the 'score' task under 'classify' for cross-encoder models, and enhancing the underlying pooling logic. These changes contribute to a more robust and consistent framework for handling various pooling-based model tasks.
Highlights
- Refactoring and Renaming: The
as_classification_model
adapter function has been renamed toas_seq_cls_model
to better reflect its broader applicability to both sequence classification and scoring tasks. This change is reflected across documentation, tests, and core logic. - Task Unification and Re-mapping: The
score
task is being unified under theclassify
task for cross-encoder models. For other models, ifscore
is specified, it will now resolve toembed
. This streamlines task handling within the system's configuration. - Enhanced Adapter Logic: The
as_seq_cls_model
adapter has been significantly refactored. It now includes a dedicatedpooler
method that centralizes the logic for applying the classification/scoring head and handles different pooling types. The adapted models are also now explicitly marked as implementingSupportsCrossEncoding
. - Qwen2 Integration: The Qwen2 model is now explicitly integrated with the new
as_seq_cls_model
adapter, allowing it to be automatically converted for sequence classification tasks.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
Warning Gemini encountered an error creating the review. You can try again by commenting |
Does this work for you? Score should be an api rather than a task, so let's get rid of score taskWhen task_option == "embed" or *ForSequenceClassification & num_labels == 1, then allows users to use the score API. For compatibility, we still allow users to use --task score.
|
I am ok with this change. How about @maxdebayser @22quinn ? |
I think it makes total sense to enable the score API for embedding models. But are we going to disable the |
this pr is to address some complex issues found when dealing with #19675
If strings are properly concatenated, the score and classify APIs can produce exactly the same results, so we should allow users to use the classify API, especially for LLM as Reranker models, where the classify API is more flexible. And actually, the architectures of cross-encoder (reranker) models and Classification models are exactly the same, both being *ForSequenceClassification; we cannot distinguish them. |
Not in the case of |
Thank you for pointing out that position_ids, the cross-encoder model, and the Classification model may be different. I’ll see if I can find a flag to make a judgment. However, many models use the same base model for fine-tuning both the cross-encoder and the Classification model. It might really be impossible to tell them apart. |
Agreed, perhaps we'll just have to trust that the users know what they are doing. |
Perhaps we should choose a more appropriate task name than “classify”. But, this change may hard to compatible. |
Signed-off-by: wang.yuqi <[email protected]>
Signed-off-by: wang.yuqi <[email protected]>
Ready for review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This feels the right direction. I think we can actually consolidate all these: |
The different |
"embed", "classify", "reward", none of them intersect The main issue arises from the overlap between "score" and "embed", "classify". |
Score should be an api rather than a task, so let's get rid of score task
score task
vllm/vllm/outputs.py
Lines 482 to 500 in 5111642
score task == embed task + BertLikeModelForSequenceClassification & num_labels == 1 ?
classify task
If your model is not in the above list, we will try to automatically convert the model using as_classification_model. By default, the class probabilities are extracted from the softmaxed hidden state corresponding to the last token.
e.g.
Qwen2ForSequenceClassification: jason9693/Qwen2.5-1.5B-apeach
classify task == *ForSequenceClassification?
More confusing error messages
This make users wonder whether BertLikeModelForSequenceClassification & num_labels == 1 can only be used for score tasks??
score api
score api (outputs similarity scores between sentence pairs) is very useful, we should keep it.
Score should be an api rather than a task, so let's get rid of score task
When task_option == "embed" or *ForSequenceClassification & num_labels == 1, then allows users to use the score API.
For compatibility, we still allow users to use --task score.
Essential Elements of an Effective PR Description Checklist
supported_models.md
andexamples
for a new model.Purpose
this pr is to address some complex issues found when dealing with #19675
It turned out that it was caused by PyStemmer resulting in slightly different stop words, leading to slightly different bm25 results.
Test Plan
pytest tests/test_config.py::test_auto_task
pytest tests/test_config.py::test_score_task
Test Result
Manage to keep all testing happy.
(Optional) Documentation Update
Document everything, when the journey ends.
Known issues