-
-
Notifications
You must be signed in to change notification settings - Fork 8.2k
[Model] support modernbert #16648
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Model] support modernbert #16648
Conversation
Signed-off-by: 唯勤 <[email protected]>
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
Signed-off-by: 唯勤 <[email protected]>
Thanks for adding this model! To get the model to pass CI, please add this model to the test files as mentioned here: https://docs.vllm.ai/en/latest/contributing/model/tests.html |
Signed-off-by: 唯勤 <[email protected]>
Signed-off-by: 唯勤 <[email protected]>
Signed-off-by: 唯勤 <[email protected]>
@DarkLight1337 , I do not know why the |
Retrying |
Signed-off-by: 唯勤 <[email protected]>
Signed-off-by: 唯勤 <[email protected]>
Signed-off-by: 唯勤 <[email protected]>
Signed-off-by: 唯勤 <[email protected]>
Signed-off-by: 唯勤 <[email protected]>
Signed-off-by: 唯勤 <[email protected]>
Signed-off-by: 唯勤 <[email protected]>
Signed-off-by: 唯勤 <[email protected]>
Signed-off-by: 唯勤 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this model!
In fact, this commit support the common architectures of |
@xsank @lionelvillard i fine tuned answerdotai/ModernBERT-base as a binary text classifier - does that fall under ModernBertForSequenceClassification? for when you support architecture "ModernBertForMaskedLM" will you need will you need to call func softmax for the logits returned? |
Signed-off-by: 唯勤 <[email protected]> Co-authored-by: 唯勤 <[email protected]>
It seems that moderBert series model have some tiny difference, let me see how to support all the features in the same class. Finally, i feel so sorry that i made a mistake, which i only support a part, there is still some work todo. |
@xsank thanks for your reply - could you also add it for architectore ModernBertForMaskedLM? that would be amazing. |
Signed-off-by: 唯勤 <[email protected]> Co-authored-by: 唯勤 <[email protected]> Signed-off-by: Yang Wang <[email protected]>
@xsank What about the /pooling endpoint for classification? |
Signed-off-by: 唯勤 <[email protected]> Co-authored-by: 唯勤 <[email protected]>
Signed-off-by: 唯勤 <[email protected]> Co-authored-by: 唯勤 <[email protected]>
Signed-off-by: 唯勤 <[email protected]> Co-authored-by: 唯勤 <[email protected]> Signed-off-by: Agata Dobrzyniewicz <[email protected]>
Signed-off-by: 唯勤 <[email protected]> Co-authored-by: 唯勤 <[email protected]> Signed-off-by: Mu Huai <[email protected]>
hi @xsank can you also add ModernBertForMaskedLM? |
hi @mhillebrand the pooling is needed to support /classify and has not been added to V0 and V1? |
Yes, |
You might need to set |
@DarkLight1337 yes /classify work now for online serving when using openai vllm container from nightly build. but when i try to use /score i get this error - any idea?
server reply to the client:
|
Can you show the command you used to serve the model? |
@DarkLight1337
client
ful server logs (including startup)
|
@xsank can you help look into this? |
@DarkLight1337 when using /classify is there a way to add a post processing step using a threshold before deciding the class? |
No, to achieve that I suggest getting the logits/probabilities and processing them by yourself |
you refer to using the probs field from the classify output?
|
Yes |
@DarkLight1337 alternative would be to add a def classifyAndApplyThreshold to vllm and rebuild the container... |
The results of the Alibaba-NLP/gte-reranker-modernbert-base model seem to slight differ from sentence-transformers. import pytest
import torch
from sentence_transformers import CrossEncoder
from vllm import LLM
model_name = "Alibaba-NLP/gte-reranker-modernbert-base"
st_model = CrossEncoder(model_name,
model_kwargs={"torch_dtype": torch.float32})
vllm_model = LLM(model_name, task="score", dtype="float32")
sentences = [
("ping", "pong"),
("ping", "pong" * 16),
("ping", "pong" * 24),
("ping", "pong" * 32),
("ping", "pong" * 48),
("ping", "pong" * 64),
("ping", "pong" * 128),
]
st_scores = st_model.predict(sentences)
texts_1 = [x[0] for x in sentences]
texts_2 = [x[1] for x in sentences]
outputs = vllm_model.score(texts_1, texts_2)
vllm_scores = [output.outputs.score for output in outputs]
def test_close(s1, s2):
return float(s1) == pytest.approx(float(s2), rel=0.01)
print(
[test_close(st_scores[i], vllm_scores[i]) for i in range(len(st_scores))]) output
|
Support modernBert, test passed on Alibaba-NLP/gte-reranker-modernbert-base.
FIX #11347
eg:
Signed-off-by: xsank [email protected]