Skip to content

[Frontend] Vendor exported templates to vllm.tools #18094

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

aarnphm
Copy link
Collaborator

@aarnphm aarnphm commented May 13, 2025

This PR vendors all exported tool calling / chat templates from examples to vllm/tools such that end users will now don't have to clone the repo down to use the templates

The format is as follows:

vllm serve <model> --chat-template tool_chat_template_hermes

Previous functionalities are still preserved, if users want to use custom templates path.

Signed-off-by: Aaron Pham [email protected]

@aarnphm aarnphm requested a review from DarkLight1337 May 13, 2025 18:45
Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

@aarnphm aarnphm requested review from russellb and mgoin May 13, 2025 18:45
@mergify mergify bot added documentation Improvements or additions to documentation frontend labels May 13, 2025
@aarnphm aarnphm added this to the v0.9.0 milestone May 13, 2025
Signed-off-by: Aaron Pham <[email protected]>
@DarkLight1337
Copy link
Member

I have actually moved a bunch of chat templates to vllm/transformers_utils/chat_templates recently, perhaps we can keep using that directory?

@aarnphm
Copy link
Collaborator Author

aarnphm commented May 14, 2025

I plan to move some of the tools items in here as well, and chat_templates should be one of them imo

@aarnphm
Copy link
Collaborator Author

aarnphm commented May 14, 2025

ig that the purpose of vllm/transformers_utils/chat_templates is to automatically apply the chat templates based on model type?

I would prefer explicitly set this rather than doing this automatically, because it would lead to unexpected behaviour imo

@DarkLight1337
Copy link
Member

ig that the purpose of vllm/transformers_utils/chat_templates is to automatically apply the chat templates based on model type?

I would prefer explicitly set this rather than doing this automatically, because it would lead to unexpected behaviour imo

Only registry.py is responsible for doing that. Not all of the chat templates in that directory have to be applied automatically.

@aarnphm
Copy link
Collaborator Author

aarnphm commented May 14, 2025

so how would we use this feature?

vllm serve <model> --chat-templates deepseek_vl_2

@russellb
Copy link
Member

Looking at your example in the PR description:

vllm serve <model> --chat-template tool_chat_template_hermes

What would you think about cleaning up the UX a bit by changing the naming schemes a bit where possible?

In this case, it would be really nice if the command was --chat-template hermes, for example.

It would also be nice to have a way to list the available templates, though that can be a different PR.

@aarnphm
Copy link
Collaborator Author

aarnphm commented May 14, 2025

Looking at your example in the PR description:

vllm serve <model> --chat-template tool_chat_template_hermes

What would you think about cleaning up the UX a bit by changing the naming schemes a bit where possible?

Yeah I think we can cleanup the naming here. Initially I was thinking (very much inspired by nix):

--chat-template hermes # alpaca | deepseek_r1 | etc.
--chat-template tool/hermes # with tool_call for hermes, tool/deepseek
--chat-template hf:org/new_model#tool_call_template.jinja <-- this can live in the HF model repo
--chat-template github:vllm-project/vllm/main#path/to/template.jinja <-- TODO, maybe

by default all of the template that is known to us will be the "official" supported ones. Then model maker can probably also hosted their own to reduce the maintainability from our side

It would also be nice to have a way to list the available templates, though that can be a different PR.

This is included in the CLI description actually, though I think I can also add this to the EngineArgs docstring so it shows up in the docs

@DarkLight1337
Copy link
Member

DarkLight1337 commented May 15, 2025

so how would we use this feature?

vllm serve <model> --chat-templates deepseek_vl_2

registry.py currently only does #17805 . It is applied if user doesn't provide chat template but no chat template is available.

@aarnphm aarnphm removed this from the v0.9.0 milestone May 15, 2025
@chaunceyjiang
Copy link
Contributor

Yeah I think we can cleanup the naming here. Initially I was thinking (very much inspired by nix):

--chat-template hermes # alpaca | deepseek_r1 | etc.
--chat-template tool/hermes # with tool_call for hermes, tool/deepseek
--chat-template hf:org/new_model#tool_call_template.jinja <-- this can live in the HF model repo
--chat-template github:vllm-project/vllm/main#path/to/template.jinja <-- TODO, maybe

Perhaps this is also a solution:
--chat-template https://xxxxx.template_vlm2vec.jinja
which allows using a file from the web.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation frontend tool-calling
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

4 participants