Support Vertex custom chat + embedding + rerank models #16273
alliecatowo
started this conversation in
Suggestion
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Self Checks
1. Is this request related to a challenge you're experiencing? Tell me about your story.
Currently, it appears as though self hosted models in vertex are unsupported. The official vertex plugin simply has a list of hardcoded models available. I suspect most people who are using vertex are using either custom models or models deployed form the model garden.
Now that vertex can one click deploy hugging face models, this is an even more attractive feature. Building on that, vertex supports the hugging face text embedding inference api as well. Vertex should be able to handle chat, embedding, rerank, vision, tts, and speech2text modalities.
The TEI model provider isn't good enough on its own since vertex has some weirdness with its endpoints, and the plugin expects an api key rather then reading from a service account or default application credentials when running in a google environment, which is a big benefit of the vertex plugin. Not to mention the trouble it would take to add 5 or 10 hosted models, paste in the same api key, etc, compared to your custom models being prepopulated in the available model list.
Hopefully theres some other vertex users out there. If anyone had success using custom embedding and reranking models with vertex feel free to share too!
2. Additional context or comments
No response
3. Can you help us with this feature?
Beta Was this translation helpful? Give feedback.
All reactions