Skip to content

feature: add support for custom model provider #77

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

pansusu
Copy link
Contributor

@pansusu pansusu commented Apr 15, 2025

This change adds a new provider parameter to the agent configuration, allowing users to specify custom model providers. This is particularly useful when working with third-party cloud platforms that support OpenAI-compatible APIs but use different model naming conventions.

Key changes:

  • Added provider parameter to model factory function
  • Support provider specification through CLI and agent config
  • Enable custom provider override for model selection

This enhancement makes the agent more flexible and compatible with various OpenAI-compatible API providers, such as:

  • Third-party cloud platforms
  • Self-hosted model services
  • Custom model deployments

Example usage:

@agent(model="qwen-vl-max-latest", provider="openai")
async def my_agent():
    # Agent implementation

CLI: uv run web.py --provider openai

@pansusu
Copy link
Contributor Author

pansusu commented Apr 15, 2025

If the third-party platform already supports existing providers, there's no need to add duplicate ones. This perfectly suits my needs, and the same applies to others.

@evalstate
Copy link
Owner

thank you. quick question (as i am working on the api key management right this second)... how do you manage the API key? just use the platform specific (so if it's azure you use azure provider and the openai config?)

@pansusu
Copy link
Contributor Author

pansusu commented Apr 15, 2025

thank you. quick question (as i am working on the api key management right this second)... how do you manage the API key? just use the platform specific (so if it's azure you use azure provider and the openai config?)

Currently, i directly use OpenAI's configuration by changing the base_url.

@evalstate
Copy link
Owner

Hi @pansusu sorry it has taken me a while to get to this; there has been quite a lot change in this area recently. Having a quick look at the code, I'm not sure how this works? At the moment the provider name comes from the leftmost part of the model string, which identifies the appropriate factory for building the LLM. Without that mapping, what's used?

@pansusu
Copy link
Contributor Author

pansusu commented Apr 21, 2025

src/mcp_agent/llm/model_factory.py

Here is the core code.

   if provider_name is not None: provider = cls.PROVIDER_MAP.get(provider_name.lower()) if provider: return ModelConfig(provider=provider, model_name=model_string, reasoning_effort=None)

First, if the provider can be obtained from the user's input parameters,
then use the user-provided model_string as the model name and return ModelConfig
without requiring model mapping.
Otherwise, execute the original logic.

Additionally, some model providers support OpenAI's request/response format. You can use OpenAI tools directly for requests and responses, where the model names should follow the provider's naming convention rather than OpenAI's original model names.

@evalstate
Copy link
Owner

Thanks @pansusu -- how about the base_uri and api key that is supposed to be used in this condition? That's where I'm stuck I think. We also sometimes set different default parameters (e.g. parallel_tool_calls has a different default between Google and Deepseek endpoints) - I'm going to do some rework in this area to have better provider/model defaults.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants