Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Story]: Allow for fallback models #1221

Open
JAORMX opened this issue Mar 5, 2025 · 3 comments
Open

[Story]: Allow for fallback models #1221

JAORMX opened this issue Mar 5, 2025 · 3 comments

Comments

@JAORMX
Copy link
Contributor

JAORMX commented Mar 5, 2025

Description

The idea is to provide a way for CodeGate to fallback to another provider + model in case there's an issue with the original one CodeGate is proxying.

Additional Context

When actively working with a model, one might start hitting issues such as rate limits or actual downtime from the provider itself. In this case, I was trying Claude 3.7 and started hitting rate limit issues. I'd like to fallback to an equivalent model to seamlessly continue working.

Image

@JAORMX
Copy link
Contributor Author

JAORMX commented Mar 5, 2025

@aponcedeleonch and I have been discussing about this and came up with the concept of "pseudo-providers" that is, fake provider endpoints within codegate that will encompass more complex logic. This could be a concept that we could use to implement fallback as requested in this issue. We could also start implementing A/B testing, or other more complex workflows using this.

@aponcedeleonch
Copy link
Contributor

I've been thinking and probably would make sense to add fallback rules or conditions. That way we would only fallback if certain condition is met. At least these 2 come to mind:

  • Error: Fallback if an error occurred in the primary model. Like the rate limit example you pointed out above
  • Balance: A 50/50 split between the models. Would be useful for A/B testing for example

@JAORMX
Copy link
Contributor Author

JAORMX commented Mar 5, 2025

@aponcedeleonch adding rules makes sense; that's similar to the fallback mechanism we had in Minder for REST endpoints and data sources.

Balacing does not fall within the fallback provider implementation IMO and should be something else. It could use the same mechanism ("pseudo-providers"), but it would be another implementation.

Think of pseudo-providers as the abstract base class, and fallback as the implementation. Balancing and A/B testing would also be separate implementations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants