Apex AI Proxy is a free, personal AI Gateway that runs on Cloudflare Workers. It aggregates multiple AI service providers behind a unified OpenAI-compatible API, allowing you to overcome rate limits and take advantage of free quotas from different providers.
Why you'll care:
- π Completely Free: Runs entirely on Cloudflare Workers' free plan
- π Load Balancing: Distributes requests across multiple providers to overcome rate limits
- π° Maximize Free Quotas: Take advantage of free tiers from different AI providers
- π Multiple API Keys: Register multiple keys for the same service provider
- π€ OpenAI Client Compatible: Works with any library that speaks OpenAI's API format
2025-04 Update
Apex AI Proxy now supports the new OpenAI /v1/responses
-style API, which is the latest standard for OpenAI-compatible services. This update is crucial for:
- Ecosystem Compatibility: Seamless integration with the latest OpenAI tools (e.g., Codex) and clients that require the
/v1/responses
API. - Future-Proofing: Ensures your proxy remains compatible with evolving OpenAI standards.
- /v1/responses API Support: You can now use the new response-based endpoints, unlocking compatibility with next-gen OpenAI clients and tools.
- Response ID-based Endpoints: Some endpoints now operate based on
response_id
. To support this, a newkv_namespaces
configuration is required for caching and managing response data. - Configuration Change: Please add the
kv_namespaces
field in your configuration (see below) to enable proper response caching and retrieval.
module.exports = {
// ...existing config...
kv_namespaces: [
{ binding: 'RESPONSE_KV', id: 'your-kv-namespace-id' }
],
};
Note: Without this configuration, some
/v1/responses
endpoints will not function correctly.
- Unlocks new OpenAI ecosystem tools (like Codex)
- Aligns with the latest API standards
- Enables advanced features that require response ID tracking
For more details, see the updated usage and configuration sections below.
- π Multi-Provider Support: Aggregate Azure, DeepSeek, Aliyun, and more behind one API
- π Smart Request Distribution: Automatically routes requests to available providers
- π Multiple API Key Management: Register multiple keys for the same provider to further increase limits
- π Protocol Translation: Handles different provider authentication methods and API formats
- π‘οΈ Robust Error Handling: Gracefully handles provider errors and failover
- Clone the repository:
git clone https://github.com/loadchange/apex-ai-proxy.git
cd apex-ai-proxy
- Install dependencies:
pnpm install
- Configure your providers (in
wrangler-config.js
):
// First, define your providers with their base URLs and API keys
const providerConfig = {
aliyuncs: {
base_url: 'https://dashscope.aliyuncs.com/compatible-mode/v1',
api_keys: ['your-aliyun-key'],
},
deepinfra: {
base_url: 'https://api.deepinfra.com/v1/openai',
api_keys: ['your-deepinfra-key'],
},
azure: {
base_url: 'https://:name.azure.com/openai/deployments/:model',
api_keys: ['your-azure-key'],
},
// Add more providers as needed
};
// Then, configure your models and assign providers to them
const modelProviderConfig = {
'gpt-4o-mini': {
providers: [
{
provider: 'azure',
model: 'gpt-4o-mini',
},
// Add more providers for the same model
],
},
'DeepSeek-R1': {
providers: [
{
provider: 'aliyuncs',
model: 'deepseek-r1',
},
{
provider: 'deepinfra',
model: 'deepseek-ai/DeepSeek-R1',
},
// You can still override provider settings for specific models if needed
{
provider: 'azure',
base_url: 'https://your-custom-endpoint.azure.com/openai/deployments/DeepSeek-R1',
api_key: 'your-custom-azure-key',
model: 'DeepSeek-R1',
},
],
},
};
- Deploy to Cloudflare Workers:
pnpm run deploy
- Rate Limit Issues: By distributing requests across multiple providers, you can overcome rate limits imposed by individual services
- Cost Optimization: Take advantage of free tiers from different providers
- API Consistency: Use a single, consistent API format (OpenAI-compatible) regardless of the underlying provider
- Simplified Integration: No need to modify your existing code that uses OpenAI clients
# Works with ANY OpenAI client!
from openai import OpenAI
client = OpenAI(
base_url="https://your-proxy.workers.dev/v1",
api_key="your-configured-api-key"
)
# Use any model you've configured in your proxy
response = client.chat.completions.create(
model="DeepSeek-R1", # This will be routed to one of your configured providers
messages=[{"role": "user", "content": "Why is this proxy awesome?"}]
)
You can configure multiple API keys for the same provider to further increase your rate limits:
{
provider: 'aliyuncs',
base_url: 'https://dashscope.aliyuncs.com/compatible-mode/v1',
api_keys: [
'your-first-aliyun-key',
'your-second-aliyun-key',
'your-third-aliyun-key'
],
model: 'deepseek-r1',
}
Found a bug or want to add support for more providers? PRs are welcome!