Cloudflare Worker AI Endpoint

一个基于 Cloudflare Worker AI 的模型的 OpenAI API 兼容接口实现

支持多模型调用、多 API 密钥、流式输出等特性

✨ 特性

🔄 动态获取最新的 Cloudflare AI 模型列表
🔑 支持多个 API 密钥配置，避免他人滥用
🎯 支持多个 AI 模型配置和调用
🌊 支持流式输出 (SSE)
✅ 完整的参数验证
🌐 默认启用 CORS
📝 详细的错误提示

🚀 快速开始

安装

# 克隆项目
git clone https://github.com/yourusername/cf-ai-endpoint.git
cd cf-ai-endpoint

# 安装依赖
npm install

配置

设置 API 密钥 (支持多个，以逗号分隔):

# E.g.: 生成单个API密钥并配置
openssl rand -base64 32 | tr -d '/+' | cut -c1-32 | npx wrangler secret put API_KEY

配置允许使用的模型列表(wrangler.toml):

# E.g.: 允许如下3个模型被调用
[vars]
MODELS = "@cf/meta/llama-2-7b-chat-int8,@cf/meta/llama-2-7b-chat-fp16,@cf/mistral/mistral-7b-instruct-v0.1"

同样可以手动在 Cloudflare 后台配置对应的 ENV。

Warning

请在后台使用 Secret 格式配置 API_KEY 设定访问接口的 API 密钥，并确保 API 存放在安全的地方。

部署

npm run deploy
# 或者
npx wrangler publish

📖 API 参考

1. 获取可用模型列表

GET /v1/models
Authorization: Bearer <your-api-key>

响应示例:

{
  "object": "list",
  "data": [
    {
      "id": "@cf/meta/llama-2-7b-chat-int8",
      "object": "model",
      "created": 1708661717835,
      "owned_by": "cloudflare",
      "permission": [],
      "root": "@cf/meta/llama-2-7b-chat-int8",
      "parent": null,
      "metadata": {
        "description": "Quantized (int8) generative text model...",
        "task": "Text Generation",
        "context_window": "8192"
      }
    }
  ]
}

2. 文本补全

POST /v1/completions
Authorization: Bearer <your-api-key>
Content-Type: application/json

{
    "model": "@cf/meta/llama-2-7b-chat-int8",
    "prompt": "你好",
    "stream": true
}

3. 对话补全

POST /v1/chat/completions
Authorization: Bearer <your-api-key>
Content-Type: application/json

{
    "model": "@cf/meta/llama-2-7b-chat-int8",
    "messages": [
        {"role": "user", "content": "你好"}
    ],
    "stream": true
}

👀 支持的参数

参数	类型	默认值	范围	说明
model	string	-	-	必选，模型 ID
stream	boolean	false	-	是否使用流式响应
max_tokens	integer	256	≥1	最大生成 token 数
temperature	number	0.6	0-5	采样温度
top_p	number	-	0-2	核采样概率
top_k	integer	-	1-50	核采样数量
frequency_penalty	number	-	0-2	频率惩罚
presence_penalty	number	-	0-2	存在惩罚
repetition_penalty	number	-	0-2	重复惩罚
seed	integer	-	1-9999999999	随机种子

💻 调用示例

Node.js (使用 OpenAI SDK)

import OpenAI from "openai";

const openai = new OpenAI({
  baseURL: "https://your-worker.workers.dev/v1",
  apiKey: "<your-api-key>",
});

// 流式响应
const stream = await openai.chat.completions.create({
  model: "@cf/meta/llama-2-7b-chat-int8",
  messages: [{ role: "user", content: "你好" }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || "");
}

fetch API

const response = await fetch("https://your-worker.workers.dev/v1/chat/completions", {
  method: "POST",
  headers: {
    Authorization: "Bearer <your-api-key>",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "@cf/meta/llama-2-7b-chat-int8",
    messages: [{ role: "user", content: "你好" }],
    stream: true,
  }),
});

// 处理流式响应
const reader = response.body.getReader();
while (true) {
  const { value, done } = await reader.read();
  if (done) break;
  console.log(new TextDecoder().decode(value));
}

📝 注意事项

Note

由于使用了 Cloudflare AI API 获取模型列表，首次请求可能会稍慢
建议在生产环境中设置更严格的 CORS 策略
API 密钥支持多个，便于权限管理和轮换
模型配置支持动态过滤，可随时调整可用模型列表
内容长度限制为 131072 字符

📄 License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Cloudflare Worker AI Endpoint

✨ 特性

🚀 快速开始

安装

配置

部署

📖 API 参考

1. 获取可用模型列表

2. 文本补全

3. 对话补全

👀 支持的参数

💻 调用示例

Node.js (使用 OpenAI SDK)

fetch API

📝 注意事项

📄 License

Files

README.md

Latest commit

History

README.md

File metadata and controls

Cloudflare Worker AI Endpoint

✨ 特性

🚀 快速开始

安装

配置

部署

📖 API 参考

1. 获取可用模型列表

2. 文本补全

3. 对话补全

👀 支持的参数

💻 调用示例

Node.js (使用 OpenAI SDK)

fetch API

📝 注意事项

📄 License