@@ -18,12 +18,15 @@ CodeGate works with the following AI model providers through Continue:
18
18
19
19
- Local / self-managed:
20
20
- [ Ollama] ( https://ollama.com/ )
21
- - [ llama.cpp] ( https://github.com/ggerganov/llama.cpp )
22
21
- [ vLLM] ( https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html )
22
+ - [ llama.cpp] ( https://github.com/ggerganov/llama.cpp ) (advanced)
23
23
- Hosted:
24
- - [ OpenRouter] ( https://openrouter.ai/ )
25
24
- [ Anthropic] ( https://www.anthropic.com/api )
26
25
- [ OpenAI] ( https://openai.com/api/ )
26
+ - [ OpenRouter] ( https://openrouter.ai/ )
27
+
28
+ You can also configure [ CodeGate muxing] ( ../features/muxing.md ) to select your
29
+ provider and model using [ workspaces] ( ../features/workspaces.mdx ) .
27
30
28
31
## Install the Continue plugin
29
32
@@ -92,8 +95,9 @@ To configure Continue to send requests through CodeGate:
92
95
"apiBase" : " http://127.0.0.1:8989/<provider>"
93
96
```
94
97
95
- Replace ` /<provider> ` with one of: ` /anthropic ` , ` /ollama ` , ` /openai ` , or
96
- ` /vllm ` to match your LLM provider.
98
+ Replace ` /<provider> ` with one of: ` /v1/mux ` (for CodeGate muxing),
99
+ ` /anthropic ` , ` /ollama ` , ` /openai ` , ` /openrouter ` , or ` /vllm ` to match your
100
+ LLM provider.
97
101
98
102
If you used a different API port when launching the CodeGate container,
99
103
replace ` 8989 ` with your custom port number.
@@ -111,49 +115,37 @@ provider. Replace the values in ALL_CAPS. The configuration syntax is the same
111
115
for VS Code and JetBrains IDEs.
112
116
113
117
<Tabs groupId = " provider" queryString = " provider" >
114
- <TabItem value = " ollama" label = " Ollama" default >
115
-
116
- You need Ollama installed on your local system with the server running
117
- (` ollama serve ` ) to use this provider.
118
-
119
- CodeGate connects to ` http://host.docker.internal:11434 ` by default. If you
120
- changed the default Ollama server port or to connect to a remote Ollama
121
- instance, launch CodeGate with the ` CODEGATE_OLLAMA_URL ` environment variable
122
- set to the correct URL. See [ Configure CodeGate] ( ../how-to/configure.md ) .
118
+ <TabItem value = " mux" label = " CodeGate muxing" default >
123
119
124
- Replace ` MODEL_NAME ` with the names of model(s) you have installed locally using
125
- ` ollama pull ` . See Continue's
126
- [ Ollama provider documentation] ( https://docs.continue.dev/customize/model-providers/ollama ) .
120
+ First, configure your [ provider(s)] ( ../features/muxing.md#add-a-provider ) and
121
+ select a model for each of your
122
+ [ workspace(s)] ( ../features/workspaces.mdx#manage-workspaces ) in the CodeGate
123
+ dashboard.
127
124
128
- We recommend the [ Qwen2.5-Coder] ( https://ollama.com/library/qwen2.5-coder )
129
- series of models. Our minimum recommendation is:
130
-
131
- - ` qwen2.5-coder:7b ` for chat
132
- - ` qwen2.5-coder:1.5b ` for autocomplete
133
-
134
- These models balance performance and quality for typical systems with at least 4
135
- CPU cores and 16GB of RAM. If you have more compute resources available, our
136
- experimentation shows that larger models do yield better results.
125
+ Configure Continue as shown. Note, the ` model ` and ` apiKey ` settings are
126
+ required by Continue, but their value is not used.
137
127
138
128
``` json title="~/.continue/config.json"
139
129
{
140
130
"models" : [
141
131
{
142
- "title" : " CodeGate-Ollama" ,
143
- "provider" : " ollama" ,
144
- "model" : " MODEL_NAME" ,
145
- "apiBase" : " http://localhost:8989/ollama"
132
+ "title" : " CodeGate-Mux" ,
133
+ "provider" : " openai" ,
134
+ "model" : " fake-value-not-used" ,
135
+ "apiKey" : " fake-value-not-used" ,
136
+ "apiBase" : " http://localhost:8989/v1/mux"
146
137
}
147
138
],
148
139
"modelRoles" : {
149
- "default" : " CodeGate-Ollama " ,
150
- "summarize" : " CodeGate-Ollama "
140
+ "default" : " CodeGate-Mux " ,
141
+ "summarize" : " CodeGate-Mux "
151
142
},
152
143
"tabAutocompleteModel" : {
153
- "title" : " CodeGate-Ollama-Autocomplete" ,
154
- "provider" : " ollama" ,
155
- "model" : " MODEL_NAME" ,
156
- "apiBase" : " http://localhost:8989/ollama"
144
+ "title" : " CodeGate-Mux-Autocomplete" ,
145
+ "provider" : " openai" ,
146
+ "model" : " fake-value-not-used" ,
147
+ "apiKey" : " fake-value-not-used" ,
148
+ "apiBase" : " http://localhost:8989/v1/mux"
157
149
}
158
150
}
159
151
```
@@ -195,6 +187,54 @@ Replace `YOUR_API_KEY` with your
195
187
}
196
188
```
197
189
190
+ </TabItem >
191
+ <TabItem value = " ollama" label = " Ollama" >
192
+
193
+ You need Ollama installed on your local system with the server running
194
+ (` ollama serve ` ) to use this provider.
195
+
196
+ CodeGate connects to ` http://host.docker.internal:11434 ` by default. If you
197
+ changed the default Ollama server port or to connect to a remote Ollama
198
+ instance, launch CodeGate with the ` CODEGATE_OLLAMA_URL ` environment variable
199
+ set to the correct URL. See [ Configure CodeGate] ( ../how-to/configure.md ) .
200
+
201
+ Replace ` MODEL_NAME ` with the names of model(s) you have installed locally using
202
+ ` ollama pull ` . See Continue's
203
+ [ Ollama provider documentation] ( https://docs.continue.dev/customize/model-providers/ollama ) .
204
+
205
+ We recommend the [ Qwen2.5-Coder] ( https://ollama.com/library/qwen2.5-coder )
206
+ series of models. Our minimum recommendation is:
207
+
208
+ - ` qwen2.5-coder:7b ` for chat
209
+ - ` qwen2.5-coder:1.5b ` for autocomplete
210
+
211
+ These models balance performance and quality for typical systems with at least 4
212
+ CPU cores and 16GB of RAM. If you have more compute resources available, our
213
+ experimentation shows that larger models do yield better results.
214
+
215
+ ``` json title="~/.continue/config.json"
216
+ {
217
+ "models" : [
218
+ {
219
+ "title" : " CodeGate-Ollama" ,
220
+ "provider" : " ollama" ,
221
+ "model" : " MODEL_NAME" ,
222
+ "apiBase" : " http://localhost:8989/ollama"
223
+ }
224
+ ],
225
+ "modelRoles" : {
226
+ "default" : " CodeGate-Ollama" ,
227
+ "summarize" : " CodeGate-Ollama"
228
+ },
229
+ "tabAutocompleteModel" : {
230
+ "title" : " CodeGate-Ollama-Autocomplete" ,
231
+ "provider" : " ollama" ,
232
+ "model" : " MODEL_NAME" ,
233
+ "apiBase" : " http://localhost:8989/ollama"
234
+ }
235
+ }
236
+ ```
237
+
198
238
</TabItem >
199
239
<TabItem value = " openai" label = " OpenAI" >
200
240
@@ -234,13 +274,26 @@ Replace `YOUR_API_KEY` with your
234
274
</TabItem >
235
275
<TabItem value = " openrouter" label = " OpenRouter" >
236
276
237
- CodeGate's vLLM provider supports OpenRouter, a unified interface for hundreds
238
- of commercial and open source models. You need an
239
- [ OpenRouter] ( https://openrouter.ai/ ) account to use this provider.
277
+ OpenRouter is a unified interface for hundreds of commercial and open source
278
+ models. You need an [ OpenRouter] ( https://openrouter.ai/ ) account to use this
279
+ provider.
280
+
281
+ :::info Known issues
282
+
283
+ ** Auto-completion support** : currently, CodeGate's ` /openrouter ` endpoint does
284
+ not work with Continue's ` tabAutocompleteModel ` setting for fill-in-the-middle
285
+ (FIM). We are
286
+ [ working to resolve this issue] ( https://github.com/stacklok/codegate/issues/980 ) .
287
+
288
+ ** DeepSeek models** : there is a bug in the current release version of Continue
289
+ affecting DeepSeek models (ex: ` deepseek/deepseek-r1 ` ), you need to run the
290
+ pre-release version of the Continue extension.
291
+
292
+ :::
240
293
241
294
Replace ` MODEL_NAME ` with one of the
242
295
[ available models] ( https://openrouter.ai/models ) , for example
243
- ` qwen/qwen-2 .5-coder-32b-instruct ` .
296
+ ` anthropic/claude-3 .5-sonnet ` .
244
297
245
298
Replace ` YOUR_API_KEY ` with your
246
299
[ OpenRouter API key] ( https://openrouter.ai/keys ) .
@@ -250,20 +303,56 @@ Replace `YOUR_API_KEY` with your
250
303
"models" : [
251
304
{
252
305
"title" : " CodeGate-OpenRouter" ,
253
- "provider" : " vllm " ,
306
+ "provider" : " openrouter " ,
254
307
"model" : " MODEL_NAME" ,
255
308
"apiKey" : " YOUR_API_KEY" ,
256
- "apiBase" : " http://localhost:8989/vllm "
309
+ "apiBase" : " http://localhost:8989/openrouter "
257
310
}
258
311
],
259
312
"modelRoles" : {
260
313
"default" : " CodeGate-OpenRouter" ,
261
314
"summarize" : " CodeGate-OpenRouter"
315
+ }
316
+ }
317
+ ```
318
+
319
+ </TabItem >
320
+ <TabItem value = " vllm" label = " vLLM" >
321
+
322
+ You need a
323
+ [ vLLM server] ( https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html )
324
+ running locally or access to a remote server to use this provider.
325
+
326
+ CodeGate connects to ` http://localhost:8000 ` by default. If you changed the
327
+ default Ollama server port or to connect to a remote Ollama instance, launch
328
+ CodeGate with the ` CODEGATE_VLLM_URL ` environment variable set to the correct
329
+ URL. See [ Configure CodeGate] ( ../how-to/configure.md ) .
330
+
331
+ A vLLM server hosts a single model. Continue automatically selects the available
332
+ model, so the ` model ` parameter is not required. See Continue's
333
+ [ vLLM provider guide] ( https://docs.continue.dev/customize/model-providers/more/vllm )
334
+ for more information.
335
+
336
+ If your server requires an API key, replace ` YOUR_API_KEY ` with the key.
337
+ Otherwise, remove the ` apiKey ` parameter from both sections.
338
+
339
+ ``` json title="~/.continue/config.json"
340
+ {
341
+ "models" : [
342
+ {
343
+ "title" : " CodeGate-vLLM" ,
344
+ "provider" : " vllm" ,
345
+ "apiKey" : " YOUR_API_KEY" ,
346
+ "apiBase" : " http://localhost:8989/vllm"
347
+ }
348
+ ],
349
+ "modelRoles" : {
350
+ "default" : " CodeGate-vLLM" ,
351
+ "summarize" : " CodeGate-vLLM"
262
352
},
263
353
"tabAutocompleteModel" : {
264
- "title" : " CodeGate-OpenRouter -Autocomplete" ,
354
+ "title" : " CodeGate-vLLM -Autocomplete" ,
265
355
"provider" : " vllm" ,
266
- "model" : " MODEL_NAME" ,
267
356
"apiKey" : " YOUR_API_KEY" ,
268
357
"apiBase" : " http://localhost:8989/vllm"
269
358
}
@@ -331,49 +420,6 @@ In the Continue config file, replace `MODEL_NAME` with the file name without the
331
420
}
332
421
```
333
422
334
- </TabItem >
335
- <TabItem value = " vllm" label = " vLLM" >
336
-
337
- You need a
338
- [ vLLM server] ( https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html )
339
- running locally or access to a remote server to use this provider.
340
-
341
- CodeGate connects to ` http://localhost:8000 ` by default. If you changed the
342
- default Ollama server port or to connect to a remote Ollama instance, launch
343
- CodeGate with the ` CODEGATE_VLLM_URL ` environment variable set to the correct
344
- URL. See [ Configure CodeGate] ( ../how-to/configure.md ) .
345
-
346
- A vLLM server hosts a single model. Continue automatically selects the available
347
- model, so the ` model ` parameter is not required. See Continue's
348
- [ vLLM provider guide] ( https://docs.continue.dev/customize/model-providers/more/vllm )
349
- for more information.
350
-
351
- If your server requires an API key, replace ` YOUR_API_KEY ` with the key.
352
- Otherwise, remove the ` apiKey ` parameter from both sections.
353
-
354
- ``` json title="~/.continue/config.json"
355
- {
356
- "models" : [
357
- {
358
- "title" : " CodeGate-vLLM" ,
359
- "provider" : " vllm" ,
360
- "apiKey" : " YOUR_API_KEY" ,
361
- "apiBase" : " http://localhost:8989/vllm"
362
- }
363
- ],
364
- "modelRoles" : {
365
- "default" : " CodeGate-vLLM" ,
366
- "summarize" : " CodeGate-vLLM"
367
- },
368
- "tabAutocompleteModel" : {
369
- "title" : " CodeGate-vLLM-Autocomplete" ,
370
- "provider" : " vllm" ,
371
- "apiKey" : " YOUR_API_KEY" ,
372
- "apiBase" : " http://localhost:8989/vllm"
373
- }
374
- }
375
- ```
376
-
377
423
</TabItem >
378
424
</Tabs >
379
425
0 commit comments