Skip to content

Commit 5b01a28

Browse files
authored
docs(api): update model descriptions for previous vendors (#3747)
* docs(api): update model descriptions and configuration examples for various AI providers * docs(api): clarify API endpoint usage and model specifications in documentation
1 parent 56c25c1 commit 5b01a28

File tree

9 files changed

+145
-97
lines changed

9 files changed

+145
-97
lines changed
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,31 @@
11
# DeepSeek
22

3-
[DeepSeek](https://www.deepseek.com/) offers a suite of AI models, such as [DeepSeek V3](https://huggingface.co/deepseek-ai/DeepSeek-V3) and [DeepSeek Coder](https://huggingface.co/collections/deepseek-ai/deepseekcoder-v2-666bf4b274a5f556827ceeca), which perform well in coding tasks. Tabby supports DeepSeek's models for both code completion and chat.
3+
[DeepSeek](https://www.deepseek.com/) is an AI company that develops large language models specialized in coding and general tasks. Their models include [DeepSeek V3](https://huggingface.co/deepseek-ai/DeepSeek-V3) for general tasks and [DeepSeek Coder](https://huggingface.co/collections/deepseek-ai/deepseekcoder-v2-666bf4b274a5f556827ceeca) specifically optimized for programming tasks.
44

5-
Below is an example
5+
## Chat model
6+
7+
DeepSeek provides an OpenAI-compatible chat API interface.
68

79
```toml title="~/.tabby/config.toml"
8-
# Chat model configuration
910
[model.chat.http]
10-
# Deepseek's chat interface is compatible with OpenAI's chat API.
1111
kind = "openai/chat"
1212
model_name = "your_model"
1313
api_endpoint = "https://api.deepseek.com/v1"
14-
api_key = "secret-api-key"
14+
api_key = "your-api-key"
15+
```
16+
17+
## Completion model
18+
19+
DeepSeek offers a specialized completion API interface for code completion tasks.
1520

16-
# Completion model configuration
21+
```toml title="~/.tabby/config.toml"
1722
[model.completion.http]
18-
# Deepseek uses its own completion API interface.
1923
kind = "deepseek/completion"
2024
model_name = "your_model"
2125
api_endpoint = "https://api.deepseek.com/beta"
22-
api_key = "secret-api-key"
26+
api_key = "your-api-key"
2327
```
28+
29+
## Embeddings model
30+
31+
DeepSeek currently does not provide embedding model APIs.
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,23 @@
11
# Jan AI
22

3-
[Jan](https://jan.ai/) is an open-source alternative to ChatGPT that runs entirely offline on your computer.
3+
[Jan](https://jan.ai/) is an open-source alternative to ChatGPT that runs entirely offline on your computer. It provides an OpenAI-compatible server interface that can be enabled through the Jan App's `Local API Server` UI.
44

5-
Jan can run a server that provides an OpenAI-equivalent chat API at https://localhost:1337,
6-
allowing us to use the OpenAI kinds for chat.
7-
To use the Jan Server, you need to enable it in the Jan App's `Local API Server` UI.
5+
## Chat model
86

9-
However, Jan does not yet provide API support for completion and embeddings.
10-
11-
Below is an example for chat:
7+
Jan provides an OpenAI-compatible chat API interface.
128

139
```toml title="~/.tabby/config.toml"
14-
# Chat model
1510
[model.chat.http]
1611
kind = "openai/chat"
1712
model_name = "your_model"
1813
api_endpoint = "http://localhost:1337/v1"
1914
api_key = ""
20-
```
15+
```
16+
17+
## Completion model
18+
19+
Jan currently does not provide completion API support.
20+
21+
## Embeddings model
22+
23+
Jan currently does not provide embedding API support.

website/docs/references/models-http-api/llama.cpp.md

+19-8
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,33 @@
11
# llama.cpp
22

3-
[llama.cpp](https://github.com/ggerganov/llama.cpp/blob/master/examples/server/README.md#api-endpoints) is a popular C++ library for serving gguf-based models.
3+
[llama.cpp](https://github.com/ggerganov/llama.cpp/blob/master/examples/server/README.md#api-endpoints) is a popular C++ library for serving gguf-based models. It provides a server implementation that supports completion, chat, and embedding functionalities through HTTP APIs.
44

5-
Tabby supports the llama.cpp HTTP API for completion, chat, and embedding models.
5+
## Chat model
6+
7+
llama.cpp provides an OpenAI-compatible chat API interface.
8+
9+
```toml title="~/.tabby/config.toml"
10+
[model.chat.http]
11+
kind = "openai/chat"
12+
api_endpoint = "http://localhost:8888"
13+
```
14+
15+
## Completion model
16+
17+
llama.cpp offers a specialized completion API interface for code completion tasks.
618

719
```toml title="~/.tabby/config.toml"
8-
# Completion model
920
[model.completion.http]
1021
kind = "llama.cpp/completion"
1122
api_endpoint = "http://localhost:8888"
1223
prompt_template = "<PRE> {prefix} <SUF>{suffix} <MID>" # Example prompt template for the CodeLlama model series.
24+
```
1325

14-
# Chat model
15-
[model.chat.http]
16-
kind = "openai/chat"
17-
api_endpoint = "http://localhost:8888"
26+
## Embeddings model
27+
28+
llama.cpp provides embedding functionality through its HTTP API.
1829

19-
# Embedding model
30+
```toml title="~/.tabby/config.toml"
2031
[model.embedding.http]
2132
kind = "llama.cpp/embedding"
2233
api_endpoint = "http://localhost:8888"

website/docs/references/models-http-api/llamafile.md

+18-15
Original file line numberDiff line numberDiff line change
@@ -1,38 +1,41 @@
11
# llamafile
22

3-
[llamafile](https://github.com/Mozilla-Ocho/llamafile)
4-
is a Mozilla Builders project that allows you to distribute and run LLMs with a single file.
3+
[llamafile](https://github.com/Mozilla-Ocho/llamafile) is a Mozilla Builders project that allows you to distribute and run LLMs with a single file. It embeds a llama.cpp server and provides an OpenAI API-compatible chat-completions endpoint, allowing us to use the `openai/chat`, `llama.cpp/completion`, and `llama.cpp/embedding` types.
54

6-
llamafile embeds a llama.cpp server and provides an OpenAI API-compatible chat-completions endpoint,
7-
allowing us to use the `openai/chat`, `llama.cpp/completion`, and `llama.cpp/embedding` types.
5+
By default, llamafile uses port `8080`, which conflicts with Tabby's default port. It is recommended to run llamafile with the `--port` option to serve on a different port, such as `8081`. For embeddings functionality, you need to run llamafile with both the `--embedding` and `--port` options.
86

9-
By default, llamafile uses port `8080`, which is also used by Tabby.
10-
Therefore, it is recommended to run llamafile with the `--port` option to serve on a different port, such as `8081`.
7+
## Chat model
118

12-
For embeddings, the embedding endpoint is no longer supported in the standard llamafile server,
13-
so you need to run llamafile with the `--embedding` and `--port` options.
14-
15-
Below is an example configuration:
9+
llamafile provides an OpenAI-compatible chat API interface. Note that the endpoint URL must include the `v1` suffix.
1610

1711
```toml title="~/.tabby/config.toml"
18-
# Chat model
1912
[model.chat.http]
2013
kind = "openai/chat" # llamafile uses openai/chat kind
2114
model_name = "your_model"
2215
api_endpoint = "http://localhost:8081/v1" # Please add and conclude with the `v1` suffix
2316
api_key = ""
17+
```
18+
19+
## Completion model
2420

25-
# Completion model
21+
llamafile uses llama.cpp's completion API interface. Note that the endpoint URL should NOT include the `v1` suffix.
22+
23+
```toml title="~/.tabby/config.toml"
2624
[model.completion.http]
27-
kind = "llama.cpp/completion" # llamafile uses llama.cpp/completion kind
25+
kind = "llama.cpp/completion"
2826
model_name = "your_model"
2927
api_endpoint = "http://localhost:8081" # DO NOT append the `v1` suffix
3028
api_key = "secret-api-key"
3129
prompt_template = "<|fim_prefix|>{prefix}<|fim_suffix|>{suffix}<|fim_middle|>" # Example prompt template for the Qwen2.5 Coder model series.
30+
```
31+
32+
## Embeddings model
3233

33-
# Embedding model
34+
llamafile provides embedding functionality through llama.cpp's API interface. Note that the endpoint URL should NOT include the `v1` suffix.
35+
36+
```toml title="~/.tabby/config.toml"
3437
[model.embedding.http]
35-
kind = "llama.cpp/embedding" # llamafile uses llama.cpp/embedding kind
38+
kind = "llama.cpp/embedding"
3639
model_name = "your_model"
3740
api_endpoint = "http://localhost:8082" # DO NOT append the `v1` suffix
3841
api_key = ""
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,31 @@
11
# Mistral AI
22

3-
[Mistral](https://mistral.ai/) is a platform that provides a suite of AI models. Tabby supports Mistral's models for code completion and chat.
3+
[Mistral](https://mistral.ai/) is a platform that provides a suite of AI models specialized in various tasks, including code generation and natural language processing. Their models are known for high performance and efficiency in both code completion and chat interactions.
44

5-
To connect Tabby with Mistral's models, you need to apply the following configurations in the `~/.tabby/config.toml` file:
5+
## Chat model
66

7-
```toml title="~/.tabby/config.toml"
8-
# Completion Model
9-
[model.completion.http]
10-
kind = "mistral/completion"
11-
model_name = "codestral-latest"
12-
api_endpoint = "https://api.mistral.ai"
13-
api_key = "secret-api-key"
7+
Mistral provides a specialized chat API interface.
148

15-
# Chat Model
9+
```toml title="~/.tabby/config.toml"
1610
[model.chat.http]
1711
kind = "mistral/chat"
1812
model_name = "codestral-latest"
1913
api_endpoint = "https://api.mistral.ai/v1"
20-
api_key = "secret-api-key"
14+
api_key = "your-api-key"
2115
```
16+
17+
## Completion model
18+
19+
Mistral offers a dedicated completion API interface for code completion tasks.
20+
21+
```toml title="~/.tabby/config.toml"
22+
[model.completion.http]
23+
kind = "mistral/completion"
24+
model_name = "codestral-latest"
25+
api_endpoint = "https://api.mistral.ai"
26+
api_key = "your-api-key"
27+
```
28+
29+
## Embeddings model
30+
31+
Mistral currently does not provide embedding model APIs.

website/docs/references/models-http-api/ollama.md

+20-9
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,35 @@
11
# Ollama
22

3-
[ollama](https://github.com/ollama/ollama/blob/main/docs/api.md#generate-a-completion) is a popular model provider that offers a local-first experience.
3+
[ollama](https://github.com/ollama/ollama/blob/main/docs/api.md#generate-a-completion) is a popular model provider that offers a local-first experience. It provides support for various models through HTTP APIs, including completion, chat, and embedding functionalities.
44

5-
Tabby supports the ollama HTTP API for completion, chat, and embedding models.
5+
## Chat model
6+
7+
Ollama provides an OpenAI-compatible chat API interface.
8+
9+
```toml title="~/.tabby/config.toml"
10+
[model.chat.http]
11+
kind = "openai/chat"
12+
model_name = "mistral:7b"
13+
api_endpoint = "http://localhost:11434/v1"
14+
```
15+
16+
## Completion model
17+
18+
Ollama offers a specialized completion API interface for code completion tasks.
619

720
```toml title="~/.tabby/config.toml"
8-
# Completion model
921
[model.completion.http]
1022
kind = "ollama/completion"
1123
model_name = "codellama:7b"
1224
api_endpoint = "http://localhost:11434"
1325
prompt_template = "<PRE> {prefix} <SUF>{suffix} <MID>" # Example prompt template for the CodeLlama model series.
26+
```
1427

15-
# Chat model
16-
[model.chat.http]
17-
kind = "openai/chat"
18-
model_name = "mistral:7b"
19-
api_endpoint = "http://localhost:11434/v1"
28+
## Embeddings model
2029

21-
# Embedding model
30+
Ollama provides embedding functionality through its HTTP API.
31+
32+
```toml title="~/.tabby/config.toml"
2233
[model.embedding.http]
2334
kind = "ollama/embedding"
2435
model_name = "nomic-embed-text"
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,17 @@
11
# OpenAI
22

3-
OpenAI is a leading AI company that has developed an extensive range of language models.
4-
Tabby supports OpenAI's API specifications for chat, completion, and embedding tasks.
5-
6-
The OpenAI API is widely used and is also provided by other vendors,
7-
such as vLLM, Nvidia NIM, and LocalAI.
8-
9-
Tabby continues to support the OpenAI Completion API specifications due to its widespread usage.
3+
OpenAI is a leading AI company that has developed an extensive range of language models. Their API specifications have become a de facto standard, also implemented by other vendors such as vLLM, Nvidia NIM, and LocalAI.
104

115
## Chat model
126

7+
OpenAI provides a comprehensive chat API interface. Note: Do not append the `/chat/completions` suffix to the API endpoint.
8+
139
```toml title="~/.tabby/config.toml"
14-
# Chat model
1510
[model.chat.http]
1611
kind = "openai/chat"
1712
model_name = "gpt-4o" # Please make sure to use a chat model, such as gpt-4o
1813
api_endpoint = "https://api.openai.com/v1" # DO NOT append the `/chat/completions` suffix
19-
api_key = "secret-api-key"
14+
api_key = "your-api-key"
2015
```
2116

2217
## Completion model
@@ -25,11 +20,12 @@ OpenAI doesn't offer models for completions (FIM), its `/v1/completions` API has
2520

2621
## Embeddings model
2722

23+
OpenAI provides powerful embedding models through their API interface. Note: Do not append the `/embeddings` suffix to the API endpoint.
24+
2825
```toml title="~/.tabby/config.toml"
29-
# Embedding model
3026
[model.embedding.http]
3127
kind = "openai/embedding"
32-
model_name = "text-embedding-3-small" # Please make sure to use a embedding model, such as text-embedding-3-small
28+
model_name = "text-embedding-3-small" # Please make sure to use a embedding model, such as text-embedding-3-small
3329
api_endpoint = "https://api.openai.com/v1" # DO NOT append the `/embeddings` suffix
34-
api_key = "secret-api-key"
30+
api_key = "your-api-key"
3531
```
Original file line numberDiff line numberDiff line change
@@ -1,44 +1,48 @@
11
# vLLM
22

3-
[vLLM](https://docs.vllm.ai/en/stable/) is a fast and user-friendly library for LLM inference and serving.
3+
[vLLM](https://docs.vllm.ai/en/stable/) is a fast and user-friendly library for LLM inference and serving. It provides an OpenAI-compatible server interface, allowing the use of OpenAI kinds for chat and embedding, while offering a specialized interface for completions.
44

5-
vLLM offers an `OpenAI Compatible Server`, enabling us to use the OpenAI kinds for chat and embedding.
6-
However, for completion, there are certain differences in the implementation.
7-
Therefore, we should use the `vllm/completion` kind and provide a `prompt_template` depending on the specific models.
5+
Important requirements for all model types:
86

9-
Please note that models differ in their capabilities for completion or chat.
10-
You should confirm the model's capability before employing it for chat or completion tasks.
7+
- `model_name` must exactly match the one used to run vLLM
8+
- `api_endpoint` should follow the format `http://host:port/v1`
9+
- `api_key` should be identical to the one used to run vLLM
1110

12-
Additionally, there are models that can serve both as chat and completion.
13-
For detailed information, please refer to the [Model Registry](../../models/index.mdx).
11+
Please note that models differ in their capabilities for completion or chat. Some models can serve both purposes. For detailed information, please refer to the [Model Registry](../../models/index.mdx).
1412

15-
Below is an example of the vLLM running at `http://localhost:8000`:
13+
## Chat model
1614

17-
Please note the following requirements in each model type:
18-
1. `model_name` must exactly match the one used to run vLLM.
19-
2. `api_endpoint` should follow the format `http://host:port/v1`.
20-
3. `api_key` should be identical to the one used to run vLLM.
15+
vLLM provides an OpenAI-compatible chat API interface.
2116

2217
```toml title="~/.tabby/config.toml"
23-
# Chat model
2418
[model.chat.http]
2519
kind = "openai/chat"
26-
model_name = "your_model" # Please make sure to use a chat model.
20+
model_name = "your_model" # Please make sure to use a chat model
2721
api_endpoint = "http://localhost:8000/v1"
28-
api_key = "secret-api-key"
22+
api_key = "your-api-key"
23+
```
24+
25+
## Completion model
2926

30-
# Completion model
27+
Due to implementation differences, vLLM uses its own completion API interface that requires a specific prompt template based on the model being used.
28+
29+
```toml title="~/.tabby/config.toml"
3130
[model.completion.http]
3231
kind = "vllm/completion"
33-
model_name = "your_model" # Please make sure to use a completion model.
32+
model_name = "your_model" # Please make sure to use a completion model
3433
api_endpoint = "http://localhost:8000/v1"
35-
api_key = "secret-api-key"
36-
prompt_template = "<PRE> {prefix} <SUF>{suffix} <MID>" # Example prompt template for the CodeLlama model series.
34+
api_key = "your-api-key"
35+
prompt_template = "<PRE> {prefix} <SUF>{suffix} <MID>" # Example prompt template for the CodeLlama model series
36+
```
37+
38+
## Embeddings model
3739

38-
# Embedding model
40+
vLLM provides an OpenAI-compatible embeddings API interface.
41+
42+
```toml title="~/.tabby/config.toml"
3943
[model.embedding.http]
4044
kind = "openai/embedding"
4145
model_name = "your_model"
4246
api_endpoint = "http://localhost:8000/v1"
43-
api_key = "secret-api-key"
47+
api_key = "your-api-key"
4448
```

website/docs/references/models-http-api/voyage-ai.md

+5-3
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,13 @@
22

33
[Voyage AI](https://voyage.ai/) is a company that provides a range of embedding models. Tabby supports Voyage AI's models for embedding tasks.
44

5-
Below is an example configuration:
5+
## Embeddings model
6+
7+
Voyage AI provides specialized embedding models through their API interface.
68

79
```toml title="~/.tabby/config.toml"
810
[model.embedding.http]
911
kind = "voyage/embedding"
10-
api_key = "..."
1112
model_name = "voyage-code-2"
12-
```
13+
api_key = "your-api-key"
14+
```

0 commit comments

Comments
 (0)