Skip to content

Commit cc712a6

Browse files
danbevNeoZhangJianyu
authored andcommitted
common : add default embeddings presets (ggml-org#11677)
* common : add default embeddings presets This commit adds default embeddings presets for the following models: - bge-small-en-v1.5 - e5-small-v2 - gte-small These can be used with llama-embedding and llama-server. For example, with llama-embedding: ```console ./build/bin/llama-embedding --embd-gte-small-default -p "Hello, how are you?" ``` And with llama-server: ```console ./build/bin/llama-server --embd-gte-small-default ``` And the embeddings endpoint can then be called with a POST request: ```console curl --request POST \ --url http://localhost:8080/embeddings \ --header "Content-Type: application/json" \ --data '{"input": "Hello, how are you?"}' ``` I'm not sure if these are the most common embedding models but hopefully this can be a good starting point for discussion and further improvements. Refs: ggml-org#10932
1 parent f2a6480 commit cc712a6

File tree

1 file changed

+42
-0
lines changed

1 file changed

+42
-0
lines changed

common/arg.cpp

+42
Original file line numberDiff line numberDiff line change
@@ -2324,5 +2324,47 @@ common_params_context common_params_parser_init(common_params & params, llama_ex
23242324
}
23252325
).set_examples({LLAMA_EXAMPLE_TTS}));
23262326

2327+
add_opt(common_arg(
2328+
{"--embd-bge-small-en-default"},
2329+
string_format("use default bge-small-en-v1.5 model (note: can download weights from the internet)"),
2330+
[](common_params & params) {
2331+
params.hf_repo = "ggml-org/bge-small-en-v1.5-Q8_0-GGUF";
2332+
params.hf_file = "bge-small-en-v1.5-q8_0.gguf";
2333+
params.pooling_type = LLAMA_POOLING_TYPE_NONE;
2334+
params.embd_normalize = 2;
2335+
params.n_ctx = 512;
2336+
params.verbose_prompt = true;
2337+
params.embedding = true;
2338+
}
2339+
).set_examples({LLAMA_EXAMPLE_EMBEDDING, LLAMA_EXAMPLE_SERVER}));
2340+
2341+
add_opt(common_arg(
2342+
{"--embd-e5-small-en-default"},
2343+
string_format("use default e5-small-v2 model (note: can download weights from the internet)"),
2344+
[](common_params & params) {
2345+
params.hf_repo = "ggml-org/e5-small-v2-Q8_0-GGUF";
2346+
params.hf_file = "e5-small-v2-q8_0.gguf";
2347+
params.pooling_type = LLAMA_POOLING_TYPE_NONE;
2348+
params.embd_normalize = 2;
2349+
params.n_ctx = 512;
2350+
params.verbose_prompt = true;
2351+
params.embedding = true;
2352+
}
2353+
).set_examples({LLAMA_EXAMPLE_EMBEDDING, LLAMA_EXAMPLE_SERVER}));
2354+
2355+
add_opt(common_arg(
2356+
{"--embd-gte-small-default"},
2357+
string_format("use default gte-small model (note: can download weights from the internet)"),
2358+
[](common_params & params) {
2359+
params.hf_repo = "ggml-org/gte-small-Q8_0-GGUF";
2360+
params.hf_file = "gte-small-q8_0.gguf";
2361+
params.pooling_type = LLAMA_POOLING_TYPE_NONE;
2362+
params.embd_normalize = 2;
2363+
params.n_ctx = 512;
2364+
params.verbose_prompt = true;
2365+
params.embedding = true;
2366+
}
2367+
).set_examples({LLAMA_EXAMPLE_EMBEDDING, LLAMA_EXAMPLE_SERVER}));
2368+
23272369
return ctx_arg;
23282370
}

0 commit comments

Comments
 (0)