Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Embedding unreachable, but llama is running #4112

Open
LucaFulchir opened this issue Apr 3, 2025 · 0 comments
Open

Embedding unreachable, but llama is running #4112

LucaFulchir opened this issue Apr 3, 2025 · 0 comments

Comments

@LucaFulchir
Copy link

LucaFulchir commented Apr 3, 2025

Hello, I'm trying to run/test tabby, but I have problems with the embedding instance
Using version 0.27, NixOS unstable server.

Ai completion and Ai chat seem to work, but I can not add a git context provider of a public repo, it seems to clone successfully but can't parse a single file.

config.toml:

[model.completion.local]
model_id = "Qwen2.5-Coder-3B"

[model.chat.local]
model_id = "Qwen2.5-Coder-1.5B-Instruct"

[model.embedding.local]
model_id = "Nomic-Embed-Text"

running with:

tabby serve --model Qwen2.5-Coder-3B --host 192.168.1.10 --port 11029 --device rocm

testing on AMD Ryzen 7 8845HS w/ Radeon 780M Graphics

on the tabby web interface, on the systems page I see "Unreachable" only under "Enbedding", with error "error decoding response body"

The llama instance seems to be UP and by dumping the local traffic I see the following req/responses:

GET /health HTTP/1.1
accept: */*
host: 127.0.0.1:30888

HTTP/1.1 200 OK
Access-Control-Allow-Origin:
Content-Length: 15
Content-Type: application/json; charset=utf-8
Keep-Alive: timeout=5, max=5
Server: llama.cpp
------
POST /tokenize HTTP/1.1
content-type: application/json
accept: */*
host: 127.0.0.1:30888
content-length: 25

{"content":"hello Tabby"}

HTTP/1.1 200 OK
Access-Control-Allow-Origin: 
Content-Length: 28
Content-Type: application/json; charset=utf-8
Keep-Alive: timeout=5, max=5
Server: llama.cpp

{"tokens":[7592,21628,3762]}
-----------
POST /embeddings HTTP/1.1
content-type: application/json
accept: */*
host: 127.0.0.1:30888
content-length: 27

{"content":"hello Tabby\n"}

HTTP/1.1 200 OK
Access-Control-Allow-Origin:
Content-Length: 16226
Content-Type: application/json; charset=utf-8
Keep-Alive: timeout=5, max=5
Server: llama.cpp

{"embedding":[0.0018252730369567871, **a lot more floats**,-0.024591289460659027],"index":0}

Additional tabby logging even when running with RUST_LOG=debug are all like:

WARN tabby_index::indexer: crates/tabby-index/src/indexer.rs:90: Failed to build chunk for document 'git:R1AWw5:::{"path":"/var/lib/tabby/repositories/[redacted]/src/connection/handshake/dirsync/req.rs","language":"rust","git_hash":"906b1491a1a0ecb98781568b24d8ba781d6765e2"}': Failed to embed chunk text: error decoding response body

what can I try/what I am doing wrong?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant