Skip to content

whisper: CrisperWhisper results in grpc: error while marshaling: string field contains invalid UTF-8 #5038

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
markuman opened this issue Mar 19, 2025 · 4 comments
Labels
bug Something isn't working unconfirmed

Comments

@markuman
Copy link

LocalAI version:
localai/localai:v2.26.0-aio-gpu-nvidia-cuda-12

Environment, CPU architecture, OS, and Version:

uname -a
Linux gpu2 6.13.7-arch1-1 #1 SMP PREEMPT_DYNAMIC Thu, 13 Mar 2025 18:12:00 +0000 x86_64 GNU/Linux

Describe the bug

Using https://huggingface.co/nyrahealth/CrisperWhisper with local-ai resullts in

Whisper-Error: 500 - {"error":{"code":500,"message":"rpc error: code = Internal desc = grpc: error while marshaling: string field contains invalid UTF-8","type":""}}

To Reproduce

  1. create directories and install dependencies
mkdir CrisperWhisper
mkdir CrisperWhisper-out
pip install huggingface_hub  torch numpy transformers
git clone https://github.com/openai/whisper
git clone https://github.com/ggerganov/whisper.cpp
  1. Download the model
from huggingface_hub import snapshot_download, login

HUGGINGFACE_TOKEN = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"

login(token=HUGGINGFACE_TOKEN)

model_id = "nyrahealth/CrisperWhisper"  # Replace with the ID of the model you want to download
snapshot_download(repo_id=model_id, local_dir="CrisperWhisper")
  1. convert model to single file ggml
python whisper.cpp/models/convert-h5-to-ggml.py CrisperWhisper/ whisper/ CrisperWhisper-out/

move CrisperWhisper-out/ ggm file to your local-ai model path.

Expected behavior

Transcribe succeeded without any errors.

Logs

7:53AM INF BackendLoader starting backend=whisper modelID=CrisperWhisper.bin o.model=CrisperWhisper.bin
7:54AM INF Success ip=127.0.0.1 latency="28.449µs" method=GET status=200 url=/readyz
7:55AM INF Success ip=127.0.0.1 latency="14.826µs" method=GET status=200 url=/readyz
7:56AM ERR Server error error="rpc error: code = Internal desc = grpc: error while marshaling: string field contains invalid UTF-8" ip=172.17.0.1 latency=2m39.930538614s method=POST status=500 url=/v1/audio/transcriptions

Additional context

@markuman markuman added bug Something isn't working unconfirmed labels Mar 19, 2025
@markuman
Copy link
Author

Using CPU-based local-ai results in the same error

quay.io/go-skynet/local-ai:v2.26.0-aio-cpu

Linux mb 6.8.0-55-generic #57-Ubuntu SMP PREEMPT_DYNAMIC Wed Feb 12 23:42:21 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

@mudler
Copy link
Owner

mudler commented Mar 19, 2025

Hi - can you please share logs with --debug ?

also, can you try to set a model name without the "."? I don't think it's a problem per-se, but the UTF-8 error is unexpected. Can you also share how are you calling the API?

@markuman
Copy link
Author

Hi - can you please share logs with --debug ?

11:48AM INF LocalAI API is listening! Please connect to the endpoint for API documentation. endpoint=http://0.0.0.0:8080
11:49AM DBG context local model name not found, setting to the first model first model name=whisper-1
11:49AM DBG guessDefaultsFromFile: not a GGUF file filePath=/build/models/CrisperWhisper.bin
11:49AM DBG Audio file copied to: /tmp/whisper4121787727/test.mp3
11:49AM INF BackendLoader starting backend=whisper modelID=CrisperWhisper.bin o.model=CrisperWhisper.bin
11:49AM DBG Loading model in memory from file: /build/models/CrisperWhisper.bin
11:49AM DBG Loading Model CrisperWhisper.bin with gRPC (file: /build/models/CrisperWhisper.bin) (backend: whisper): {backendString:whisper model:CrisperWhisper.bin modelID:CrisperWhisper.bin assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc0005de008 externalBackends:map[autogptq:/build/backend/python/autogptq/run.sh bark:/build/backend/python/bark/run.sh coqui:/build/backend/python/coqui/run.sh diffusers:/build/backend/python/diffusers/run.sh exllama2:/build/backend/python/exllama2/run.sh faster-whisper:/build/backend/python/faster-whisper/run.sh kokoro:/build/backend/python/kokoro/run.sh rerankers:/build/backend/python/rerankers/run.sh transformers:/build/backend/python/transformers/run.sh vllm:/build/backend/python/vllm/run.sh] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}
11:49AM DBG Loading GRPC Process: /tmp/localai/backend_data/backend-assets/grpc/whisper
11:49AM DBG GRPC Service for CrisperWhisper.bin will be running at: '127.0.0.1:44969'
11:49AM DBG GRPC Service state dir: /tmp/go-processmanager771897440
11:49AM DBG GRPC Service Started
11:49AM DBG Wait for the service to start up
11:49AM DBG Options: ContextSize:512  Seed:1365369429  NBatch:512  MMap:true  NGPULayers:99999999  Threads:8
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr 2025/03/19 11:49:18 gRPC Server listening at 127.0.0.1:44969
11:49AM DBG GRPC Service Ready
11:49AM DBG GRPC: Loading model with options: {state:{NoUnkeyedLiterals:{} DoNotCompare:[] DoNotCopy:[] atomicMessageInfo:0xc000336e58} sizeCache:0 unknownFields:[] Model:CrisperWhisper.bin ContextSize:512 Seed:1365369429 NBatch:512 F16Memory:false MLock:false MMap:true VocabOnly:false LowVRAM:false Embeddings:false NUMA:false NGPULayers:99999999 MainGPU: TensorSplit: Threads:8 LibrarySearchPath: RopeFreqBase:0 RopeFreqScale:0 RMSNormEps:0 NGQA:0 ModelFile:/build/models/CrisperWhisper.bin Device: UseTriton:false ModelBaseName: UseFastTokenizer:false PipelineType: SchedulerType: CUDA:false CFGScale:0 IMG2IMG:false CLIPModel: CLIPSubfolder: CLIPSkip:0 ControlNet: Tokenizer: LoraBase: LoraAdapter: LoraScale:0 NoMulMatQ:false DraftModel: AudioPath: Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 LoadFormat: MMProj: RopeScaling: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0 Type: FlashAttention:false NoKVOffload:false ModelPath:/build/models LoraAdapters:[] LoraScales:[] Options:[] CacheTypeKey: CacheTypeValue: GrammarTriggers:[]}
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_init_from_file_with_params_no_state: loading model from '/build/models/CrisperWhisper.bin'
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_init_with_params_no_state: use gpu    = 1
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_init_with_params_no_state: flash attn = 0
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_init_with_params_no_state: gpu_device = 0
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_init_with_params_no_state: dtw        = 0
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_init_with_params_no_state: backends   = 1
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_model_load: loading model
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_model_load: n_vocab       = 51866
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_model_load: n_audio_ctx   = 1500
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_model_load: n_audio_state = 1280
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_model_load: n_audio_head  = 20
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_model_load: n_audio_layer = 32
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_model_load: n_text_ctx    = 448
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_model_load: n_text_state  = 1280
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_model_load: n_text_head   = 20
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_model_load: n_text_layer  = 32
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_model_load: n_mels        = 128
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_model_load: ftype         = 1
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_model_load: qntvr         = 0
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_model_load: type          = 5 (large v3)
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_model_load: adding 6800 extra tokens
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_model_load: n_langs       = 100
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_model_load:      CPU total size =  3094.36 MB
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_model_load: model size    = 3094.36 MB
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_init_state: kv self size  =   83.89 MB
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_init_state: kv cross size =  251.66 MB
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_init_state: kv pad  size  =    7.86 MB
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_init_state: compute buffer (conv)   =   36.13 MB
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_init_state: compute buffer (encode) =  212.29 MB
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_init_state: compute buffer (cross)  =    9.25 MB
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_init_state: compute buffer (decode) =   99.10 MB
11:49AM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr whisper_full_with_state: auto-detected language: de (p = 0.999483)
12:23PM DBG GRPC(CrisperWhisper.bin-127.0.0.1:44969): stderr 2025/03/19 12:23:14 ERROR: [core] [Server #1]grpc: server failed to encode response: rpc error: code = Internal desc = grpc: error while marshaling: string field contains invalid UTF-8
12:23PM ERR Server error error="rpc error: code = Internal desc = grpc: error while marshaling: string field contains invalid UTF-8" ip=10.0.2.100 latency=33m56.389020712s method=POST status=500 url=/v1/audio/transcriptions

also, can you try to set a model name without the "."?

hmm I don't understand this.

curl -s http://localhost:8080/models|jq|grep -i -A 2 -B 2 cris 
    },
    {
      "id": "CrisperWhisper.bin",
      "object": "model"
    },

That's also just the name in the filesystem

ls -lh localai/models/ |grep -i cr
-rw-rw---- 1 markuman markuman 2.9G Mar 17 07:58 CrisperWhisper.bin

Can you also share how are you calling the API?

import requests
import json


baseurl = "http://127.0.0.1:8080"
transcription = '/v1/audio/transcriptions'

testfile = 'test.mp3'

# transcription with whisper
############################
with open(testfile, "rb") as audio_file:
    files = {"file": ("test.mp3", audio_file)}
    data = {"model": "CrisperWhisper.bin"}
    response = requests.post(baseurl + transcription, files=files, data=data)

print(response)

if response.status_code == 200:
    raw = response.json().get('text')
else:
    print(f"Whisper-Error: {response.status_code} - {response.text}")

print(raw)

@mjess
Copy link

mjess commented Apr 25, 2025

I'm experiencing the very same issue. Happens only for CrisperWhisper model, all other Whisper models I tried so far just work fine. Any further details I can provide to debug this? I'd love to get CrisperWhisper running...

local-ai Version: v2.28.0

For reproducing the issue faster (w/o model conversion): ggml model here: https://huggingface.co/nyrahealth/CrisperWhisper/commit/0c039779bd37fc1fdd2bbaccaa02dbda7aac37d5#d2h-238772

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working unconfirmed
Projects
None yet
Development

No branches or pull requests

3 participants