-
Notifications
You must be signed in to change notification settings - Fork 421
[BUG]: Error loading the LLava model #1136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Does the model you're using work with the corresponding version of llama.cpp? Maybe also test in the previous version of LLamaSharp, to check for regressions. |
I'll try. Can you tell me if llava should work with any multimodal models? I also noticed that the context is created only through llama weight, why is that? |
0.21.0 - the same mistake |
I tested the master with the model used in the unit test successfully. Tested on Windows with CUDA. |
and on the CPU? |
Models/llava-v1.6-mistral-7b.Q3_K_XS.gguf ? |
As I explain I tested in GPU. The unit test are running successfully so it should work. I'm talking about Llava. Qwen-VL is not tested, and should not work. |
Does LLava not work on the CPU? |
mmproj-model-f16.gguf - works! |
|
Should the LLava model be mmproj only? Maybe I don't understand correctly how to work with VLM. |
You need both files. Check https://scisharp.github.io/LLamaSharp/0.23.0/QuickStart/ and the Llava example in the examples project |
That's llama.cpp documentation
That's llama.cpp documention. In the list of supported models in LlamaSharp documentation https://github.com/SciSharp/LLamaSharp qwen-vl is not in the list. If I find some time I will test it. |
Thanks. Yes, I get it. |
I found model (main + mmproj): and here many mmproj models: All models are loaded, but the output is not working. The models from the example work, but they are weak and old. I would like to try Qwen2-VL or gemma3 (it doesn't work yet, apparently a new version is needed llama.cpp, ggml-org/llama.cpp#12344). |
After conducting several tests and reviewing the current status of llama.cpp in relation to multimodal models, my understanding is as follows:
Regarding LlamaSharp, only Llava and similar models are currently compatible. I don’t believe that replicating the work done in qwen2vl-cli or gemma3-cli would be the best approach. Instead, I recommend waiting for llama.cpp to introduce a vision API and then updating LlamaSharp’s multimodal support to integrate with that API. |
Thanks. Apparently it is, but it is weak by modern standards for Vision: |
I haven't found any information about this, can you show me where they discuss it? |
#1178 That's the information |
Models:
https://huggingface.co/benxh/Qwen2.5-VL-7B-Instruct-GGUF
https://huggingface.co/KBlueLeaf/llama3-llava-next-8b-gguf (from here #897)
https://huggingface.co/second-state/Llava-v1.5-7B-GGUF
Error:
External component has thrown an exception.
System.Runtime.InteropServices.SEHException (0x80004005): External component has thrown an exception.
at LLama.Native.SafeLlavaModelHandle.clip_model_load(String mmProj, Int32 verbosity)
at LLama.Native.SafeLlavaModelHandle.LoadFromFile(String modelPath, Int32 verbosity)
at LLama.LLavaWeights.LoadFromFile(String mmProject)
What could be the problem?
I wanted to use a multimodal model to convert image2text.
Environment & Configuration
The text was updated successfully, but these errors were encountered: