-
Notifications
You must be signed in to change notification settings - Fork 11.5k
Eval bug: Deepseek V2 Lite no longer working with Vulkan (assert fail during tg) #12956
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
Comments
Also when using a batch size of 1, the crash happens during prompt processing. |
Ok I found a fix. Now it's working, but I'm also noticing a small, but significant performance regression with prompt processing speed...
With my fix on top of master: (7a8be3a):
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Name and Version
ggml_vulkan: Found 2 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon RX 6800 (AMD proprietary driver) | uma: 0 | fp16: 1 | warp size: 32 | shared memory: 32768 | int dot: 0 | matrix cores: none
ggml_vulkan: 1 = AMD Radeon RX 5700 XT (AMD proprietary driver) | uma: 0 | fp16: 1 | warp size: 32 | shared memory: 32768 | int dot: 0 | matrix cores: none
version: 5139 (84778e9)
built with MSVC 19.43.34809.0 for x64
Operating systems
Windows
GGML backends
Vulkan
Hardware
Ryzen 5900X + Rx 5700XT + Rx 6800
Models
DeepSeek-V2-Lite-Chat.IQ4_NL.gguf
Problem description & steps to reproduce
The program always crashes after generating a single token. Prompt processing seem to work fine.
.\llama-cli.exe -m .\models\DeepSeek-V2-Lite-Chat.IQ4_NL.gguf -ngl 99 -t 12 -p "Hello"
First Bad Commit
most likely daa4228, I haven't done a proper bisect, but d6d2c2a was working.
Relevant log output
The text was updated successfully, but these errors were encountered: