-
Notifications
You must be signed in to change notification settings - Fork 14.2k
Description
Name and Version
llama-server.exe running b7513 but haven't fully tested how many versions back this started.
Operating systems
Windows
Which llama.cpp modules do you know to be affected?
llama-server
Command line
llama-server.exe --model ".\gemma-3-12b-it-qat-UD-Q4_K_XL.gguf" --mmproj ".\gemma-3-12b-it-qat-UD-mmproj.F16.gguf"Problem description & steps to reproduce
WebUI processing image with my request generates the following error and then quits:
0.49.164.680 I slot launch_slot_: id 3 | task 0 | processing task
0.49.164.725 I slot update_slots: id 3 | task 0 | new prompt, n_ctx_slot = 16384, n_keep = -1, task.n_tokens = 1465
0.49.164.731 I slot update_slots: id 3 | task 0 | n_tokens = 0, memory_seq_rm [0, end)
0.49.164.778 I slot update_slots: id 3 | task 0 | prompt processing progress, n_tokens = 28, batch.n_tokens = 28, progress = 0.019113
0.51.364.526 I slot update_slots: id 3 | task 0 | n_tokens = 28, memory_seq_rm [28, end)
0.51.364.550 I srv process_chun: processing image...
0.51.364.668 I encoding image slice...
D:\a\llama.cpp\llama.cpp\ggml\src\ggml-vulkan\ggml-vulkan.cpp:5928: GGML_ASSERT(wg0 <= ctx->device->properties.limits.maxComputeWorkGroupCount[0] && wg1 <= ctx->device->properties.limits.maxComputeWorkGroupCount[1] && wg2 <= ctx->device->properties.limits.maxComputeWorkGroupCount[2]) failed
First Bad Commit
currently on b7513, but haven't tested how many versions back this problem started.
Relevant log output
0.49.164.680 I slot launch_slot_: id 3 | task 0 | processing task
0.49.164.725 I slot update_slots: id 3 | task 0 | new prompt, n_ctx_slot = 16384, n_keep = -1, task.n_tokens = 1465
0.49.164.731 I slot update_slots: id 3 | task 0 | n_tokens = 0, memory_seq_rm [0, end)
0.49.164.778 I slot update_slots: id 3 | task 0 | prompt processing progress, n_tokens = 28, batch.n_tokens = 28, progress = 0.019113
0.51.364.526 I slot update_slots: id 3 | task 0 | n_tokens = 28, memory_seq_rm [28, end)
0.51.364.550 I srv process_chun: processing image...
0.51.364.668 I encoding image slice...
D:\a\llama.cpp\llama.cpp\ggml\src\ggml-vulkan\ggml-vulkan.cpp:5928: GGML_ASSERT(wg0 <= ctx->device->properties.limits.maxComputeWorkGroupCount[0] && wg1 <= ctx->device->properties.limits.maxComputeWorkGroupCount[1] && wg2 <= ctx->device->properties.limits.maxComputeWorkGroupCount[2]) failed