llamafile: add rvv support for sgemm kernels #18199
Merged
+768
−0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds RISC-V vector (RVV) support for the SGEMM kernels.
These kernels are used by
GGML_OP_MUL_MATfor prompt processing.Key Changes
zvfhextension) and BF16 (with thezvfbfwmaextension).4x6withLMUL=1(32 register groups)4x3withLMUL=2(16 register groups)2x2withLMUL=4(8 register groups)Testing
Kernels were functionally tested on QEMU for VLENs (128-bit, 256-bit, 512-bit and 1024-bit) for a range of input sizes.
Benchmarking Results
End-to-end benchmarking on
BananaPI-BPI F3 (VLEN=256)Prefill / Prompt Processing
Tokens / Second