Skip to content

Conversation

@taimur-10x
Copy link
Contributor

Summary

This PR adds RISC-V vector (RVV) support for the SGEMM kernels.

These kernels are used by GGML_OP_MUL_MAT for prompt processing.

Key Changes

  • RVV Support is added for F16 and F32 types (with the zvfh extension) and BF16 (with the zvfbfwma extension).
  • Tiling was decided based on various LMUL configurations
    • 4x6 with LMUL=1 (32 register groups)
    • 4x3 with LMUL=2 (16 register groups)
    • 2x2 with LMUL=4 (8 register groups)

Testing

Kernels were functionally tested on QEMU for VLENs (128-bit, 256-bit, 512-bit and 1024-bit) for a range of input sizes.

Benchmarking Results

End-to-end benchmarking on BananaPI-BPI F3 (VLEN=256)

Prefill / Prompt Processing

Tokens / Second

Model Prompt Size SGEMM 4x6 SGEMM 4x3 SGEMM 2x2 Vector Dot
Tinyllama F16 1.1B 32 6.08 7.89 6.26 8.42
Tinyllama F16 1.1B 64 6.09 7.25 11.31 7.57
Tinyllama F16 1.1B 128 5.93 6.9 13.73 8.78
Tinyllama F16 1.1B 256 5.54 6.79 12.56 8.57
Tinyllama F16 1.1B 512 5.37 6.64 13.37 8.68

Co-authored-by: Rehan Qasim <rehan.qasim@10xengineers.ai>
@github-actions github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Dec 19, 2025
@ggerganov ggerganov merged commit d34d5ca into ggml-org:master Dec 22, 2025
70 of 71 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants