github
GitHub APIKeeps: repo, release, stars delta
- 2026-05-05ggerganov/llama.cpp b9037: b9037
<details open> Hexagon: Process M-tail rows on HMX instead of HVX (#22724) * hex-mm: process m-tail rows on HMX instead of HVX * hmx-mm: unroll and optimize padded activation loop --------- Co-authored-by: Max Krasnyansky <maxk@qti.qualcomm.com> </details> **macOS/iOS:** -
github:ggerganov/llama.cpp # Release v5.8.0 ## New Model additions ### DeepSeek-V4 <img width="6604" height="3574" alt="image" src="https://github.com/user-attachments/assets/4c0fdb29-f770-463c-a97b-d24438896a4c" /> DeepSeek-V4 is the next-generation MoE (Mixture of Experts) language model fr
github:huggingface/transformers- 2026-05-05ggerganov/llama.cpp b9033: b9033
<details open> sync : ggml </details> **macOS/iOS:** - [macOS Apple Silicon (arm64)](https://github.com/ggml-org/llama.cpp/releases/download/b9033/llama-b9033-bin-macos-arm64.tar.gz) - [macOS Apple Silicon (arm64, KleidiAI enabled)](https://github.com/ggml-org/llama.cpp/releas
github:ggerganov/llama.cpp - 2026-05-05ggerganov/llama.cpp b9031: b9031
<details open> common : only load backends when required (#22290) * common : only load backends when required Signed-off-by: Adrien Gallouët <angt@huggingface.co> * llama : call ggml_backend_load_all() directly from llama_backend_init() Signed-off-by: Adrien Gallouët <angt@h
github:ggerganov/llama.cpp - 2026-05-05ggerganov/llama.cpp b9028: b9028
<details open> llama : add option to save memory in device buffers (#22679) * llama : add option to save memory in device buffers * tests : extend llama-save-load-state </details> **macOS/iOS:** - [macOS Apple Silicon (arm64)](https://github.com/ggml-org/llama.cpp/releases/d
github:ggerganov/llama.cpp - 2026-05-05ggerganov/llama.cpp b9026: b9026
<details open> ggml : implement fast walsh-hadamard transform for kv rotation (#21352) (#22631) </details> **macOS/iOS:** - [macOS Apple Silicon (arm64)](https://github.com/ggml-org/llama.cpp/releases/download/b9026/llama-b9026-bin-macos-arm64.tar.gz) - [macOS Apple Silicon (a
github:ggerganov/llama.cpp