github
GitHub APIKeeps: repo, release, stars delta
- 2026-04-27ggerganov/llama.cpp b8952: b8952
<details open> server: (router) Forward form-data to model server (Fixes #22044) (#22118) * This commit enables the router to forward form-data to model server. Fixes #22044 (enabling to use the /v1/audio/transcriptions in router mode) * * Applied the suggestion from Copilots
github:ggerganov/llama.cpp - 2026-04-27vllm-project/vllm v0.20.0: v0.20.0
# vLLM v0.20.0 ## Highlights This release features 752 commits from 320 contributors (123 new)! * **DeepSeek V4**: Initial DeepSeek V4 support landed (#40860), with DSML token-leakage fix in DSV4/3.2 (#40806), DSA + MTP IMA fix (#40772), and a silu clamp limit on the share
github:vllm-project/vllm - 2026-04-27ggerganov/llama.cpp b8951: b8951
<details open> add fast mat-vec kernels for i-quants (#22344) </details> **macOS/iOS:** - [macOS Apple Silicon (arm64)](https://github.com/ggml-org/llama.cpp/releases/download/b8951/llama-b8951-bin-macos-arm64.tar.gz) - [macOS Apple Silicon (arm64, KleidiAI enabled)](https://g
github:ggerganov/llama.cpp - 2026-04-27ggerganov/llama.cpp b8950: b8950
<details open> Additional test for common/gemma4 : handle parsing edge cases (#22420) * Additional test for common/gemma4 : handle parsing edge cases * Move tests to Gemma 4 test group </details> **macOS/iOS:** - [macOS Apple Silicon (arm64)](https://github.com/ggml-org/llam
github:ggerganov/llama.cpp - 2026-04-27ggerganov/llama.cpp b8949: b8949
<details open> fix: rpc-server cache may not work in Windows environments (#22394) * fix: create directory and log cache file name. * Remove GGML_LOG_INFO conditional compilation. --------- Co-authored-by: kotaro <kotaro.kusunoki@gmail.com> </details> **macOS/iOS:** - [mac
github:ggerganov/llama.cpp - 2026-04-27ggerganov/llama.cpp b8948: b8948
<details open> Fix type casting for unaccounted memory calculation (#22424) </details> **macOS/iOS:** - [macOS Apple Silicon (arm64)](https://github.com/ggml-org/llama.cpp/releases/download/b8948/llama-b8948-bin-macos-arm64.tar.gz) - [macOS Apple Silicon (arm64, KleidiAI enabl
github:ggerganov/llama.cpp - 2026-04-27ggerganov/llama.cpp b8947: b8947
<details open> download : prefer q8_0 when q4_k not available (#22428) </details> **macOS/iOS:** - [macOS Apple Silicon (arm64)](https://github.com/ggml-org/llama.cpp/releases/download/b8947/llama-b8947-bin-macos-arm64.tar.gz) - [macOS Apple Silicon (arm64, KleidiAI enabled)](
github:ggerganov/llama.cpp - 2026-04-27ggerganov/llama.cpp b8946: b8946
<details open> model : remove duplicate wo_s scale after build_attn (Qwen3, LLaMA) (#22421) Signed-off-by: Yash Nankani <ynankani@nvidia.com> </details> **macOS/iOS:** - [macOS Apple Silicon (arm64)](https://github.com/ggml-org/llama.cpp/releases/download/b8946/llama-b8946-bi
github:ggerganov/llama.cpp - 2026-04-27ggerganov/llama.cpp b8944: b8944
<details open> ggml : use 64 bytes aligned tile buffers (#21058) | Model | Test | t/s OLD | t/s NEW | Speedup | |:---------------------------------|:-------|----------:|----------:|----------:| | qwen35 0.8B BF16 | pp512 |
github:ggerganov/llama.cpp - 2026-04-27ggerganov/llama.cpp b8943: b8943
<details open> common: fix missing exports in llama-common (#22340) * common: refactor common/debug to move abort_on_nan into base_callback_data Passing bool abort_on_nan as template parameter for common_debug_cb_eval is unnecessary and creates an issue with LTO. It should jus
github:ggerganov/llama.cpp