github
GitHub APIKeeps: repo, release, stars delta
- 2026-05-25ggerganov/llama.cpp b9320: b9320
<details open> TP: fix ggml context size calculation (#22616) * TP: fix ggml context size calculation, memory leak * move split state cache back into the context * revert to constant ggml context size for cgraphs * increase headroom for statically allocated tensors * remove
AAPLgithub:ggerganov/llama.cpp - 2026-05-25ggerganov/llama.cpp b9319: b9319
<details open> ggml: `gguf_init_from_callback` and `gguf_init_from_buffer` (#22341) * ggml: implement `gguf_init_from_buffer` * test: `gguf_init_from_buffer` * fix: memory breakdown for a model loaded with `no_alloc` from a file is consistent with being loaded from a buffer
AAPLgithub:ggerganov/llama.cpp - 2026-05-25ggerganov/llama.cpp b9318: b9318
<details open> server: MTP layer kv-cache should respect draft type ctk (#23646) </details> **macOS/iOS:** - [macOS Apple Silicon (arm64)](https://github.com/ggml-org/llama.cpp/releases/download/b9318/llama-b9318-bin-macos-arm64.tar.gz) - [macOS Apple Silicon (arm64, KleidiAI
AAPLgithub:ggerganov/llama.cpp - 2026-05-25ggerganov/llama.cpp b9315: b9315
<details open> llama : document that only one on-device state can be saved per sequence (#23520) </details> **macOS/iOS:** - [macOS Apple Silicon (arm64)](https://github.com/ggml-org/llama.cpp/releases/download/b9315/llama-b9315-bin-macos-arm64.tar.gz) - [macOS Apple Silicon (
AAPLgithub:ggerganov/llama.cpp