github
GitHub APIKeeps: repo, release, stars delta
- 2026-05-04ROCm/ROCm rocm-7.2.3: ROCm 7.2.3 Release
<!-- Do not edit this file! --> <!-- This file is autogenerated with --> <!-- tools/autotag/tag_script.py --> <!-- Disable lints since this is an auto-generated file. --> <!-- markdownlint-di
AMDgithub:ROCm/ROCm - 2026-05-04ggerganov/llama.cpp b9025: b9025
<details open> kleidiai : update to v1.24.0 and use release archive (#22549) </details> **macOS/iOS:** - [macOS Apple Silicon (arm64)](https://github.com/ggml-org/llama.cpp/releases/download/b9025/llama-b9025-bin-macos-arm64.tar.gz) - [macOS Apple Silicon (arm64, KleidiAI enab
github:ggerganov/llama.cpp - 2026-05-04ggerganov/llama.cpp b9023: b9023
<details open> server: implement /models?reload=1 (#21848) </details> **macOS/iOS:** - [macOS Apple Silicon (arm64)](https://github.com/ggml-org/llama.cpp/releases/download/b9023/llama-b9023-bin-macos-arm64.tar.gz) - [macOS Apple Silicon (arm64, KleidiAI enabled)](https://gith
github:ggerganov/llama.cpp - 2026-05-04ggerganov/llama.cpp b9020: b9020
<details open> common/autoparser: fixes for newline handling / forced tool calls (#22654) * chat/autoparser: the fixes * Move optspace() to chat-peg-parser, comment out server tests invalidated due to content now allowed with forced tool calls. * Trim whitespace on apply inst
github:ggerganov/llama.cpp - 2026-05-04ggerganov/llama.cpp b9022: b9022
<details open> examples: refactor diffusion generation (#22590) * examples: refactor diffusion generation * renamed enum values </details> **macOS/iOS:** - [macOS Apple Silicon (arm64)](https://github.com/ggml-org/llama.cpp/releases/download/b9022/llama-b9022-bin-macos-arm64
github:ggerganov/llama.cpp - 2026-05-04ggerganov/llama.cpp b9019: b9019
<details open> model: move `load_hparams` and `load_tensors` to per-model definition (#22004) * git-friendly migration * add build_graph * nits * exclude old code from build * wip * add llm_arch_model_i * prepare downstream functions * nits * nits * wip * wip * add b
github:ggerganov/llama.cpp - 2026-05-04ggerganov/llama.cpp b9018: b9018
<details open> server: Add a simple get_datetime server tool (#22649) </details> **macOS/iOS:** - [macOS Apple Silicon (arm64)](https://github.com/ggml-org/llama.cpp/releases/download/b9018/llama-b9018-bin-macos-arm64.tar.gz) - [macOS Apple Silicon (arm64, KleidiAI enabled)](h
github:ggerganov/llama.cpp - 2026-05-04ggerganov/llama.cpp b9016: b9016
<details open> docs : update speculative decoding parameters after refactor (#22397) (#22539) * docs : update speculative decoding parameters after refactor (#22397) Update docs/speculative.md to reflect the new parameter naming scheme introduced in PR #22397: - Replace --dra
github:ggerganov/llama.cpp - 2026-05-04ggerganov/llama.cpp b9015: b9015
<details open> vulkan: delete dead GGML_VK_MAX_NODES def (#22621) </details> **macOS/iOS:** - [macOS Apple Silicon (arm64)](https://github.com/ggml-org/llama.cpp/releases/download/b9015/llama-b9015-bin-macos-arm64.tar.gz) - [macOS Apple Silicon (arm64, KleidiAI enabled)](https
github:ggerganov/llama.cpp - 2026-05-04ggerganov/llama.cpp b9014: b9014
<details open> ggml-webgpu: add layer norm ops (#22406) * shader(norm): add layer norm ops * shader(norm): stablize floating point computation with Kahan summation and handle mixed types * shader(norm): remove the non-contiguous strides * shader(norm): use the original imple
github:ggerganov/llama.cpp