github
GitHub APIKeeps: repo, release, stars delta
- 2026-05-10ggerganov/llama.cpp b9101: b9101
<details open> server : print warning when HTTP timeout exceeded (#22907) </details> **macOS/iOS:** - [macOS Apple Silicon (arm64)](https://github.com/ggml-org/llama.cpp/releases/download/b9101/llama-b9101-bin-macos-arm64.tar.gz) - [macOS Apple Silicon (arm64, KleidiAI enabled
github:ggerganov/llama.cpp - 2026-05-10ggerganov/llama.cpp b9100: b9100
<details open> backend sampling: support returning post-sampling probs (#22622) * server: Never return 0.0 post-sampling probabilities * backend sampling: support returning post-sampling probs </details> **macOS/iOS:** - [macOS Apple Silicon (arm64)](https://github.com/ggml-
github:ggerganov/llama.cpp - 2026-05-10ggerganov/llama.cpp b9099: b9099
<details open> vendor : update cpp-httplib to 0.43.4 (#22888) </details> **macOS/iOS:** - [macOS Apple Silicon (arm64)](https://github.com/ggml-org/llama.cpp/releases/download/b9099/llama-b9099-bin-macos-arm64.tar.gz) - [macOS Apple Silicon (arm64, KleidiAI enabled)](https://g
github:ggerganov/llama.cpp - 2026-05-10ggerganov/llama.cpp b9097: b9097
<details open> sync : ggml </details> **macOS/iOS:** - [macOS Apple Silicon (arm64)](https://github.com/ggml-org/llama.cpp/releases/download/b9097/llama-b9097-bin-macos-arm64.tar.gz) - [macOS Apple Silicon (arm64, KleidiAI enabled)](https://github.com/ggml-org/llama.cpp/releas
github:ggerganov/llama.cpp - 2026-05-10ggerganov/llama.cpp b9095: b9095
<details open> internal AllReduce kernel for CUDA provider (#22299) * ggml-cuda: add internal AllReduce provider for tensor parallelism Introduces a NCCL-free AllReduce implementation for LLAMA_SPLIT_MODE_TENSOR using a single-phase CUDA kernel that pipelines D2H copy, cross-G
github:ggerganov/llama.cpp - 2026-05-10ggerganov/llama.cpp b9094: b9094
<details open> model : fix model type check for granite/llama3 and deepseek2/glm4.7 lite (#22870) </details> **macOS/iOS:** - [macOS Apple Silicon (arm64)](https://github.com/ggml-org/llama.cpp/releases/download/b9094/llama-b9094-bin-macos-arm64.tar.gz) - [macOS Apple Silicon
github:ggerganov/llama.cpp - 2026-05-10vllm-project/vllm v0.20.2: v0.20.2
# vLLM v0.20.2 ## Highlights This release features 6 commits from 6 contributors (0 new)! This is a small patch release with bug fixes for DeepSeek V4, gpt-oss, and Qwen3-VL ### Bug Fixes * **DeepSeek V4 sparse attention**: Re-enable the persistent topk path on Hopper
github:vllm-project/vllm