Two major open-source LLM inference engines have added support for Cohere's model architecture in recent days.

window 30devidence 3

signal brief

Two major open-source LLM inference engines have added support for Cohere's model architecture in recent days. On 2026-06-13, llama.cpp merged architecture support for Cohere MoE (#24260), enabling local execution of Cohere-style models such as North-Mini-Code. On 2026-06-12, vLLM v0.23.0 added 'Cohere Mini Code' as a new supported model (release notes). These integrations lower the barrier for developers to experiment with and deploy Cohere's models in self-hosted and edge environments. While not a direct business announcement, the inclusion in both llama.cpp and vLLM signals growing community and ecosystem momentum for Cohere's architecture, which could translate into increased model adoption and longer-term demand for inference compute. The evidence is based on two independent open-source project updates; no official Cohere communication has been made, hence low confidence.

evidence

Decision support, not stock advice. This signal is research with cited evidence — not a recommendation to buy, sell, or hold any security.