#local-llm
- Google DeepMind Releases Gemma 4 QAT Checkpoints: Sub-1 GB On-Device E2B Model Google DeepMind models-llm
- Ollama v0.24.0: Codex App Integration and MLX Sampler Improvements Ollama tools
- llama.cpp b9161/b9169: Codex CLI Compatibility and Qwen3A Multimodal Support ggml-org tools
- Ollama v0.30.7: Hermes Desktop Support, Gemma 4 QAT, and Nemotron-3-Ultra Ollama tools
- llama.cpp b9589–b9592: CUDA SSM Sync Fix and Mamba Memory Optimization tools
- Ollama v0.30.9: Cohere2Moe Support, Coding Agent Single-Token Output Bug Fixed tools
- llama.cpp June 16 Builds: Eagle3 Speculative Decoding, Vulkan UMA Memory, NVFP4 Fixes tools