quantization — AI Digest

8 июн Google DeepMind Releases Gemma 4 QAT Checkpoints: Sub-1 GB On-Device E2B Model Google DeepMind models-llm
28 июн ViQ: Text-Aligned Visual Quantized Representations at Any Resolution (ECCV 2026) Tencent Hunyuan research
19 мая LongLive-2.0: NVFP4 Parallel Infrastructure for Long Video Generation (NVIDIA, 1,220 HF upvotes) NVIDIA research
25 июн Quantized Reasoning Models Think They Need to Think Longer, but They Do Not Meta research
12 июн llama.cpp b9603: Qualcomm Adreno OpenCL Kernels for On-Device Inference ggml-org tools