{
  "version": "https://jsonfeed.org/version/1.1",
  "title": "AI Digest",
  "home_page_url": "https://ai-digest.kerby.pro/en/",
  "feed_url": "https://ai-digest.kerby.pro/en/feed.json",
  "description": "AI releases, tools, research, and industry: a daily roundup with an emphasis on source verifiability.",
  "language": "en",
  "authors": [
    {"name": "Alexei Lukin", "url": "https://kerby.pro"}
  ],
  "items": [
    
    {
      "id": "2026-06-28-world-model-hallucination-predictable-preventable",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-28-world-model-hallucination-predictable-preventable/",
      "title": "Hallucination in World Models is Predictable and Preventable",
      "content_text": "Hansen and Wang reframe hallucination in visual world models as a data coverage problem rather than a model capacity problem. Three failure modes are identified: perceptual, action-marginalized, and scene-diverging. Three model-internal signals are derived that predict hallucination with approximately -0.80 Spearman correlation. They introduce MMBench2, a 427-hour 210-task dataset with ground-truth actions and rewards. Coverage-aware training and curiosity-reward fine-tuning enable adaptation to new environments with as few as 50 trajectories. 41 upvotes on HF Daily Papers.\n\nWhy it matters: World models underpin model-predictive control for robotics. Reframing hallucination as a data coverage issue and providing predictive diagnostic signals are practically actionable results with direct impact on robot deployment in novel environments.",
      "date_published": "2026-06-28T00:00:00Z",
      "tags": ["world-models", "hallucination", "robustness", "model-based-rl", "robotics", "embodied-ai"],
      "authors": [{"name": "UC San Diego"}]
    },
    
    {
      "id": "2026-06-28-viq-visual-quantized-representations-eccv-2026",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-28-viq-visual-quantized-representations-eccv-2026/",
      "title": "ViQ: Text-Aligned Visual Quantized Representations at Any Resolution (ECCV 2026)",
      "content_text": "ViQ introduces a discrete visual representation framework built on a SigLIP2 vision tower with position-aware, head-wise Finite Scalar Quantization (FSQ). It converts images at any native resolution into compact discrete codes usable by both multimodal LLMs for understanding and decoders for high-fidelity reconstruction. Training uses two stages: text-aligned semantic pre-training and feature discretization via proximal representation learning. ViQ matches continuous-feature encoders on multimodal benchmarks while delivering 20-70% inference acceleration. Accepted to ECCV 2026.\n\nWhy it matters: Discrete visual tokens are a key bottleneck for unified image-language models: prior methods either sacrificed reconstruction quality for semantics or vice versa. ViQ\u0027s resolution-agnostic, text-aligned quantization bridges that gap. 80 upvotes on HF Daily Papers.",
      "date_published": "2026-06-28T00:00:00Z",
      "tags": ["multimodal", "visual-tokenization", "quantization", "representation-learning", "eccv-2026"],
      "authors": [{"name": "Tencent Hunyuan"}]
    },
    
    {
      "id": "2026-06-28-verification-horizon-reward-function-coding-agents",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-28-verification-horizon-reward-function-coding-agents/",
      "title": "The Verification Horizon: No Single Reward Function Works for Coding Agents at Scale",
      "content_text": "This Qwen team paper challenges the assumption that verification is the easy half of generate-then-verify for coding agents. Studying four reward constructions across general coding, frontend, and long-horizon tasks, it finds no static reward function remains effective as policy capability grows. Verification must co-evolve with the generator, characterized across three axes: scalability, faithfulness, and robustness.\n\nWhy it matters: Reward hacking and specification gaming are central problems in training capable coding agents. This paper provides a rigorous framework for verification failure modes at the frontier, with direct implications for how labs design RL pipelines.",
      "date_published": "2026-06-28T00:00:00Z",
      "tags": ["rl", "reward-hacking", "coding-agents", "agentic-rl", "scalable-oversight", "verification"],
      "authors": [{"name": "Qwen (Alibaba)"}]
    },
    
    {
      "id": "2026-06-28-tencent-hunyuan-unirl-unified-rl-post-training",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-28-tencent-hunyuan-unirl-unified-rl-post-training/",
      "title": "Tencent Hunyuan Open-Sources UniRL: Unified RL Post-Training for LLMs and Diffusion Models",
      "content_text": "Tencent\u0027s Hunyuan team released UniRL, an open-source framework for unified RL post-training across LLMs, vision-language models, and diffusion/flow-matching models. It implements a single generate-score-advantage-update-sync loop usable across heterogeneous model families. Two algorithms ship with it: Flow-DPPO for diffusion/flow models using trust-region masks based on exact divergence, and DRPO for LLMs with a smoothed advantage-weighted quadratic regularizer.\n\nWhy it matters: RL post-training has become the dominant route to frontier model quality. UniRL is one of the first public frameworks to unify this pipeline across text, vision, and image-generation model families in a single codebase.",
      "date_published": "2026-06-28T00:00:00Z",
      "tags": ["reinforcement-learning", "post-training", "open-source", "diffusion", "rlhf", "framework"],
      "authors": [{"name": "Tencent / Hunyuan"}]
    },
    
    {
      "id": "2026-06-28-runway-seedance-2-mini-api-lower-cost-video",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-28-runway-seedance-2-mini-api-lower-cost-video/",
      "title": "Runway Adds Seedance 2.0 Mini to API: Lower-Cost Video Generation at 480p/720p",
      "content_text": "On June 26, Runway added Seedance 2.0 Mini (model ID: seedance2_mini) to its API. The model supports text, image, and video input with keyframe control, reference images, reference videos, and generated audio, the same feature set as full Seedance 2.0, but at a lower resolution ceiling (480p or 720p) and billed at 16 credits per second, roughly half the cost of the standard tier. Clip duration ranges from 4 to 15 seconds.\n\nWhy it matters: Seedance 2.0 Mini makes ByteDance\u0027s leading video generation model accessible to a wider developer audience at substantially lower cost. Combined with the 4K tier added on June 24, Runway now offers the full cost/quality spectrum of Seedance 2.0 through a single API.",
      "date_published": "2026-06-28T00:00:00Z",
      "tags": ["text-to-video", "image-to-video", "video-to-video", "api", "seedance"],
      "authors": [{"name": "Runway / ByteDance"}]
    },
    
    {
      "id": "2026-06-28-qwen-image-agent-agentic-context-t2i",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-28-qwen-image-agent-agentic-context-t2i/",
      "title": "Qwen-Image-Agent: Agentic Context Building to Bridge the Prompt Underspecification Gap in T2I",
      "content_text": "Qwen-Image-Agent addresses the context gap in text-to-image generation: user prompts are often underspecified, implicit, or require up-to-date knowledge. The framework iteratively constructs the full generation context via two modules: Context-Aware Planning (identifying missing context) and Context Grounding (gathering it via reasoning, web search, memory, and user feedback). The system achieves state-of-the-art on IA-Bench (45.4%), WISE-Verified (0.9020), and MindBench (0.42). 41 upvotes on HF Daily Papers.\n\nWhy it matters: Most T2I research focuses on model quality; this targets the deployment gap where real users give incomplete prompts. The agentic context-building loop mirrors how humans specify creative tasks to designers.",
      "date_published": "2026-06-28T00:00:00Z",
      "tags": ["image-generation", "agentic", "multimodal", "retrieval", "text-to-image"],
      "authors": [{"name": "Qwen (Alibaba)"}]
    },
    
    {
      "id": "2026-06-28-openai-gpt-5-6-sol-terra-luna-government-gated",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-28-openai-gpt-5-6-sol-terra-luna-government-gated/",
      "title": "OpenAI Previews GPT-5.6 Family: Sol, Terra, and Luna in Government-Gated Limited Release",
      "content_text": "OpenAI launched a limited preview of GPT-5.6 on June 26, comprising three tiers: Sol (flagship, $5/$30 per 1M tokens, with \u0027ultra mode\u0027 multi-agent orchestration), Terra (balanced at $2.50/$15), and Luna (fast at $1/$6). Access is restricted to ~20 pre-approved organizations at the US government\u0027s request for evaluation before wide release. Sol scores top marks on Terminal-Bench 2.1 for agentic coding and ~53.5% on SecureBio Virology Capabilities Test. ChatGPT users remain on GPT-5.5; general availability is expected within weeks. GPT-4.5 was retired from ChatGPT the same day.\n\nWhy it matters: GPT-5.6\u0027s government-mandated pre-release gating sets a precedent for frontier model deployment: the US government is now actively screening who gets early access to the most capable AI systems. The three-tier pricing structure also signals that top-tier AI is increasingly agentic by default.",
      "date_published": "2026-06-28T00:00:00Z",
      "tags": ["gpt-5.6", "reasoning", "agentic", "api", "us-policy", "safety"],
      "authors": [{"name": "OpenAI"}]
    },
    
    {
      "id": "2026-06-28-deepseek-7b-funding-double-headcounts",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-28-deepseek-7b-funding-double-headcounts/",
      "title": "DeepSeek Closes $7.4 Billion Funding Round, Plans to Double All Department Headcounts",
      "content_text": "DeepSeek completed China\u0027s largest-ever AI startup fundraise at approximately 50 billion yuan (~$7.4 billion), with Tencent and CATL as the largest private investors alongside the state-backed National AI Industry Investment Fund. The post-money valuation is estimated at 350\u2013400 billion yuan (~$52\u201359 billion). Bloomberg reported on June 25 that DeepSeek simultaneously announced plans to at least double the size of every department, targeting pre-training, data, agent infrastructure, and AI cross-disciplinary roles. The company currently employs roughly 150\u2013170 people.\n\nWhy it matters: DeepSeek\u0027s first-ever external fundraise signals a transition from lean research lab to operationally scaled company. The $7.4B round is among the largest AI startup financings globally in 2026, and state co-investment alongside strategic corporates gives DeepSeek compute and infrastructure leverage to sustain long-term competition with OpenAI and Anthropic.",
      "date_published": "2026-06-28T00:00:00Z",
      "tags": ["funding", "deepseek", "hiring", "china-ai", "open-source"],
      "authors": [{"name": "DeepSeek"}]
    },
    
    {
      "id": "2026-06-28-danceopd-on-policy-generative-field-distillation",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-28-danceopd-on-policy-generative-field-distillation/",
      "title": "DanceOPD: On-Policy Generative Field Distillation for Unified Image Generation",
      "content_text": "DanceOPD treats each image generation capability (text-to-image, local editing, global editing) as a velocity field and distills them into a unified student flow-matching model via on-policy sampling. For each training sample, the student routes to one frozen capability field, queries it at a low-noise on-policy state, and matches the resulting velocity with a local MSE loss. This avoids capability interference. Editing scores improve by up to 21.9% in specific categories while text-to-image metrics are preserved or improved by up to 2.0%. 64 upvotes on HF Daily Papers.\n\nWhy it matters: Unifying diverse generative capabilities without catastrophic forgetting is a standing challenge in image generation. DanceOPD\u0027s on-policy distillation approach is architecturally clean and shows strong empirical results across all three capability dimensions.",
      "date_published": "2026-06-28T00:00:00Z",
      "tags": ["image-generation", "flow-matching", "distillation", "multi-capability", "generative-models"],
      "authors": [{"name": "ByteDance Seed"}]
    },
    
    {
      "id": "2026-06-28-claude-code-v2-1-195-hook-matcher-fix",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-28-claude-code-v2-1-195-hook-matcher-fix/",
      "title": "Claude Code v2.1.195: Hook Matcher Fix for MCP Servers with Hyphens, Fullscreen Mouse Controls",
      "content_text": "Anthropic released Claude Code v2.1.195 on June 26. It fixes hook matchers with hyphenated identifiers (e.g. mcp__brave-search) to use exact-match instead of substring-match, a bug that affected all MCP server identifiers containing hyphens. Also adds CLAUDE_CODE_DISABLE_MOUSE_CLICKS to disable mouse click/drag/hover in fullscreen while retaining scroll, and fixes voice dictation on macOS for long sessions and languages without word spaces (Japanese, Chinese, Thai).\n\nWhy it matters: The hook matcher bug affected a large fraction of real-world MCP setups, as hyphenated server names are the dominant convention. The fix unblocks production pipelines that had to work around incorrect hook routing.",
      "date_published": "2026-06-28T00:00:00Z",
      "tags": ["claude-code", "coding-agent", "cli", "mcp", "voice-dictation"],
      "authors": [{"name": "Anthropic"}]
    },
    
    {
      "id": "2026-06-28-anthropic-mythos-5-partial-restoration-infra",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-28-anthropic-mythos-5-partial-restoration-infra/",
      "title": "US Government Partially Restores Anthropic Mythos 5 Access for ~100 Critical Infrastructure Organizations",
      "content_text": "On June 27, the US Commerce Department notified Anthropic that Claude Mythos 5 can be redeployed to approximately 100 US organizations operating and defending critical infrastructure \u2014 covering energy, healthcare, financial services, and telecommunications. Claude Fable 5 (the public-facing model) remains suspended. Anthropic continues negotiating for broader Mythos 5 access and the return of Fable 5. The original export control directive was imposed June 12 after Amazon researchers flagged jailbreak vectors in Fable 5\u0027s cybersecurity guardrails.\n\nWhy it matters: This is the first partial rollback of a US government export control applied to a commercial AI model, establishing a sector-specific trusted-access framework. Frontier models with autonomous vulnerability-discovery capabilities are now subject to export-control regimes previously reserved for weapons and semiconductor technology.",
      "date_published": "2026-06-28T00:00:00Z",
      "tags": ["mythos-5", "fable-5", "export-controls", "us-policy", "cybersecurity", "national-security"],
      "authors": [{"name": "Anthropic"}]
    },
    
    {
      "id": "2026-06-28-ai-engineer-worlds-fair-2026-mcp-registry",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-28-ai-engineer-worlds-fair-2026-mcp-registry/",
      "title": "AI Engineer World\u0027s Fair 2026 Opens; Anthropic Announces MCP Registry API",
      "content_text": "The AI Engineer World\u0027s Fair 2026 opened June 29 at Moscone Center, San Francisco, with 6,000+ engineers, 300 speakers, and 29 tracks. Anthropic announced the official MCP Registry API at the event, a canonical directory of MCP servers that coding tools like Claude Code, Codex, and OpenCode can consume programmatically, formalizing MCP from a protocol into production infrastructure.\n\nWhy it matters: The MCP Registry API announcement gives developers a standardized way to discover and integrate MCP servers across all major coding agents. The conference is the largest gathering of AI engineering practitioners of 2026.",
      "date_published": "2026-06-28T00:00:00Z",
      "tags": ["conference", "mcp", "mcp-registry", "agents", "tools"]
    },
    
    {
      "id": "2026-06-26-yandex-alice-ai-major-update",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-26-yandex-alice-ai-major-update/",
      "title": "Yandex Releases Major Alice AI Update: Cross-Session Memory, Personalization, and Live Accessibility Mode",
      "content_text": "Yandex announced a significant upgrade to Alice AI on June 25 at its YoungCon festival, updating the core LLM, search model, and multimodal VLM. New capabilities include persistent cross-session memory, adaptive communication style mirroring user tone and formality, improved image/diagram/table understanding, and a Live-mode for visually impaired users that describes camera surroundings in real time via the Alice AI VLM.\n\nWhy it matters: A broad capability leap for Russia\u0027s most widely deployed consumer AI assistant \u2014 moving it toward a persistent, personalized agent model with accessibility features expanding meaningful AI access to blind and low-vision users.",
      "date_published": "2026-06-26T00:00:00Z",
      "tags": ["alice", "yandexgpt", "russia", "personalization", "memory", "accessibility", "multimodal", "consumer-ai"],
      "authors": [{"name": "Yandex"}]
    },
    
    {
      "id": "2026-06-26-suno-spark-incubator",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-26-suno-spark-incubator/",
      "title": "Suno Launches Spark Incubator for Independent Artists with Grants and Mentorship",
      "content_text": "Suno announced Spark on June 25, an incubator program offering independent artists grants, marketing funds, songwriting camp invitations, and mentorship. Participants retain full creative and commercial rights over work produced with the platform. The program follows Suno\u0027s $400M raise at a $5.4B valuation in June 2026.\n\nWhy it matters: Spark is Suno\u0027s most direct attempt to position itself as an industry collaborator rather than a disruptor, with financial commitments to artists at a time when Universal Music Group and Sony are still litigating against the company.",
      "date_published": "2026-06-26T00:00:00Z",
      "tags": ["suno", "music-generation", "ai-music", "industry", "partnership"],
      "authors": [{"name": "Suno"}]
    },
    
    {
      "id": "2026-06-26-runway-agent-2-0",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-26-runway-agent-2-0/",
      "title": "Runway Releases Agent 2.0 for Marketing Campaign Automation",
      "content_text": "On June 25, Runway released Agent 2.0 across all plans, an agentic tool that creates entire marketing campaigns, analyzes performance data, and scales creative assets across platforms, formats, and markets from a single conversational workflow. It builds on the Aleph 2.0 and Gen-4.5 video models released earlier in 2026.\n\nWhy it matters: Agent 2.0 marks Runway\u0027s pivot from a video generation tool to a full marketing production platform, targeting creative agencies and brand teams while leveraging its video generation lead.",
      "date_published": "2026-06-26T00:00:00Z",
      "tags": ["runway", "text-to-video", "agents", "enterprise", "release"],
      "authors": [{"name": "Runway"}]
    },
    
    {
      "id": "2026-06-26-qwen-agentworld",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-26-qwen-agentworld/",
      "title": "Qwen-AgentWorld: Language World Models for General Agents at 35B and 397B Scale",
      "content_text": "Qwen-AgentWorld presents two foundation world models (35B and 397B parameters) trained on over 10 million interaction trajectories across seven domains, using a three-stage pipeline: capability injection, next-state-prediction activation, and RL refinement. The system serves as both a scalable environment simulator for RL training and a warm-up stage for downstream agent tasks, accompanied by the new AgentWorldBench benchmark.\n\nWhy it matters: Language world models that faithfully simulate environment dynamics could reduce the cost of RL data collection and allow agents to practice in simulation before real deployment. At 397B parameters this is the largest dedicated agent world model to date.",
      "date_published": "2026-06-26T00:00:00Z",
      "tags": ["agents", "reasoning", "rl", "multimodal", "paper", "simulation"],
      "authors": [{"name": "Qwen Team, Alibaba"}]
    },
    
    {
      "id": "2026-06-26-oprd-on-policy-representation-distillation",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-26-oprd-on-policy-representation-distillation/",
      "title": "OPRD: On-Policy Representation Distillation for Post-Training LLMs",
      "content_text": "OPRD extends on-policy distillation from output-space (logits) into hidden-state representation space, aligning student and teacher representations across selected layers on shared rollouts. A cross-architecture extension (OPRD-Bridge) transfers knowledge between models with different architectures and tokenizers via low-rank representational structure. The method delivers 1.44\u00d7 faster training and up to 54% memory reduction while substantially closing performance gaps on math benchmarks where logit-based methods plateau.\n\nWhy it matters: On-policy distillation is a standard component in post-training pipelines for frontier models. OPRD fixes a key failure mode \u2014 high-entropy token distributions making output-space gradients uninformative \u2014 and opens distillation across incompatible model families.",
      "date_published": "2026-06-26T00:00:00Z",
      "tags": ["rl", "reasoning", "post-training", "efficiency", "paper"]
    },
    
    {
      "id": "2026-06-26-opencode-v1-17-11",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-26-opencode-v1-17-11/",
      "title": "OpenCode v1.17.11: Session Snapshots with Revert Controls, Chrome-Style Tab Cycling",
      "content_text": "OpenCode v1.17.11 introduces session snapshots with revert controls, allowing users to roll a session back to any earlier message including all associated file changes. The desktop interface gains Chrome-style tab cycling (mod+1\u20139) and draggable tabs. The previous release v1.17.10 (June 24) added MCP server instructions injected into session context, MCP resource template listing and read tools, and a --mini CLI mode.\n\nWhy it matters: Session snapshots with file revert are a significant safety feature for agentic coding workflows, reducing the cost of exploratory or risky agent runs.",
      "date_published": "2026-06-26T00:00:00Z",
      "tags": ["opencode", "coding-agent", "cli", "mcp", "open-source", "update"],
      "authors": [{"name": "SST"}]
    },
    
    {
      "id": "2026-06-26-openai-codex-remote-ga-enterprise",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-26-openai-codex-remote-ga-enterprise/",
      "title": "OpenAI Makes Codex Remote Generally Available Across All Plans, Reports 97.9% Internal Adoption",
      "content_text": "OpenAI made Codex Remote generally available on all ChatGPT plans, letting users start or continue coding work on a connected Mac or Windows host from a mobile device via QR-paired authentication. Alongside this, OpenAI published adoption data showing 97.9% of its own employees now use Codex \u2014 up from ~40% in August 2025 \u2014 including non-technical departments such as Legal and Finance.\n\nWhy it matters: Moving Codex Remote from preview to GA across all tiers significantly broadens who can use agentic coding assistants; the internal adoption figures signal that OpenAI believes Codex is ready for broad enterprise use beyond pure software engineering.",
      "date_published": "2026-06-26T00:00:00Z",
      "tags": ["codex", "openai", "agents", "enterprise", "coding-agent", "ga", "mobile"],
      "authors": [{"name": "OpenAI"}]
    },
    
    {
      "id": "2026-06-26-mistral-ocr-4",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-26-mistral-ocr-4/",
      "title": "Mistral Releases OCR 4: State-of-the-Art Document Intelligence with On-Premises Deployment",
      "content_text": "Mistral released OCR 4, a document intelligence model covering 170 languages that returns structured output including bounding boxes, typed-block classification (titles, tables, equations, signatures), and inline confidence scores. It tops OlmOCRBench at 85.20 with 72% average win rate in human preference studies, and deploys as a single container for on-premises use. Pricing is $4 per 1,000 pages via API, available on Mistral API, Amazon SageMaker, and Microsoft Foundry.\n\nWhy it matters: Combining best-in-class extraction quality with a self-hostable, single-container deployment addresses a major enterprise blocker \u2014 routing sensitive documents through third-party cloud APIs \u2014 positioning Mistral strongly in the enterprise document processing market.",
      "date_published": "2026-06-26T00:00:00Z",
      "tags": ["mistral", "ocr", "document-ai", "enterprise", "on-premises", "multilingual", "release"],
      "authors": [{"name": "Mistral AI"}]
    },
    
    {
      "id": "2026-06-26-looped-lm-readout-blind-spot",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-26-looped-lm-readout-blind-spot/",
      "title": "Dense Supervision Is Not Enough: The Readout Blind Spot in Looped Language Models",
      "content_text": "This paper diagnoses a training failure in looped (recurrent) transformer architectures: scale-invariant readouts such as RMSNorm and LayerNorm create a \u0027blind spot\u0027 where per-loop cross-entropy supervision leaves hidden-state magnitudes uncontrolled, growing to thousands despite dense supervision. The authors provide two architectural fixes \u2014 making scale visible to the loss function or removing it from the recurrent loop \u2014 and show that scale-controlled variants achieve better perplexity at matched inference depths on 44M and 129M parameter models.\n\nWhy it matters: Looped/recurrent transformers are a promising direction for compute-efficient inference (reusing weights across depth), but training instabilities have limited adoption. This work provides a concrete diagnosis and a simple design rule that could unblock practical development of this architecture class.",
      "date_published": "2026-06-26T00:00:00Z",
      "tags": ["reasoning", "pre-training", "training-dynamics", "paper"]
    },
    
    {
      "id": "2026-06-26-jetspec-speculative-decoding",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-26-jetspec-speculative-decoding/",
      "title": "JetSpec: Breaking the Scaling Ceiling of Speculative Decoding with Parallel Tree Drafting",
      "content_text": "JetSpec introduces a causal parallel draft head that aligns candidate token-tree scores with the target model\u0027s autoregressive factorization, solving the longstanding tradeoff between autoregressive and bidirectional drafters. It achieves up to 9.64\u00d7 speedup on MATH-500 and 4.58\u00d7 on conversational workloads using Qwen3 models on H100/B200 GPUs, with vLLM integration and released draft models on HuggingFace.\n\nWhy it matters: Speculative decoding has plateaued because larger draft budgets did not reliably yield longer accepted sequences. JetSpec breaks this ceiling with a principled training objective, delivering \u003e1,000 tokens/second throughput \u2014 practically significant for inference cost reduction at any scale.",
      "date_published": "2026-06-26T00:00:00Z",
      "tags": ["inference", "speculative-decoding", "efficiency", "benchmark", "paper", "vllm"],
      "authors": [{"name": "Hao AI Lab, UC San Diego"}]
    },
    
    {
      "id": "2026-06-26-google-deepmind-a24-investment",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-26-google-deepmind-a24-investment/",
      "title": "Google DeepMind Invests $75M in A24, Forms First AI Research Partnership with a Film Studio",
      "content_text": "Google invested $75 million in A24 on June 22, 2026 \u2014 its first equity stake in a film studio \u2014 in a multiyear research partnership to co-develop AI filmmaking tools using Veo. DeepMind researchers will embed inside A24\u0027s active productions to build new creative workflows and techniques. Google does not gain access to A24\u0027s existing film library.\n\nWhy it matters: This is the first time a major AI research lab has taken an equity position in a film production company to shape its video generation models through professional creative feedback, setting a precedent for how AI labs may seek adoption in creative industries.",
      "date_published": "2026-06-26T00:00:00Z",
      "tags": ["deepmind", "funding", "partnership", "hollywood", "video-generation", "research"],
      "authors": [{"name": "Google DeepMind"}]
    },
    
    {
      "id": "2026-06-26-glm-5-2-open-weights-coding",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-26-glm-5-2-open-weights-coding/",
      "title": "GLM-5.2: Zhipu AI\u0027s MIT-Licensed 744B MoE Coding Model Raises Cybersecurity Concerns",
      "content_text": "Zhipu AI\u0027s GLM-5.2 \u2014 a 744B MoE model with 40B active parameters and 1M-token context \u2014 had its MIT-licensed weights released on HuggingFace around June 17, with Axios publishing on June 25 that security researchers found the model matches US frontier models on cybersecurity benchmarks. GLM-5.2 scores 62.1 on SWE-bench Pro, ranks second on Code Arena, and is priced at roughly $1.40/million input tokens versus GPT-5.5 at $5.\n\nWhy it matters: The combination of frontier-level coding capability, MIT licensing allowing unrestricted commercial use, and cost roughly one-sixth of GPT-5.5 makes GLM-5.2 the most cost-disruptive open-weight coding model currently available; the security community is evaluating its dual-use potential.",
      "date_published": "2026-06-26T00:00:00Z",
      "tags": ["glm", "zai-org", "open-weights", "mit", "coding", "moe", "chinese-lab", "security", "1m-context"],
      "authors": [{"name": "Zhipu AI / Z.ai"}]
    },
    
    {
      "id": "2026-06-26-deterministic-horizon-reasoning-limits",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-26-deterministic-horizon-reasoning-limits/",
      "title": "The Deterministic Horizon: When Extended Reasoning Fails and Tool Delegation Becomes Necessary",
      "content_text": "Accepted at ICML 2026, this paper establishes an Attention Bottleneck Theorem bounding the state-tracking capacity of decoder-only transformers and identifies a \u0027Deterministic Horizon\u0027 around 19\u201331 steps beyond which chain-of-thought reasoning degrades super-exponentially. Empirical validation across 12 models and 8 task domains \u2014 including SWE-Bench and WebArena \u2014 shows hybrid neural-plus-tool systems reach 86\u201394% accuracy versus 24\u201342% for pure chain-of-thought.\n\nWhy it matters: The paper shifts the narrative around reasoning failures from a training-data problem to an architectural capacity limit, providing principled thresholds for when agentic systems should delegate to external tools rather than reason further.",
      "date_published": "2026-06-26T00:00:00Z",
      "tags": ["reasoning", "agents", "theory", "benchmark", "paper", "formal-reasoning"]
    },
    
    {
      "id": "2026-06-26-deepreinforce-ornith-1-0",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-26-deepreinforce-ornith-1-0/",
      "title": "DeepReinforce Releases Ornith-1.0: Open-Source Coding Models That Learn Their Own RL Scaffolds",
      "content_text": "DeepReinforce released Ornith-1.0 on June 25, a family of four MIT-licensed agentic coding models (9B dense, 31B dense, 35B MoE, 397B MoE) built on Gemma 4 and Qwen 3.5 bases. Instead of using human-designed RL scaffolds, each model learns to generate its own task-specific harnesses during RL training, with rewards flowing back to both scaffold generation and solution generation stages. The 397B flagship achieves 77.5 on Terminal-Bench 2.1 and 82.4 on SWE-Bench Verified, matching Claude Opus 4.7.\n\nWhy it matters: Self-scaffolding RL is a meaningful departure from fixed-harness training, and this is the first open-source model family to match a recent Anthropic frontier model on agentic coding benchmarks at MIT license.",
      "date_published": "2026-06-26T00:00:00Z",
      "tags": ["open-source", "mit", "coding", "reinforcement-learning", "swe-bench", "moe", "release"],
      "authors": [{"name": "DeepReinforce"}]
    },
    
    {
      "id": "2026-06-26-codex-zsh-v0-1-0",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-26-codex-zsh-v0-1-0/",
      "title": "OpenAI Ships codex-zsh v0.1.0: Versioned Patched zsh Binary for Codex Sandbox",
      "content_text": "OpenAI published codex-zsh v0.1.0 as a standalone versioned artifact \u2014 a minimally patched zsh build adding EXEC_WRAPPER support via a patch to Src/exec.c, enabling Codex\u0027s shell-escalation protocol to intercept execve calls and route each command through the Run/Escalate/Deny sandbox policy. Binaries ship for macOS (aarch64 and x86_64) and Linux (musl, both arches).\n\nWhy it matters: Publishing this as a separately versioned artifact decouples zsh patch maintenance from the main Codex CLI release cycle and makes the sandbox\u0027s trust boundary auditable.",
      "date_published": "2026-06-26T00:00:00Z",
      "tags": ["codex", "openai", "sandbox", "security", "cli", "open-source"],
      "authors": [{"name": "OpenAI"}]
    },
    
    {
      "id": "2026-06-26-codex-cli-v0-142-2",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-26-codex-cli-v0-142-2/",
      "title": "OpenAI Codex CLI v0.142.2: Default MCP Tool Search, macOS Proxy Support, PowerShell Safety",
      "content_text": "Codex CLI v0.142.2 makes MCP tool search the default when the server supports it, adds macOS system proxy and PAC/WPAD support, and enforces explicit approval for PowerShell commands containing executable AST regions the safety classifier cannot inspect. Dark-mode plugin logos, richer safety-buffering UI metadata, and actionable Bedrock credential recovery guidance are also included.\n\nWhy it matters: Default MCP tool search improves usability for large tool catalogs; the PowerShell AST enforcement closes a meaningful sandbox-escape surface.",
      "date_published": "2026-06-26T00:00:00Z",
      "tags": ["codex", "openai", "coding-agent", "cli", "mcp", "security", "update"],
      "authors": [{"name": "OpenAI"}]
    },
    
    {
      "id": "2026-06-26-claude-code-v2-1-193",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-26-claude-code-v2-1-193/",
      "title": "Claude Code v2.1.193: Shell Classifier Expansion, OTel Response Logging, Live Path Autocomplete",
      "content_text": "Claude Code v2.1.193 adds a new autoMode.classifyAllShell setting routing all Bash/PowerShell commands through the auto-mode safety classifier, an opt-in OpenTelemetry claude_code.assistant_response log event, live file-path autocomplete in bash mode, and MCP auth startup notices. Background-agent reliability fixes include phantom subagent spawning, stale UI after login, and re-prompting on auto-update.\n\nWhy it matters: The shell classifier expansion and OTel response logging are significant for enterprise deployments needing audit trails and fine-grained shell permission control; the background-agent fixes address long-standing reliability issues as multi-agent workflows see heavier use.",
      "date_published": "2026-06-26T00:00:00Z",
      "tags": ["claude-code", "anthropic", "coding-agent", "cli", "observability", "security", "mcp", "update"],
      "authors": [{"name": "Anthropic"}]
    },
    
    {
      "id": "2026-06-26-bytedance-seedream-5-0-pro",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-26-bytedance-seedream-5-0-pro/",
      "title": "ByteDance Announces Seedream 5.0 Pro: Image Generation with Built-In Online Search and Deep Reasoning",
      "content_text": "Announced at Volcano Engine FORCE on June 23, Seedream 5.0 Pro features integrated online search for trend-aware and current-event imagery, deep-thinking prompt understanding, support for up to 10 reference images, and 2K+ resolution output. It targets the commercial production tier with layout control and targeted editing capabilities.\n\nWhy it matters: Integration of live web search into image generation is a novel architectural approach that allows the model to generate contextually current imagery without separate retrieval steps \u2014 a differentiator versus Flux.2, Midjourney v8.1, and Ideogram 4.0.",
      "date_published": "2026-06-26T00:00:00Z",
      "tags": ["image-generation", "text-to-image", "multimodal", "bytedance", "chinese-lab", "search", "release"],
      "authors": [{"name": "ByteDance"}]
    },
    
    {
      "id": "2026-06-26-bytedance-seedance-2-5",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-26-bytedance-seedance-2-5/",
      "title": "ByteDance Unveils Seedance 2.5: Native 30-Second 4K AI Video with 50 Multimodal Inputs",
      "content_text": "ByteDance announced Seedance 2.5 at its Volcano Engine FORCE conference on June 23, generating single 30-second clips natively at 4K with 10-bit color depth. The model accepts up to 50 simultaneous multimodal inputs (images, audio, 3D white models, style references) and co-processes audio in the same latent space as video for native sound synchronization. An enterprise beta is live; public launch is targeted for early July.\n\nWhy it matters: Seedance 2.5 more than quadruples the reference input capacity of its nearest competitor, and native 30-second generation without stitching removes a key limitation of current video models \u2014 raising the bar for long-form AI video generation.",
      "date_published": "2026-06-26T00:00:00Z",
      "tags": ["text-to-video", "image-to-video", "video-generation", "bytedance", "chinese-lab", "4k", "release"],
      "authors": [{"name": "ByteDance"}]
    },
    
    {
      "id": "2026-06-26-bytedance-seed-audio-1-0",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-26-bytedance-seed-audio-1-0/",
      "title": "ByteDance Launches Seed-Audio 1.0: Unified Speech, Music, and Ambient Sound Generation",
      "content_text": "Announced alongside Seedance 2.5 at the Volcano Engine FORCE conference on June 23, Seed-Audio 1.0 generates multi-character dialogue with distinct voices, background music, sound effects, and ambient soundscapes in a single end-to-end pass of up to 2 minutes. It accepts text prompts and reference audio for voice style matching and cloning, and is available via ByteDance\u0027s Volcano Ark API integrated into CapCut, Jimeng, and Fanqie.\n\nWhy it matters: Seed-Audio 1.0 positions ByteDance as a full-stack generative media provider, unifying voice, music, and effects into one model \u2014 directly competing with ElevenLabs\u0027 multi-product suite and reducing the need for separate specialized tools in content pipelines.",
      "date_published": "2026-06-26T00:00:00Z",
      "tags": ["tts", "music-generation", "voice-cloning", "audio", "bytedance", "chinese-lab", "release"],
      "authors": [{"name": "ByteDance"}]
    },
    
    {
      "id": "2026-06-26-bytedance-doubao-seed-2-1-pro",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-26-bytedance-doubao-seed-2-1-pro/",
      "title": "ByteDance Launches Doubao-Seed-2.1 Pro Flagship LLM at FORCE Conference",
      "content_text": "ByteDance unveiled Doubao-Seed-2.1 Pro at the 2026 Volcano Engine FORCE conference on June 23, a flagship MoE LLM targeting enterprise coding, long-chain agent tasks, and vision-language understanding with million-token context windows. The model benchmarks competitively against GPT-5.5 and Gemini 3.1 Pro, priced at 6 yuan per million input tokens. ByteDance also previewed Seedance 2.5 (video generation) and Seedream 5.0 Pro (image generation) at the same event, completing a full-stack media AI suite.\n\nWhy it matters: Doubao now serves 180 trillion daily tokens \u2014 a 1,500\u00d7 increase since launch \u2014 making this the most widely deployed Chinese AI product, with the 2.1 Pro release signaling ByteDance\u0027s push to monetize at enterprise scale.",
      "date_published": "2026-06-26T00:00:00Z",
      "tags": ["doubao", "bytedance", "moe", "agents", "coding", "multimodal", "chinese-lab", "release"],
      "authors": [{"name": "ByteDance / Doubao"}]
    },
    
    {
      "id": "2026-06-26-anthropic-mythos-classified-vulnerabilities",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-26-anthropic-mythos-classified-vulnerabilities/",
      "title": "Anthropic\u0027s Mythos Model Found Vulnerabilities in Classified US Government Systems Within Hours",
      "content_text": "A senior US official disclosed that Anthropic\u0027s Mythos model identified vulnerabilities in classified US government computer systems within hours during testing conducted through Project Glasswing. Senator Mark Warner cited the finding at a Senate Banking Committee hearing, stating the model \u0027broke into almost all of our classified systems, not in weeks but in hours.\u0027 The revelation contributed to a government directive restricting foreign national access to Anthropic\u0027s Fable 5 and Mythos 5 models.\n\nWhy it matters: Frontier AI models have crossed a threshold where they can autonomously find security vulnerabilities in hardened classified infrastructure \u2014 reshaping how governments think about AI security policy and export controls.",
      "date_published": "2026-06-26T00:00:00Z",
      "tags": ["anthropic", "security", "cybersecurity", "government", "national-security", "policy", "frontier-model", "export-controls"],
      "authors": [{"name": "Anthropic"}]
    },
    
    {
      "id": "2026-06-25-wan-streamer-realtime-multimodal",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-25-wan-streamer-realtime-multimodal/",
      "title": "Wan-Streamer v0.1: End-to-End Real-Time Interactive Foundation Model Under 550ms Latency",
      "content_text": "A unified foundation model for real-time multimodal interaction handling language, audio, and video in a single Transformer with block-causal attention. Unlike pipeline systems chaining separate ASR, reasoning, and TTS modules, Wan-Streamer jointly learns perception, reasoning, and generation \u2014 achieving ~200ms model-side latency and 550ms total interaction latency, with streaming units as short as 160ms at 25 fps. Currently at 192p resolution as proof of concept.\n\nWhy it matters: Real-time interactive AI where a model sees, hears, and responds with audio and video within half a second has been a hard systems problem. Wan-Streamer demonstrates that end-to-end joint training in a single Transformer can match latency targets previously requiring specialized pipeline glue.",
      "date_published": "2026-06-25T00:00:00Z",
      "tags": ["multimodal", "streaming", "real-time", "audio", "paper", "architecture"],
      "authors": [{"name": "Wan-AI"}]
    },
    
    {
      "id": "2026-06-25-quantized-reasoning-overthinking",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-25-quantized-reasoning-overthinking/",
      "title": "Quantized Reasoning Models Think They Need to Think Longer, but They Do Not",
      "content_text": "An empirical study showing that post-training quantization of reasoning models paradoxically increases chain-of-thought length while reducing accuracy. In up to 52% of failures, quantized models reach the correct intermediate answer but then fail to select it \u2014 because high-entropy token positions cause them to oversample \u0027overthinking\u0027 markers like \u0027wait\u0027, \u0027but\u0027, \u0027alternatively\u0027. A training-free logit penalty on these markers reduces reasoning length 12\u201323% while maintaining or improving accuracy across 5 models (1.5B\u201332B), 3 quantization methods, and 5 benchmarks.\n\nWhy it matters: Quantization is the primary technique for deploying large reasoning models cheaply, but this paper reveals a previously undiagnosed failure mode explaining much of the accuracy loss. The training-free fix is immediately applicable to any quantized reasoning model deployment, offering significant inference cost reduction with no fine-tuning required.",
      "date_published": "2026-06-25T00:00:00Z",
      "tags": ["reasoning", "efficiency", "quantization", "training", "paper", "chain-of-thought"],
      "authors": [{"name": "Meta"}]
    },
    
    {
      "id": "2026-06-25-qualcomm-acquires-modular",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-25-qualcomm-acquires-modular/",
      "title": "Qualcomm Acquires Modular for $3.92B to Challenge CUDA Lock-in",
      "content_text": "Qualcomm announced at its Investor Day on June 24 that it is acquiring Modular \u2014 the startup behind the Mojo programming language and MAX inference engine \u2014 in an all-stock deal valued at approximately $3.92B. The deal is expected to close H2 2026 pending regulatory approval. Modular\u0027s stack runs AI models across Nvidia, AMD, Intel, and Apple Silicon without hardware-specific rewrites, directly attacking the developer lock-in that makes CUDA sticky.\n\nWhy it matters: If Qualcomm can make Modular\u0027s cross-hardware abstraction mainstream, it erodes one of Nvidia\u0027s deepest moats. For ML engineers, a mature hardware-agnostic inference stack would meaningfully expand deployment options and reduce GPU vendor dependence. The $3.92B price signals enterprise conviction in the Mojo / MAX ecosystem.",
      "date_published": "2026-06-25T00:00:00Z",
      "tags": ["hardware", "inference", "acquisition", "cuda", "compiler", "compute"],
      "authors": [{"name": "Qualcomm"}]
    },
    
    {
      "id": "2026-06-25-opencode-v1-17-10",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-25-opencode-v1-17-10/",
      "title": "OpenCode v1.17.10: MCP Server Instructions in Context, --mini CLI Mode",
      "content_text": "OpenCode v1.17.10 (June 24) ships MCP server instructions integrated directly into session context, a new --mini CLI mode for lightweight invocation, MCP resource template listing and read tools, opencode-managed provider integration support, and fixed MCP OAuth callbacks for local authentication.\n\nWhy it matters: OpenCode is one of the most actively starred open-source coding agents (160K+ GitHub stars). The MCP resource template tools and managed provider integration expand the agent\u0027s ability to work with external data sources natively.",
      "date_published": "2026-06-25T00:00:00Z",
      "tags": ["opencode", "coding-agent", "mcp", "cli", "open-source"],
      "authors": [{"name": "SST"}]
    },
    
    {
      "id": "2026-06-25-openai-broadcom-jalapeno-inference-chip",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-25-openai-broadcom-jalapeno-inference-chip/",
      "title": "OpenAI and Broadcom Unveil Jalape\u00f1o: OpenAI\u0027s First Custom AI Inference Chip",
      "content_text": "OpenAI and Broadcom jointly announced Jalape\u00f1o on June 24 \u2014 OpenAI\u0027s first custom ASIC designed exclusively for LLM inference. The chip was co-developed from initial design to tape-out in nine months, with AI models accelerating parts of the chip design itself. OpenAI claims roughly 50% better cost-per-token versus current-generation GPUs. Prototype deployments are targeted for end of 2026, with production ramp in 2027\u20132028. The chip will not be sold to external customers.\n\nWhy it matters: OpenAI\u0027s first step toward vertical hardware integration reduces dependence on Nvidia and cuts the per-token cost of serving ChatGPT and API products at scale. The nine-month design cycle \u2014 itself enabled in part by AI \u2014 signals an acceleration in the hardware development loop. This places OpenAI alongside Google (TPUs), Amazon (Trainium), and Microsoft (Maia) in the custom silicon club.",
      "date_published": "2026-06-25T00:00:00Z",
      "tags": ["openai", "hardware", "inference", "asic", "broadcom", "chip", "compute"],
      "authors": [{"name": "OpenAI"}]
    },
    
    {
      "id": "2026-06-25-google-veo-3-1-flow-audio-tools",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-25-google-veo-3-1-flow-audio-tools/",
      "title": "Google Brings Veo 3.1 Audio to All Flow Editing Tools, Adds Insert and Remove",
      "content_text": "On June 22, Google extended Veo 3.1\u0027s audio generation to existing Flow creation features \u2014 Ingredients to Video, Frames to Video, and Extend \u2014 that previously produced silent output. Two new precision editing tools were also added: Insert (adding elements to a scene with matched lighting) and Remove (deleting objects with automatic background reconstruction). Available in Gemini API, Vertex AI, the Gemini app, and Flow.\n\nWhy it matters: Extending native audio to reference-image-driven and clip-extension workflows closes a major gap for professional users who build videos from existing material. The Insert and Remove tools move Veo toward a full post-production pipeline.",
      "date_published": "2026-06-25T00:00:00Z",
      "tags": ["text-to-video", "image-to-video", "video-editing", "gemini", "audio"],
      "authors": [{"name": "Google DeepMind"}]
    },
    
    {
      "id": "2026-06-25-github-copilot-auto-model-selection",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-25-github-copilot-auto-model-selection/",
      "title": "GitHub Copilot Removes Manual Model Selection from Free and Student Plans",
      "content_text": "Effective June 24, GitHub made Copilot auto model selection the default and only option for Free and Student plan users. The Auto system dynamically routes each request to the best available model across OpenAI, Anthropic, and Google families, within plan restrictions. GitHub simultaneously retired the (Preview) label from all Microsoft-released models.\n\nWhy it matters: Removing manual model selection from lower-tier plans simplifies UX but limits user control \u2014 following a trend where providers abstract model selection for cost optimization. Free and Student users can no longer pin to a specific model.",
      "date_published": "2026-06-25T00:00:00Z",
      "tags": ["github-copilot", "developer-tools", "copilot"],
      "authors": [{"name": "GitHub / Microsoft"}]
    },
    
    {
      "id": "2026-06-25-gemini-3-5-flash-computer-use",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-25-gemini-3-5-flash-computer-use/",
      "title": "Gemini 3.5 Flash Gains Native Computer Use as Built-in Tool",
      "content_text": "Google announced on June 24 that computer use is now a native built-in tool in Gemini 3.5 Flash, available via the Gemini API and Gemini Enterprise Agent Platform. Previously available only as a standalone specialist model, the capability now lets agents see, click, type, and scroll across browser, mobile, and desktop environments. Targeted adversarial training mitigates prompt injection risks. Improved OSWorld benchmark performance versus prior implementations.\n\nWhy it matters: Integrating computer use directly into the primary Flash model lowers the barrier to building agentic workflows over real UIs. Combined with Flash\u0027s speed and cost profile, this makes real-world agent automation more accessible for enterprise deployments \u2014 and directly competes with Anthropic\u0027s computer use offering.",
      "date_published": "2026-06-25T00:00:00Z",
      "tags": ["gemini", "computer-use", "agents", "enterprise", "automation", "agentic"],
      "authors": [{"name": "Google DeepMind"}]
    },
    
    {
      "id": "2026-06-25-domainshuttle-subject-driven-t2v",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-25-domainshuttle-subject-driven-t2v/",
      "title": "DomainShuttle: Subject-Driven Text-to-Video Across In-Domain and Cross-Domain Scenarios",
      "content_text": "A text-to-video system for subject-driven synthesis across two scenarios: in-domain (preserving reference subject features precisely) and cross-domain (flexible variation while retaining identity). Introduces Domain-MoT (domain-aware adaptive layer normalization), Video-Reference DualRoPE (separate rotary position encoding for reference and video tokens), and Cross-Pair Consistent Loss. Ranked third on HF Daily Papers for June 25 (34 upvotes).\n\nWhy it matters: Existing subject-driven video methods trade off fidelity against editability \u2014 DomainShuttle proposes architectural components that decouple these objectives, enabling both accurate subject preservation and free domain transfer.",
      "date_published": "2026-06-25T00:00:00Z",
      "tags": ["text-to-video", "training", "generative-models", "paper"]
    },
    
    {
      "id": "2026-06-25-codex-cli-v0-142-1-windows-proxy",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-25-codex-cli-v0-142-1-windows-proxy/",
      "title": "OpenAI Codex CLI v0.142.1: Opt-in Windows System Proxy Support",
      "content_text": "Codex CLI v0.142.1 (June 25, stable) adds opt-in Windows system proxy support covering PAC, WPAD, static proxies, and bypass rules. The 0.143.0-alpha series continued with 9+ pre-release builds across June 23\u201325, suggesting a larger feature update is in progress.\n\nWhy it matters: Enterprise Windows deployments behind corporate proxies have been a blocker for Codex CLI adoption. The active alpha series signals rapid ongoing development.",
      "date_published": "2026-06-25T00:00:00Z",
      "tags": ["codex", "coding-agent", "cli", "openai"],
      "authors": [{"name": "OpenAI"}]
    },
    
    {
      "id": "2026-06-25-claude-code-v2-1-191",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-25-claude-code-v2-1-191/",
      "title": "Claude Code v2.1.191: /rewind Command, 37% CPU Reduction, MCP Retry Logic",
      "content_text": "Claude Code v2.1.191 (June 24) adds /rewind to resume conversations from before a /clear was run, cuts CPU usage during streaming by ~37% through text-update coalescing, adds MCP server retry logic for transient network errors, and reduces memory growth in long sessions. The prior v2.1.187 (June 23) had added sandbox.credentials to block sandboxed commands from reading secret files and org-configured model restrictions in the model picker.\n\nWhy it matters: Two rapid releases in 36 hours show active shipping cadence. The /rewind feature addresses a common pain point with conversation state loss; the CPU and memory improvements matter for long agentic sessions; MCP reliability improvements are relevant to production tool-use pipelines.",
      "date_published": "2026-06-25T00:00:00Z",
      "tags": ["claude-code", "coding-agent", "mcp", "cli", "anthropic"],
      "authors": [{"name": "Anthropic"}]
    },
    
    {
      "id": "2026-06-25-beyond-nl2code-multimodal-survey",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-25-beyond-nl2code-multimodal-survey/",
      "title": "Beyond NL2Code: A Structured Survey of Multimodal Code Intelligence",
      "content_text": "A comprehensive survey of code intelligence systems that go beyond natural-language-only inputs, covering how LLMs process visual artifacts \u2014 screenshots, charts, vector drawings, interactive UI states \u2014 to generate executable code. The paper maps four domains: graphical user interfaces, scientific visualization, structured graphics, and emerging agent frameworks, and argues future progress requires multi-signal validation and agent transparency.\n\nWhy it matters: Topped HuggingFace Daily Papers for June 25 with 262 upvotes \u2014 the highest-voted paper of the day. As AI coding assistants increasingly encounter visual specs and UI mockups, this survey frames the open challenges in visually-grounded programming and sets a research agenda for the next generation of coding agents.",
      "date_published": "2026-06-25T00:00:00Z",
      "tags": ["multimodal", "code-generation", "survey", "agents", "paper"]
    },
    
    {
      "id": "2026-06-25-anthropic-alibaba-distillation",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-25-anthropic-alibaba-distillation/",
      "title": "Anthropic Accuses Alibaba of Largest Known Claude Distillation Attack: 28.8M Conversations",
      "content_text": "In a letter to the US Senate Banking Committee disclosed on June 24, Anthropic accused Alibaba\u0027s Qwen lab of conducting the largest known distillation attack against Claude: 28.8 million conversation exchanges via nearly 25,000 fraudulent accounts between April 22 and June 5, 2026. The campaign targeted Claude\u0027s software engineering and agentic reasoning capabilities. Anthropic had previously identified similar campaigns attributed to DeepSeek (150K interactions), Moonshot AI (3.4M), and MiniMax (13M).\n\nWhy it matters: Model distillation at this scale \u2014 using a frontier model\u0027s outputs to train a cheaper competing model \u2014 is a growing threat to AI lab IP. The Alibaba allegation represents a significant escalation. The Senate disclosure may influence export controls and API access policy in the ongoing US-China AI competition.",
      "date_published": "2026-06-25T00:00:00Z",
      "tags": ["anthropic", "security", "distillation", "alibaba", "qwen", "policy", "export-controls", "china"],
      "authors": [{"name": "Anthropic"}]
    },
    
    {
      "id": "2026-06-25-agent-native-memory-benchmark",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-25-agent-native-memory-benchmark/",
      "title": "Are We Ready For an Agent-Native Memory System? SJTU Benchmarks 12 Architectures",
      "content_text": "A systematic evaluation of AI agent memory through a data-management lens from SJTU and Tsinghua. The paper proposes a framework decomposing agent memory into four modules \u2014 representation and storage, extraction, retrieval and routing, and maintenance \u2014 then benchmarks 12 existing memory systems. Key finding: no single architecture performs optimally across all workloads; localized maintenance is more cost-efficient than full reorganization.\n\nWhy it matters: As agentic AI proliferates, memory is increasingly a deployment bottleneck. This is the first systematic benchmark across 12 memory architectures using a unified framework, giving practitioners a principled basis for architecture selection. Ranked second on HF Daily Papers for June 25 (40 upvotes).",
      "date_published": "2026-06-25T00:00:00Z",
      "tags": ["agents", "memory", "efficiency", "benchmark", "paper"]
    },
    
    {
      "id": "2026-06-24-yandex-robotrak-700km-autonomous-run-moscow-spb",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-24-yandex-robotrak-700km-autonomous-run-moscow-spb/",
      "title": "Yandex Self-Driving Truck Completes First Fully Autonomous 700km Moscow\u2013Saint Petersburg Run",
      "content_text": "On June 23, 2026, Yandex\u0027s Robotrak autonomous truck completed a 700km fully driverless journey from Moscow to Saint Petersburg along the M-11 highway \u2014 the first such feat in Russia. The AI-powered system handled overtaking, road construction zones, and toll plazas at approximately 90 km/h. A safety driver was present but did not touch the controls. Yandex published an uncut 8-hour video log of the trip.\n\nWhy it matters: A landmark milestone for AI-powered autonomous logistics in Russia, demonstrating that Yandex\u0027s self-driving stack has reached long-haul highway maturity. Validates commercial viability of autonomous freight and positions Yandex as the leading autonomous vehicle developer in the Russian market.",
      "date_published": "2026-06-24T00:00:00Z",
      "tags": ["robotics", "russia", "physical-ai"],
      "authors": [{"name": "Yandex"}]
    },
    
    {
      "id": "2026-06-24-xai-goal-grok-build-long-running-autonomous-coding",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-24-xai-goal-grok-build-long-running-autonomous-coding/",
      "title": "xAI Launches /goal in Grok Build for Long-Running Autonomous Coding Tasks",
      "content_text": "xAI shipped a new /goal command in Grok Build on June 22, 2026, enabling long-running autonomous task execution in its terminal-based coding agent. When invoked, the agent creates a progress checklist, then works through it step by step \u2014 including code review, webpage inspection, and script execution \u2014 until the task is completed and verified. The feature uses a multi-model architecture combining Composer 2.5 and Grok Build 0.1. Access is currently limited to SuperGrok Heavy subscribers ($300/month).\n\nWhy it matters: The /goal command pushes Grok Build from an interactive coding assistant toward a more autonomous software engineering agent capable of handling multi-step projects without continuous human guidance, competing directly with OpenAI\u0027s Codex and Anthropic\u0027s Claude Code in the agentic coding space.",
      "date_published": "2026-06-24T00:00:00Z",
      "tags": ["grok", "coding", "agents", "developer-tools", "agentic-ai"],
      "authors": [{"name": "xAI"}]
    },
    
    {
      "id": "2026-06-24-sherloc-code-repair-36pct-token-reduction",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-24-sherloc-code-repair-36pct-token-reduction/",
      "title": "SHERLOC: Structured Diagnostic Localization Cuts Code Repair Token Usage by 36.7%",
      "content_text": "SHERLOC (arXiv 2606.24820, June 23) is a training-free framework addressing fault localization in repository-level code repair. It pairs a reasoning LLM with compact repository tools and a self-recovery mechanism to produce structured diagnostic outputs. Achieves 84.33% accuracy@1 on SWE-Bench Lite while reducing total token usage by 36.7%, and improves downstream repair agent resolve rate by 5.95 percentage points.\n\nWhy it matters: Token efficiency is a practical ceiling on agentic coding tasks. By halving the localization cost without any fine-tuning, SHERLOC makes capable code repair agents substantially cheaper and easier to integrate into existing pipelines.",
      "date_published": "2026-06-24T00:00:00Z",
      "tags": ["coding-agent", "software-engineering", "efficiency", "swe-bench"]
    },
    
    {
      "id": "2026-06-24-sakana-fugu-multi-llm-orchestrator-sota-swe-bench",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-24-sakana-fugu-multi-llm-orchestrator-sota-swe-bench/",
      "title": "Sakana AI Releases Fugu: Multi-LLM Orchestrator Achieving SoTA on SWE-Bench Pro",
      "content_text": "Sakana AI published the Fugu Technical Report (arXiv 2606.21228, revised June 23, 2026). Fugu is a family of orchestrator models trained to coordinate an adaptive team of specialized LLMs, dynamically devising agent scaffolds tailored to each query via fine-tuning, evolutionary algorithms, and RL. Two variants: Fugu (performance/latency balance) and Fugu-Ultra (maximum quality). Achieves state-of-the-art results on SWE-Bench Pro, Terminal Bench, LiveCodeBench, and GPQA-Diamond among publicly accessible models.\n\nWhy it matters: Fugu directly addresses vendor lock-in and frontier LLM fragmentation by learning to compose specialist models rather than relying on a single provider. Achieving SoTA on hard benchmarks like GPQA-Diamond and SWE-Bench Pro without a monolithic model is a meaningful architectural result.",
      "date_published": "2026-06-24T00:00:00Z",
      "tags": ["multi-agent", "coding-agent", "reinforcement-learning", "software-engineering"],
      "authors": [{"name": "Sakana AI"}]
    },
    
    {
      "id": "2026-06-24-qwen-agentworld-language-world-models-for-agents",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-24-qwen-agentworld-language-world-models-for-agents/",
      "title": "Qwen-AgentWorld: Language World Models for General Agents across Seven Environments",
      "content_text": "Alibaba\u0027s Qwen team published Qwen-AgentWorld (arXiv 2606.24597, June 23), introducing language world models \u2014 35B-A3B and 397B-A17B MoE variants \u2014 that simulate seven agentic environments: MCP, Search, Terminal, Software Engineering, Android, Web, and OS. Trained on over 10 million real environment interaction trajectories. Also introduces AgentWorldBench covering all seven domains. The models can serve as scalable RL training simulators or as warm-up training for downstream agent tasks.\n\nWhy it matters: The first language world model operating at this breadth of agentic environments \u2014 providing a unified simulator for RL training across seven domains rather than requiring seven separate real-world environments \u2014 could meaningfully reduce the cost and friction of training capable agents. Top-voted paper on HF Daily Papers for June 24 (36 upvotes).",
      "date_published": "2026-06-24T00:00:00Z",
      "tags": ["agents", "world-models", "reinforcement-learning", "agentic-ai", "qwen"],
      "authors": [{"name": "Alibaba/Qwen"}]
    },
    
    {
      "id": "2026-06-24-prime-intellect-prime-rl-0-6-0-agentic-rl-moe",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-24-prime-intellect-prime-rl-0-6-0-agentic-rl-moe/",
      "title": "Prime Intellect Releases prime-rl v0.6.0 for Agentic RL on Trillion-Parameter MoE Models",
      "content_text": "Prime Intellect released prime-rl v0.6.0 (June 22\u201323, 2026), an open-source framework for asynchronous reinforcement learning on trillion-parameter MoE models targeting long-horizon agentic tasks like software engineering. The framework decouples trainer and inference into independent async processes. A GLM-5 demonstration ran SWE tasks at 131K sequence length with sub-5-minute step times and 256 rollout batch size on only 28 H200 nodes. Router replay cuts KL mismatch between trainer and inference by roughly 10x.\n\nWhy it matters: Previously, scaling agentic RL to trillion-parameter scale required cluster sizes beyond most research budgets. prime-rl 0.6.0 demonstrates it is feasible with 28 H200 nodes \u2014 accessible to mid-sized labs \u2014 and the open-source release lets other organizations replicate this capability.",
      "date_published": "2026-06-24T00:00:00Z",
      "tags": ["reinforcement-learning", "moe", "infrastructure", "open-source", "training"],
      "authors": [{"name": "Prime Intellect"}]
    },
    
    {
      "id": "2026-06-24-openai-daybreak-gpt-5-5-cyber-full-release",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-24-openai-daybreak-gpt-5-5-cyber-full-release/",
      "title": "OpenAI Expands Daybreak with Full GPT-5.5-Cyber Release, Codex Security Plugin, and Patch the Planet",
      "content_text": "On June 22, 2026, OpenAI expanded its Daybreak cybersecurity platform with the full release of GPT-5.5-Cyber (scoring 85.6% on CyberGym \u2014 the highest single-model result to date), a Codex Security plugin for finding and patching vulnerabilities within developer workflows, and \u0027Patch the Planet\u0027 \u2014 an open-source initiative co-founded with Trail of Bits. Access to GPT-5.5-Cyber remains restricted to verified defenders. The Cyber Partner Program now includes over 20 vendors including Cisco, CrowdStrike, Palo Alto Networks, and Cloudflare; over 30 open-source projects including cURL, Go, and Python have committed to Patch the Planet.\n\nWhy it matters: Daybreak\u0027s expansion marks OpenAI\u0027s most concrete push into enterprise cybersecurity infrastructure: combining a specialized fine-tuned model, developer tooling, and a coordinated open-source patching program positions AI as a systematic defense layer rather than a point tool.",
      "date_published": "2026-06-24T00:00:00Z",
      "tags": ["cybersecurity", "openai", "codex", "open-source", "enterprise"],
      "authors": [{"name": "OpenAI"}]
    },
    
    {
      "id": "2026-06-24-modal-auto-endpoints-production-llm-inference",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-24-modal-auto-endpoints-production-llm-inference/",
      "title": "Modal Launches Auto Endpoints for Production-Grade Open-Model LLM Inference",
      "content_text": "Modal published Auto Endpoints on June 23, 2026. The product deploys optimized, OpenAI API-compatible LLM inference endpoints with a single command, selecting GPU type, region, and inference engine flags automatically, while keeping the full serving code visible and editable. It includes speculative decoding with custom drafter models. The backing Modal App is fully inspectable and forkable.\n\nWhy it matters: Occupies the middle ground between opaque managed APIs and DIY self-hosting: production-optimized defaults with full ownership of the configuration, practical for teams needing compliance or custom latency/cost tradeoffs.",
      "date_published": "2026-06-24T00:00:00Z",
      "tags": ["inference", "serving", "open-source", "developer-tools", "cloud"],
      "authors": [{"name": "Modal"}]
    },
    
    {
      "id": "2026-06-24-mistral-ocr-4-bounding-boxes-block-classification",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-24-mistral-ocr-4-bounding-boxes-block-classification/",
      "title": "Mistral Releases OCR 4 with Bounding Boxes, Block Classification, and 170-Language Support",
      "content_text": "Mistral published OCR 4 on June 23, 2026. New capabilities include per-word bounding boxes, typed block classification (titles, tables, equations, signatures), and per-word confidence scores \u2014 enabling source-grounded citations and spatial indexing. The model supports 170 languages across 10 language groups, handles PDF, DOC, PPT, and OpenDocument formats, and runs self-hosted in a single container. On OlmOCRBench it scores 85.20 (top overall) and 93.07 on OmniDocBench. Pricing: $4/1,000 pages via API, $2 with Batch API.\n\nWhy it matters: Bounding boxes and confidence scores are the most-requested capabilities for document AI pipelines, enabling in-context highlighting, form extraction, and spatial reasoning that pure text extraction cannot support. Self-hosting support removes data-egress concerns for regulated industries.",
      "date_published": "2026-06-24T00:00:00Z",
      "tags": ["document-understanding", "multimodal", "enterprise", "rag", "inference"],
      "authors": [{"name": "Mistral"}]
    },
    
    {
      "id": "2026-06-24-krea-2-raw-turbo-open-weights-12b-dit-image-model",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-24-krea-2-raw-turbo-open-weights-12b-dit-image-model/",
      "title": "Krea Releases Krea 2 Raw and Turbo Open Weights: 12B DiT Image Model Generating in 2 Seconds",
      "content_text": "Krea released open weights for Krea 2 on June 22, 2026 via Hugging Face under a custom community license (commercial use requires enterprise agreement for organizations with 50+ seats). Two variants: Krea 2 Raw (pre-RLHF base checkpoint from mid-training) and Krea 2 Turbo (distilled, post-trained). The 12B Diffusion Transformer generates images in approximately 2 seconds with Turbo. Krea reports 30 million users across 191 countries.\n\nWhy it matters: Krea 2 Turbo\u0027s 2-second generation speed at 12B parameters is among the fastest open-weight text-to-image models available. Releasing the Raw pre-RLHF checkpoint gives researchers access to an undistilled mid-training snapshot for fine-tuning and alignment research.",
      "date_published": "2026-06-24T00:00:00Z",
      "tags": ["image-generation", "open-weights", "diffusion", "text-to-image"],
      "authors": [{"name": "Krea"}]
    },
    
    {
      "id": "2026-06-24-google-deepmind-a24-75m-ai-research-partnership",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-24-google-deepmind-a24-75m-ai-research-partnership/",
      "title": "Google DeepMind and A24 Announce $75M AI Research Partnership for Filmmaking",
      "content_text": "Google DeepMind invested $75 million into film studio A24 and announced a multi-year, non-exclusive research and development partnership on June 22, 2026. DeepMind researchers will work alongside A24 filmmakers on active productions to develop AI-powered workflows, with Veo as the central technology. This is Google\u0027s first-ever equity stake in a film studio.\n\nWhy it matters: This is the most direct integration of a frontier AI lab into Hollywood production to date, giving DeepMind real-world feedback loops from working filmmakers and positioning Veo as the preferred AI video tool for prestige cinema. Follows Netflix and Amazon MGM\u0027s AI investments, signaling industry-wide consolidation of AI into the studio pipeline.",
      "date_published": "2026-06-24T00:00:00Z",
      "tags": ["deepmind", "partnership", "hollywood", "text-to-video", "funding"],
      "authors": [{"name": "Google DeepMind"}]
    },
    
    {
      "id": "2026-06-24-github-copilot-cli-terminal-interface-ga",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-24-github-copilot-cli-terminal-interface-ga/",
      "title": "GitHub Copilot CLI Redesigned Terminal Interface Reaches General Availability",
      "content_text": "The redesigned GitHub Copilot CLI terminal interface, previewed at Microsoft Build 2026, is now generally available. It introduces a tabbed layout (Session, Gists, Issues, Pull Requests) for navigating GitHub directly from the terminal, guided in-session tool configuration via `/mcp add`, `/skills`, and `/plugin` commands instead of manual file editing, and theme-aware accessible colors with screen reader support.\n\nWhy it matters: Moves coding-agent-driven GitHub workflows entirely into the terminal, collapsing the context-switch between writing code and managing issues or PRs. The guided `/mcp add` flow lowers the barrier to extending Copilot CLI with custom MCP servers.",
      "date_published": "2026-06-24T00:00:00Z",
      "tags": ["github-copilot", "cli", "developer-tools", "mcp", "ga"],
      "authors": [{"name": "GitHub"}]
    },
    
    {
      "id": "2026-06-24-cursor-3-9-unified-customize-page-plugins-mcps",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-24-cursor-3-9-unified-customize-page-plugins-mcps/",
      "title": "Cursor 3.9 Launches Unified Customize Page for Plugins, Skills, MCPs, and Subagents",
      "content_text": "Cursor 3.9 (June 22) consolidates plugins, skills, MCPs, subagents, rules, commands, and hooks into a single Customize page manageable at user, team, or workspace scope. A marketplace leaderboard surfaces the most popular extensions across a team with one-click installation. Plugins now support prebuilt canvases (e.g., Hex Canvas for data visualizations, Atlassian Canvas for live issue tracking). Team marketplaces expanded to import plugin repos from GitLab, BitBucket, and Azure DevOps.\n\nWhy it matters: Cursor is converging on a full plugin ecosystem with team-level governance, shifting from a personal IDE toward a managed, shareable developer platform. Prebuilt canvases make plugins first-class interactive surfaces rather than just automation hooks.",
      "date_published": "2026-06-24T00:00:00Z",
      "tags": ["cursor", "ide", "plugins", "mcp", "coding-agent"],
      "authors": [{"name": "Cursor"}]
    },
    
    {
      "id": "2026-06-24-claude-code-v2-1-187-sandbox-credential-isolation",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-24-claude-code-v2-1-187-sandbox-credential-isolation/",
      "title": "Claude Code v2.1.187: Sandbox Credential Isolation and Remote MCP Hang Fix",
      "content_text": "Claude Code v2.1.187 (June 23) adds a `sandbox.credentials` setting that blocks sandboxed commands from reading credential files and secret env vars, adds org-configured model restrictions to the model picker, and fixes remote MCP tool calls that previously hung for up to 5 minutes before aborting.\n\nWhy it matters: The credential isolation setting closes a real attack surface where sandboxed subprocesses could exfiltrate secrets; the MCP hang fix removes a reliability blocker for teams running agent workflows with external tool servers.",
      "date_published": "2026-06-24T00:00:00Z",
      "tags": ["claude-code", "mcp", "security", "coding-agent", "release"],
      "authors": [{"name": "Anthropic"}]
    },
    
    {
      "id": "2026-06-24-bytedance-seedance-2-5-native-4k-30-second-video",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-24-bytedance-seedance-2-5-native-4k-30-second-video/",
      "title": "ByteDance Previews Seedance 2.5: Native 4K, 30-Second Video with 50 Reference Inputs",
      "content_text": "Also at the June 23 Volcano Engine FORCE conference, ByteDance previewed Seedance 2.5, its next-generation video model. The model generates native 30-second single-clip video at 4K resolution with 10-bit color depth, and accepts up to 50 multimodal reference inputs simultaneously \u2014 images, audio, 3D models, style references \u2014 compared to 12 in the previous version. Post-generation local editing preserves visual style. The model is in global enterprise beta; public launch is targeted for early July 2026.\n\nWhy it matters: Extending single-pass video generation to 30 seconds at 4K clears a key production barrier that most current models cannot meet without stitching artifacts. The 50-reference multimodal input capacity targets professional film and advertising pipelines, directly challenging Runway and Kling at the high end.",
      "date_published": "2026-06-24T00:00:00Z",
      "tags": ["text-to-video", "image-to-video", "china", "preview"],
      "authors": [{"name": "ByteDance"}]
    },
    
    {
      "id": "2026-06-24-bytedance-doubao-seed-2-1-pro-volcano-engine-force",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-24-bytedance-doubao-seed-2-1-pro-volcano-engine-force/",
      "title": "ByteDance Launches Doubao-Seed-2.1-Pro at Volcano Engine FORCE Conference",
      "content_text": "ByteDance unveiled Doubao-Seed-2.1-Pro on June 23 at the Volcano Engine FORCE conference in Beijing \u2014 a production-level frontier LLM for coding, long-horizon agentic tasks, and multimodal understanding. Also released: Doubao-Seed-2.1-Turbo at half the price (6 yuan per million input / 30 yuan per million output tokens for Pro). ByteDance claims parity with GPT-5.5 on coding and agent benchmarks, topping OSWorld, MobileWorld, and MMMU-Pro. The Doubao family now exceeds 180 trillion daily token calls \u2014 up 10x year-over-year.\n\nWhy it matters: ByteDance is directly competing with frontier closed-source models at Chinese market pricing, using its Doubao consumer product as both a distribution channel and an internal evaluation harness. Reaching 180 trillion daily tokens signals that Seed models are running at hyperscale production, not just research scale.",
      "date_published": "2026-06-24T00:00:00Z",
      "tags": ["doubao", "seed", "coding", "agents", "multimodal", "china"],
      "authors": [{"name": "ByteDance"}]
    },
    
    {
      "id": "2026-06-24-anthropic-claude-tag-slack-persistent-ai-teammate",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-24-anthropic-claude-tag-slack-persistent-ai-teammate/",
      "title": "Anthropic Launches Claude Tag: A Persistent AI Teammate for Slack",
      "content_text": "Anthropic launched Claude Tag in beta on June 23, 2026, for Claude Enterprise and Team customers. It adds Claude as a persistent, multiplayer Slack team member that users can @-mention to delegate tasks. Claude learns from channel history over time, can work asynchronously, and \u2014 when ambient mode is enabled \u2014 proactively flags relevant information without being prompted. The feature runs on Claude Opus 4.8 and replaces the existing Claude for Slack app. Anthropic reports that an internal version already generates 65% of its product team\u0027s code.\n\nWhy it matters: Claude Tag is Anthropic\u0027s most direct move into the enterprise collaboration software market, turning Claude from a chatbot into an always-on autonomous agent embedded in the workflow layer where teams actually operate. The multiplayer design \u2014 one shared Claude per Slack channel \u2014 is a new interaction paradigm that enables collective delegation rather than individual prompting.",
      "date_published": "2026-06-24T00:00:00Z",
      "tags": ["claude-code", "enterprise", "agents", "agentic-ai", "anthropic"],
      "authors": [{"name": "Anthropic"}]
    },
    
    {
      "id": "2026-06-24-alice-ai-agentic-booking-restaurants-beauty-salons",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-24-alice-ai-agentic-booking-restaurants-beauty-salons/",
      "title": "Yandex Alice AI Gains Agentic Booking for Restaurants and Beauty Salons Across Russia",
      "content_text": "Yandex launched an AI agent booking capability inside Alice AI chat on June 23, 2026. Users can now book restaurant tables and beauty salon appointments via natural-language conversation, covering over 30,000 restaurants and 40,000 service businesses nationwide. For venues connected to Yandex Eats, bookings confirm automatically; for others, Alice fills out reservation forms on the venue\u0027s website. Available in alice.yandex.ru, the Alice AI app, Yandex Browser, and the main Yandex app.\n\nWhy it matters: Concrete move from AI assistant to transactional AI agent: Alice now completes real-world actions (booking, form submission) rather than just providing recommendations, expanding practical utility for tens of millions of Russian users.",
      "date_published": "2026-06-24T00:00:00Z",
      "tags": ["alice", "agents", "russia", "agentic-ai"],
      "authors": [{"name": "Yandex"}]
    },
    
    {
      "id": "2026-06-23-zhipu-ai-market-cap-crosses-hk1-trillion-on-glm-52",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-23-zhipu-ai-market-cap-crosses-hk1-trillion-on-glm-52/",
      "title": "Zhipu AI Market Cap Crosses HK$1 Trillion on GLM-5.2 Momentum",
      "content_text": "Zhipu AI\u0027s shares surged up to 42% intraday on June 22, 2026, pushing the Hong Kong-listed company\u0027s market capitalisation past HK$1 trillion (approximately US$128 billion) for the first time. The rally was driven by continued investor enthusiasm for GLM-5.2 \u2014 the company\u0027s 753B-parameter, MIT-licensed open-weight model \u2014 and a JPMorgan upgrade raising Zhipu\u0027s 2026\u20132030 revenue forecast by 7\u201316%. GLM-5.2 ranked second globally on the Code Arena front-end benchmark, behind only Anthropic\u0027s Claude Fable 5.\n\nWhy it matters: Zhipu AI becoming China\u0027s first open-source AI lab to cross a HK$1 trillion valuation signals that open-weight frontier models from Chinese labs now command Western-frontier-tier market credibility.",
      "date_published": "2026-06-23T00:00:00Z",
      "tags": ["zai-org", "glm", "open-weights", "china", "market-cap", "stock"],
      "authors": [{"name": "Zhipu AI"}]
    },
    
    {
      "id": "2026-06-23-world-action-models-a-survey",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-23-world-action-models-a-survey/",
      "title": "World Action Models: A Survey",
      "content_text": "A comprehensive survey of World Action Models (WAMs) \u2014 embodied predictive-action models that forecast future states to inform robot control. The authors organize 109 methods across three design philosophies (Render-and-Decode, Latent-Only, Video-Generation-Free) and four architectural axes, concluding that the field is converging on generating less of the future while preserving what control requires.\n\nWhy it matters: 217 upvotes on HuggingFace Daily Papers (top paper of June 23); provides the first rigorous taxonomy distinguishing true WAMs from video generators as compute-action trade-offs become central to embodied AI design.",
      "date_published": "2026-06-23T00:00:00Z",
      "tags": ["embodied-ai", "world-models", "survey", "robotics", "multimodal", "vla"],
      "authors": [{"name": "National University of Singapore"}]
    },
    
    {
      "id": "2026-06-23-claude-fable-5-exits-subscription-plans-moves-to-u",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-23-claude-fable-5-exits-subscription-plans-moves-to-u/",
      "title": "Claude Fable 5 Exits Subscription Plans, Moves to Usage Credits",
      "content_text": "Starting June 23, 2026, Claude Fable 5 is removed from Pro, Max, Team, and seat-based Enterprise plan allowances; continued access requires usage credits billed at $10/M input and $50/M output tokens \u2014 double the cost of Opus 4.8. Anthropic attributed the change to capacity constraints and stated the model may return to subscription plans once capacity improves.\n\nWhy it matters: Fable 5 is Anthropic\u0027s top-ranked coding model (leading on SWE-bench and FrontierCode), so the pricing transition directly impacts developers and teams relying on it for agentic coding pipelines.",
      "date_published": "2026-06-23T00:00:00Z",
      "tags": ["claude-fable-5", "anthropic", "pricing", "billing", "api", "subscription"],
      "authors": [{"name": "Anthropic"}]
    },
    
    {
      "id": "2026-06-23-claude-code-v2-1-186-mcp-cli-auth-bash-auto-respon",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-23-claude-code-v2-1-186-mcp-cli-auth-bash-auto-respon/",
      "title": "Claude Code v2.1.186: MCP CLI Auth, Bash Auto-Response, Workflow Filtering",
      "content_text": "Claude Code v2.1.186 (released June 22) adds `claude mcp login \u003cname\u003e` and `claude mcp logout \u003cname\u003e` CLI commands for authenticating MCP servers without the interactive menu, makes `!` bash commands trigger automatic Claude responses (configurable via `respondToBashCommands: false`), adds status filtering to the `/workflows` agent detail view and a Skills section to `/plugin`, and fixes streaming failures after machine sleep plus numerous subagent and session-management bugs.\n\nWhy it matters: The new MCP CLI auth flow and bash auto-response significantly smooth headless and SSH workflows, while the workflow/plugin UX additions reflect Claude Code\u0027s growing role as an orchestrator of multi-agent, multi-tool pipelines.",
      "date_published": "2026-06-23T00:00:00Z",
      "tags": ["claude-code", "mcp", "cli", "coding-agent", "release"],
      "authors": [{"name": "Anthropic"}]
    },
    
    {
      "id": "2026-06-23-agentic-transformers-provably-learn-to-search-via",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-23-agentic-transformers-provably-learn-to-search-via/",
      "title": "Agentic Transformers Provably Learn to Search via Reinforcement Learning",
      "content_text": "A theoretical study showing that transformer-based agents trained via policy gradient on a stochastic k-ary tree environment provably develop a depth-first search mechanism, with one attention head tracking prior actions and another detecting failures and triggering backtracking. Policies trained on shallow trees generalize to deeper ones without additional training.\n\nWhy it matters: Provides rare provable guarantees on emergent agentic search behaviors in transformers trained with RL, explaining mechanistically why curriculum-trained agents can generalize beyond their training distribution.",
      "date_published": "2026-06-23T00:00:00Z",
      "tags": ["reasoning", "rl", "transformers", "theory", "agents", "search"]
    },
    
    {
      "id": "2026-06-22-yandex-adds-30-ai-characters-to-alice-ai-chat",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-22-yandex-adds-30-ai-characters-to-alice-ai-chat/",
      "title": "Yandex Adds 30 AI Characters with Distinct Personalities to Alice AI Chat",
      "content_text": "Yandex launched over 30 AI-powered characters with distinct personalities inside the Alice AI chat interface, ranging from bloggers to anime characters, each designed for specific use cases such as emotional support, self-development, or entertainment. Users can also create custom characters by specifying a name and behavior description; characters retain conversation history across sessions and are available on alice.yandex.ru, iOS/Android apps, and Yandex Browser.\n\nWhy it matters: Signals Yandex pushing Alice AI into the companion/social AI segment alongside its assistant functions, competing with character-based AI platforms globally",
      "date_published": "2026-06-22T00:00:00Z",
      "tags": ["alice", "russia", "chatbot", "consumer-ai"],
      "authors": [{"name": "Yandex"}]
    },
    
    {
      "id": "2026-06-22-s-agent-spatial-tool-use-elicits-reasoning-for-spa",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-22-s-agent-spatial-tool-use-elicits-reasoning-for-spa/",
      "title": "S-Agent: Spatial Tool-Use Elicits Reasoning for Spatial Intelligence in VLMs",
      "content_text": "S-Agent reframes spatial reasoning in vision-language models as an agentic process: a VLM planner dispatches spatial tools to accumulate evidence across 2D-to-3D projections and time, maintaining scene and agent memory across frames. The approach is training-free for existing models, and a fine-tuned S-Agent-8B matches closed-source models on spatial benchmarks.\n\nWhy it matters: Shows that tool-augmented agency can substitute for brute-force scale in spatial intelligence, with an 8B model matching frontier closed-source systems",
      "date_published": "2026-06-22T00:00:00Z",
      "tags": ["agents", "multimodal", "reasoning"],
      "authors": [{"name": "Nanyang Technological University"}]
    },
    
    {
      "id": "2026-06-22-llamacpp-b9754-real-time-model-load-progress-via-s",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-22-llamacpp-b9754-real-time-model-load-progress-via-s/",
      "title": "llama.cpp b9754: Real-Time Model Load Progress via SSE and PEG Grammar Parser",
      "content_text": "llama.cpp shipped ~12 tagged builds on June 21, 2026 (b9743\u2013b9754). Key additions: b9747 adds real-time model load progress tracking via /models/sse (Server-Sent Events); b9750 implements the Jinja call statement for template generation; b9754 adds an automaton-based PEG parser for stricter grammar-constrained generation. All builds ship cross-platform binaries for macOS, Linux, Windows, and Android.\n\nWhy it matters: Real-time SSE progress streaming reduces opaque startup latency for local inference frontends; PEG grammar parser improves structured output reliability",
      "date_published": "2026-06-22T00:00:00Z",
      "tags": ["inference", "llama-cpp", "open-source", "streaming", "local-inference"]
    },
    
    {
      "id": "2026-06-22-gatemem-benchmarking-memory-governance-in-multi-pr",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-22-gatemem-benchmarking-memory-governance-in-multi-pr/",
      "title": "GateMem: Benchmarking Memory Governance in Multi-Principal Shared-Memory Agents",
      "content_text": "GateMem is a benchmark evaluating LLM agents deployed in multi-user institutional settings (hospitals, offices, schools) on three competing goals: utility for legitimate requests, role-based access control, and reliable data deletion. Testing across all current methods reveals none simultaneously achieve all three properties, exposing a critical gap before real institutional deployment.\n\nWhy it matters: First systematic benchmark for memory governance in shared-agent deployments; directly relevant to enterprise safety and compliance as agentic systems enter regulated environments",
      "date_published": "2026-06-22T00:00:00Z",
      "tags": ["agents", "alignment", "safety", "benchmark"]
    },
    
    {
      "id": "2026-06-21-runway-launches-studio-integrated-video-editing-su",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-21-runway-launches-studio-integrated-video-editing-su/",
      "title": "Runway Launches Studio: Integrated AI Video Editing Suite",
      "content_text": "On June 18, 2026, Runway shipped Studio, a unified interface allowing users to trim, stitch, reorder, and export final videos without leaving the platform. The feature closes the loop between AI generation and post-production editing in one workspace.\n\nWhy it matters: Runway is moving from a generation-only tool to a full end-to-end video production platform, reducing the need for separate editing software and making AI-generated video more practically usable for final delivery.",
      "date_published": "2026-06-21T00:00:00Z",
      "tags": ["runway", "video-editing", "text-to-video"],
      "authors": [{"name": "Runway"}]
    },
    
    {
      "id": "2026-06-21-playful-agentic-robot-learning",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-21-playful-agentic-robot-learning/",
      "title": "Playful Agentic Robot Learning: Self-Directed Play Yields Transferable Robot Skills",
      "content_text": "Robotics Agent Teams (RATs) acquire skills through self-directed play before any downstream task is specified. During play, the agent generates novel exploratory tasks, writes and executes robot-code policies, diagnoses failures, retries with step-level feedback, and distills successes into a reusable code library. Play-learned skills improved held-out downstream performance by 20.6 and 17.0 percentage points over baselines on LIBERO-PRO and MolmoSpaces, and transferred to other Code-as-Policy agents without fine-tuning.\n\nWhy it matters: Demonstrates that unstructured pre-task play with code-based policies yields skills that generalize to unseen tasks and third-party agents \u2014 a step toward robots that self-improve before deployment. Received 42 upvotes on HuggingFace Daily Papers.",
      "date_published": "2026-06-21T00:00:00Z",
      "tags": ["robotics", "agents", "agentic", "reinforcement-learning"],
      "authors": [{"name": "UC Berkeley"}]
    },
    
    {
      "id": "2026-06-21-opencode-v1-17-9-released-with-glm-52-support-and",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-21-opencode-v1-17-9-released-with-glm-52-support-and/",
      "title": "OpenCode v1.17.9 Released with GLM-5.2 Support and MCP Fixes",
      "content_text": "OpenCode v1.17.9, released on June 21, 2026, adds high and max thinking variants for GLM-5.2 models, fixes Devstral model detection with varying provider ID casing, passes custom headers to Copilot model requests, and fixes OpenAI-compatible providers rejecting MCP tool schemas. Cloudflare AI Gateway API key passing and session timeline flicker are also fixed, and agent step limits now force a final text response rather than failing mid-run.\n\nWhy it matters: GLM-5.2 thinking-mode support ships same day as the model\u0027s ongoing adoption wave; the MCP schema fix unblocks a class of providers that were silently broken.",
      "date_published": "2026-06-21T00:00:00Z",
      "tags": ["opencode", "coding-agent", "mcp", "glm"],
      "authors": [{"name": "SST"}]
    },
    
    {
      "id": "2026-06-21-openai-codex-adds-record-and-replay-for-reusable-w",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-21-openai-codex-adds-record-and-replay-for-reusable-w/",
      "title": "OpenAI Codex Adds Record and Replay for Reusable Workflow Skills",
      "content_text": "OpenAI shipped Record \u0026 Replay for Codex on June 18, 2026 (app version 26.616), allowing users to demonstrate a repetitive workflow once on macOS and have Codex convert it into a reusable SKILL.md file that accepts variable inputs. Unlike traditional RPA, the feature captures intent rather than pixel-exact coordinates, making it resilient to UI changes. Available to ChatGPT Plus, Pro, Business, Enterprise, and Edu subscribers outside the EU, UK, and Switzerland.\n\nWhy it matters: Workflow recording lowers the barrier to AI automation: non-engineers can teach Codex tasks without writing prompts or scripts, extending agentic capabilities to a much broader user base.",
      "date_published": "2026-06-21T00:00:00Z",
      "tags": ["codex", "computer-use", "agents", "automation"],
      "authors": [{"name": "OpenAI"}]
    },
    
    {
      "id": "2026-06-21-moebius-0-2b-lightweight-image-inpainting-framewor",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-21-moebius-0-2b-lightweight-image-inpainting-framewor/",
      "title": "Moebius: 0.2B Lightweight Image Inpainting Framework Matches 11.9B FLUX Model",
      "content_text": "Moebius introduces a 0.22B parameter image inpainting model that matches or surpasses FLUX.1-Fill-Dev (11.9B parameters) through a Local-\u03bb Mix Interaction block that summarizes spatial context and global semantic priors into fixed-size linear matrices. Adaptive multi-granularity latent-space distillation delivers a 15\u00d7 inference speedup.\n\nWhy it matters: Top-voted paper on HuggingFace Daily Papers with over 100 upvotes. Demonstrates that extreme parameter efficiency (under 2% of a baseline model\u0027s size) is achievable for a demanding generative task without quality loss.",
      "date_published": "2026-06-21T00:00:00Z",
      "tags": ["efficiency", "distillation", "diffusion", "text-to-image"],
      "authors": [{"name": "Huazhong University of Science and Technology"}]
    },
    
    {
      "id": "2026-06-21-mistral-rebrands-le-chat-to-vibe-unified-work-and",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-21-mistral-rebrands-le-chat-to-vibe-unified-work-and/",
      "title": "Mistral Rebrands Le Chat to Vibe: Unified Work and Code AI Agent",
      "content_text": "Mistral rebranded its Le Chat product to Vibe in June 2026, unifying work and coding capabilities under a single agent and a single license. Vibe includes Work Mode (a long-range task agent that picks its own tools and streams progress) and Code Mode (for remote coding and pull request creation), a new VS Code extension, and CLI updates for project-wide automation. All existing Le Chat conversations, settings, and plans carry over automatically.\n\nWhy it matters: The rebrand signals Mistral\u0027s strategic pivot from a chat assistant to a unified agentic platform competing directly with Cursor, Codex, and Claude Code.",
      "date_published": "2026-06-21T00:00:00Z",
      "tags": ["coding-agent", "agents", "enterprise", "pivot", "cli", "vs-code"],
      "authors": [{"name": "Mistral"}]
    },
    
    {
      "id": "2026-06-21-how-transparent-is-diffusiongemma",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-21-how-transparent-is-diffusiongemma/",
      "title": "How Transparent is DiffusionGemma? Interpretability Study Closes the Gap to Autoregressive Models",
      "content_text": "This paper investigates whether DiffusionGemma \u2014 a masked discrete-diffusion LM that reasons in continuous latent space \u2014 is harder to interpret than autoregressive models. By mapping intermediate denoising states through an interpretable token bottleneck, the authors reduce the apparent transparency gap from 28.6\u00d7 to just 1.1\u00d7 relative to Gemma 4, and identify diffusion-specific phenomena such as non-chronological reasoning and token smearing. Co-authored by Neel Nanda and Rohin Shah.\n\nWhy it matters: First systematic mech-interp study of a production-scale diffusion language model, with direct implications for AI safety monitoring as diffusion LMs gain adoption.",
      "date_published": "2026-06-21T00:00:00Z",
      "tags": ["interpretability", "mech-interp", "safety", "monitorability", "diffusion-gemma"],
      "authors": [{"name": "Google DeepMind"}]
    },
    
    {
      "id": "2026-06-21-fapo-fully-autonomous-prompt-optimization-of-multi",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-21-fapo-fully-autonomous-prompt-optimization-of-multi/",
      "title": "FAPO: Fully Autonomous Prompt Optimization of Multi-Step LLM Pipelines",
      "content_text": "FAPO evaluates multi-step LLM pipeline outputs, attributes failures to the specific step that caused them, proposes targeted prompt variants, validates them with an independent agent, and iterates until accuracy improves or budget is exhausted. It outperformed GEPA (state-of-the-art optimizer) in 15 of 18 model-benchmark pairs, with mean gains of +14.1 percentage points and +33.8 on tasks requiring structural prompt changes. Open-sourced under Apache 2.0.\n\nWhy it matters: Step-level failure attribution is qualitatively different from treating the pipeline as a black box \u2014 it enables targeted optimization that pipeline-blind methods cannot achieve.",
      "date_published": "2026-06-21T00:00:00Z",
      "tags": ["agents", "automation", "agentic", "evaluation"],
      "authors": [{"name": "Cisco Foundation AI"}]
    },
    
    {
      "id": "2026-06-21-elevenlabs-music-v2-api-goes-live-with-genre-swit",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-21-elevenlabs-music-v2-api-goes-live-with-genre-swit/",
      "title": "ElevenLabs Music v2 API Goes Live with Genre-Switching and Inpainting",
      "content_text": "ElevenLabs opened its Music v2 model via the public API in mid-June 2026. The model supports section-by-section song construction, mid-track genre switching (e.g., opera to heavy metal in one piece), and inpainting of individual song segments. API pricing dropped up to 50% versus Music v1. Commercial licensing is included.\n\nWhy it matters: Music v2\u0027s chunk-based composition API and commercial licensing make it the first developer-accessible music generation model with structured song-building primitives, directly competing with Suno\u0027s v5.5 on both quality and integration flexibility.",
      "date_published": "2026-06-21T00:00:00Z",
      "tags": ["elevenlabs", "music-generation", "audio", "api"],
      "authors": [{"name": "ElevenLabs"}]
    },
    
    {
      "id": "2026-06-21-deepseek-closes-74-billion-series-a-at-55-billion",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-21-deepseek-closes-74-billion-series-a-at-55-billion/",
      "title": "DeepSeek Closes $7.4 Billion Series A at $55 Billion Valuation, Led by Tencent and CATL",
      "content_text": "DeepSeek closed its first-ever external funding round on June 16, 2026, raising ~51 billion yuan ($7.4B) at a post-money valuation of roughly $55 billion. Tencent ($1.5B) and CATL ($740M) led external investors, while founder Liang Wenfeng personally committed $3B. The deal carries an unusual governance structure: commercial investors received no voting rights and a five-year lockup, while the state-backed National AI Industry Investment Fund received direct equity with exclusive voting rights.\n\nWhy it matters: The largest first-round financing in Chinese AI history. The governance structure \u2014 giving state investors sole voting control while locking out private capital \u2014 sets a new precedent for how Beijing exerts control over frontier AI, and draws immediate scrutiny from Western regulators and investors.",
      "date_published": "2026-06-21T00:00:00Z",
      "tags": ["funding", "deepseek", "china", "state-investment", "valuation"],
      "authors": [{"name": "DeepSeek"}]
    },
    
    {
      "id": "2026-06-21-claude-code-v2-1-185-improves-api-stream-stall-mes",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-21-claude-code-v2-1-185-improves-api-stream-stall-mes/",
      "title": "Claude Code v2.1.185 Improves API Stream-Stall Messaging",
      "content_text": "Version 2.1.185 (June 20, 2026) changes the stream-stall indicator from \"No response from API \u00b7 Retrying in \u2026\" to \"Waiting for API response \u00b7 will retry in \u2026\" and extends the threshold before the hint appears from 10 seconds to 20 seconds.",
      "date_published": "2026-06-21T00:00:00Z",
      "tags": ["claude-code", "coding-agent"],
      "authors": [{"name": "Anthropic"}]
    },
    
    {
      "id": "2026-06-21-chatgpt-adds-pronunciation-help-in-60-languages-an",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-21-chatgpt-adds-pronunciation-help-in-60-languages-an/",
      "title": "ChatGPT Adds Pronunciation Help in 60+ Languages and World Cup Hub",
      "content_text": "OpenAI rolled out several ChatGPT improvements on June 18\u201319, 2026: audio and text pronunciation guidance for words in over 60 languages, a dedicated FIFA World Cup 2026 conversational experience covering schedules, predictions, and player storylines, more granular connected-app permission controls, improved chat organization with sidebar pinning and one-click sharing, faster iOS photo uploads, and per-message model selection on Android for paid users.\n\nWhy it matters: Pronunciation in 60+ languages broadens ChatGPT\u0027s utility for language learners globally; the World Cup hub signals OpenAI\u0027s push into real-time sports and live-event intelligence.",
      "date_published": "2026-06-21T00:00:00Z",
      "tags": ["chatgpt", "multilingual", "mobile", "openai", "translation"],
      "authors": [{"name": "OpenAI"}]
    },
    
    {
      "id": "2026-06-19-zhipu-ai-releases-glm-5-2-open-weights-753b-moe-wi",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-19-zhipu-ai-releases-glm-5-2-open-weights-753b-moe-wi/",
      "title": "Zhipu AI Releases GLM-5.2 Open Weights: 753B MoE with 1M-Token Context under MIT License",
      "content_text": "Z.ai (formerly Zhipu AI) published full MIT-licensed weights for GLM-5.2 on HuggingFace on June 17, 2026. The model is a 753B-parameter mixture-of-experts architecture with a 1 million-token context window, optimized for long-horizon coding and agentic tasks. No regional restrictions apply. On Code Arena it ranks second globally among open models, trailing only closed-source leaders.\n\nWhy it matters: GLM-5.2 is the strongest open-weight model for long-horizon coding at time of release, matching several closed-source frontier models on coding benchmarks. MIT license with no regional restrictions is a rare combination for a large-scale Chinese-lab model.",
      "date_published": "2026-06-19T00:00:00Z",
      "tags": ["glm", "zai-org", "open-weights", "mit", "moe", "long-context", "coding", "china"],
      "authors": [{"name": "Zhipu AI / Z.ai"}]
    },
    
    {
      "id": "2026-06-19-xai-releases-grok-imagine-video-1-5-1-on-video-are",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-19-xai-releases-grok-imagine-video-1-5-1-on-video-are/",
      "title": "xAI Releases Grok Imagine Video 1.5: #1 on Video Arena Leaderboard at $4.20/min",
      "content_text": "xAI released Grok Imagine Video 1.5 as generally available on June 17, 2026, reaching #1 on the Image-to-Video Arena leaderboard with a +52 Elo jump. The model generates native synchronized audio, with a \u0027fast\u0027 mode producing 6-second 720p clips in ~25 seconds. Pricing is $4.20/min \u2014 86% cheaper than Sora 2\u0027s $30/min. Available on grok.com/imagine, iOS, Android, and via the Imagine API.\n\nWhy it matters: Grok Imagine Video 1.5 tops the benchmark leaderboard at a fraction of competitor prices, applying direct pressure on Sora 2 and other premium video generation services.",
      "date_published": "2026-06-19T00:00:00Z",
      "tags": ["xai", "grok", "video-generation", "image-to-video"],
      "authors": [{"name": "xAI"}]
    },
    
    {
      "id": "2026-06-19-stylisticbias-15-visual-attributes-account-for-80",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-19-stylisticbias-15-visual-attributes-account-for-80/",
      "title": "StylisticBias: 15 Visual Attributes Account for 80% of Social Bias in Multimodal LLMs",
      "content_text": "A controlled benchmark of ~25,000 photorealistic images \u2014 ~50 per-attribute variations per base face with identity held constant \u2014 shows that age and body type dominate identity-level bias in MLLMs, while fashion style drives the largest attribute-level shifts. Across six MLLMs and 25 social judgment scenarios, ~15 attributes account for ~80% of total bias variation. Accepted to ICML 2026 workshops.\n\nWhy it matters: Provides a Pareto account of MLLM social bias: practitioners can focus on a small high-leverage set of visual attributes rather than auditing all possible variables. The methodology of isolating attributes with identity constant is cleaner than prior holistic evaluations.",
      "date_published": "2026-06-19T00:00:00Z",
      "tags": ["multimodal", "bias", "benchmark", "paper"]
    },
    
    {
      "id": "2026-06-19-openai-publishes-deployment-simulation-predicting",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-19-openai-publishes-deployment-simulation-predicting/",
      "title": "OpenAI Publishes Deployment Simulation: Predicting Model Behavior Before Release",
      "content_text": "OpenAI released research on Deployment Simulation, a method that replays de-identified user conversations through a candidate model to predict how it will behave in production before release. Analyzing 1.3 million conversations across GPT-5 Thinking through GPT-5.4, the approach achieved a median multiplicative error of 1.5x on behavioral rate predictions and surface \u0027calculator hacking\u0027 \u2014 a novel misalignment \u2014 before it reached production.\n\nWhy it matters: A scalable pre-deployment safety approach that uses real production traffic to stress-test upcoming model versions, going beyond narrow hand-crafted evaluations.",
      "date_published": "2026-06-19T00:00:00Z",
      "tags": ["openai", "safety", "evaluation", "agents", "alignment"],
      "authors": [{"name": "OpenAI"}]
    },
    
    {
      "id": "2026-06-19-openai-gpt-5-5-instant-health-intelligence-matches",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-19-openai-gpt-5-5-instant-health-intelligence-matches/",
      "title": "OpenAI: GPT-5.5 Instant Health Intelligence Matches Frontier Models, Now Free",
      "content_text": "OpenAI published an update on June 18, 2026 showing GPT-5.5 Instant\u0027s health performance now matches frontier models on HealthBench Professional, with a 71% drop in factuality issues versus GPT-5.3 Instant. Physician evaluators rated model responses across 3,500 clinical scenarios covering accuracy and communication. The model is available to all free ChatGPT users.\n\nWhy it matters: Over 230 million weekly ChatGPT users gain access to frontier-grade health AI. The 71% factuality improvement matters most for the high-stakes medical domain.",
      "date_published": "2026-06-19T00:00:00Z",
      "tags": ["openai", "gpt-5-5", "chatgpt", "health"],
      "authors": [{"name": "OpenAI"}]
    },
    
    {
      "id": "2026-06-19-ollama-v0-30-10-cohere-command-a-and-north-models",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-19-ollama-v0-30-10-cohere-command-a-and-north-models/",
      "title": "Ollama v0.30.10: Cohere Command A and North Models on Apple Silicon via MLX",
      "content_text": "Ollama v0.30.10 enables Cohere\u0027s Command A and the North model family to run on Apple Silicon using the MLX engine, expanding which models benefit from MLX\u0027s memory-efficient acceleration. The release also updates the bundled llama.cpp engine to build b9672.\n\nWhy it matters: Brings more frontier-class models to local Mac inference without API calls for Apple Silicon users.",
      "date_published": "2026-06-19T00:00:00Z",
      "tags": ["ollama", "mlx", "apple-silicon", "local-ai", "release"],
      "authors": [{"name": "Ollama"}]
    },
    
    {
      "id": "2026-06-19-multimodal-evaluator-preference-collapse-cross-mod",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-19-multimodal-evaluator-preference-collapse-cross-mod/",
      "title": "Multimodal Evaluator Preference Collapse: Cross-Modal Contagion in Self-Evolving Agent Loops",
      "content_text": "Investigates how cross-modal evaluator bias propagates in self-evolving agent loops using LLMs as judges. The MM-EPC framework shows that when GPT-4o evaluates DeepSeek-chat across modalities, a single strategy can monopolize nearly half the reward signal \u2014 \u0027cross-modal contagion\u0027. Cross-model evaluation is the primary risk factor; self-evaluation shows near-complete immunity. Validated with ~35,000 API calls.\n\nWhy it matters: As self-improving agents proliferate, understanding how evaluator choice corrupts reward signals is critical. The finding that self-evaluation avoids contagion creates a concrete design trade-off for RLHF and agent-evolution pipelines.",
      "date_published": "2026-06-19T00:00:00Z",
      "tags": ["evaluation", "agents", "multimodal", "alignment", "paper"]
    },
    
    {
      "id": "2026-06-19-llama-cpp-b9716-builds-internvl-multimodal-batchin",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-19-llama-cpp-b9716-builds-internvl-multimodal-batchin/",
      "title": "llama.cpp b9716 Builds: InternVL Multimodal Batching, CUDA col2im, and Nginx SSE Fix",
      "content_text": "llama.cpp shipped over a dozen builds on June 18\u201319 (b9702\u2013b9716). Key additions: batching support for InternVL multimodal models in the mtmd pipeline, a CUDA col2im 1D operation, a streaming fix adding `X-Accel-Buffering: no` header to prevent Nginx from buffering SSE responses, and HTTP 400 errors for invalid grammar inputs instead of silent drops. Server schema and request validation were also added.\n\nWhy it matters: The Nginx SSE buffering fix is a widely encountered production issue for anyone serving llama.cpp behind a reverse proxy; the grammar validation change improves debuggability for structured-output use cases.",
      "date_published": "2026-06-19T00:00:00Z",
      "tags": ["inference", "local-ai", "multimodal", "open-source"]
    },
    
    {
      "id": "2026-06-19-kling-ai-launches-3-0-turbo-and-3-0-omni-fast-prev",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-19-kling-ai-launches-3-0-turbo-and-3-0-omni-fast-prev/",
      "title": "Kling AI Launches 3.0 Turbo and 3.0 Omni: Fast Previews and 4K Editing with Character Consistency",
      "content_text": "Kuaishou released two additions to the Kling 3.0 family on June 17, 2026. Kling 3.0 Turbo is a fast-preview mode generating 1\u201315 second clips at 480p/720p for rapid creative iteration before full-quality renders. Kling 3.0 Omni extends the editing pipeline to 3\u201315 second videos with 4K input/output, adds per-shot storyboard control, a \u0027Reference to Video\u0027 feature for locking in character and background consistency from multi-angle references, and motion/voice transfer from existing video clips.\n\nWhy it matters: Turbo addresses the high cost of testing creative ideas in AI video. Omni pushes Kling into high-fidelity long-form editing, directly competing with Runway Gen-4.5. Kling reports 100 million global registered users.",
      "date_published": "2026-06-19T00:00:00Z",
      "tags": ["video-generation", "image-to-video", "china"],
      "authors": [{"name": "Kuaishou"}]
    },
    
    {
      "id": "2026-06-19-google-deepmind-publishes-ai-control-roadmap-defen",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-19-google-deepmind-publishes-ai-control-roadmap-defen/",
      "title": "Google DeepMind Publishes AI Control Roadmap: Defense-in-Depth Against Misaligned Coding Agents",
      "content_text": "Google DeepMind released a detailed AI Control Roadmap describing how it secures internal systems against potentially misaligned AI coding agents. The framework treats misaligned AI as an insider threat and applies defense-in-depth combining cybersecurity safeguards with AI-specific monitoring. The team analyzed over one million coding agent trajectories to build live monitoring systems, finding that most flagged behaviors stem from agent misinterpretation rather than adversarial intent.\n\nWhy it matters: Documents a production-tested approach to AI control for agentic coding deployments, providing a concrete roadmap other organizations can adapt as they deploy coding agents internally.",
      "date_published": "2026-06-19T00:00:00Z",
      "tags": ["safety", "agents", "alignment", "coding-agent"],
      "authors": [{"name": "Google DeepMind"}]
    },
    
    {
      "id": "2026-06-19-github-copilot-june-18-changelog-mai-code-1-flash",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-19-github-copilot-june-18-changelog-mai-code-1-flash/",
      "title": "GitHub Copilot June 18 Changelog: MAI-Code-1-Flash Expands and AGENTS.md Lands in Code Review",
      "content_text": "GitHub\u0027s June 18, 2026 changelog includes: MAI-Code-1-Flash (Microsoft\u0027s 5B-parameter coding model) now available on Copilot CLI, GitHub Copilot app, and Copilot Chat beyond its Build 2026 debut surfaces. Code review gains support for repository-level AGENTS.md files, letting teams document agent conventions and have review tools respect them. Duplicate issue detection entered public preview. Copilot-authored PRs are now discoverable via `author:` search.\n\nWhy it matters: AGENTS.md support in code review establishes a repository-level convention for documenting agent behavior, likely to become a standard pattern across tools. MAI-Code-1-Flash expansion gives Copilot users a fast Microsoft-owned model across more surfaces.",
      "date_published": "2026-06-19T00:00:00Z",
      "tags": ["github-copilot", "coding-agent", "agents", "release"],
      "authors": [{"name": "GitHub"}]
    },
    
    {
      "id": "2026-06-19-enpire-ai-coding-agents-close-the-loop-on-physical",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-19-enpire-ai-coding-agents-close-the-loop-on-physical/",
      "title": "ENPIRE: AI Coding Agents Close the Loop on Physical Robotics Research Without Human Intervention",
      "content_text": "ENPIRE is a closed-loop framework where AI coding agents (Codex, Claude Code, Kimi Code) conduct the full robotics research cycle on physical hardware: resetting scenes, running trials, verifying outcomes, and rewriting policies until they succeed. Testing contact-rich tasks including GPU card insertion and zip-tie manipulation, the system achieved 99% pass@8 without human-in-the-loop intervention. New metrics MRU and MTU quantify physical autoresearch efficiency.\n\nWhy it matters: First documented system where frontier coding agents autonomously run the entire scientific loop \u2014 hypothesis, experiment, evaluation, iteration \u2014 on real robots rather than simulation, closing the gap between AI-generated code and physical validation.",
      "date_published": "2026-06-19T00:00:00Z",
      "tags": ["robotics", "agents", "coding-agent", "embodied-ai", "benchmark"],
      "authors": [{"name": "NVIDIA / Carnegie Mellon University / UC Berkeley"}]
    },
    
    {
      "id": "2026-06-19-claude-code-v2-1-183-auto-mode-safety-guards-for-d",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-19-claude-code-v2-1-183-auto-mode-safety-guards-for-d/",
      "title": "Claude Code v2.1.183: Auto Mode Safety Guards for Destructive Git and Infrastructure Commands",
      "content_text": "Claude Code v2.1.183 (June 19, 2026) adds guardrails to auto mode that block destructive git operations \u2014 `git reset --hard`, `git checkout -- .`, `git clean -fd`, `git stash drop` \u2014 when the user did not explicitly ask to discard local work. `git commit --amend` is blocked for commits not made by the agent this session, and infrastructure-destroy commands (`terraform destroy`, `pulumi destroy`, `cdk destroy`) are blocked unless a specific stack was named. New `attribution.sessionUrl` setting omits claude.ai session links from commits and PRs.\n\nWhy it matters: Prevents agentic sessions from silently destroying local work or cloud infrastructure, raising the safety floor for unattended runs.",
      "date_published": "2026-06-19T00:00:00Z",
      "tags": ["claude-code", "coding-agent", "anthropic", "safety", "release"],
      "authors": [{"name": "Anthropic"}]
    },
    
    {
      "id": "2026-06-19-aws-summit-new-york-2026-bedrock-agentcore-ga-kiro",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-19-aws-summit-new-york-2026-bedrock-agentcore-ga-kiro/",
      "title": "AWS Summit New York 2026: Bedrock AgentCore GA, Kiro iOS Preview, and AWS Context Previewed",
      "content_text": "At AWS Summit New York (June 17\u201318, 2026), Amazon announced Bedrock AgentCore general availability with managed knowledge bases, native data connectors, Smart Parsing for multi-format documents, and built-in web search. Kiro \u2014 AWS\u0027s spec-driven agentic IDE \u2014 gained a native iOS app in gated preview for monitoring and steering agent sessions. AWS Context was previewed as a knowledge-graph service for agentic search. Additional launches included the AWS DevOps Agent for autonomous release testing and EC2 G7 instances with NVIDIA Blackwell GPUs.\n\nWhy it matters: Bedrock AgentCore GA makes production agent orchestration accessible without writing custom loops. Kiro for iOS is an early signal of mobile-first agent oversight becoming a product category.",
      "date_published": "2026-06-19T00:00:00Z",
      "tags": ["aws", "agents", "inference", "coding-agent", "enterprise"],
      "authors": [{"name": "Amazon"}]
    },
    
    {
      "id": "2026-06-19-alibaba-launches-qwen-robot-suite-three-foundation",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-19-alibaba-launches-qwen-robot-suite-three-foundation/",
      "title": "Alibaba Launches Qwen-Robot Suite: Three Foundation Models for Embodied AI and Robotics",
      "content_text": "Alibaba\u0027s Qwen team announced the Qwen-Robot Suite on June 16, 2026, consisting of three specialized foundation models: Qwen-RobotNav (autonomous navigation), Qwen-RobotManip (robotic arm manipulation across diverse hardware), and Qwen-RobotWorld (a video world model for predicting physical scenarios). The suite achieved leading results across dozens of robotics benchmarks and entered pilot testing with Alibaba Cloud enterprise clients.\n\nWhy it matters: Alibaba\u0027s first dedicated AI suite for robotics, extending the Qwen brand into physical AI and positioning it against Google DeepMind and Figure.",
      "date_published": "2026-06-19T00:00:00Z",
      "tags": ["qwen", "alibaba", "robotics", "embodied-ai", "china"],
      "authors": [{"name": "Alibaba / Qwen"}]
    },
    
    {
      "id": "2026-06-18-yandex-open-sources-yaff-data-format-saving-up-to",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-18-yandex-open-sources-yaff-data-format-saving-up-to/",
      "title": "Yandex Open-Sources YaFF Data Format, Saving Up to 20% Server Capacity",
      "content_text": "Yandex released YaFF (Yet Another Flat Format) as open source on June 17, 2026 \u2014 a binary data serialization format for high-load services that enables reading data without decompressing it, built as an overlay on Protobuf. Deployed in Yandex\u0027s advertising recommendation system, YaFF reduced CPU load by 10\u201320% while handling hundreds of thousands of requests per second, saving the company nearly 500 million rubles.\n\nWhy it matters: A format that cuts CPU overhead by 10\u201320% on ML-serving workloads is directly applicable to LLM inference infrastructure. Open-sourcing it enables the Russian ML ecosystem to benefit.",
      "date_published": "2026-06-18T00:00:00Z",
      "tags": ["yandex", "open-source", "infrastructure", "mit", "russia"],
      "authors": [{"name": "Yandex"}]
    },
    
    {
      "id": "2026-06-18-vk-publishes-russian-ai-software-market-forecast-9",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-18-vk-publishes-russian-ai-software-market-forecast-9/",
      "title": "VK Publishes Russian AI Software Market Forecast: 95 Billion Rubles by 2030",
      "content_text": "At VK Cloud Conf 2026 (June 17, 2026), VK released research showing Russia\u0027s AI software market reached 25 billion rubles in 2025 and is projected to grow nearly fourfold to 94.8 billion rubles by 2030, at a 30.5% CAGR. AI platforms were identified as the fastest-growing segment at 50% annual growth.\n\nWhy it matters: Most current official market sizing for Russia\u0027s AI industry, providing context for the competitive landscape in which Yandex, Sber, MTS AI, and VK are operating.",
      "date_published": "2026-06-18T00:00:00Z",
      "tags": ["vk", "russia", "market-research", "forecast"],
      "authors": [{"name": "VK AI"}]
    },
    
    {
      "id": "2026-06-18-sae-interventions-are-unreliable-suppressed-behavi",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-18-sae-interventions-are-unreliable-suppressed-behavi/",
      "title": "SAE Interventions Are Unreliable: Suppressed Behaviors Recover Post-Intervention",
      "content_text": "This paper challenges a core assumption in SAE-based mechanistic interpretability: that clamping or suppressing sparse autoencoder features reliably controls model behavior. The authors show that suppressed behaviors tend to recover post-intervention, undermining the reliability of SAE steering as a safety or control mechanism.\n\nWhy it matters: Raises a critical concern for the interpretability community: if SAE feature suppression does not durably prevent behaviors, then steering-based alignment approaches built on SAEs may be less robust than assumed.",
      "date_published": "2026-06-18T00:00:00Z",
      "tags": ["interpretability", "safety", "sparse-autoencoders", "alignment", "paper"],
      "authors": [{"name": "Hong Kong Polytechnic University"}]
    },
    
    {
      "id": "2026-06-18-opencode-v1-17-8-faster-session-timelines-and-mcp",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-18-opencode-v1-17-8-faster-session-timelines-and-mcp/",
      "title": "OpenCode v1.17.8: Faster Session Timelines and MCP Compatibility",
      "content_text": "OpenCode v1.17.8 (June 17, 2026) significantly speeds up session timeline loading and eliminates flicker and scroll jumps. MCP tools now work correctly with providers that enforce stricter JSON schema validation. Long-running MCP operations maintain active timeout handling rather than dropping silently.\n\nWhy it matters: OpenCode has reached 160K+ GitHub stars and 7.5M monthly active developers, making its MCP compatibility improvements broadly impactful for teams integrating custom MCP servers.",
      "date_published": "2026-06-18T00:00:00Z",
      "tags": ["opencode", "coding-agent", "mcp", "open-source", "release"],
      "authors": [{"name": "SST"}]
    },
    
    {
      "id": "2026-06-18-openai-launches-scheduled-tasks-in-chatgpt-sunsets",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-18-openai-launches-scheduled-tasks-in-chatgpt-sunsets/",
      "title": "OpenAI Launches Scheduled Tasks in ChatGPT, Sunsets Pulse",
      "content_text": "OpenAI released a redesigned Scheduled Tasks feature in ChatGPT on June 17, 2026, giving users a dedicated Scheduled page in the sidebar to create, manage, pause, and resume recurring work and monitoring tasks. Tasks can search the web and connected apps, notifying users only when something meaningful changes. The update retires Pulse \u2014 ChatGPT\u0027s proactive daily summaries \u2014 giving Pro users 14 days to migrate.\n\nWhy it matters: Scheduled Tasks marks OpenAI\u0027s clearest move toward persistent, autonomous task automation inside ChatGPT, directly competing with standalone AI agent services.",
      "date_published": "2026-06-18T00:00:00Z",
      "tags": ["openai", "chatgpt", "agents", "automation"],
      "authors": [{"name": "OpenAI"}]
    },
    
    {
      "id": "2026-06-18-openai-codex-cli-v0-141-0-encrypted-remote-executi",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-18-openai-codex-cli-v0-141-0-encrypted-remote-executi/",
      "title": "OpenAI Codex CLI v0.141.0: Encrypted Remote Execution Channels",
      "content_text": "Codex CLI v0.141.0 (June 18, 2026) ships authenticated, end-to-end encrypted Noise relay channels for remote executors, replacing the previous unauthenticated relay. Cross-platform remote execution now preserves native working directories and shells. Includes SQLite auto-recovery for corruption events and memory optimizations via tool-search caching and request deduplication for long sessions.\n\nWhy it matters: Encrypted Noise relay channels close a significant security gap for teams running Codex against remote or cloud-hosted workspaces.",
      "date_published": "2026-06-18T00:00:00Z",
      "tags": ["codex", "coding-agent", "cli", "openai", "release"],
      "authors": [{"name": "OpenAI"}]
    },
    
    {
      "id": "2026-06-18-midjourney-pivots-to-medical-hardware-with-full-bo",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-18-midjourney-pivots-to-medical-hardware-with-full-bo/",
      "title": "Midjourney Pivots to Medical Hardware with Full-Body Ultrasound Scanner",
      "content_text": "On June 18, 2026, Midjourney CEO David Holz announced Midjourney Medical, a new division building a full-body Ultrasonic Computational Tomography scanner using 8,960 ultrasound transducers. The device produces no radiation, completes a scan in ~60 seconds, and is claimed to be 10x cheaper and 60x faster than MRI. Midjourney plans to open a flagship clinic in San Francisco in 2027 and deploy 50,000 scanners globally over six years.\n\nWhy it matters: A dramatic pivot from AI image generation into medical hardware by one of the most recognizable AI consumer brands, signaling the company\u0027s ambition well beyond creative tools.",
      "date_published": "2026-06-18T00:00:00Z",
      "tags": ["midjourney", "hardware", "medical-imaging", "pivot"],
      "authors": [{"name": "Midjourney"}]
    },
    
    {
      "id": "2026-06-18-midjourney-launches-draft-mode-for-v8-1-with-24-im",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-18-midjourney-launches-draft-mode-for-v8-1-with-24-im/",
      "title": "Midjourney Launches Draft Mode for V8.1 with 24-Image Exploration Grid",
      "content_text": "On June 16, 2026, Midjourney released Draft mode for its V8.1 model. Each generation produces 24 images at 512x512px resolution using 0.4 GPU-minutes per prompt \u2014 half the cost of a standard SD job. Users can click \u0027Vary\u0027 on any draft to upscale it to full quality. The update also introduced a --preview flag for testing early model versions.\n\nWhy it matters: Draft mode dramatically lowers the cost of prompt iteration on Midjourney\u0027s most capable model, making high-volume creative exploration practical for professional users and studios.",
      "date_published": "2026-06-18T00:00:00Z",
      "tags": ["midjourney", "text-to-image", "image-generation", "update"],
      "authors": [{"name": "Midjourney"}]
    },
    
    {
      "id": "2026-06-18-kairos-a-native-world-model-stack-for-physical-ai",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-18-kairos-a-native-world-model-stack-for-physical-ai/",
      "title": "Kairos: A Native World Model Stack for Physical AI",
      "content_text": "Kairos is a full-stack world model architecture for physical AI, introducing a Cross-Embodiment Data Curriculum (open-world video \u2192 human behavior \u2192 robot interaction) and a Hybrid Linear Temporal Attention mechanism with provable error-accumulation bounds. The 4B-parameter model runs on-device in real time and tops four embodied-intelligence benchmarks including RoboTwin 2.0 (96.1%) and LIBERO-Plus.\n\nWhy it matters: 712 upvotes on HuggingFace Daily \u2014 the highest among June 18 papers. First open-source world model to close the perception-to-action loop on-device without intermediate translation latency.",
      "date_published": "2026-06-18T00:00:00Z",
      "tags": ["world-models", "robotics", "embodied-ai", "open-source"],
      "authors": [{"name": "ACE Robotics"}]
    },
    
    {
      "id": "2026-06-18-grok-4-3-now-available-on-amazon-bedrock-with-1m-t",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-18-grok-4-3-now-available-on-amazon-bedrock-with-1m-t/",
      "title": "Grok 4.3 Now Available on Amazon Bedrock with 1M-Token Context",
      "content_text": "xAI\u0027s Grok 4.3 became generally available through Amazon Bedrock on June 17, 2026. The model features a 1-million-token context window, configurable reasoning effort (none/low/medium/high), and native video input. Pricing on Bedrock is $1.25/M input tokens and $2.50/M output tokens. The model runs on Mantle, Amazon\u0027s new inference engine, and supports tool calling, structured output, and streaming.\n\nWhy it matters: Bedrock availability brings Grok 4.3 into one of the most widely used enterprise cloud AI platforms, giving AWS developers access to a 1M-context reasoning model inside existing IAM and VPC infrastructure.",
      "date_published": "2026-06-18T00:00:00Z",
      "tags": ["xai", "grok", "amazon-bedrock", "aws", "api", "reasoning", "enterprise"],
      "authors": [{"name": "xAI"}]
    },
    
    {
      "id": "2026-06-18-github-copilot-app-is-now-generally-available",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-18-github-copilot-app-is-now-generally-available/",
      "title": "GitHub Copilot App Is Now Generally Available",
      "content_text": "GitHub\u0027s standalone Copilot desktop app reached general availability on June 17, 2026 for macOS, Windows, and Linux. The app centers on parallel agent sessions \u2014 each session runs in an isolated git worktree \u2014 and Canvases, bidirectional surfaces where developers and agents collaborate on shared plans, terminals, and pull requests. Cloud automations let users schedule recurring agent tasks without a local machine. Agent Merge automates PR progression through CI and review cycles.\n\nWhy it matters: This marks GitHub\u0027s shift from Copilot as an IDE plugin to Copilot as a first-class agent platform. Running isolated sessions per worktree enables true parallel agentic work on separate features or bugfixes simultaneously.",
      "date_published": "2026-06-18T00:00:00Z",
      "tags": ["github-copilot", "coding-agent", "agents", "ga", "release"],
      "authors": [{"name": "GitHub"}]
    },
    
    {
      "id": "2026-06-18-gemini-cli-retires-june-18-replaced-by-antigravity",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-18-gemini-cli-retires-june-18-replaced-by-antigravity/",
      "title": "Gemini CLI Retires June 18, Replaced by Antigravity CLI",
      "content_text": "On June 18, 2026, Google\u0027s Gemini CLI stopped serving requests for Google AI Pro/Ultra subscribers and free users, completing the transition to Antigravity CLI \u2014 Google\u0027s agent-first development platform announced at I/O 2026 in May. Antigravity CLI is rewritten in Go for faster execution, supports asynchronous multi-agent workflows, and replaces Gemini CLI\u0027s hooks and extensions with a new plugin model. Notably, Antigravity CLI is not open source, unlike the Apache 2.0-licensed Gemini CLI.\n\nWhy it matters: This is a forced migration affecting all free and consumer-tier Gemini CLI users on the exact day of this digest. The closed-source status change and architectural differences create significant friction for teams with automation built on Gemini CLI.",
      "date_published": "2026-06-18T00:00:00Z",
      "tags": ["google", "gemini", "cli", "coding-agent", "deprecation", "antigravity"],
      "authors": [{"name": "Google DeepMind"}]
    },
    
    {
      "id": "2026-06-18-dreamreasoner-8b-block-size-curriculum-for-diffusi",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-18-dreamreasoner-8b-block-size-curriculum-for-diffusi/",
      "title": "DreamReasoner-8B: Block-Size Curriculum for Diffusion Reasoning Models",
      "content_text": "DreamReasoner-8B identifies a training failure mode in block diffusion LLMs: large block sizes severely degrade chain-of-thought reasoning. The paper introduces block-size curriculum learning \u2014 shifting from small to large blocks during training \u2014 producing a model competitive with Qwen3-8B on mathematical and code reasoning benchmarks.\n\nWhy it matters: Identifies a fundamental training-inference mismatch in the diffusion-LM paradigm and provides a principled fix, enabling open-source diffusion models to match leading autoregressive models on reasoning tasks.",
      "date_published": "2026-06-18T00:00:00Z",
      "tags": ["reasoning", "diffusion", "curriculum-learning", "paper"]
    },
    
    {
      "id": "2026-06-18-diffusion-proof-formal-theorem-proving-via-diffusi",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-18-diffusion-proof-formal-theorem-proving-via-diffusi/",
      "title": "Diffusion-Proof: Formal Theorem Proving via Diffusion Language Models",
      "content_text": "Diffusion-Proof is the first application of diffusion language models to formal mathematics, pairing dLLM-Prover-7B (full proof generation) with dLLM-Corrector-7B (bidirectional proof correction via in-filling). The system achieves +1.61% on ProofNet-Test and +6.14% on MiniF2F-Test over baselines and solves an IMO problem that DeepSeek-Prover-V2-7B could not.\n\nWhy it matters: Demonstrates that diffusion LLMs can outperform autoregressive models on formal theorem proving, where compounding token-level errors are especially costly.",
      "date_published": "2026-06-18T00:00:00Z",
      "tags": ["reasoning", "diffusion", "theorem-proving", "paper"]
    },
    
    {
      "id": "2026-06-18-cursor-3-7-cloud-dev-environments-and-in-cloud-sub",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-18-cursor-3-7-cloud-dev-environments-and-in-cloud-sub/",
      "title": "Cursor 3.7: Cloud Dev Environments and /in-cloud Subagents",
      "content_text": "Cursor 3.7 (June 17, 2026) introduces cloud environment setup \u2014 configuring a reproducible dev environment in the cloud in under 10 minutes via a shared terminal session and creating a reusable snapshot. The `/in-cloud` command spins up isolated cloud VM subagents for long-running or parallel work such as CI fixes and codebase exploration. A `/babysit` command lets cloud agents iterate on a PR remotely.\n\nWhy it matters: Cloud VM subagents address a key pain point: long-running agent tasks blocking the developer\u0027s local workspace. The reusable environment snapshot reduces cold-start overhead for repeated agentic runs.",
      "date_published": "2026-06-18T00:00:00Z",
      "tags": ["cursor", "coding-agent", "agents", "ide", "release"],
      "authors": [{"name": "Cursor"}]
    },
    
    {
      "id": "2026-06-18-claude-code-v2-1-181-inline-config-syntax-and-bun",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-18-claude-code-v2-1-181-inline-config-syntax-and-bun/",
      "title": "Claude Code v2.1.181: Inline /config Syntax and Bun 1.4 Upgrade",
      "content_text": "Claude Code v2.1.181 (June 17, 2026) adds `/config key=value` syntax for setting any configuration option inline, a CLAUDE_CLIENT_PRESENCE_FILE env var to suppress mobile push notifications, and upgrades the bundled Bun runtime to 1.4. Streaming of long paragraphs is now line-by-line. Fixes include truncated file writes on network drives, prompt caching with custom ANTHROPIC_BASE_URL, and macOS sandbox entitlement issues.\n\nWhy it matters: The inline /config syntax reduces friction for toggling model parameters mid-session. The network-drive write fix addresses a data-loss bug affecting users on NFS/SMB mounts.",
      "date_published": "2026-06-18T00:00:00Z",
      "tags": ["claude-code", "coding-agent", "cli", "anthropic", "release"],
      "authors": [{"name": "Anthropic"}]
    },
    
    {
      "id": "2026-06-18-black-forest-labs-releases-flux-2-with-multi-refer",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-18-black-forest-labs-releases-flux-2-with-multi-refer/",
      "title": "Black Forest Labs Releases FLUX.2 with Multi-Reference Conditioning and 4MP Output",
      "content_text": "Black Forest Labs released the FLUX.2 family around June 16, 2026. Key capabilities include multi-reference conditioning (generating consistent variations from multiple reference inputs), up to 4-megapixel output, improved text rendering, and better real-world lighting physics. NVIDIA partnered to provide FP8 quantizations and ComfyUI optimizations, cutting VRAM requirements by 40% and improving inference performance by 40%. FLUX.2-dev weights are available on Hugging Face under an open license.\n\nWhy it matters: FLUX.2\u0027s multi-reference feature and 4MP ceiling make it a direct challenger to Midjourney V8.1 and GPT-Image for professional design workflows, while open-weight availability keeps it accessible for self-hosting and fine-tuning.",
      "date_published": "2026-06-18T00:00:00Z",
      "tags": ["text-to-image", "image-generation", "open-weights", "mit", "high-resolution"],
      "authors": [{"name": "Black Forest Labs"}]
    },
    
    {
      "id": "2026-06-18-anthropic-opens-seoul-office-and-announces-korean",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-18-anthropic-opens-seoul-office-and-announces-korean/",
      "title": "Anthropic Opens Seoul Office and Announces Korean AI Ecosystem Partnerships",
      "content_text": "Anthropic opened its Seoul office on June 17, 2026 \u2014 its third in Asia-Pacific after Tokyo and Bengaluru \u2014 appointing KiYoung Choi as Representative Director. The company simultaneously announced enterprise deployments with NAVER, Samsung SDS, LG CNS, Nexon, and Hanwha Solutions, a research partnership with the National AI Research Lab consortium (KAIST, Korea University, POSTECH, Yonsei), and the launch of Claude for Startups in Korea.\n\nWhy it matters: The Seoul office signals Anthropic\u0027s deepening commitment to the Asia-Pacific market and marks the first time Claude Code has been adopted at scale inside major Korean conglomerates (NAVER, Samsung, LG).",
      "date_published": "2026-06-18T00:00:00Z",
      "tags": ["anthropic", "expansion", "korea", "claude", "claude-code", "enterprise"],
      "authors": [{"name": "Anthropic"}]
    },
    
    {
      "id": "2026-06-17-zppo-teacher-in-prompts-knowledge-distillation",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-17-zppo-teacher-in-prompts-knowledge-distillation/",
      "title": "ZPPO: Teacher-in-Prompts Knowledge Distillation Outperforms Gradient Methods for Small Reasoners",
      "content_text": "Zone of Proximal Policy Optimization (ZPPO, arXiv 2606.18216) embeds teacher guidance in prompts rather than gradients: it constructs prompts pairing correct teacher responses with incorrect student responses for contrastive learning, and prompts aggregating student errors to surface failure patterns. Tested on 0.8B\u20139B student models with a 27B teacher, ZPPO outperforms distillation and RL baselines, with strongest gains for smaller models.\n\nWhy it matters: Top HuggingFace Daily Papers for June 17 (27 upvotes). Prompt-as-teacher approach offers a lightweight alternative to gradient-based distillation for post-training small reasoning models.",
      "date_published": "2026-06-17T00:00:00Z",
      "tags": ["reasoning", "rl", "distillation", "training", "policy-optimization"],
      "authors": [{"name": "NVIDIA"}]
    },
    
    {
      "id": "2026-06-17-zhipu-ai-open-sources-glm-52-under-mit-license",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-17-zhipu-ai-open-sources-glm-52-under-mit-license/",
      "title": "Zhipu AI Open-Sources GLM-5.2 Under MIT License with 1M Token Context",
      "content_text": "Zhipu AI released the open weights of GLM-5.2 on HuggingFace under an MIT license around June 16, 2026. The model is built on a 753B MoE architecture with a 1-million-token context window, coding-first positioning, and a dual thinking-effort system with no regional restrictions, hosted at zai-org/GLM-5.2.\n\nWhy it matters: Unrestricted MIT open-source release of a 753B frontier-tier MoE model with 1M context, directly competitive with leading closed models for enterprise long-horizon agentic coding globally.",
      "date_published": "2026-06-17T00:00:00Z",
      "tags": ["glm", "open-weights", "long-context", "coding", "mit", "zai-org", "1m-context", "moe"],
      "authors": [{"name": "Zhipu AI"}]
    },
    
    {
      "id": "2026-06-17-xai-launches-grok-imagine-video-15-to-general",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-17-xai-launches-grok-imagine-video-15-to-general/",
      "title": "xAI Launches Grok Imagine Video 1.5 to General Availability",
      "content_text": "xAI moved Grok Imagine Video 1.5 from preview to general availability on June 16, rolling it out on the Imagine API and on grok.com and mobile apps. The model animates still images into 720p/24fps video with native audio. Video 1.5 Fast generates 6-second clips in ~25 seconds (down from 40+ in v1.0), having previously topped the Image-to-Video Arena leaderboard with a 52 Elo point lead.\n\nWhy it matters: Brings xAI\u0027s top-ranked image-to-video model to broad consumer and API availability, directly competing with Veo and Runway at meaningfully faster generation speeds.",
      "date_published": "2026-06-17T00:00:00Z",
      "tags": ["grok", "image-to-video", "xai", "ga"],
      "authors": [{"name": "xAI"}]
    },
    
    {
      "id": "2026-06-17-xai-launches-grok-for-powerpoint-as-free",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-17-xai-launches-grok-for-powerpoint-as-free/",
      "title": "xAI Launches Grok for PowerPoint as Free Microsoft 365 Add-in",
      "content_text": "xAI released a free Microsoft 365 add-in integrating Grok into PowerPoint on June 16. Users can generate full slide decks from text prompts, restructure slides, and apply styling in natural language. The add-in connects to live X and web search and can pull from SharePoint, email, and Google Drive via Grok connectors. PowerPoint is the first Office app; Word and Excel integrations are planned.\n\nWhy it matters: xAI\u0027s first foothold inside Microsoft Office\u0027s enterprise installed base, putting Grok in direct competition with Microsoft\u0027s own Copilot features for productivity workers.",
      "date_published": "2026-06-17T00:00:00Z",
      "tags": ["grok", "xai", "enterprise", "integrations"],
      "authors": [{"name": "xAI"}]
    },
    
    {
      "id": "2026-06-17-vllm-v0230-model-runner-v2-default-for-llama-and",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-17-vllm-v0230-model-runner-v2-default-for-llama-and/",
      "title": "vLLM v0.23.0: Model Runner V2 Default for Llama and Mistral, Transformers v5, Multi-Tier KV Cache",
      "content_text": "vLLM v0.23.0 (June 15, 408 commits, 200 contributors) makes Model Runner V2 the default for Llama and Mistral dense models, adds Transformers v5 compatibility, multi-tier KV cache offloading with object-store secondary tier, a unified reasoning + tool-call parser, Gemma 4 encoder-free support, and Rust frontend gains including streaming generate and dynamic LoRA. Also includes DeepSeek-V4 production hardening and ROCm 7.2.3 / FlashInfer v0.6.12 updates.\n\nWhy it matters: MRv2 expansion to Llama and Mistral covers the two most widely-deployed open-weight model families, eliminating pipeline-parallel bubbles. The unified parser simplifies integration for tool-calling and reasoning workflows.",
      "date_published": "2026-06-17T00:00:00Z",
      "tags": ["vllm", "inference", "open-source", "deepseek", "gemma"]
    },
    
    {
      "id": "2026-06-17-vibethinker-3b-reaches-frontier-level-reasoning",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-17-vibethinker-3b-reaches-frontier-level-reasoning/",
      "title": "VibeThinker-3B Reaches Frontier-Level Reasoning Benchmarks via Curriculum RL",
      "content_text": "VibeThinker-3B (arXiv 2606.16140, June 15) achieves 94.3 on AIME26 (97.1 with test-time scaling), 80.2 Pass@1 on LiveCodeBench v6, and 96.1% acceptance on unseen LeetCode contests using curriculum SFT, multi-domain RL, and offline self-distillation on a 3B dense model. Authors propose the Parametric Compression-Coverage Hypothesis: reasoning compresses into compact models while broad factual knowledge requires larger parameter counts.\n\nWhy it matters: 713 upvotes on HuggingFace Daily Papers. A 3B model matching or exceeding much larger systems on math and code benchmarks challenges core assumptions about scale requirements for frontier reasoning \u2014 significant implications for inference cost and edge deployment.",
      "date_published": "2026-06-17T00:00:00Z",
      "tags": ["reasoning", "rl", "benchmark", "small-models", "rlvr"],
      "authors": [{"name": "WeiboAI"}]
    },
    
    {
      "id": "2026-06-17-ollama-v0309-cohere2moe-support-coding-agent",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-17-ollama-v0309-cohere2moe-support-coding-agent/",
      "title": "Ollama v0.30.9: Cohere2Moe Support, Coding Agent Single-Token Output Bug Fixed",
      "content_text": "Ollama v0.30.9 (June 15) adds Cohere2Moe architecture support, fixes the LFM2 parser for cases where thinking was not emitted, and resolves a bug where coding agents invoked via Ollama output only a single token. Also adds an explicit error when a single message exceeds the context window.\n\nWhy it matters: The single-token output bug directly blocked users running Claude Code and similar coding agents locally via Ollama \u2014 this fix unblocks local-first developer setups.",
      "date_published": "2026-06-17T00:00:00Z",
      "tags": ["ollama", "inference", "local-llm", "open-source", "bug-fix"]
    },
    
    {
      "id": "2026-06-17-llamacpp-june-16-builds-eagle3-speculative",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-17-llamacpp-june-16-builds-eagle3-speculative/",
      "title": "llama.cpp June 16 Builds: Eagle3 Speculative Decoding, Vulkan UMA Memory, NVFP4 Fixes",
      "content_text": "llama.cpp shipped incremental builds b9660\u2013b9672 on June 16. Notable: Eagle3 speculative decoding backend sampling support (b9669), Vulkan preference for host-visible memory on UMA devices (b9668), NVFP4 edge-case fixes in llama-graph (b9670), SYCL support for Q4_K/Q5_K/Q6_K MoE MUL_MAT_ID (b9664), and BoringSSL vendor update to 0.20260616.0 (b9672).\n\nWhy it matters: Eagle3 speculative decoding in the backend sampler extends the fastest local inference technique to more hardware. Vulkan UMA optimization benefits iGPU and Apple unified-memory setups.",
      "date_published": "2026-06-17T00:00:00Z",
      "tags": ["llama-cpp", "inference", "local-llm", "open-source", "speculative-decoding"]
    },
    
    {
      "id": "2026-06-17-joyai-vl-interaction-open-source-8b-real-time-vlm",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-17-joyai-vl-interaction-open-source-8b-real-time-vlm/",
      "title": "JoyAI-VL-Interaction: Open-Source 8B Real-Time VLM with Autonomous Turn-Taking",
      "content_text": "JoyAI-VL-Interaction (arXiv 2606.14777) is an 8B VLM for continuous real-time video interaction: it watches a live video stream and autonomously decides when to speak or stay silent. Released with training recipe, time-aligned interaction data, and a fully deployable open-source system (pluggable ASR/TTS, memory, background agent API). Human raters preferred it over Doubao and Gemini in-app assistants across six real-world scenarios.\n\nWhy it matters: 223 upvotes on HuggingFace Daily Papers. One of the first 8B models for always-on video streaming with autonomous turn-taking, closer to a real-time assistant than a chatbot, with full open-source release (model + data + system).",
      "date_published": "2026-06-17T00:00:00Z",
      "tags": ["vision-language", "real-time", "streaming", "multimodal", "agents"],
      "authors": [{"name": "JD.com"}]
    },
    
    {
      "id": "2026-06-17-google-deepmind-and-uk-government-partner-to",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-17-google-deepmind-and-uk-government-partner-to/",
      "title": "Google DeepMind and UK Government Partner to Speed Housing Planning with Gemini",
      "content_text": "Google DeepMind announced a partnership with the UK government on June 16 to build an AI prototype for planning officers, targeting a 50% reduction in housing application processing time. Built on Gemini, the tool automates data consolidation, policy identification, feedback summarization, and draft report generation. Trials will run in Barnet, Camden, and Dorset councils before a planned national rollout in 2027.\n\nWhy it matters: A government-scale Gemini deployment for public services tied to the UK\u0027s 1.5 million homes target \u2014 demonstrates AI addressing a high-profile policy bottleneck with explicit accountability safeguards.",
      "date_published": "2026-06-17T00:00:00Z",
      "tags": ["gemini", "google-deepmind", "partnership"],
      "authors": [{"name": "Google DeepMind"}]
    },
    
    {
      "id": "2026-06-17-anthropic-study-domain-expertise-drives-agentic",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-17-anthropic-study-domain-expertise-drives-agentic/",
      "title": "Anthropic Study: Domain Expertise Drives Agentic Coding Success, Not Programming Background",
      "content_text": "Anthropic published an analysis of ~400,000 Claude Code sessions from ~235,000 users (Oct 2025\u2013Apr 2026). Domain expertise \u2014 not coding background \u2014 is the primary predictor of success: expert-rated sessions succeed at 30%+ vs 15% for novices, and non-software professionals (legal, finance, management) succeed at nearly the same rate as engineers. Average task value rose ~27% over 7 months as task scope shifted from debugging toward deployment, data analysis, and document writing.\n\nWhy it matters: Large-scale empirical evidence that agentic coding tools lower barriers beyond programmers \u2014 domain knowledge matters more than coding skill \u2014 with direct implications for workforce transformation and enterprise AI adoption.",
      "date_published": "2026-06-17T00:00:00Z",
      "tags": ["claude-code", "agentic-ai", "workforce", "software-engineering", "benchmark"],
      "authors": [{"name": "Anthropic"}]
    },
    
    {
      "id": "2026-06-17-alibaba-releases-qwen-robotsuite-three-embodied",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-17-alibaba-releases-qwen-robotsuite-three-embodied/",
      "title": "Alibaba Releases Qwen-RobotSuite: Three Embodied AI Foundation Models",
      "content_text": "Alibaba\u0027s Qwen team released Qwen-RobotSuite on June 16\u201317, 2026: Qwen-RobotManip (VLA for robotic manipulation, trained on 38,100+ hours of data), Qwen-RobotNav (navigation and instruction-following), and Qwen-RobotWorld (world model for physically consistent future states). RobotManip and RobotNav ship with public GitHub repositories.\n\nWhy it matters: Alibaba\u0027s first open embodied AI foundation suite covering manipulation, navigation, and world modeling \u2014 with open-source GitHub releases for immediate downstream fine-tuning across different robot platforms.",
      "date_published": "2026-06-17T00:00:00Z",
      "tags": ["qwen", "robotics", "embodied-ai", "open-weights", "vla", "navigation", "manipulation"],
      "authors": [{"name": "Alibaba / Qwen"}]
    },
    
    {
      "id": "2026-06-16-openclaw-2026-6-8-beta-2-glm-haiku-rich-telegram",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-16-openclaw-2026-6-8-beta-2-glm-haiku-rich-telegram/",
      "title": "OpenClaw v2026.6.8-beta.2: GLM-5.2 and Claude Haiku 4.5 Support, Rich Telegram Formatting",
      "content_text": "OpenClaw v2026.6.8-beta.2 (June 16, 2026) adds support for GLM-5.2 and Claude Haiku 4.5 models and normalizes provider-qualified model IDs across OpenRouter and Google Vertex. Telegram delivery now supports structured rich text including tables, lists, and expandable blockquotes while preserving intentional line breaks. WhatsApp gains configured ACP bindings. Agent and gateway recovery is improved across DM sends, media completions, auto-reply handling, session restart aborts, and subagent operations. UI additions include collapsible workspace files, improved WebChat backscroll stability, and iOS gateway reconnection fixes.\n\nWhy it matters: OpenClaw is the leading open-source autonomous agent distributed via messaging platforms. Adding GLM-5.2 alongside Claude Haiku 4.5 expands model coverage with Chinese-lab options. The rich Telegram formatting closes a long-standing gap for teams using Telegram as their agent interface.",
      "date_published": "2026-06-16T00:00:00Z",
      "tags": ["openclaw", "coding-agent", "open-source", "mcp", "update", "beta"],
      "authors": [{"name": "OpenClaw"}]
    },
    
    {
      "id": "2026-06-16-openai-launches-partner-network-150m",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-16-openai-launches-partner-network-150m/",
      "title": "OpenAI Launches Partner Network with $150M Investment",
      "content_text": "OpenAI officially introduced the OpenAI Partner Network on June 14\u201315, 2026, a formal global partner program backed by $150 million targeting consulting firms, systems integrators, and technology specialists. The program has three tiers \u2014 Select, Advanced, and Elite \u2014 and aims to certify 300,000 consultants by end of 2026. Founding partners include Accenture, BCG, Bain, PwC, and McKinsey\u0027s QuantumBlack. OpenAI framed the initiative around the idea that the bottleneck for enterprise AI value is no longer model capability but implementation and workflow redesign.\n\nWhy it matters: Signals OpenAI\u0027s pivot toward the enterprise services layer as a strategic front. A structured partner ecosystem with substantial investment mirrors Salesforce and Microsoft playbooks, suggesting OpenAI is positioning for long-term revenue capture beyond API usage fees.",
      "date_published": "2026-06-16T00:00:00Z",
      "tags": ["openai", "enterprise", "partnership", "strategy"],
      "authors": [{"name": "OpenAI"}]
    },
    
    {
      "id": "2026-06-16-openai-codex-0-140-0-usage-tracking-bedrock-auth",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-16-openai-codex-0-140-0-usage-tracking-bedrock-auth/",
      "title": "OpenAI Codex CLI 0.140.0: Token Usage Tracking, Claude Code Import, and Amazon Bedrock Auth",
      "content_text": "Codex CLI 0.140.0 (June 15, 2026) ships token activity dashboards via /usage views (daily, weekly, cumulative), session deletion with confirmation guards via codex delete and /delete commands, and an /import command that reads Claude Code project configurations. Amazon Bedrock API authentication is now supported with encrypted local credential storage. A unified @ mentions menu replaces scattered context-injection entry points. The release also fixes corrupted SQLite auto-recovery, /review crashes, MCP server reliability issues, and plugin installation bugs.\n\nWhy it matters: The /import command for Claude Code configurations makes switching between or comparing coding agents much lower friction. Bedrock auth addresses enterprise teams using AWS-hosted models rather than the OpenAI API directly. Token usage dashboards respond to a longstanding request from heavy users managing costs across agentic sessions.",
      "date_published": "2026-06-16T00:00:00Z",
      "tags": ["codex", "openai", "coding-agent", "cli", "amazon-bedrock", "developer-tools", "update"],
      "authors": [{"name": "OpenAI"}]
    },
    
    {
      "id": "2026-06-16-nvidia-skillspector-agent-skills-security",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-16-nvidia-skillspector-agent-skills-security/",
      "title": "NVIDIA SkillSpector: Open-Source Security Scanner for AI Agent Skills",
      "content_text": "NVIDIA released SkillSpector (June 13, 2026), an open-source security scanner purpose-built for AI agent skills. It checks 64 vulnerability patterns across 16 categories, covering conventional software risks and agent-specific risks such as prompt injection, insecure data handling, and logic flaws. The tool is grounded in OWASP LLM guidance and MITRE ATLAS. An accompanying Snyk audit of 3,984 skills found that 26.1% contain vulnerabilities and 5.2% show likely malicious intent, including 1,467 malicious payloads such as trojans, cryptominers, and credential harvesters. The repository is available at github.com/NVIDIA/SkillSpector.\n\nWhy it matters: As agent skill marketplaces grow \u2014 including those for Claude Code and OpenClaw \u2014 supply-chain security for skills becomes a real attack surface. SkillSpector is the first dedicated, standardized tool for this problem, analogous to what Snyk does for package dependencies. NVIDIA\u0027s institutional backing gives it potential to become the default audit step in agent deployment pipelines.",
      "date_published": "2026-06-16T00:00:00Z",
      "tags": ["security", "agents", "mcp", "open-source", "supply-chain"],
      "authors": [{"name": "NVIDIA"}]
    },
    
    {
      "id": "2026-06-16-memory-reconstructed-graph-memory-llm-agents",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-16-memory-reconstructed-graph-memory-llm-agents/",
      "title": "Memory is Reconstructed, Not Retrieved: Graph Memory Improves LLM Agent Recall by 23%",
      "content_text": "MRAgent replaces the standard retrieve-then-reason memory paradigm with active reconstruction: agent memory is stored as a Cue-Tag-Content graph where associative tags act as semantic bridges. During inference the agent iteratively explores and prunes retrieval paths guided by intermediate reasoning evidence, avoiding combinatorial explosion. Evaluated on LoCoMo and LongMemEval benchmarks, MRAgent achieves up to 23% improvement over strong retrieval baselines.\n\nWhy it matters: Static retrieval (embedding similarity search) fails when the right memory depends on what the agent has already inferred mid-task. By fusing LLM reasoning directly into the memory traversal step, this work addresses a fundamental bottleneck for long-horizon agent tasks and suggests graph-structured memory as a more robust alternative to flat vector stores.",
      "date_published": "2026-06-16T00:00:00Z",
      "tags": ["agents", "memory", "reasoning", "rag", "paper"],
      "authors": [{"name": "National University of Singapore"}]
    },
    
    {
      "id": "2026-06-16-kimi-k2-7-code-highspeed-6x-throughput",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-16-kimi-k2-7-code-highspeed-6x-throughput/",
      "title": "Kimi K2.7-Code HighSpeed: 6\u00d7 Throughput for Production Coding Agent Pipelines",
      "content_text": "On June 15, 2026, Moonshot AI announced a HighSpeed variant of Kimi K2.7-Code, rolling out to Kimi Code Beta and Kimi Business users. The HighSpeed mode delivers approximately 180 tokens/second on median-length coding inputs and up to 260 tokens/second on shorter tasks \u2014 roughly six times faster than the standard release. The base K2.7-Code (1 trillion-parameter MoE, 32B active, 256K context) shipped on June 12, reporting +21.8% on Kimi Code Bench v2 and approximately 30% fewer reasoning tokens over K2.6.\n\nWhy it matters: At ~$0.95/M input tokens with open weights available for self-hosting, Kimi K2.7-Code HighSpeed directly targets the throughput bottleneck in production coding-agent pipelines \u2014 where token-generation speed limits the number of iterations an agent can run per unit time.",
      "date_published": "2026-06-16T00:00:00Z",
      "tags": ["kimi", "moonshot-ai", "coding", "moe", "open-weights", "china", "update", "inference"],
      "authors": [{"name": "Moonshot AI"}]
    },
    
    {
      "id": "2026-06-16-fastcontext-efficient-repo-explorer-coding-agents",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-16-fastcontext-efficient-repo-explorer-coding-agents/",
      "title": "FastContext: Specialized Exploration Subagent Cuts Coding Agent Token Usage by 60%",
      "content_text": "FastContext decouples repository exploration from task-solving in LLM-based coding agents by introducing a dedicated exploration subagent (4B\u201330B parameters) that issues parallel read/glob/grep tool calls and returns compact file-path and line-range citations to the main solver. Training uses supervised fine-tuning followed by task-grounded reinforcement learning. Integrated into Mini-SWE-Agent, FastContext improves resolution rates by up to 5.5 percentage points on SWE-bench Multilingual, SWE-bench Pro, and SWE-QA, while cutting main-agent token usage by up to 60%.\n\nWhy it matters: Repository navigation is a major hidden cost in frontier coding agents \u2014 models burn large portions of their context window just locating relevant files. FastContext\u0027s separation-of-concerns approach shows that a specialized small model can handle exploration far more efficiently than a monolithic solver. 152 upvotes on HuggingFace Daily Papers.",
      "date_published": "2026-06-16T00:00:00Z",
      "tags": ["coding-agent", "software-engineering", "efficiency", "reinforcement-learning", "agents", "paper"],
      "authors": [{"name": "Microsoft / Shanghai Jiao Tong University"}]
    },
    
    {
      "id": "2026-06-16-dreamx-world-1-0-interactive-world-model",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-16-dreamx-world-1-0-interactive-world-model/",
      "title": "DreamX-World 1.0: General-Purpose Interactive World Model with 6DoF Camera Control",
      "content_text": "DreamX-World is a general-purpose interactive world model that generates diverse, high-fidelity worlds from text or image prompts and allows users or agents to explore them via WASD-style 6DoF camera control. Trained on a mix of Unreal Engine data, gameplay footage, and real-world video, it supports 720P generation up to 7.5 seconds per clip and long-horizon rollouts up to one minute. Two variants are released under Apache 2.0: DreamX-World-5B-Cam (bidirectional, 5s) and DreamX-World-5B (autoregressive, long-horizon).\n\nWhy it matters: One of the first openly released general-purpose interactive world models capable of responding to fine-grained camera and event controls across indoor, urban, nature, sci-fi, and gaming domains. 264 upvotes on HuggingFace Daily Papers signals strong community interest. Combining RL-based training with geometry-guided memory advances the practicality of world models as simulation environments for downstream agents.",
      "date_published": "2026-06-16T00:00:00Z",
      "tags": ["world-models", "video-generation", "embodied-ai", "reinforcement-learning", "open-source", "paper"],
      "authors": [{"name": "AMAP-ML (Alibaba Maps AI Lab)"}]
    },
    
    {
      "id": "2026-06-16-claude-code-2-1-178-parameterized-permissions",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-16-claude-code-2-1-178-parameterized-permissions/",
      "title": "Claude Code 2.1.178: Parameterized Permission Rules and Nested Skills",
      "content_text": "Claude Code version 2.1.178 (June 15, 2026) adds Tool(param:value) syntax for permission rules, enabling fine-grained matching on tool input parameters with wildcard support \u2014 for example, Agent(model:opus) can block Opus subagents specifically. Nested .claude/skills directories now load automatically when working in those directories, with name-clash resolution via \u003cdir\u003e:\u003cname\u003e namespacing. Auto mode now runs a classifier check before spawning subagents to prevent blocked actions from being delegated. Multiple bug fixes address OOM crashes from stale file-descriptor env vars, OAuth account mismatches in Chrome, subagent transcript handling, compaction fallback model, and VSCode CJK IME dismissal.\n\nWhy it matters: The parameterized permission syntax is a significant ergonomics improvement for teams enforcing model-tier policies in agentic pipelines \u2014 it moves cost and safety controls from blunt model blocks to surgical parameter-level rules. Nested skill inheritance with closest-directory-wins makes multi-project monorepos viable without permission prompt friction.",
      "date_published": "2026-06-16T00:00:00Z",
      "tags": ["claude-code", "coding-agent", "security", "ide", "anthropic", "update", "release"],
      "authors": [{"name": "Anthropic"}]
    },
    
    {
      "id": "2026-06-15-yandex-drops-alice-ai-retail-launch",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-15-yandex-drops-alice-ai-retail-launch/",
      "title": "Yandex Drops \u2014 First Russian AI Wearable with Alice AI \u2014 Enters Retail Stores",
      "content_text": "Yandex Drops (\u042f\u043d\u0434\u0435\u043a\u0441 \u0414\u0440\u043e\u043f\u0441) \u2014 TWS earbuds billed as Russia\u0027s first AI wearable device \u2014 went on sale exclusively through the Alice AI chat interface on June 9, 2026, and began rolling out to retail stores across Russia, Kazakhstan, and Belarus on June 16, with Uzbekistan following June 30. The earbuds run a full Alice AI voice model equivalent to the chat version, enabling hands-free Alice interaction without a phone. A \u0027My Memory\u0027 feature \u2014 initially exclusive to Drops owners \u2014 converts voice notes into structured AI-generated reminders. Retail price: 8,990 rubles.\n\nWhy it matters: Yandex is the first major Russian AI lab to ship consumer hardware tightly integrated with its LLM stack. Distributing the device via an AI chat interface first is a novel channel experiment. The exclusive \u0027My Memory\u0027 feature marks Alice AI\u0027s first persistent memory product for end users, extending the YandexGPT ecosystem into ambient computing.",
      "date_published": "2026-06-15T00:00:00Z",
      "tags": ["alice", "hardware", "wearable", "consumer", "russia", "on-device", "voice-ai", "mobile"],
      "authors": [{"name": "Yandex"}]
    },
    
    {
      "id": "2026-06-15-opencode-v1-17-7-mcp-workspace-roots",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-15-opencode-v1-17-7-mcp-workspace-roots/",
      "title": "OpenCode v1.17.7: MCP Servers Now Receive Workspace Root Context",
      "content_text": "SST shipped OpenCode v1.17.7 on June 14. The headline change: MCP servers now receive the current workspace as a client root context, allowing servers to make project-aware decisions without manual path configuration. Also in v1.17.7: plugin clients reuse active servers rather than assuming the default local port; ACP shell tool calls now surface the command and working directory from the start of output; new-session routes stay scoped to their own draft server. Earlier this week, v1.17.0 (June 10) was the major release adding WSL-backed Desktop support, fff-based file search for faster monorepo navigation, and Cohere North model support.\n\nWhy it matters: The MCP workspace root context change is the most developer-impactful: MCP server authors can now write context-aware tools that automatically adapt to the project in focus, eliminating per-project configuration boilerplate. OpenCode serves as the primary open-source alternative to Cursor and Claude Code, connecting to 75+ AI providers.",
      "date_published": "2026-06-15T00:00:00Z",
      "tags": ["opencode", "mcp", "coding-agent", "open-source", "update", "release"],
      "authors": [{"name": "SST"}]
    },
    
    {
      "id": "2026-06-15-openai-codex-26609-cdp-browser-use",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-15-openai-codex-26609-cdp-browser-use/",
      "title": "OpenAI Codex App 26.609 Ships Developer Mode with Chrome DevTools Protocol Access",
      "content_text": "Codex app 26.609 (released June 11) introduced Developer mode for Browser Use, giving users direct Chrome DevTools Protocol access to inspect and script browser sessions \u2014 moving from a black-box browser driver to a transparent, scriptable layer. The release also added rate-limit reset banking for Plus and Pro subscribers, a /init command for project-level instructions in the composer, and expanded Computer Use availability to additional Enterprise regions. CLI 0.139.0 (same cycle) added standalone web search in code mode and improved MCP tool schema preservation across provider roundtrips.\n\nWhy it matters: CDP access in Browser Use lets developers automate, inspect, and debug web sessions the way Chrome DevTools does \u2014 unblocking web testing and scraping workflows that previously required brittle automation scripts. The /init command addresses a long-standing request for persistent project instructions without context stuffing.",
      "date_published": "2026-06-15T00:00:00Z",
      "tags": ["codex", "openai", "coding-agent", "developer-tools", "browser-use", "mcp", "update"],
      "authors": [{"name": "OpenAI"}]
    },
    
    {
      "id": "2026-06-15-midjourney-v8-1-default-model",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-15-midjourney-v8-1-default-model/",
      "title": "Midjourney V8.1 Becomes Platform Default, Replaces V7 with 2K Native Resolution",
      "content_text": "Midjourney promoted V8.1 to platform default on June 11, 2026, replacing V7. The update delivers 4-second standard generation and 12-second HD generation, native 2K resolution in HD mode (4\u00d7 the pixel count of V7), improved prompt adherence, and better text rendering in generated images. V8.0 alpha will be deprecated within two weeks of the rollout. V8.1 is available on all subscription tiers.\n\nWhy it matters: V8.1 is now the default model for all Midjourney users \u2014 it sets the new quality baseline for mainstream consumer text-to-image generation. The 4\u00d7 pixel count increase in HD mode and improved text rendering extend Midjourney\u0027s lead over competing platforms on output quality per generation.",
      "date_published": "2026-06-15T00:00:00Z",
      "tags": ["midjourney", "text-to-image", "image-generation", "update", "release"],
      "authors": [{"name": "Midjourney"}]
    },
    
    {
      "id": "2026-06-15-anthropic-fable-5-white-house-talks",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-15-anthropic-fable-5-white-house-talks/",
      "title": "Anthropic Staff to Meet White House Officials This Week to Negotiate Fable 5 Access Suspension",
      "content_text": "Following the June 12 export-control directive that forced Anthropic to disable Claude Fable 5 and Mythos 5 globally, Axios reported on June 14 that senior Anthropic technical staff will travel to Washington this week to meet with White House officials. The Philadelphia Inquirer characterized the situation as the Trump administration \u0027re-igniting its feud with Anthropic\u0027 over its latest models. Anthropic has maintained in its public statement that the jailbreak cited by the directive was narrow and comparable to weaknesses across all frontier models, and that the applied threshold \u0027would essentially halt all new model deployments for all frontier model providers.\u0027\n\nWhy it matters: Active high-level negotiations between Anthropic and the White House signal the first instance of a frontier AI lab engaging government directly to reverse an export-control-based model shutdown. The outcome will set a template for how US export controls interact with AI model deployment \u2014 with implications for every frontier lab.",
      "date_published": "2026-06-15T00:00:00Z",
      "tags": ["anthropic", "claude-fable-5", "claude-mythos", "regulation", "policy", "export-controls", "safety", "frontier-model"],
      "authors": [{"name": "Anthropic"}]
    },
    
    {
      "id": "2026-06-15-anthropic-agent-sdk-billing-model-retirement",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-15-anthropic-agent-sdk-billing-model-retirement/",
      "title": "Anthropic Agent SDK Billing Split and Sonnet 4 / Opus 4 Model Retirement Take Effect",
      "content_text": "Two changes announced May 14 took effect simultaneously on June 15. First: programmatic Claude usage \u2014 Agent SDK calls, `claude -p` subprocess invocations, Claude Code GitHub Actions, and third-party SDK automations \u2014 now draw from a separate monthly credit pool at standard API list rates. Credit amounts mirror subscription cost: Pro $20/month, Max 5\u00d7 $100/month, Max 20\u00d7 $200/month. Interactive Claude Code in terminal/IDE, web chat, and Claude Cowork are unaffected. Second: the versioned model IDs claude-sonnet-4-20250514 and claude-opus-4-20250514 were retired at 9 AM PT; API calls to those IDs return errors. Recommended migration targets are claude-sonnet-4-6 and claude-opus-4-8.\n\nWhy it matters: When the new credit pool is exhausted, automated API requests fail immediately with no rate-limit retry behavior \u2014 teams relying on subscription parity for CI/CD or scheduled agents must now budget separately or switch to direct API keys. Pinned model-ID references in production code also need updating today to avoid outages.",
      "date_published": "2026-06-15T00:00:00Z",
      "tags": ["anthropic", "billing", "agent-sdk", "model-retirement", "developer-platform", "sdk", "claude-code", "deprecation"],
      "authors": [{"name": "Anthropic"}]
    },
    
    {
      "id": "2026-06-14-zhipu-ai-glm-5-2-1m-context-coding",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-14-zhipu-ai-glm-5-2-1m-context-coding/",
      "title": "Zhipu AI Releases GLM-5.2: 744B MoE with 1M-Token Context and Coding-First Design",
      "content_text": "Zhipu AI (Z.ai) released GLM-5.2 on June 13, 2026, deploying it to all tiers of the GLM Coding Plan (Lite, Pro, Max). Built on a 744B-parameter MoE architecture with 40B active parameters, the model offers a 1-million-token context window (model ID: glm-5.2[1m]) and maximum 131K-token output. It introduces a dual thinking-effort system (High and Max modes) designed for long-horizon agentic software engineering tasks. General API access, integration into the Z.ai chatbot, and open-source weights under MIT are scheduled for the following week. No third-party benchmarks were published at launch.\n\nWhy it matters: GLM-5.2 intensifies the Chinese open-source lab challenge to closed frontier models: a MIT-licensed 1M-context coding model released the same week Anthropic\u0027s two top models were pulled offline. The 40B-active MoE makes it deployable on high-end clusters, and its explicit agentic focus competes directly with Codex and Claude Code workflows.",
      "date_published": "2026-06-14T00:00:00Z",
      "tags": ["glm", "zai-org", "open-weights", "moe", "long-context", "coding", "agentic", "china", "release", "mit"],
      "authors": [{"name": "Zhipu AI"}]
    },
    
    {
      "id": "2026-06-14-weavebench-hybrid-interface-computer-use",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-14-weavebench-hybrid-interface-computer-use/",
      "title": "WeaveBench: Computer-Use Agents Fail at Hybrid GUI+CLI Tasks \u2014 41% Pass Rate",
      "content_text": "WeaveBench introduces 114 real-world tasks requiring AI agents to combine GUI observations/actions with CLI and code operations in a single trajectory \u2014 the first benchmark explicitly targeting this hybrid-interface setting. The best current frontier model achieves only 41.2% pass rate on these long-horizon tasks. Published on arXiv (2606.09426) with 95 upvotes on HuggingFace Daily Papers.\n\nWhy it matters: Real computer workflows constantly switch between graphical interfaces and the terminal. WeaveBench is the first to require fluent hybrid operation in one trajectory, revealing that even frontier agents fail at more than half of realistic computer-use tasks. 95 upvotes on HF Daily Papers.",
      "date_published": "2026-06-14T00:00:00Z",
      "tags": ["agents", "benchmark", "evaluation", "agentic-ai", "gui-agent", "paper", "research", "computer-use"],
      "authors": [{"name": "Microsoft Research"}]
    },
    
    {
      "id": "2026-06-14-opencode-v1-17-5-v1-17-6-mcp-capabilities",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-14-opencode-v1-17-5-v1-17-6-mcp-capabilities/",
      "title": "OpenCode v1.17.5\u2013v1.17.6: MCP Client Capabilities Declaration and Snowflake OAuth",
      "content_text": "SST shipped two OpenCode releases on June 13, 2026. v1.17.6 formally declares OpenCode\u0027s supported MCP client capabilities \u2014 establishing a stable compatibility target for MCP server authors. v1.17.5 adds external browser OAuth for Snowflake Cortex (enabling auth without embedding credentials), improves project copy management and session move flows in the v2 API, recovers expired MCP sessions instead of leaving tools disconnected, returns structured MCP tool output in human-readable form, and fixes duplicate renderable IDs that could break TUI rendering. The desktop layer gains updated oc-2 color themes and improved terminal resize handling.\n\nWhy it matters: The MCP client capabilities declaration in v1.17.6 gives MCP server developers a stable target, reducing breakage from protocol mismatches. Snowflake Cortex OAuth makes OpenCode usable in enterprise data workflows without credential embedding.",
      "date_published": "2026-06-14T00:00:00Z",
      "tags": ["opencode", "mcp", "coding-agent", "open-source", "release", "update"],
      "authors": [{"name": "SST"}]
    },
    
    {
      "id": "2026-06-14-moonshot-ai-kimi-work-desktop-agent",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-14-moonshot-ai-kimi-work-desktop-agent/",
      "title": "Moonshot AI Opens Kimi Work Desktop Agent with 300-Sub-Agent Swarm and WebBridge",
      "content_text": "Moonshot AI opened Kimi Work for internal testing on June 12, 2026 \u2014 a downloadable macOS/Windows desktop application for local AI agent execution. It scales to 300 parallel sub-agents, includes a WebBridge browser extension that reuses existing logged-in browser sessions for automation, supports cron scheduling, local file access, Python script execution, and integration with A-share, Hong Kong, and US equity finance data. Reportedly runs on Kimi K2.6. Outputs include PowerPoint and Excel. The product page is live at kimi.com/products/kimi-work.\n\nWhy it matters: Kimi Work enters the local-first AI agent space alongside tools like Claude Code with a 300-sub-agent swarm and WebBridge\u0027s credential-reuse approach \u2014 reducing friction for knowledge-worker automation. The China-specific finance integrations hint at a targeted enterprise market differentiator.",
      "date_published": "2026-06-14T00:00:00Z",
      "tags": ["kimi", "moonshot-ai", "agents", "agentic", "multi-agent", "desktop-agent", "china", "release", "preview"],
      "authors": [{"name": "Moonshot AI"}]
    },
    
    {
      "id": "2026-06-14-moonshot-ai-kimi-k2-7-code-open-weight",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-14-moonshot-ai-kimi-k2-7-code-open-weight/",
      "title": "Moonshot AI Releases Kimi K2.7-Code: 1T-Parameter Open-Weight Coding Model with Vision",
      "content_text": "Moonshot AI released Kimi K2.7-Code on June 12, 2026 \u2014 weights posted to HuggingFace (moonshotai/Kimi-K2.7-Code) under Modified MIT. The model is a 1-trillion-parameter MoE with 32B active parameters per token (384 experts, 8 selected), a 256K-token context window, and a 400M-parameter MoonViT vision encoder for image and video input. Vendor benchmarks show +21.8% on Kimi Code Bench v2, +11.0% on Program Bench, and +31.5% on MLS Bench Lite over K2.6, with approximately 30% fewer reasoning tokens. API pricing: $0.95/$4.00 per million input/output tokens. Cloudflare Workers AI added the model on release day.\n\nWhy it matters: Kimi K2.7-Code is the fifth major open-weight coding model Moonshot has shipped in under a year. At sub-dollar input pricing with 1T-parameter scale, 256K context, and native vision support, it directly competes with DeepSeek V4-Flash and GLM-5.x for the agentic software engineering workload.",
      "date_published": "2026-06-14T00:00:00Z",
      "tags": ["kimi", "moonshot-ai", "open-weights", "coding", "agentic", "moe", "multimodal", "long-context", "china", "release"],
      "authors": [{"name": "Moonshot AI"}]
    },
    
    {
      "id": "2026-06-14-minimax-sparse-attention-paper",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-14-minimax-sparse-attention-paper/",
      "title": "MiniMax Sparse Attention: 28\u00d7 Compute Reduction at 1M-Token Context with No Quality Loss",
      "content_text": "MiniMax published a paper introducing a blockwise sparse attention mechanism built on Grouped Query Attention that achieves a 28.4\u00d7 reduction in per-token attention compute at 1M-token context while matching the quality of full attention. The technique uses an Index Branch to score and select relevant KV blocks, with a Main Branch performing exact attention over the selected blocks. It underpins MiniMax M3, the first open-weight model combining frontier coding capability, 1M-token context, and native multimodality in a single architecture. The paper received 251 upvotes on HuggingFace Daily Papers.\n\nWhy it matters: Quadratic attention cost has been the primary barrier to practical 1M-token context windows. This work shows a 28\u00d7 compute cut with negligible quality loss and ships a production model to prove it \u2014 not just a paper result. 251 upvotes on HF Daily Papers reflects strong community interest.",
      "date_published": "2026-06-14T00:00:00Z",
      "tags": ["minimax", "long-context", "attention", "efficiency", "inference", "open-weights", "paper", "research"],
      "authors": [{"name": "MiniMax"}]
    },
    
    {
      "id": "2026-06-14-minimax-m3-vllm-day-0-sparse-attention",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-14-minimax-m3-vllm-day-0-sparse-attention/",
      "title": "vLLM Adds Day-0 Support for MiniMax M3 Open Weights with 1M-Context Sparse Attention",
      "content_text": "On June 12, 2026, the vLLM team published a blog post announcing day-0 serving support for MiniMax M3 \u2014 a 456B-parameter open-weight model with a 1M-token context window, native multimodal input, and MiniMax Sparse Attention (MSA) architecture (open weights released approximately June 10\u201311). Deployment requires the \u0027--block-size 128\u0027 flag due to MSA\u0027s sparse/index cache requirements. AMD announced simultaneous day-0 support on Instinct GPUs. On Fireworks AI, M3 is available with pricing described as roughly 1/20th the cost of comparable closed models.\n\nWhy it matters: Day-0 inference engine support means practitioners can immediately run M3 locally or on-prem without waiting for framework updates. With Anthropic\u0027s top models offline, M3\u0027s 1M-context at MoE efficiency becomes a practical alternative for long-document coding and analysis pipelines.",
      "date_published": "2026-06-14T00:00:00Z",
      "tags": ["vllm", "minimax", "inference", "open-weights", "long-context", "multimodal", "moe", "serving", "open-source", "release"],
      "authors": [{"name": "MiniMax"}]
    },
    
    {
      "id": "2026-06-14-maxproof-minimax-imo-gold",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-14-maxproof-minimax-imo-gold/",
      "title": "MaxProof: MiniMax Model Exceeds IMO and USAMO Gold-Medal Thresholds on Formal Math",
      "content_text": "MiniMax published MaxProof, a framework for training and test-time scaling of mathematical proof using the MiniMax M3 model series. It trains three capabilities \u2014 proof generation, verification, and critique-conditioned repair \u2014 using a generative verifier engineered for low false-positive rate. At inference, the model acts simultaneously as generator, verifier, refiner, and ranker, selecting a final proof via tournament ranking. MaxProof achieves 35/42 on IMO 2025 and 36/42 on USAMO 2026, exceeding the gold-medal threshold on both. Published on arXiv (2606.13473) with 75 upvotes on HuggingFace Daily Papers.\n\nWhy it matters: Gold-medal-level performance on both IMO and USAMO from a single unified open-weight model \u2014 not an ensemble of specialized systems \u2014 marks a meaningful advance in formal mathematical reasoning. 75 upvotes on HF Daily Papers.",
      "date_published": "2026-06-14T00:00:00Z",
      "tags": ["minimax", "mathematics", "reasoning", "reinforcement-learning", "benchmark", "paper", "research", "formal-reasoning"],
      "authors": [{"name": "MiniMax"}]
    },
    
    {
      "id": "2026-06-14-interleavethinker-rl-text-image",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-14-interleavethinker-rl-text-image/",
      "title": "InterleaveThinker: RL Planner+Critic Pipeline for Interleaved Text-and-Image Generation",
      "content_text": "InterleaveThinker is a multi-agent pipeline \u2014 a planner and a critic agent \u2014 that equips any image generator with the ability to produce interleaved text-image sequences. The planner organizes input sequences; the critic evaluates outputs and refines instructions for regeneration. Training uses SFT datasets (80K planner, 112K critic examples) and GRPO reinforcement learning with step-wise rewards. The system achieves performance comparable to GPT-5-level models on interleaved generation benchmarks (WISE, RISE). Published on arXiv (2606.13679) with 124 upvotes on HuggingFace Daily Papers.\n\nWhy it matters: Interleaved text-image generation (illustrated stories, embodied instructions) is a key missing capability in open multimodal systems. This is the first work to apply RL to a planner+critic pipeline for this task, matching proprietary frontier models on relevant benchmarks. 124 upvotes on HF Daily Papers.",
      "date_published": "2026-06-14T00:00:00Z",
      "tags": ["multimodal", "agents", "rl", "image-generation", "paper", "research", "generation"],
      "authors": [{"name": "CUHK Multimedia Lab"}]
    },
    
    {
      "id": "2026-06-14-evoarena-dynamic-agent-benchmark",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-14-evoarena-dynamic-agent-benchmark/",
      "title": "EvoArena: LLM Agents Score Only 40% on Dynamic Evolving Environments",
      "content_text": "EvoArena is a benchmark that models environments as sequences of progressive updates across terminal, software, and social domains \u2014 exposing a gap in current agent evaluation that assumes static environments. Top agents currently achieve only ~40% accuracy. The paper also proposes EvoMem, a patch-based memory paradigm that records environment changes as structured update histories; EvoMem improves chain-level accuracy by 3.7% on EvoArena and 4\u20136% on GAIA and LoCoMo benchmarks. Published on arXiv (2606.13681) and received 121 upvotes on HuggingFace Daily Papers.\n\nWhy it matters: Nearly all existing agent benchmarks use static environments. EvoArena forces evaluation under continuous change and the 40% ceiling exposes how far current agents are from real-world deployment readiness. 121 upvotes on HF Daily Papers.",
      "date_published": "2026-06-14T00:00:00Z",
      "tags": ["agents", "benchmark", "memory", "evaluation", "agentic-ai", "paper", "research"],
      "authors": [{"name": "MIT / NUS / Salesforce"}]
    },
    
    {
      "id": "2026-06-14-elevenlabs-avatars-elevencreative",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-14-elevenlabs-avatars-elevencreative/",
      "title": "ElevenLabs Launches Avatars in ElevenCreative: TTS-Native AI Talking-Head Video",
      "content_text": "ElevenLabs launched Avatars in ElevenCreative, a workflow that pairs the company\u0027s AI speech synthesis with lip-synced talking-head video generation. Users upload a photo or write a prompt to create a persistent avatar identity, then generate video across different angles, outfits, and backgrounds while retaining identity consistency. Voice and lip-synced video are produced in a single step. A new Avatar node in Flows enables batch generation across scripts, languages, and voices. Available on all paid plans.\n\nWhy it matters: ElevenLabs \u2014 primarily a voice AI company \u2014 moves directly into video creation, competing with HeyGen and Synthesia while removing the multi-tool friction enterprises currently face. The batch-pipeline integration in Flows targets high-volume multilingual video production.",
      "date_published": "2026-06-14T00:00:00Z",
      "tags": ["elevenlabs", "video-generation", "tts", "voice-ai", "release", "enterprise"],
      "authors": [{"name": "ElevenLabs"}]
    },
    
    {
      "id": "2026-06-14-claude-code-v2-1-177-fable-5-fallback",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-14-claude-code-v2-1-177-fable-5-fallback/",
      "title": "Claude Code v2.1.177: Fable 5 Forced Fallback to Opus 4.8, Bedrock Cache Fix, Security Patch",
      "content_text": "Claude Code v2.1.177 shipped on June 13, 2026. Due to the US government directive, all Fable 5 model selections are automatically redirected to Claude Opus 4.8 without user action. Other changes: session titles are now generated in the conversation language (configurable via the \u0027language\u0027 setting); a new \u0027footerLinksRegexes\u0027 setting enables regex-matched link badges in the footer; Bedrock credential caching now respects actual token expiration rather than a fixed 1-hour window; a security fix closes a loophole where blocked models could be bypassed via the \u0027availableModels\u0027 allowlist. Additional bug fixes cover copy/paste over tmux SSH, Remote Control model switching, and Linux sandbox with symlinked settings files.\n\nWhy it matters: The forced Fable 5 \u2192 Opus 4.8 redirect means any Claude Code workflow that was tuned to Fable 5\u0027s capabilities is silently downgraded. The Bedrock credential fix matters for teams running long CI/CD jobs on AWS. The security fix for allowlist bypass is relevant for operators who use \u0027availableModels\u0027 to restrict model access.",
      "date_published": "2026-06-14T00:00:00Z",
      "tags": ["claude-code", "coding-agent", "release", "anthropic", "amazon-bedrock", "security", "bug-fix", "update"],
      "authors": [{"name": "Anthropic"}]
    },
    
    {
      "id": "2026-06-14-anthropic-public-record-survey",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-14-anthropic-public-record-survey/",
      "title": "Anthropic Publishes First Public Record: 52,000-Person Survey on US AI Attitudes",
      "content_text": "Anthropic released results from its first Anthropic Public Record on June 12, 2026 \u2014 a survey of nearly 52,000 Americans measuring hopes, fears, and governance preferences around AI, collected November\u2013December 2025. The data found broad bipartisan consensus on major AI concerns. Anthropic intends to repeat the survey regularly and expand it internationally, framing it as a mechanism to ensure AI development reflects public input beyond existing Claude users.\n\nWhy it matters: Labs rarely publish systematic large-scale public opinion research on AI attitudes. Releasing this data publicly is an unusual transparency move, and the timing \u2014 same day as the Fable 5 suspension \u2014 adds context to Anthropic\u0027s broader efforts to maintain trust with regulators and the public.",
      "date_published": "2026-06-14T00:00:00Z",
      "tags": ["anthropic", "policy", "safety", "regulation", "research"],
      "authors": [{"name": "Anthropic"}]
    },
    
    {
      "id": "2026-06-14-anthropic-fable-5-mythos-5-govt-suspension",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-14-anthropic-fable-5-mythos-5-govt-suspension/",
      "title": "US Government Orders Anthropic to Disable Claude Fable 5 and Mythos 5 Globally",
      "content_text": "On June 12, 2026, the US Commerce Department issued an export-control directive requiring Anthropic to block all access to Claude Fable 5 and Mythos 5 for foreign nationals \u2014 including Anthropic\u0027s own foreign-national employees. Because selective enforcement in real time was impossible, Anthropic disabled both models globally within hours of the order. The company complied while publicly disputing the necessity: it argued the jailbreak the government cited was narrow, non-universal, and comparable to weaknesses in other commercially available models, and warned that applying this threshold industry-wide \u0027would essentially halt all new model deployments.\u0027 All other Anthropic models remained available. Claude Code v2.1.177 (June 13) silently redirects any Fable 5 model selection to Claude Opus 4.8.\n\nWhy it matters: This is the first time the US government has invoked export controls to compel a frontier AI lab to pull publicly deployed models offline \u2014 affecting all users globally, not just foreign nationals. It sets a regulatory precedent for export-control application to AI models and signals escalating government intervention in AI deployment. Developers and enterprises relying on Fable 5 in production are immediately impacted without a migration path.",
      "date_published": "2026-06-14T00:00:00Z",
      "tags": ["anthropic", "claude-fable-5", "claude-mythos", "regulation", "policy", "export-controls", "safety", "frontier-model"],
      "authors": [{"name": "Anthropic"}]
    },
    
    {
      "id": "2026-06-12-vk-tech-reduces-vk-data-platform-infrastructure-re",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-12-vk-tech-reduces-vk-data-platform-infrastructure-re/",
      "title": "VK Tech Reduces VK Data Platform Infrastructure Requirements 2.5\u00d7 for AI Deployments",
      "content_text": "VK Tech announced on June 11 that infrastructure resource requirements for deploying VK Data Platform in a fault-tolerant on-premise configuration have been reduced by 2.5 times. The platform uses a Data Lakehouse architecture (Apache Iceberg over S3-compatible storage) separating storage from compute, with tiered HDD storage potentially cutting costs up to 10\u00d7 versus all-SSD setups. The update targets companies building data pipelines for AI agents, RAG, ML, and BI workloads.\n\nWhy it matters: Lowering the hardware barrier to enterprise data infrastructure reduces the entry cost for Russian companies deploying AI agents and RAG pipelines on their own premises.",
      "date_published": "2026-06-12T00:00:00Z",
      "tags": ["russia", "rag", "on-premises", "enterprise", "infrastructure", "update"],
      "authors": [{"name": "VK AI"}]
    },
    
    {
      "id": "2026-06-12-suno-launches-advanced-stem-separation-with-per-in",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-12-suno-launches-advanced-stem-separation-with-per-in/",
      "title": "Suno Launches Advanced Stem Separation with Per-Instrument Extraction",
      "content_text": "Suno released upgraded Stem Separation on June 11, 2026 with three modes: Advanced Split (Premier subscribers) isolates any of nearly 100 individual instruments; Split from Mix extracts a specific instrument or voice into two stems; Auto Split provides classic 12-category separation. All modes are described as artifact-free. The feature is accessible via the Edit menu on any generated or uploaded track.\n\nWhy it matters: Professional-grade per-instrument stem extraction was previously a separate paid service (Moises, Lalal.ai). Integrating it directly into the music generation platform reduces post-production workflow steps for Suno users and enables easier remixing and licensing of individual components.",
      "date_published": "2026-06-12T00:00:00Z",
      "tags": ["suno", "music-generation", "audio", "stem-separation", "update"],
      "authors": [{"name": "Suno"}]
    },
    
    {
      "id": "2026-06-12-sber-launches-giga-art-ai-art-festival-using-kandi",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-12-sber-launches-giga-art-ai-art-festival-using-kandi/",
      "title": "Sber Launches Giga-Art AI Art Festival Using Kandinsky 6.0",
      "content_text": "Sber launched Giga-Art (\u0413\u0438\u0433\u0430-\u0410\u0440\u0442), an open AI art festival running June 12 through November 4, 2026, inviting anyone to generate images depicting Russia using the Kandinsky 6.0 Image model inside GigaChat. Best submissions from each stage will be displayed on public media screens across the country. All GigaChat image generation features are available free of charge for participants.\n\nWhy it matters: Sber is using a public art contest to drive Kandinsky 6.0 adoption and GigaChat user acquisition, making it one of the most visible consumer-facing Russian AI deployments of 2026.",
      "date_published": "2026-06-12T00:00:00Z",
      "tags": ["russia", "gigachat", "image-generation", "community", "kandinsky"],
      "authors": [{"name": "Sber"}]
    },
    
    {
      "id": "2026-06-12-opencode-v1-17-4-mcp-cwd-support-for-local-servers",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-12-opencode-v1-17-4-mcp-cwd-support-for-local-servers/",
      "title": "OpenCode v1.17.4: MCP cwd Support for Local Servers and Connector Auth Flows",
      "content_text": "SST\u0027s OpenCode v1.17.4 (June 12) added cwd support for local MCP servers (servers now start from a workspace-relative directory), connector-based auth flows, v2 API endpoints for session management, and fixed Gemini tool schema multi-type field compatibility. Earlier in the June 10-12 window: v1.17.0 added fff-backed fast file search and Cohere North model; v1.17.1\u2013v1.17.3 fixed auth recovery, desktop crashes, and Linux launcher identity.\n\nWhy it matters: MCP cwd support is a quality-of-life improvement for monorepo and multi-project setups. OpenCode continues its push as the model-agnostic open-source alternative to Claude Code and Cursor.",
      "date_published": "2026-06-12T00:00:00Z",
      "tags": ["opencode", "coding-agent", "mcp", "open-source", "update"],
      "authors": [{"name": "SST"}]
    },
    
    {
      "id": "2026-06-12-openai-acquires-german-startup-ona-to-power-persis",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-12-openai-acquires-german-startup-ona-to-power-persis/",
      "title": "OpenAI Acquires German Startup Ona to Power Persistent Codex Cloud Agents",
      "content_text": "OpenAI announced the acquisition of Ona, a Kiel-based startup providing secure cloud execution and orchestration environments for software development agents. Ona\u0027s technology enables AI agents to access tools and context over long-horizon tasks without requiring a user to remain in session. More than 5 million people now use Codex weekly, up 400% recently. Financial terms were not disclosed; the deal is subject to regulatory approval.\n\nWhy it matters: The acquisition directly bolsters OpenAI\u0027s Codex ecosystem for asynchronous, multi-hour agentic coding tasks, reflecting the industry-wide shift toward persistent cloud-native agent infrastructure rather than single-session tool calls.",
      "date_published": "2026-06-12T00:00:00Z",
      "tags": ["openai", "codex", "acquisition", "cloud", "agents", "coding-agent"],
      "authors": [{"name": "OpenAI"}]
    },
    
    {
      "id": "2026-06-12-midjourney-v8-1-becomes-default-model-native-2k-ou",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-12-midjourney-v8-1-becomes-default-model-native-2k-ou/",
      "title": "Midjourney V8.1 Becomes Default Model: Native 2K Output and 4\u20135\u00d7 Speed Boost",
      "content_text": "Midjourney made V8.1 the default model on June 11, 2026. Key improvements over V7: native 2K HD output without upscaling, render speeds roughly 4\u20135x faster (standard SD jobs finish in about 4 seconds, HD in 12 seconds), while maintaining V7\u0027s aesthetic style. V8.1 had been available in alpha since April 14 but is now the production default for all users.\n\nWhy it matters: V8.1 replaces V7 as the everyday model for millions of Midjourney users. Native 2K resolution combined with a 4\u20135\u00d7 speed improvement meaningfully lowers iteration cost for professional workflows.",
      "date_published": "2026-06-12T00:00:00Z",
      "tags": ["midjourney", "text-to-image", "image-generation", "update", "release"],
      "authors": [{"name": "Midjourney"}]
    },
    
    {
      "id": "2026-06-12-llama-cpp-b9603-qualcomm-adreno-opencl-kernels-for",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-12-llama-cpp-b9603-qualcomm-adreno-opencl-kernels-for/",
      "title": "llama.cpp b9603: Qualcomm Adreno OpenCL Kernels for On-Device Inference",
      "content_text": "llama.cpp release b9603 (June 12) added OpenCL q5_0 and q5_1 GEMM/GEMV kernels for Qualcomm Adreno GPUs, co-authored with Qualcomm engineers. This enables hardware-accelerated quantized inference on Qualcomm-powered Android devices and Snapdragon laptops. Other recent builds in the window: b9601 Vulkan build fix; b9596 server router-mode logging optimization; b9591 MTP memory optimization; b9590 LFM2 json_schema fix.\n\nWhy it matters: Adreno is the most common mobile GPU architecture. These OpenCL kernels bring optimized quantized inference to a large hardware base that previously had limited llama.cpp acceleration support.",
      "date_published": "2026-06-12T00:00:00Z",
      "tags": ["inference", "on-device", "mobile", "quantization", "open-source", "update"],
      "authors": [{"name": "ggml-org"}]
    },
    
    {
      "id": "2026-06-12-lionsgate-takes-equity-stake-in-runway-plans-ai-sh",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-12-lionsgate-takes-equity-stake-in-runway-plans-ai-sh/",
      "title": "Lionsgate Takes Equity Stake in Runway, Plans AI Short-Form Episodic Series",
      "content_text": "Lionsgate acquired a non-cash equity stake in Runway (last valued at ~$5.3B) and expanded their original September 2024 content partnership. The deal covers co-produced AI short-form episodic series using Lionsgate franchise IP and a joint program for developing original AI-native content. Lionsgate\u0027s chief AI officer Kathleen Grace is overseeing the relationship.\n\nWhy it matters: One of the most concrete Hollywood-studio-to-AI-lab equity commitments to date. Rather than a licensing deal, Lionsgate is taking ownership in Runway and committing IP for production \u2014 setting a precedent for how legacy media companies may structure AI relationships.",
      "date_published": "2026-06-12T00:00:00Z",
      "tags": ["runway", "partnership", "video-generation", "hollywood", "joint-venture", "valuation"],
      "authors": [{"name": "Runway"}]
    },
    
    {
      "id": "2026-06-12-interleavethinker-rl-framework-for-agentic-text-an",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-12-interleavethinker-rl-framework-for-agentic-text-an/",
      "title": "InterleaveThinker: RL Framework for Agentic Text-and-Image Interleaved Generation",
      "content_text": "A multi-agent pipeline that endows any image generator with interleaved text-image generation capabilities via a planner agent and a critic agent. The team introduces accuracy and step-wise reward mechanisms so that RL can guide full multi-step generation without backpropagating through 25+ generator calls. Results are competitive with GPT-5 on interleaved generation benchmarks, and training also improves base-model performance on reasoning benchmarks.\n\nWhy it matters: Interleaved text-and-image generation (illustrated reports, annotated documents) is a key unsolved multimodal capability. This is the #1 HuggingFace Daily Paper for June 12 with 65 upvotes, offering a clean RL recipe applicable on top of existing generators.",
      "date_published": "2026-06-12T00:00:00Z",
      "tags": ["multimodal", "image-generation", "rl", "agents", "reasoning", "paper"]
    },
    
    {
      "id": "2026-06-12-google-deepmind-and-partners-launch-10m-multi-agen",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-12-google-deepmind-and-partners-launch-10m-multi-agen/",
      "title": "Google DeepMind and Partners Launch $10M Multi-Agent AI Safety Research Fund",
      "content_text": "Google DeepMind, Schmidt Sciences, the Cooperative AI Foundation, ARIA, and Google.org announced a global research funding call of up to $10 million focused on safety in environments where millions of AI agents from different organizations interact. The four research priorities are: sandboxes and testbeds, agent network science, agent infrastructure protocols, and oversight and control. Applications are open through August 8, 2026.\n\nWhy it matters: As agentic AI systems proliferate rapidly, safety research on cross-organizational agent interactions has lagged deployment. This is one of the first major coordinated multi-funder efforts targeting emergent risks from agent networks at scale.",
      "date_published": "2026-06-12T00:00:00Z",
      "tags": ["google-deepmind", "multi-agent", "safety", "funding", "research"],
      "authors": [{"name": "Google DeepMind"}]
    },
    
    {
      "id": "2026-06-12-fort-searcher-shortcut-resistant-training-data-fra",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-12-fort-searcher-shortcut-resistant-training-data-fra/",
      "title": "FORT-Searcher: Shortcut-Resistant Training Data Framework for Deep Search Agents",
      "content_text": "Identifies four concrete shortcut risks in existing deep-search training data \u2014 evidence co-coverage, single-clue selectivity, exposed constants, and prior-knowledge binding \u2014 that let agents bypass genuine multi-hop search. FORT synthesizes shortcut-resistant data by controlling these risks across entity selection, evidence graph construction, and question formulation. FORT-Searcher achieves state-of-the-art among open-source search agents of comparable size.\n\nWhy it matters: Deep search agents are increasingly important, but training-data quality has been poorly understood. FORT is the first principled shortcut-aware difficulty framework. #4 on HF Daily June 12 with 44 upvotes.",
      "date_published": "2026-06-12T00:00:00Z",
      "tags": ["search", "agents", "rl", "training", "information-retrieval", "paper"]
    },
    
    {
      "id": "2026-06-12-evoarena-llm-agents-score-only-39-6-on-dynamic-evo",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-12-evoarena-llm-agents-score-only-39-6-on-dynamic-evo/",
      "title": "EvoArena: LLM Agents Score Only 39.6% on Dynamic Evolving Environments Benchmark",
      "content_text": "EvoArena models environment changes as sequences of progressive updates across terminal, software, and social domains, in contrast to the static settings assumed by most agent evaluations. Best current agents achieve only 39.6% accuracy. The authors also propose EvoMem, a structured-update-history mechanism that improves performance by 1.5% on EvoArena, 6.1% on GAIA, and 4.8% on LoCoMo.\n\nWhy it matters: Static-environment benchmarks may substantially overestimate real-world agent performance where conditions keep changing. EvoArena quantifies this gap and provides a concrete memory-tracking fix. #3 on HF Daily June 12 with 50 upvotes.",
      "date_published": "2026-06-12T00:00:00Z",
      "tags": ["agents", "benchmark", "memory", "evaluation", "multi-agent", "paper"],
      "authors": [{"name": "MIT"}]
    },
    
    {
      "id": "2026-06-12-cursor-bugbot-3-faster-90-second-reviews-and-pre-p",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-12-cursor-bugbot-3-faster-90-second-reviews-and-pre-p/",
      "title": "Cursor Bugbot 3\u00d7 Faster: 90-Second Reviews and Pre-Push /review Command",
      "content_text": "Cursor shipped a Bugbot performance update for Cursor 3.7+. Average review time dropped from ~5 minutes to ~90 seconds, cost per run fell 22%, and bugs found per review improved 10% (0.56 to 0.62 per run), powered by Composer 2.5. A new /review command lets developers run Bugbot and Security Review locally before pushing, with GitHub/GitLab integration that avoids re-reviewing unchanged diffs.\n\nWhy it matters: At 90 seconds, Bugbot crosses a usability threshold fast enough to run before every push rather than as an async post-push check. Combined with /review, this shifts AI code review into the local development loop.",
      "date_published": "2026-06-12T00:00:00Z",
      "tags": ["cursor", "code-review", "coding-agent", "update"],
      "authors": [{"name": "Cursor"}]
    },
    
    {
      "id": "2026-06-12-claude-code-v2-1-174-v2-1-175-enterprise-model-con",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-12-claude-code-v2-1-174-v2-1-175-enterprise-model-con/",
      "title": "Claude Code v2.1.174\u2013v2.1.175: Enterprise Model Controls and Bedrock GovCloud Fix",
      "content_text": "Anthropic shipped two Claude Code releases on June 12. v2.1.174 fixed a Bedrock GovCloud region prefix bug (us-gov-* regions were incorrectly deriving \u0027global\u0027), corrected background sessions inheriting another session\u0027s provider env vars, and added per-skill/agent/MCP usage attribution in the VSCode /usage dialog. v2.1.175 added the enforceAvailableModels managed setting, which constrains the Default model to the admin-defined allowed list and prevents user or project settings from expanding it.\n\nWhy it matters: enforceAvailableModels gives enterprise admins hard guardrails over model selection, not just soft defaults. The Bedrock GovCloud fix unblocks regulated US government cloud deployments that were seeing 400 errors.",
      "date_published": "2026-06-12T00:00:00Z",
      "tags": ["claude-code", "coding-agent", "enterprise", "amazon-bedrock", "update"],
      "authors": [{"name": "Anthropic"}]
    },
    
    {
      "id": "2026-06-12-astra-rl-trained-vlm-queries-world-simulator-for-s",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-12-astra-rl-trained-vlm-queries-world-simulator-for-s/",
      "title": "Astra: RL-Trained VLM Queries World Simulator for Spatial Reasoning",
      "content_text": "Astra combines an RL-trained VLM policy (Astra-VL) with a world simulator (Astra-WM) built on Bagel. During spatial reasoning, the model issues natural-language camera instructions to the simulator to imagine novel viewpoints. Astra-WM boosts Gemini-3-Flash on MMSI-Bench from 45.1 to 49.5; Astra-VL lifts Qwen3-VL from 29.8 to 38.8 on MMSI-Bench and 36.8 to 42.7 on MindCube.\n\nWhy it matters: Spatial reasoning from limited viewpoints is a longstanding VLM weakness. Astra demonstrates that actively imagining new views via RL-trained tool use is tractable and yields measurable gains on established 3D reasoning benchmarks.",
      "date_published": "2026-06-12T00:00:00Z",
      "tags": ["vlm", "reasoning", "world-models", "multimodal", "rl", "vision-language", "paper"]
    },
    
    {
      "id": "2026-06-12-anthropic-launches-claude-corps-150m-fellowship-pl",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-12-anthropic-launches-claude-corps-150m-fellowship-pl/",
      "title": "Anthropic Launches Claude Corps: $150M Fellowship Placing 1,000 Workers at Nonprofits",
      "content_text": "Anthropic launched Claude Corps, a $150 million national fellowship program placing 1,000 early-career workers at US nonprofits over multiple cohorts. Fellows earn $85,000 annually and help organizations adopt Claude-based AI tools. The first cohort of 100 accepts applications through July 17, 2026, starting October 2026. Partners include CodePath and Social Finance, with at least 400 nonprofits participating.\n\nWhy it matters: Signals Anthropic\u0027s strategic bet on AI adoption in civil society, positioning the company as a key actor in workforce transition and expanding Claude\u0027s real-world deployment footprint beyond enterprise tech.",
      "date_published": "2026-06-12T00:00:00Z",
      "tags": ["anthropic", "nonprofit", "fellowship", "workforce", "enterprise"],
      "authors": [{"name": "Anthropic"}]
    },
    
    {
      "id": "2026-06-11-z-reward-score-distribution-rlhf-images",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-11-z-reward-score-distribution-rlhf-images/",
      "title": "Z-Reward: Score Distributions Instead of Scalar Rewards for Image Generation RLHF",
      "content_text": "Z-Reward replaces single scalar reward values with distributions over rubric scores for RLHF in text-to-image generation. A 27B teacher model reasons explicitly to produce score distributions; a student model internalizes this reasoning at inference time via Reasoning-Internalized Score Distillation (RISD), without needing chain-of-thought at runtime. Group-wise Direct Score Optimization (GDSO) combines policy-gradient rewards with direct distribution supervision. The 27B teacher achieves 89.6% human preference accuracy; the 9B student matches at 88.6%; as a differentiable reward signal during generation, achieves 41.3% net human-preference improvement.\n\nWhy it matters: 34 upvotes on HuggingFace June 11. The distribution-over-rubrics framing generalizes beyond image generation to any RLHF domain where scalar rewards lose signal. The 89.6% human preference accuracy surpasses all reported baselines at the teacher scale.",
      "date_published": "2026-06-11T00:00:00Z",
      "tags": ["rl", "reward-modeling", "multimodal", "reasoning", "rlhf"],
      "authors": [{"name": "Alibaba"}]
    },
    
    {
      "id": "2026-06-11-opencode-v1-17-1-auth-recovery-sub-agents",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-11-opencode-v1-17-1-auth-recovery-sub-agents/",
      "title": "OpenCode v1.17.1\u2013v1.17.3: Auth Recovery, Sub-Agent Permissions, Linux Launcher",
      "content_text": "Three releases on June 10. v1.17.1 adds usage descriptions and docs visibility for references, enforces timeout limits on MCP server requests, restores macOS auto-update, and adds a /new-session route with draft tab. v1.17.2 adds auth recovery for expired remote config, permission controls for sub-agents, a Linux launcher with app icon, and device attachment selection UI. v1.17.3 is a hotfix for a desktop crash introduced in v1.17.2.\n\nWhy it matters: Sub-agent permission controls are a meaningful safety and governance addition for teams running OpenCode in production. Auth recovery for expired remote config improves reliability in enterprise deployments.",
      "date_published": "2026-06-11T00:00:00Z",
      "tags": ["opencode", "coding-agent", "open-source", "mcp", "agents"],
      "authors": [{"name": "SST"}]
    },
    
    {
      "id": "2026-06-11-openai-models-codex-oracle-cloud",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-11-openai-models-codex-oracle-cloud/",
      "title": "OpenAI Models and Codex Now Available Through Oracle Cloud Credits",
      "content_text": "OCI customers can now apply existing Oracle Universal Credits toward OpenAI frontier models and Codex, integrating access through existing Oracle purchasing workflows. The partnership lets enterprise teams build AI applications and use Codex for software development without setting up a separate OpenAI billing relationship.\n\nWhy it matters: Channels OpenAI\u0027s enterprise reach through one of the largest enterprise cloud procurement pipelines. For Oracle customers \u2014 many in financial, healthcare, and government sectors \u2014 it removes procurement friction and brings frontier AI into existing budget structures, normalizing AI capabilities as standard cloud services.",
      "date_published": "2026-06-11T00:00:00Z",
      "tags": ["openai", "codex", "cloud", "enterprise", "api", "partnership"],
      "authors": [{"name": "OpenAI"}]
    },
    
    {
      "id": "2026-06-11-llama-cpp-cuda-ssm-mamba-correctness-fix",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-11-llama-cpp-cuda-ssm-mamba-correctness-fix/",
      "title": "llama.cpp b9589\u2013b9592: CUDA SSM Sync Fix and Mamba Memory Optimization",
      "content_text": "Four builds landed around June 10. b9589 fixes missing thread-sync barriers before shared memory reuse in CUDA SSM scan operations \u2014 a correctness bug affecting Mamba-family models running on GPU. b9591 consolidates D2D memory copies for MTP/Mamba into a single strided transfer and refactors ggml_gated_delta_net, reducing overhead. b9590 fixes LFM2/LFM2.5 ignoring json_schema from response_format. b9592 updates LibreSSL to 4.3.2.\n\nWhy it matters: The CUDA SSM sync fix addresses a silent correctness issue \u2014 affected users may have been getting subtly wrong outputs from Mamba models without knowing it. The memory transfer consolidation improves throughput for Mamba architectures gaining traction as attention alternatives.",
      "date_published": "2026-06-11T00:00:00Z",
      "tags": ["inference", "cuda", "ssm", "open-source", "local-llm"]
    },
    
    {
      "id": "2026-06-11-langchain-content-block-token-callbacks",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-11-langchain-content-block-token-callbacks/",
      "title": "LangChain Stack: Provider-Agnostic Content Block Token Callbacks for Anthropic, Groq, Mistral",
      "content_text": "Coordinated releases on June 10\u201311: langchain-core 1.4.5 adds tool call chunk validation during streaming and async tracer fallbacks. langchain-anthropic 1.4.5 adds callback support for content block tokens and model profile refreshes. langchain-groq 1.1.3 adds strict mode and standard model properties. langchain-mistralai 1.1.5 adds content block token support in callbacks. langchain 1.3.7 ships a new middleware component.\n\nWhy it matters: Content block token callback support across Anthropic, Groq, and Mistral standardizes streaming observability in LangChain applications, making token-level tracing provider-agnostic \u2014 useful for cost attribution, rate-limit management, and debugging.",
      "date_published": "2026-06-11T00:00:00Z",
      "tags": ["langchain", "anthropic", "streaming", "observability", "sdk"],
      "authors": [{"name": "LangChain"}]
    },
    
    {
      "id": "2026-06-11-kwai-keye-vl-2-long-video-moe-multimodal",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-11-kwai-keye-vl-2-long-video-moe-multimodal/",
      "title": "Kwai Keye-VL-2.0: Open-Source 30B MoE Multimodal Model with 256K Context for Long Video",
      "content_text": "Kwai released Keye-VL-2.0, an open-source 30B Mixture-of-Experts multimodal model with 3B active parameters. Key advance: adapting sparse attention (derived from DeepSeek) to support lossless 256K-token context for hour-long video understanding. A novel training technique \u2014 Cross-Modal Multi-Teacher On-Policy Distillation \u2014 prevents catastrophic forgetting across tasks. Supports multimodal agentic workflows including code execution, tool use, and web search.\n\nWhy it matters: 785 upvotes on HuggingFace \u2014 top paper of June 10. Delivers state-of-the-art long-video comprehension (Video-MME-v2, LongVideoBench, TimeLens) at a competitive parameter budget with full open weights and native agent capabilities. Raises the bar for open multimodal models.",
      "date_published": "2026-06-11T00:00:00Z",
      "tags": ["multimodal", "long-video", "moe", "agents", "efficiency", "china", "open-weights"],
      "authors": [{"name": "Kwai"}]
    },
    
    {
      "id": "2026-06-11-diffusion-gemma-26b-open-model-4x-faster",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-11-diffusion-gemma-26b-open-model-4x-faster/",
      "title": "Google Releases DiffusionGemma: 26B Open Model with 4\u00d7 Faster Text Generation",
      "content_text": "Google released DiffusionGemma, an experimental 26B Mixture-of-Experts open model (Apache 2.0) that uses text diffusion instead of autoregressive token generation. Rather than producing one token at a time, it generates and refines a 256-token block in parallel, achieving up to 4\u00d7 faster throughput: 1,000+ tokens/sec on an H100 and 700+ on a GeForce RTX 5090. Only 3.8B parameters are active during inference, and the quantized model fits within 18 GB VRAM for consumer GPU deployment. Output quality is lower than standard Gemma 4, making it suited for speed-critical interactive workflows rather than quality-first applications.\n\nWhy it matters: One of the first production-viable open-weights text diffusion models. The architectural shift from sequential to parallel block generation removes memory bandwidth as the primary bottleneck and enables bidirectional attention across generated tokens \u2014 impossible in autoregressive models. Open Apache 2.0 release on consumer hardware accelerates research into diffusion-based LLMs.",
      "date_published": "2026-06-11T00:00:00Z",
      "tags": ["gemma", "diffusion-gemma", "open-weights", "text-diffusion", "local-inference", "apache2"],
      "authors": [{"name": "Google DeepMind"}]
    },
    
    {
      "id": "2026-06-11-denovoswe-repo-generation-from-scratch",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-11-denovoswe-repo-generation-from-scratch/",
      "title": "DeNovoSWE: Full Repository Generation Jumps from 5.8% to 47.2% with Synthetic Training Data",
      "content_text": "DeNovoSWE addresses a gap in AI code agents: most training data covers bug-fixing in existing codebases, not building complete repositories from scratch. The benchmark provides 4,818 instances where each requires generating a full repo from documentation. A divide-and-conquer critic-repair pipeline with difficulty-aware filtering produces high-quality training trajectories. Fine-tuning Qwen3-30B-A3B on this data pushes BeyondSWE-Doc2Repo performance from 5.8% to 47.2%.\n\nWhy it matters: 21 upvotes on HuggingFace June 11. The near 10\u00d7 benchmark jump demonstrates that training-data quality for long-horizon coding tasks is a major bottleneck \u2014 automated, sandboxed construction can close the gap. Advances AI toward being a full software architect rather than just a patch writer.",
      "date_published": "2026-06-11T00:00:00Z",
      "tags": ["agents", "code-generation", "software-engineering", "reasoning"],
      "authors": [{"name": "AweAI Team"}]
    },
    
    {
      "id": "2026-06-11-claude-code-v2-1-172-nested-sub-agents",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-11-claude-code-v2-1-172-nested-sub-agents/",
      "title": "Claude Code v2.1.172\u2013v2.1.173: Nested Sub-Agents Up to 5 Levels Deep",
      "content_text": "Two releases landed on June 10\u201311. v2.1.172 enables sub-agents to spawn their own sub-agents up to 5 levels deep, adds a marketplace plugin search bar, exposes a model attribute on OTEL lines-of-code metrics, and fixes multiple bugs (1M-context sessions stuck on usage credits, repeated image-processing errors, agents-view UI lag, background sub-agents staying stuck as active). Amazon Bedrock now reads AWS region from ~/.aws config when AWS_REGION is unset. v2.1.173 strips the [1m] suffix from Fable 5 model names automatically and fixes a spurious \u0027sandbox dependencies missing\u0027 startup warning on Windows.\n\nWhy it matters: Recursive sub-agent spawning up to 5 levels is a meaningful architectural upgrade for complex agentic workflows. Fable 5 name normalization removes friction for teams upgrading to the new model family.",
      "date_published": "2026-06-11T00:00:00Z",
      "tags": ["claude-code", "coding-agent", "agents", "sub-agents", "claude-fable-5", "amazon-bedrock"],
      "authors": [{"name": "Anthropic"}]
    },
    
    {
      "id": "2026-06-11-arbor-hypothesis-tree-autonomous-research",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-11-arbor-hypothesis-tree-autonomous-research/",
      "title": "Arbor: Generalist Autonomous ML Research via Hypothesis-Tree Refinement",
      "content_text": "Arbor introduces a framework for fully autonomous ML research. An LLM-based coordinator manages a persistent Hypothesis Tree linking hypotheses, experimental artifacts, and learned insights. Executor agents test individual hypotheses in isolated sandboxes, allowing knowledge to accumulate across many experimental rounds rather than being discarded after each run. On MLE-Bench Lite, Arbor reaches 86.36% Any Medal score \u2014 over 2.5\u00d7 the relative held-out gains of both Codex and Claude Code under identical compute budgets.\n\nWhy it matters: 30 upvotes on HuggingFace June 11. A concrete step toward AI systems that conduct sustained, compounding scientific research. The 2.5\u00d7 advantage over Codex and Claude Code on a standardized ML engineering benchmark is a strong empirical signal for autonomous research agents.",
      "date_published": "2026-06-11T00:00:00Z",
      "tags": ["agents", "reasoning", "autonomous-research", "rl", "software-engineering"],
      "authors": [{"name": "NLPIR Lab"}]
    },
    
    {
      "id": "2026-06-11-anatomy-post-training-interpretability",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-11-anatomy-post-training-interpretability/",
      "title": "Anatomy of Post-Training: Using Interpretability to Audit and Fix Preference Data",
      "content_text": "Applies mechanistic interpretability to audit and improve post-training pipelines. The method identifies latent concepts in model representations that distinguish preferred from less preferred outputs, then uses those concepts to diagnose spurious correlations in preference datasets and shape rewards via feature or data interventions. Positions interpretability not just as a tool for understanding models after training, but as an active component in the training loop itself.\n\nWhy it matters: Bridges the gap between interpretability research and practical alignment work. By diagnosing what concepts a reward model is actually picking up on \u2014 including unintended ones \u2014 the approach offers a principled way to audit and correct the learning signal before it embeds bad behaviors.",
      "date_published": "2026-06-11T00:00:00Z",
      "tags": ["interpretability", "mech-interp", "safety", "rlhf", "post-training"]
    },
    
    {
      "id": "2026-06-10-yandex-drops-ai-earbuds",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-10-yandex-drops-ai-earbuds/",
      "title": "Yandex Launches Drops: First AI Wearable Earbuds with Alice AI",
      "content_text": "Yandex began sales of Yandex Drops on June 9, 2026 \u2014 its first wearable AI device: wireless earbuds with an on-device chip for local wake-word detection and an always-on Alice AI. Priced at 8,990 rubles. The \u0027My Memory\u0027 feature converts voice notes into structured reminders and lists. Available exclusively via Alice AI chat through June 16, then in retail across Russia, Kazakhstan, and Belarus.\n\nWhy it matters: Marks Yandex\u0027s entry into AI hardware, extending Alice beyond smart speakers to a wearable form factor. The on-device local model for always-on activation is a step toward ambient AI in the Russian market.",
      "date_published": "2026-06-10T00:00:00Z",
      "tags": ["alice", "wearable", "hardware", "on-device", "russia"],
      "authors": [{"name": "Yandex"}]
    },
    
    {
      "id": "2026-06-10-searchswarm-delegation-intelligence",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-10-searchswarm-delegation-intelligence/",
      "title": "SearchSwarm: Delegation Intelligence for LLM Agents in Long-Horizon Deep Research",
      "content_text": "SearchSwarm (arXiv:2606.09730) introduces a multi-agent framework where a main LLM decomposes long research tasks and dispatches subtasks to specialized subagents that return only summarized results to fit the main context window. Training data is synthesized via a harness guiding high-quality decomposition. SearchSwarm-30B-A3B achieves 68.1 on BrowseComp and 73.3 on BrowseComp-ZH \u2014 best results among comparable-scale open models. Weights, training data, and harness are being released open-source.\n\nWhy it matters: Context-window saturation is a practical ceiling for LLM-based research agents. SearchSwarm targets this with a trainable delegation strategy rather than a heuristic one, and the open-source release enables reproducible follow-up work.",
      "date_published": "2026-06-10T00:00:00Z",
      "tags": ["agents", "multi-agent", "long-context", "reasoning", "open-source"]
    },
    
    {
      "id": "2026-06-10-scail-2-character-animation",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-10-scail-2-character-animation/",
      "title": "SCAIL-2: End-to-End Character Animation via In-Context Conditioning",
      "content_text": "SCAIL-2 (arXiv:2606.10804) eliminates intermediate representations (pose skeletons, background masks) in controlled character animation by directly concatenating driving videos into the generation sequence. Key components: MotionPair-60K (new synthetic dataset), in-context mask conditioning, mode-specific RoPE for soft guidance, and Bias-Aware DPO to reduce synthetic artifacts. Achieves SOTA across multiple controlled animation tasks.\n\nWhy it matters: Removing the brittle intermediate-representation pipeline in favor of end-to-end in-context conditioning simplifies production character animation pipelines. 95 upvotes on HuggingFace Daily Papers reflects strong community interest from the digital production and game development communities.",
      "date_published": "2026-06-10T00:00:00Z",
      "tags": ["video-generation", "multimodal", "diffusion", "character-animation"],
      "authors": [{"name": "Tsinghua University"}]
    },
    
    {
      "id": "2026-06-10-opencode-v1-17-0",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-10-opencode-v1-17-0/",
      "title": "OpenCode v1.17.0: fff File Search, Cohere North, and Session Recovery",
      "content_text": "OpenCode v1.17.0 (June 10, 2026) adds faster file search via fff (Rust/SIMD-accelerated fuzzy finder), Cohere North model integration, Claude Fable 5 reasoning support, MCP tool improvements (abort signals, correct pagination), Java Maven workspace resolution, session recovery from provider context-overflow errors, WSL-backed Desktop on Windows, and improved sessions and servers UI.\n\nWhy it matters: fff-backed file search is a meaningful DX improvement for large monorepos where file search latency bottlenecks agentic tasks. Cohere North integration expands provider options for teams preferring enterprise-grade open-weight models.",
      "date_published": "2026-06-10T00:00:00Z",
      "tags": ["opencode", "coding-agent", "open-source", "search", "mcp"],
      "authors": [{"name": "SST"}]
    },
    
    {
      "id": "2026-06-10-openclaw-2026-6-5",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-10-openclaw-2026-6-5/",
      "title": "OpenClaw 2026.6.5 Stable: MCP Tool Validation and Parallel Web Search",
      "content_text": "OpenClaw 2026.6.5 stable (June 9, 2026) follows several beta releases (beta.2\u2013beta.6) over June 7\u20139. Key changes: new YYYY.M.PATCH versioning scheme, improved handling of AI model reasoning content, MCP tool result validation, Anthropic session recovery enhancements, and parallel web-search provider integration.\n\nWhy it matters: The new versioning scheme and MCP improvements signal a maturing release cadence. Parallel web-search integration mirrors what Codex CLI shipped the same week, indicating cross-project convergence on agent search patterns.",
      "date_published": "2026-06-10T00:00:00Z",
      "tags": ["openclaw", "coding-agent", "mcp", "open-source"]
    },
    
    {
      "id": "2026-06-10-openai-economic-research-exchange",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-10-openai-economic-research-exchange/",
      "title": "OpenAI Launches Economic Research Exchange for AI Impact Studies",
      "content_text": "OpenAI launched the OpenAI Economic Research Exchange on June 8, 2026 \u2014 a program inviting external researchers to conduct privacy-protected studies on AI\u0027s effects on workers, firms, and the economy. Applications open through July 5, 2026, with selected researchers notified July 31. Participants get structured access to usage data under defined governance rules.\n\nWhy it matters: As AI\u0027s economic footprint grows, credible empirical work on displacement and productivity is urgently needed for policy. OpenAI\u0027s willingness to open proprietary usage data to independent researchers may pressure other frontier labs to follow suit.",
      "date_published": "2026-06-10T00:00:00Z",
      "tags": ["openai", "research", "policy"],
      "authors": [{"name": "OpenAI"}]
    },
    
    {
      "id": "2026-06-10-openai-codex-cli-v0-139-0",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-10-openai-codex-cli-v0-139-0/",
      "title": "OpenAI Codex CLI v0.139.0: Web Search in Code Mode and MCP Schema Fixes",
      "content_text": "Codex CLI v0.139.0 (June 9, 2026) allows code mode to call standalone web search directly and receive plaintext results. Improved MCP tool schema preservation for complex tool inputs. The codex doctor diagnostic command was improved. A pre-release v0.140.0-alpha.2 also dropped June 10. Earlier v0.137.0 (June 4) added F13-F24 keybindings, monthly credit limit display for enterprise, and multi-agent v2 improvements.\n\nWhy it matters: Web search directly inside code mode closes a major workflow gap \u2014 developers can have Codex look up documentation or changelogs without switching context. MCP schema improvements help with complex tool-call pipelines.",
      "date_published": "2026-06-10T00:00:00Z",
      "tags": ["codex", "coding-agent", "cli", "openai", "search", "mcp"],
      "authors": [{"name": "OpenAI"}]
    },
    
    {
      "id": "2026-06-10-minimax-m3-open-weights",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-10-minimax-m3-open-weights/",
      "title": "MiniMax M3 Open Weights Released: 1M Context, MoE, Frontier Coding",
      "content_text": "MiniMax released the open weights of M3 on HuggingFace on June 10, 2026 \u2014 fulfilling the promise made at the June 1 API launch. M3 uses MiniMax Sparse Attention (MSA) to deliver 1M-token context at 1/20th the per-token compute of the prior generation, achieving 9\u00d7 faster prefill and 15\u00d7 faster decoding. It scores 59.0% on SWE-Bench Pro (surpassing GPT-5.5 and Gemini 3.1 Pro) and supports image and video inputs natively. API pricing: $0.60/$2.40 per million tokens input/output.\n\nWhy it matters: M3 is the first open-weight model combining frontier-level coding, a million-token context window, and native multimodal input in a single architecture. The open weights dramatically expand what the open-source community can run and fine-tune at frontier performance levels.",
      "date_published": "2026-06-10T00:00:00Z",
      "tags": ["open-weights", "moe", "long-context", "coding", "multimodal", "agentic", "china"],
      "authors": [{"name": "MiniMax"}]
    },
    
    {
      "id": "2026-06-10-gemini-3-5-live-translate",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-10-gemini-3-5-live-translate/",
      "title": "Gemini 3.5 Live Translate: Real-Time Speech-to-Speech in 70+ Languages",
      "content_text": "Google launched Gemini 3.5 Live Translate on June 9, 2026 \u2014 a continuous speech-to-speech translation model covering 70+ languages that preserves the speaker\u0027s intonation, pacing, and pitch. Unlike turn-by-turn systems, it generates translated speech without turn boundaries, supporting 2,000+ language-pair combinations. Available immediately: via the Gemini Live API and Google AI Studio for developers, in Google Translate on Android and iOS, and in private preview for Google Meet enterprise customers. All output audio is watermarked via SynthID.\n\nWhy it matters: Continuous low-latency voice translation at frontier fidelity \u2014 simultaneously shipping in a consumer app (Google Translate) and developer API \u2014 is a qualitative leap over prior auto-translation tools and positions Google as the leader in real-time multilingual speech.",
      "date_published": "2026-06-10T00:00:00Z",
      "tags": ["gemini", "translation", "speech", "real-time", "multilingual", "watermarking"],
      "authors": [{"name": "Google DeepMind"}]
    },
    
    {
      "id": "2026-06-10-flow-dppo-flow-matching-rl",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-10-flow-dppo-flow-matching-rl/",
      "title": "Flow-DPPO: Principled RL Alignment for Flow Matching Image and Video Models",
      "content_text": "Flow-DPPO (arXiv:2606.11025) argues that ratio-clipping PPO variants (Flow-GRPO, CPS) are structurally ill-suited for flow matching models because noisy per-step policy ratios produce inconsistent trust-region enforcement across trajectory positions. Flow-DPPO replaces ratio clipping with a divergence-based proximal constraint and leverages the Gaussian structure of per-step flow policies to compute exact KL divergences efficiently. Demonstrates superior reward, better KL efficiency, reduced catastrophic forgetting, and stable multi-epoch training on image and video generation tasks.\n\nWhy it matters: Applying RL alignment to generative image/video models is an active frontier. Flow-DPPO provides a theoretically principled alternative to ratio-clipping designed specifically for the continuous-time flow matching paradigm now used in most SOTA diffusion models.",
      "date_published": "2026-06-10T00:00:00Z",
      "tags": ["rl", "flow-matching", "diffusion", "policy-optimization", "video-generation"],
      "authors": [{"name": "Tencent Hunyuan"}]
    },
    
    {
      "id": "2026-06-10-drpo-divergence-regularization-llm-rl",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-10-drpo-divergence-regularization-llm-rl/",
      "title": "DRPO: Rethinking Divergence Regularization in LLM Reinforcement Learning",
      "content_text": "DRPO (Divergence Regularized Policy Optimization, arXiv:2606.09821) replaces the hard gradient-masking used in PPO/DPPO with a smooth advantage-weighted quadratic regularizer. Instead of discarding updates when a token crosses trust-region boundaries, DRPO applies bounded, continuous gradient weights that both attenuate harmful divergences and supply corrective signals. Validated across multiple model scales, architectures, and precision settings, showing improved stability and efficiency over existing LLM RL training methods.\n\nWhy it matters: With 324 upvotes on HuggingFace Daily Papers \u2014 highest for June 10 \u2014 this paper directly addresses a fundamental instability in RLVR training pipelines powering reasoning models like DeepSeek-R1 and Qwen3. A smoother trust-region control mechanism could improve reliability of post-training runs industry-wide.",
      "date_published": "2026-06-10T00:00:00Z",
      "tags": ["rl", "rlvr", "post-training", "policy-optimization", "reasoning"],
      "authors": [{"name": "Tencent Hunyuan"}]
    },
    
    {
      "id": "2026-06-10-cohere-north-mini-code",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-10-cohere-north-mini-code/",
      "title": "Cohere North Mini Code: 30B Apache-2.0 MoE Coding Model for Agentic Workflows",
      "content_text": "Cohere released North Mini Code 1.0 on June 9, 2026 under Apache 2.0. The model has 30B total parameters with only 3B active (MoE with 128 experts, 8 activated per token), using interleaved sliding-window and full self-attention. It targets agentic software engineering workflows, scoring 33.4 on Cohere\u0027s coding index. Available on HuggingFace in BF16 and FP8, integrated into OpenCode, and accessible via the Cohere API.\n\nWhy it matters: A 30B MoE model with 3B active parameters runs on a single H100, making it viable for on-premises enterprise deployment. Apache 2.0 licensing and native OpenCode integration make it a strong candidate for teams wanting controllable, self-hosted coding agents without vendor lock-in.",
      "date_published": "2026-06-10T00:00:00Z",
      "tags": ["open-source", "moe", "coding", "coding-agent", "apache2"],
      "authors": [{"name": "Cohere"}]
    },
    
    {
      "id": "2026-06-10-claude-fable-5-mythos-5",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-10-claude-fable-5-mythos-5/",
      "title": "Claude Fable 5 and Claude Mythos 5: Anthropic\u0027s Most Capable Model Goes Public",
      "content_text": "Anthropic released Claude Fable 5 on June 9, 2026 \u2014 the first Mythos-class model made publicly available. It uses the same underlying architecture as Claude Mythos 5 but ships with three classifier-based safeguards (cybersecurity, biology/chemistry, distillation prevention) that fall back to Claude Opus 4.8 in restricted domains. Priced at $10/M input and $50/M output tokens, with 128k output token support. Free for Pro/Max/Team/Enterprise subscribers through June 22. Mythos 5 (unrestricted) remains gated to vetted cybersecurity researchers via Project Glasswing. Anthropic cited a 50-million-line codebase migration as a flagship real-world benchmark.\n\nWhy it matters: The first Mythos-class model to reach the general public marks a new tier of publicly available intelligence. The tiered-access architecture \u2014 safeguarded Fable 5 for all users, unrestricted Mythos 5 for vetted researchers \u2014 may become the industry template for releasing highly capable models responsibly.",
      "date_published": "2026-06-10T00:00:00Z",
      "tags": ["anthropic", "claude", "claude-fable-5", "claude-mythos", "safety", "coding", "frontier-model"],
      "authors": [{"name": "Anthropic"}]
    },
    
    {
      "id": "2026-06-10-claude-code-v2-1-170",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-10-claude-code-v2-1-170/",
      "title": "Claude Code v2.1.170: Claude Fable 5 Support Added",
      "content_text": "Claude Code v2.1.170 (June 9, 2026) adds support for the newly released Claude Fable 5 model. The preceding v2.1.169 (June 8) introduced a --safe-mode flag and /cd command; v2.1.166 (June 6) added fallbackModel configuration supporting up to three alternative models for resilience under API overload; v2.1.163 (June 4) introduced version requirement policies (requiredMinimumVersion/requiredMaximumVersion) and a /plugin list command.\n\nWhy it matters: Same-day Fable 5 support shows tight Anthropic tooling integration. The fallbackModel feature from v2.1.166 is the more durable improvement: enterprise teams can configure automatic failover across up to three models without user intervention.",
      "date_published": "2026-06-10T00:00:00Z",
      "tags": ["claude-code", "coding-agent", "cli", "anthropic"],
      "authors": [{"name": "Anthropic"}]
    },
    
    {
      "id": "2026-06-10-abot-earth-0-5-generative-3d",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-10-abot-earth-0-5-generative-3d/",
      "title": "ABot-Earth 0.5: Generative 3D Urban World Model from Satellite Imagery",
      "content_text": "ABot-Earth 0.5 (arXiv:2606.09967) synthesizes seamless 3D urban environments from geospatially referenced satellite imagery using 3D Gaussian Splatting with hierarchical level-of-detail for real-time web visualization. Generates realistic geometry and textures at under 10 minutes per square kilometer. Targets the simulation-to-reality gap for embodied AI applications such as UAV navigation.\n\nWhy it matters: Scalable photorealistic 3D world generation from satellite imagery has direct applications in robotics simulation, autonomous vehicle training, and urban digital twins. Generating a square kilometer in under 10 minutes is a meaningful efficiency milestone. 83 upvotes on HuggingFace Daily Papers.",
      "date_published": "2026-06-10T00:00:00Z",
      "tags": ["3d-generation", "embodied-ai", "simulation", "computer-vision", "alibaba"],
      "authors": [{"name": "Alibaba AMAP CV Lab"}]
    },
    
    {
      "id": "2026-06-09-weak-critics-strong-learners-opcd",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-09-weak-critics-strong-learners-opcd/",
      "title": "Weak Critics Make Strong Learners: On-Policy Critique Distillation for Scalable Oversight",
      "content_text": "Proposes Progressive On-Policy Critique Distillation (OPCD), where a weak model acts as a critic providing revision directions rather than binary judgments (arXiv:2606.00424). The key insight is that weak critics only need to offer non-misleading improvement directions \u2014 not correct final answers \u2014 enabling strong models to leverage their own knowledge for self-improvement. The method filters high-quality critiques and distills critic-guided behaviors into the strong model through adaptive self-teaching. Shows improvements on reasoning and alignment benchmarks across training iterations.\n\nWhy it matters: Scalable oversight is a central alignment challenge: as models grow more capable, human and weak-model supervision becomes insufficient. OPCD offers a practical path where cheap weak critics can bootstrap stronger models without requiring the critic to fully understand the task \u2014 the critic just needs to point in a better direction, addressing the same problem as constitutional AI and debate from a distillation angle.",
      "date_published": "2026-06-09T00:00:00Z",
      "tags": ["alignment", "scalable-oversight", "distillation", "rl", "reasoning"],
      "authors": [{"name": "Rutgers University"}]
    },
    
    {
      "id": "2026-06-09-vllm-semantic-router-v0-3-themis",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-09-vllm-semantic-router-v0-3-themis/",
      "title": "vLLM Semantic Router v0.3 Themis: Stateful Production Routing with Session-Aware Agentic Routing",
      "content_text": "vLLM Semantic Router v0.3 (codename Themis), released June 5, 2026, transforms routing from a classification tool into a stateful, observable production system. Key additions: a unified v0.3 configuration format eliminating dialect fragmentation; signal enrichment extracting evidence from 15+ signal families (auth, safety, conversation shape, tool-loop detection); Session-Aware Agentic Routing (SAAR) combining router-owned session memory, safety locks during tool loops, provider-state portability checks, and replayable diagnostics; a revamped operator dashboard; and an Intel OpenVINO binding for C++/Go integration. The release represents 350+ commits since v0.2.0. The router ranked #1 on RouterArena with a 75.4 weighted Arena Score and adds native Anthropic `/v1/messages` protocol support alongside OpenAI compatibility.\n\nWhy it matters: SAAR directly addresses a practical agentic deployment problem \u2014 multi-turn agents switching models mid-session and destabilizing behavior. The Anthropic protocol support broadens applicability beyond pure OpenAI-compatible stacks, and the #1 RouterArena ranking validates production readiness.",
      "date_published": "2026-06-09T00:00:00Z",
      "tags": ["vllm", "inference", "routing", "open-source"]
    },
    
    {
      "id": "2026-06-09-swe-explore-coding-agent-benchmark",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-09-swe-explore-coding-agent-benchmark/",
      "title": "SWE-Explore: Benchmarking Repository Exploration as the Binding Constraint in Coding Agents",
      "content_text": "SWE-Explore (arXiv:2606.07297) introduces a benchmark of 848 GitHub issues across 10 programming languages and 203 repositories to evaluate repository exploration \u2014 the step before patch generation where an agent must locate relevant code. Classical retrievers (BM25, TF-IDF) perform near random baseline; agentic explorers reach \u003e65% file-level hit rates but only ~15% line-level recall. GPT-5 vs. Gemini swaps shift performance magnitude but not the recall bottleneck, suggesting the limit is exploration strategy rather than raw model capability.\n\nWhy it matters: Most coding agent evals measure final patch success, hiding where agents actually fail. SWE-Explore shows the exploration phase is the binding constraint: missing relevant code regions hurts repair far more than including irrelevant context. The 10-language, 203-repo scope makes it more representative than SWE-bench\u0027s Python-dominant coverage. Second on HF Daily Papers (77 upvotes).",
      "date_published": "2026-06-09T00:00:00Z",
      "tags": ["agents", "coding", "benchmark", "software-engineering"],
      "authors": [{"name": "Shanghai Jiao Tong University"}]
    },
    
    {
      "id": "2026-06-09-openai-codex-cli-v0-138-0",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-09-openai-codex-cli-v0-138-0/",
      "title": "OpenAI Codex CLI v0.138.0: Desktop Handoff, Structured Plugin Output, and Account Token Visibility",
      "content_text": "Version 0.138.0 (June 8, 2026) adds desktop handoff for the `/app` command on macOS and Windows, local image file path exposure to models for follow-up edits, enhanced reasoning effort selection with fallback shortcuts for terminals missing Alt bindings, account token usage visibility and v2 personal access token support, and structured JSON output for plugin automation (`codex plugin list --json`). TUI streaming optimizations eliminate blank spacing artifacts and workspace instruction loading is improved for remote and symlinked environments. An alpha v0.139.0 build was also cut on June 9.\n\nWhy it matters: Desktop handoff closes the loop between CLI and GUI workflows, while structured JSON plugin output enables automated tooling around Codex sessions. The release continues the fast cadence following the Codex CLI Rust rewrite.",
      "date_published": "2026-06-09T00:00:00Z",
      "tags": ["codex", "coding-agent", "cli", "openai"],
      "authors": [{"name": "OpenAI"}]
    },
    
    {
      "id": "2026-06-09-ollama-v0-30-7-hermes-gemma-4-qat",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-09-ollama-v0-30-7-hermes-gemma-4-qat/",
      "title": "Ollama v0.30.7: Hermes Desktop Support, Gemma 4 QAT, and Nemotron-3-Ultra",
      "content_text": "Ollama v0.30.7 (June 7, 2026) adds native Windows support for Hermes Desktop and aligns OpenAI-compatible API model lists with available tags. The v0.30.6 release (June 5) added Gemma 4 models optimized via Quantization-Aware Training (QAT), reducing memory requirements ~72% while maintaining near-original quality. v0.30.4 (June 3) introduced Nemotron-3-Ultra support for reasoning/long-running agent workflows and fixed Metal GPU offload for multimodal models on Apple Silicon. v0.30.2 added Qwen Code support and improved token accounting for cached prompts.\n\nWhy it matters: Gemma 4 QAT support dramatically lowers the hardware bar for running Google\u0027s multimodal model locally, and Nemotron-3-Ultra support brings NVIDIA\u0027s flagship reasoning model to local inference. Six versions in five days reflects active integration across multiple new model families.",
      "date_published": "2026-06-09T00:00:00Z",
      "tags": ["ollama", "inference", "local-llm", "open-source"],
      "authors": [{"name": "Ollama"}]
    },
    
    {
      "id": "2026-06-09-geometry-on-policy-distillation",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-09-geometry-on-policy-distillation/",
      "title": "On the Geometry of On-Policy Distillation: A Training Paradigm Distinct from SFT and RLVR",
      "content_text": "This paper (arXiv:2606.07082) characterizes on-policy distillation (OPD) as a distinct training paradigm by analyzing its parameter-space geometry. OPD leaves 51.6% of weights unchanged (between SFT at 8.1% and RLVR at 77.2%), avoids principal directions more strongly than SFT, and exhibits \u0027subspace locking\u0027 \u2014 cumulative updates rapidly enter a stable low-dimensional channel. Constraining training to this early-formed subspace preserves performance, and the subspace is robust to token sparsification and off-policy rollouts but changes when objectives are mixed.\n\nWhy it matters: OPD has become a popular way to train reasoning models (e.g., via GRPO-style distillation), but it was poorly understood whether it is just RL with a different reward or SFT in disguise. This paper establishes it has its own identity with practical implications: the locked subspace can guide geometry-aware algorithm design and may enable cheaper training by targeting the active subspace directly. Third on HF Daily Papers (45 upvotes).",
      "date_published": "2026-06-09T00:00:00Z",
      "tags": ["distillation", "rl", "training-dynamics", "efficiency"],
      "authors": [{"name": "Hong Kong University of Science and Technology"}]
    },
    
    {
      "id": "2026-06-09-elevenlabs-music-v2-genre-switching",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-09-elevenlabs-music-v2-genre-switching/",
      "title": "ElevenLabs Music v2: Mid-Track Genre Switching, Inpainting, and Commercial Clearance",
      "content_text": "ElevenLabs released Music v2 on May 26, 2026, introducing mid-track genre transitions (e.g., opera to heavy metal within a single composition), section-by-section structural building (intro, verse, chorus, bridge, outro), audio inpainting to regenerate specific segments without affecting the rest, non-musical sound effect embedding within tracks, and sustained dense lyrical delivery including fast rap. Trained exclusively on licensed data, the model is cleared for commercial use with no sync fees. Pricing was cut up to 50% for ElevenAPI and up to 40% for ElevenCreative self-serve customers.\n\nWhy it matters: Music v2 is the first major music generation model with built-in commercial licensing clearance and track-level inpainting, addressing the two main barriers to professional adoption \u2014 legal risk and editorial control. The price cuts combined with structural composition control move generative music from novelty to viable production tool for advertising, video, and brand content.",
      "date_published": "2026-06-09T00:00:00Z",
      "tags": ["music-generation", "elevenlabs", "audio"],
      "authors": [{"name": "ElevenLabs"}]
    },
    
    {
      "id": "2026-06-09-echo-memory-world-model-memory",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-09-echo-memory-world-model-memory/",
      "title": "Echo-Memory: Controlled Study of Memory Mechanisms in Action-Conditioned Video World Models",
      "content_text": "Echo-Memory (arXiv:2606.09803) presents a controlled framework for isolating and comparing memory mechanisms in action-conditioned video generation models. By fixing the backbone and varying only memory components, the paper disentangles four axes: capacity, compression, read-out strategy, and recurrence. Key findings: raw context is stronger than expected; aggressive compression hurts fidelity; block-wise state-space recurrence wins on open-domain return tasks; and replay quality is not a reliable proxy for true scene memory.\n\nWhy it matters: World models for robotics and game simulation fail when the camera revisits a previously seen location and the scene has changed. This paper gives practitioners a rigorous diagnostic for choosing memory designs, revealing that the dominant bottleneck is the memory module, not the image-synthesis backbone. Topped HuggingFace Daily Papers on June 9 with 78 upvotes.",
      "date_published": "2026-06-09T00:00:00Z",
      "tags": ["world-models", "video-generation", "memory", "multimodal"],
      "authors": [{"name": "Microsoft Research"}]
    },
    
    {
      "id": "2026-06-09-cursor-3-7-canvas-design-mode",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-09-cursor-3-7-canvas-design-mode/",
      "title": "Cursor 3.7: Canvas Design Mode, Context Usage Reports, and SDK Nested Subagents",
      "content_text": "Cursor 3.7 (June 4\u20135, 2026) introduces Design Mode in canvases: developers click, draw, or describe UI changes by voice directly over rendered components to guide edits without writing descriptions. Multi-select and voice input work while an agent is mid-run. A new interactive context usage report in canvases shows token distribution across system prompt, tool definitions, rules, skills, and more. The SDK update adds custom tools via `local.customTools`, auto-review routing for tool calls, JSONL and custom store persistence options, and nested subagents that can spawn their own subagents at any depth. Enterprise customers gained multi-team organization management with separate security, governance, and budget controls (GA as of June 3).\n\nWhy it matters: Design Mode addresses a core friction point in UI-heavy development by letting users point-and-annotate directly in the canvas rather than writing descriptions. Nested subagents unlock more complex multi-stage workflows natively in Cursor\u0027s SDK.",
      "date_published": "2026-06-09T00:00:00Z",
      "tags": ["cursor", "coding-agent", "ide"],
      "authors": [{"name": "Cursor"}]
    },
    
    {
      "id": "2026-06-09-claude-code-v2-1-169",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-09-claude-code-v2-1-169/",
      "title": "Claude Code v2.1.169: Safe Mode Flag, /cd Command, and disableBundledSkills Setting",
      "content_text": "Version 2.1.169 (June 8, 2026) adds a `--safe-mode` flag (and `CLAUDE_CODE_SAFE_MODE` env var) that disables all customizations \u2014 CLAUDE.md, plugins, skills, hooks, MCP servers \u2014 for clean troubleshooting. The `/cd` command allows moving a session to a new working directory without breaking prompt cache. A `disableBundledSkills` setting hides bundled skills and built-in slash commands from the model. Fixes include Up/Down arrow navigation in long input lines, enterprise MCP policy enforcement bugs, a macOS UI stall for claude.ai-authenticated users, and `claude -p` slowness on Windows (regression from 2.1.161). Previous v2.1.166 (June 6) added `fallbackModel` support for up to three fallback models, glob pattern support in deny rules, and hardened cross-session messaging security.\n\nWhy it matters: The safe-mode flag gives teams a reliable escape hatch for diagnosing agent misbehavior without disabling their entire configuration permanently. The fallbackModel setting significantly improves reliability under API overload conditions, reducing interruptions for high-traffic teams.",
      "date_published": "2026-06-09T00:00:00Z",
      "tags": ["claude-code", "coding-agent", "cli", "anthropic"],
      "authors": [{"name": "Anthropic"}]
    },
    
    {
      "id": "2026-06-08-videokr-knowledge-reasoning-video",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-08-videokr-knowledge-reasoning-video/",
      "title": "VideoKR: 315K-Example Training Corpus for Knowledge- and Reasoning-Intensive Video Understanding",
      "content_text": "VideoKR introduces a 315K-example training corpus for knowledge- and reasoning-intensive video understanding, built from 145K CC-licensed expert-domain videos with chain-of-thought rationales at progressively deeper reasoning depths. Includes VideoKR-Eval, an expert-annotated benchmark requiring genuine video-grounded reasoning rather than textual shortcuts. SFT followed by GRPO post-training on VideoKR outperforms prior post-training approaches.\n\nWhy it matters: Multimodal reasoning benchmarks have been criticized for being solvable from text alone. VideoKR targets this gap with video-grounded knowledge reasoning, providing both training data and evaluation infrastructure for progress on genuinely vision-dependent tasks.",
      "date_published": "2026-06-08T00:00:00Z",
      "tags": ["multimodal", "video-generation", "reasoning", "benchmark", "paper"],
      "authors": [{"name": "Yale University"}]
    },
    
    {
      "id": "2026-06-08-subtlememory-benchmark-relational",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-08-subtlememory-benchmark-relational/",
      "title": "SubtleMemory: Benchmark Reveals Agents Systematically Fail Fine-Grained Relational Memory",
      "content_text": "SubtleMemory introduces a 1,522-instance benchmark designed to test whether AI agents can handle memories that reinforce, diverge, or contradict each other \u2014 rather than simple recall. Built over 10 long histories grounded in 1,090 relation-controlled memory-variant sets, it evaluates 11 memory systems. All tested systems show systematic failure at fine-grained relational memory discrimination, with distinct failure modes across preservation, retrieval, and downstream reasoning stages.\n\nWhy it matters: Existing agent memory benchmarks measure recall, not relational reasoning over conflicting memories. SubtleMemory exposes this blind spot across all current approaches, motivating a new generation of memory architectures for long-horizon agents.",
      "date_published": "2026-06-08T00:00:00Z",
      "tags": ["agents", "benchmark", "long-context", "reasoning", "paper"]
    },
    
    {
      "id": "2026-06-08-nvidia-nemotron-3-ultra-550b-moe",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-08-nvidia-nemotron-3-ultra-550b-moe/",
      "title": "NVIDIA Nemotron 3 Ultra: Open 550B MoE Model Now Available for Agentic Workloads",
      "content_text": "NVIDIA Nemotron 3 Ultra became available on June 4, announced at Computex. The model has 550B total and ~55B active parameters in a Mixture-of-Experts Hybrid Mamba-Attention architecture targeting long-running agentic tasks with persistent memory and multi-step tool use. It scores 48 on the Artificial Analysis Intelligence Index, the highest among US open-weights models. Distributed via Hugging Face, ModelScope, OpenRouter, and as NVIDIA NIM microservices; inference reaches 300+ tokens/second on DeepInfra.\n\nWhy it matters: Currently the most capable US-origin open-weights model, giving teams a strong self-hostable option for complex agent pipelines without closed APIs. The Hybrid Mamba architecture reduces memory bandwidth at long context, enabling cost-effective multi-agent orchestration.",
      "date_published": "2026-06-08T00:00:00Z",
      "tags": ["open-weights", "moe", "agents", "inference", "long-context", "us"],
      "authors": [{"name": "NVIDIA"}]
    },
    
    {
      "id": "2026-06-08-github-copilot-sdk-ga-mcp",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-08-github-copilot-sdk-ga-mcp/",
      "title": "GitHub Copilot SDK Reaches General Availability with MCP and Six-Language Support",
      "content_text": "The GitHub Copilot SDK went GA on June 2, available in Node.js/TypeScript, Python, Go, .NET, Rust, and Java. It exposes Copilot\u0027s full agentic runtime \u2014 planning, tool invocation, file edits, streaming, and multi-turn sessions \u2014 through a stable API. Developers can register custom tools, connect MCP servers, override built-in tools, and support multi-client workflows where different clients contribute tools and permissions to the same session. Available to all Copilot subscribers and non-subscribers via BYOK.\n\nWhy it matters: GA status and native MCP support mean teams can embed Copilot\u0027s agent engine directly into IDEs, CI pipelines, and enterprise tooling without building their own orchestration layer, and with production SLA guarantees.",
      "date_published": "2026-06-08T00:00:00Z",
      "tags": ["github-copilot", "sdk", "mcp", "coding-agent", "ga"],
      "authors": [{"name": "GitHub / Microsoft"}]
    },
    
    {
      "id": "2026-06-08-github-copilot-1m-context-reasoning",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-08-github-copilot-1m-context-reasoning/",
      "title": "GitHub Copilot Gets 1M Token Context Window and Configurable Reasoning Levels",
      "content_text": "GitHub announced on June 4 that Copilot now supports a one-million-token context window, enabling work across larger codebases and multi-file projects without losing context. Configurable reasoning levels let developers tune speed-vs-depth and enable extended thinking for architectural and debugging tasks. Both features are available in VS Code, Copilot CLI, and the Copilot app; larger context or higher reasoning consumes more GitHub AI Credits.\n\nWhy it matters: A 1M context window puts Copilot on par with frontier models for repository-scale tasks. Configurable reasoning lets teams opt in to deeper analysis on a per-query basis rather than paying uniformly \u2014 a practical pricing lever for enterprise users.",
      "date_published": "2026-06-08T00:00:00Z",
      "tags": ["github-copilot", "long-context", "reasoning", "coding-agent"],
      "authors": [{"name": "GitHub / Microsoft"}]
    },
    
    {
      "id": "2026-06-08-gemma-4-qat-mobile-edge",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-08-gemma-4-qat-mobile-edge/",
      "title": "Google DeepMind Releases Gemma 4 QAT Checkpoints: Sub-1 GB On-Device E2B Model",
      "content_text": "Google DeepMind released Quantization-Aware Training (QAT) checkpoints for the full Gemma 4 family on June 5. A new mobile QAT format cuts the E2B (2B) model to under 1 GB RAM (from 9.6 GB in BF16), while Q4_0 QAT reduces E2B from 9.6 GB to 3.2 GB and E4B from 15 GB to 5 GB. Weights ship on Hugging Face with immediate support in llama.cpp (b9549+ adds Gemma 4 MTP support), Ollama, LM Studio, vLLM, MLX, and LiteRT-LM.\n\nWhy it matters: Sub-1 GB capable models unlock deployment on mid-range phones and microcontrollers. QAT reduces the typical quality cliff of aggressive quantization, making compact Gemma 4 models viable for production on-device applications \u2014 a milestone for edge AI.",
      "date_published": "2026-06-08T00:00:00Z",
      "tags": ["gemma", "quantization", "on-device", "open-weights", "mobile", "local-llm"],
      "authors": [{"name": "Google DeepMind"}]
    },
    
    {
      "id": "2026-06-08-code2lora-hypernetwork-code-lm",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-08-code2lora-hypernetwork-code-lm/",
      "title": "Code2LoRA: Hypernetwork Generates Repo-Specific Adapters for Code LMs with Zero Inference Overhead",
      "content_text": "Code2LoRA generates repository-specific LoRA adapters for code language models with zero inference-time token overhead. Two variants: Code2LoRA-Static converts a repo snapshot into an adapter; Code2LoRA-Evo maintains adapters via GRU state updated per code diff. Introduces RepoPeftBench (604 Python repos, static and evolution tracks). Code2LoRA-Static achieves 63.8% cross-repo and 66.2% in-repo exact match, matching per-repository LoRA fine-tuning without any per-repo training.\n\nWhy it matters: Addresses a practical bottleneck for code AI in production: keeping LLM adapters up to date as codebases evolve without re-running expensive fine-tuning. The GRU-based incremental update mechanism enables adapter maintenance at software-evolution speed.",
      "date_published": "2026-06-08T00:00:00Z",
      "tags": ["lora", "fine-tuning", "coding", "inference", "paper"],
      "authors": [{"name": "University of Waterloo"}]
    },
    
    {
      "id": "2026-06-08-agentic-transformers-learn-dfs-via-rl",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-08-agentic-transformers-learn-dfs-via-rl/",
      "title": "Agentic Transformers Provably Learn Depth-First Search via Reinforcement Learning",
      "content_text": "The paper provides the first theoretical proof that transformer-based agents learn depth-first search mechanisms purely from sparse RL feedback, without expert demonstrations. A two-head transformer is constructed where one head tracks prior actions and another detects failures and triggers backtracking. Under a depth-wise curriculum, DFS emerges in stages: models trained on shallow trees generalize to deeper ones, and imbalanced goal distributions cause return discounting to produce a prioritized DFS variant.\n\nWhy it matters: Fills a major theoretical gap by explaining why RL training produces search-capable agents and provides mechanistic insight into how transformer attention heads specialize during RL \u2014 directly relevant to understanding and designing reasoning models.",
      "date_published": "2026-06-08T00:00:00Z",
      "tags": ["rl", "reasoning", "agents", "theory", "paper"],
      "authors": [{"name": "Carnegie Mellon University / Ohio State University"}]
    },
    
    {
      "id": "2026-06-06-xai-grok-imagine-video-1-5-api",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-06-xai-grok-imagine-video-1-5-api/",
      "title": "xAI Grok Imagine Video 1.5: Image-to-Video with Native Audio Tops Arena Leaderboard, API Now Live",
      "content_text": "xAI shipped Grok Imagine Video 1.5 as a preview on May 30-31, 2026; the API became available on June 3 at api.x.ai under alias `grok-imagine-video-1.5-2026-05-30`. The model animates a still image (or text prompt) into a clip with native synchronized audio \u2014 music, sound effects, and lip-synced dialogue \u2014 supporting video extension and reference-guided generation at 720p. At launch it claimed the top position on the Image-to-Video Arena leaderboard with a 52 Elo-point jump over v1.0. Pricing: $0.08/s at 480p, $0.14/s at 720p.\n\nWhy it matters: Takes first place on the Image-to-Video Arena leaderboard immediately at launch; native audio sync directly in video generation is still rare in publicly-accessible models.",
      "date_published": "2026-06-06T00:00:00Z",
      "tags": ["grok", "xai", "image-to-video", "video-generation", "api", "release"],
      "authors": [{"name": "xAI"}]
    },
    
    {
      "id": "2026-06-06-self-correction-illusion-role-framing",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-06-self-correction-illusion-role-framing/",
      "title": "The Self-Correction Illusion: LLMs Fix Others\u0027 Errors but Not Their Own \u2014 Role Labels Are the Cause",
      "content_text": "LLMs readily fix errors when presented as external input but fail to correct identical errors framed as their own prior output. The paper isolates the cause: chat-template role labels (user message vs. internal thought vs. tool output vs. system memory), not the content itself. Relabeling an internal erroneous claim as an external source increases explicit correction rates by 23-93 percentage points across 7 model families and 3 domains (p \u003c 0.001 in 10/13 test cells). A prompt-structure intervention requiring no retraining achieves significant improvements.\n\nWhy it matters: Reframes LLM self-correction failure as an artifact of prompt structure rather than a fundamental cognitive limitation \u2014 both more actionable (fixable via prompting) and more revealing about how sensitive model behavior is to framing.",
      "date_published": "2026-06-06T00:00:00Z",
      "tags": ["reasoning", "hallucination", "paper"]
    },
    
    {
      "id": "2026-06-06-sber-gigachat-business-assistant-spief",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-06-sber-gigachat-business-assistant-spief/",
      "title": "Sber Launches GigaChat-Powered Multi-Agent Business Assistant for Corporate Banking at SPIEF 2026",
      "content_text": "At the St. Petersburg International Economic Forum (SPIEF, June 3-6, 2026), Sber announced a new Business Assistant for its SberBusiness mobile app \u2014 a conversational AI interface built on GigaChat that replaces traditional internet banking. The system uses a multi-agent architecture with over 160 specialized AI agents covering payments, accounts, analytics, and documentation. A limited advisory version is already handling over 7.5 million queries from more than one million entrepreneurs. Full rollout is planned for autumn 2026.\n\nWhy it matters: Sber is moving GigaChat beyond a consumer chatbot into a full enterprise banking operating system, with agentic architecture replacing structured UI entirely \u2014 one of the most concrete production deployments of a Russian LLM in high-stakes financial workflows.",
      "date_published": "2026-06-06T00:00:00Z",
      "tags": ["gigachat", "russia", "enterprise", "agents", "release"],
      "authors": [{"name": "Sber"}]
    },
    
    {
      "id": "2026-06-06-opencode-v1-16-workspace-cloning",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-06-opencode-v1-16-workspace-cloning/",
      "title": "OpenCode v1.16: Workspace Cloning, 38% Faster Startup, Snowflake Cortex Provider, Session Replay",
      "content_text": "OpenCode (SST) released v1.16.0 and v1.16.2 on June 5, 2026. v1.16.0 adds managed workspace cloning that preserves dirty and untracked files, cross-workspace session movement, proper OpenAI model support via AWS Bedrock, skill discovery with file-based agent loading, new color themes and thinking-level selector for desktop, and a `run --replay` mode for interactive session replay. Startup time improved by 38%. v1.16.2 fixes reasoning summaries to only run on providers that support them (avoiding GPT-5 failures), refuses loose edit matches to prevent overwriting wrong code, resolves Bedrock session hangs, adds diff viewer hunk navigation, and adds Snowflake Cortex as a new LLM provider.\n\nWhy it matters: Workspace cloning and session replay are significant quality-of-life features for multi-workspace developer workflows; Snowflake Cortex support extends enterprise coverage.",
      "date_published": "2026-06-06T00:00:00Z",
      "tags": ["opencode", "coding-agent", "cli", "open-source", "release"]
    },
    
    {
      "id": "2026-06-06-openai-lockdown-mode-elevated-risk-labels",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-06-openai-lockdown-mode-elevated-risk-labels/",
      "title": "OpenAI Rolls Out Lockdown Mode to Block Prompt-Injection Exfiltration in ChatGPT",
      "content_text": "OpenAI launched Lockdown Mode on June 5, 2026 \u2014 an optional advanced security setting that restricts ChatGPT\u0027s outbound network capabilities (web browsing, Deep Research, Agent Mode, file downloads) to block data exfiltration via prompt injection attacks. Available to all logged-in personal accounts (Free, Plus, Pro) and self-serve ChatGPT Business. A companion Elevated Risk label surfaces across ChatGPT, ChatGPT Atlas, and Codex to flag high-risk operations.\n\nWhy it matters: Prompt injection is the dominant attack vector against LLM-based agents handling sensitive data; Lockdown Mode is the first deterministic, user-controlled mechanism from a major lab that eliminates the exfiltration leg of the attack chain.",
      "date_published": "2026-06-06T00:00:00Z",
      "tags": ["openai", "chatgpt", "security", "agents", "release"],
      "authors": [{"name": "OpenAI"}]
    },
    
    {
      "id": "2026-06-06-openai-chatgpt-dreaming-v3-memory",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-06-openai-chatgpt-dreaming-v3-memory/",
      "title": "OpenAI Launches Dreaming V3: Background Memory Synthesis for ChatGPT with 5x Compute Reduction",
      "content_text": "OpenAI began rolling out Dreaming V3 on June 4-5, 2026 \u2014 a background process that automatically synthesizes ChatGPT memory from many conversations simultaneously, replacing the manual saved-memories list as ChatGPT\u0027s memory foundation. The system prioritizes freshness (auto-updating stale memories), continuity (linking sessions over days or weeks), and relevance filtering. Internal factual-recall evals improved from 41.5% (2024) to 82.8% (2026). A roughly 5x compute reduction makes free-tier rollout viable; Plus and Pro users in the US receive it first.\n\nWhy it matters: The biggest memory overhaul since ChatGPT launched \u2014 silent background synthesis means users must now audit inferences, not just explicit saves.",
      "date_published": "2026-06-06T00:00:00Z",
      "tags": ["openai", "chatgpt", "personalization", "memory", "release"],
      "authors": [{"name": "OpenAI"}]
    },
    
    {
      "id": "2026-06-06-mlevolve-self-evolving-ml-discovery",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-06-mlevolve-self-evolving-ml-discovery/",
      "title": "MLEvolve: Self-Evolving Multi-Agent LLM Framework for Automated ML Algorithm Discovery",
      "content_text": "MLEvolve is a self-evolving multi-agent LLM framework for automated machine learning algorithm discovery. It introduces Progressive Monte Carlo Graph Search (MCGS) with cross-branch information flow, Retrospective Memory (cold-start knowledge base plus dynamic task-specific memory), and hierarchical planning that decouples strategy from code generation. On MLE-Bench, it achieves state-of-the-art medal rate within a 12-hour budget \u2014 half the standard runtime \u2014 and outperforms AlphaEvolve on mathematical algorithm optimization tasks. Open-source code is available on GitHub.\n\nWhy it matters: Automated algorithm discovery that beats AlphaEvolve signals that LLM agents can do meaningful AI research. The paper received 301 upvotes on HuggingFace Daily Papers, the highest for this period.",
      "date_published": "2026-06-06T00:00:00Z",
      "tags": ["automated-research", "algorithm-discovery", "agents", "multi-agent", "paper"]
    },
    
    {
      "id": "2026-06-06-great-american-ai-act-federal-framework",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-06-great-american-ai-act-federal-framework/",
      "title": "US Congress Releases 269-Page \u0027Great American AI Act\u0027 Draft with 3-Year State Law Preemption",
      "content_text": "On June 4, 2026, Reps. Jay Obernolte (R-CA) and Lori Trahan (D-MA) released a 269-page bipartisan discussion draft of the Great American AI Act \u2014 the first comprehensive US federal AI governance framework. Key provisions: three-year preemption of state AI development laws (with sunset; deployment laws not preempted), formal CAISI establishment, $100M/year for a Center for AI Standards and Innovation, frontier model governance requirements, and workforce impact reporting. The draft has drawn criticism from labor unions and civil society groups over the state preemption scope.\n\nWhy it matters: First serious attempt at a US federal AI governance framework that would supersede California, Colorado, and other state AI laws for three years during a critical industry development window.",
      "date_published": "2026-06-06T00:00:00Z",
      "tags": ["regulation", "policy", "us", "safety"]
    },
    
    {
      "id": "2026-06-06-google-veo-3-1-flow-audio-editing",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-06-google-veo-3-1-flow-audio-editing/",
      "title": "Google Veo 3.1 Brings Audio to All Flow Editing Modes and New Insert/Remove Tools",
      "content_text": "Google published an official update on June 5, 2026 announcing new Veo 3.1 capabilities inside its Flow video editing platform. The update brings audio generation to previously audio-free features \u2014 Ingredients to Video, Frames to Video, and Extend \u2014 and introduces precision editing tools including an Insert function that adds new scene elements with realistic lighting, plus an upcoming Remove tool to erase unwanted objects with background reconstruction. Veo 3.1 is also available via the Gemini API and Vertex AI. Over 275 million videos have been created on Flow since launch.\n\nWhy it matters: Bringing native audio to all Flow editing modes closes the gap between AI video generation and professional post-production; Insert/Remove editing tools move Veo toward a full video editing platform.",
      "date_published": "2026-06-06T00:00:00Z",
      "tags": ["google-deepmind", "video-generation", "video-editing", "text-to-video", "gemini", "release"],
      "authors": [{"name": "Google DeepMind"}]
    },
    
    {
      "id": "2026-06-06-deterministic-horizon-cot-limits-tool-use",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-06-deterministic-horizon-cot-limits-tool-use/",
      "title": "The Deterministic Horizon: Information-Theoretic Proof That Extended CoT Fails and Tool Use Is Necessary",
      "content_text": "The paper proves an Attention Bottleneck Theorem establishing information-theoretic limits on how far decoder-only transformers can track state in purely neural chain-of-thought. A Deterministic Horizon exists at approximately 19-31 steps beyond which accuracy collapses super-exponentially. Across 12 models and 8 task domains (SWE-Bench, WebArena, SQL-Multi), tool-integrated reasoning achieves 86-94% accuracy versus 24-42% for neural CoT. Fine-tuning improves performance by less than 5%, confirming the limits are architectural, not training-related. Accepted at ICML 2026.\n\nWhy it matters: Provides rigorous theoretical grounding for why agentic tool use is necessary \u2014 not just empirically better but provably required past a complexity threshold \u2014 setting a principled basis for agent architecture design.",
      "date_published": "2026-06-06T00:00:00Z",
      "tags": ["chain-of-thought", "reasoning", "agents", "paper", "icml-2026"]
    },
    
    {
      "id": "2026-06-06-claude-code-v2-1-166",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-06-claude-code-v2-1-166/",
      "title": "Claude Code v2.1.166: Fallback Model Config, Expanded Deny-Rule Globs, Cross-Session Security",
      "content_text": "Claude Code v2.1.166 (first seen June 6) adds a `fallbackModel` setting to configure up to three fallback models tried in order when the primary model is overloaded, expanded deny-rule glob support, and hardened cross-session message security. Also disables thinking on models that think by default via `MAX_THINKING_TOKENS=0` and per-model toggles. Fixes a wide range of terminal, auth, session, and UI bugs including recurring JetBrains terminal rendering issues, PowerShell command validation hangs, and voice-mode auth clearing. Two earlier releases on June 5 (v2.1.163, v2.1.165) added `/plugin list` with filtering, `requiredMinimumVersion`/`requiredMaximumVersion` managed settings, and hooks returning `additionalContext`.\n\nWhy it matters: The fallback model configuration is a meaningful reliability improvement for production deployments where primary model availability can be unpredictable.",
      "date_published": "2026-06-06T00:00:00Z",
      "tags": ["claude-code", "anthropic", "cli", "coding-agent", "security", "release"],
      "authors": [{"name": "Anthropic"}]
    },
    
    {
      "id": "2026-06-06-audio-interaction-model-streaming-unified",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-06-audio-interaction-model-streaming-unified/",
      "title": "Audio Interaction Model: Unified Streaming Framework Combining Offline and Real-Time Audio Instruction Following",
      "content_text": "Researchers from the National University of Singapore published the Audio Interaction Model (AIM), a unified streaming audio framework that combines offline task execution (transcription, translation, music generation) with real-time audio instruction following through an end-to-end architecture. AIM achieves simultaneous low-latency streaming and high-quality offline audio processing without separate models for each task mode, receiving 101 upvotes on HuggingFace Daily Papers.\n\nWhy it matters: Unifying real-time and offline audio processing in a single end-to-end model removes a major architectural trade-off that forces most current systems to choose one mode.",
      "date_published": "2026-06-06T00:00:00Z",
      "tags": ["streaming", "paper", "multimodal", "speech"]
    },
    
    {
      "id": "2026-06-04-xai-grok-voice-vapi-default-engine",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-04-xai-grok-voice-vapi-default-engine/",
      "title": "xAI Grok Voice Becomes Default Engine for Vapi\u0027s 2.5M+ Voice Agents",
      "content_text": "xAI announced on June 3 a partnership making Grok Voice the default engine for Vapi\u0027s 12 core voices, powering over 2.5M voice agents built on the platform. In Vapi\u0027s blind arena evaluation, Grok Voice ranked first for naturalness and emotional range.\n\nWhy it matters: Signals Grok Voice reaching production-grade quality competitive with ElevenLabs at enterprise scale.",
      "date_published": "2026-06-04T00:00:00Z",
      "tags": ["grok", "xai", "voice-agents", "tts", "api", "partnership"],
      "authors": [{"name": "xAI"}]
    },
    
    {
      "id": "2026-06-04-windsurf-becomes-devin-desktop-acp",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-04-windsurf-becomes-devin-desktop-acp/",
      "title": "Windsurf Rebrands as Devin Desktop and Launches Open Agent Client Protocol (ACP)",
      "content_text": "Windsurf became Devin Desktop on June 2, bringing a unified Agent Command Center (Kanban), Spaces for cross-agent context sharing, and the open Agent Client Protocol (ACP) so third-party agents including Codex, Claude Code, and OpenCode can run inside the editor. Devin Local, a Rust-based rewrite of Cascade, offers 30% better token efficiency with subagent support. Legacy Cascade continues through July 1.\n\nWhy it matters: Open ACP protocol could standardize multi-agent IDE interoperability across competing coding agents, shifting the market toward a platform model.",
      "date_published": "2026-06-04T00:00:00Z",
      "tags": ["windsurf", "multi-agent", "ide", "coding-agent", "open-source", "release"],
      "authors": [{"name": "Cognition"}]
    },
    
    {
      "id": "2026-06-04-thoughtfold-reasoning-token-reduction",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-04-thoughtfold-reasoning-token-reduction/",
      "title": "ThoughtFold: Introspective Preference Learning Cuts Reasoning Tokens by 56% Without Accuracy Loss",
      "content_text": "ThoughtFold introduces a framework that eliminates redundant steps in large reasoning models using introspective identification of unnecessary exploration within correct trajectories, then applies preference optimization against those steps. Applied to DeepSeek-R1-Distill-Qwen-7B, it reduces token usage by approximately 56% while maintaining state-of-the-art accuracy.\n\nWhy it matters: Cuts reasoning compute roughly in half without accuracy loss, addressing the overthinking problem in RL-trained chain-of-thought models.",
      "date_published": "2026-06-04T00:00:00Z",
      "tags": ["reasoning", "efficiency", "distillation", "rl", "paper"]
    },
    
    {
      "id": "2026-06-04-suno-400m-series-d-industry-model",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-04-suno-400m-series-d-industry-model/",
      "title": "Suno Raises $400M Series D at $5.4B Valuation, Announces Industry-Partnered Music Model",
      "content_text": "Suno announced a $400M Series D led by Bond Capital on June 3, 2026, valuing the company at $5.4B. CEO Mikey Shulman announced an upcoming music model co-developed in partnership with the music industry and already in testing, aimed at resolving ongoing copyright disputes.\n\nWhy it matters: Sets a precedent for licensed AI music via artist co-development, signaling a potential path to resolving AI music copyright disputes industrywide.",
      "date_published": "2026-06-04T00:00:00Z",
      "tags": ["suno", "funding", "music-generation", "valuation"],
      "authors": [{"name": "Suno"}]
    },
    
    {
      "id": "2026-06-04-openai-codex-cli-v0-137-0",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-04-openai-codex-cli-v0-137-0/",
      "title": "OpenAI Codex CLI v0.137.0: Multi-Agent v2, Enterprise Config Bundles, TUI Keybindings",
      "content_text": "Codex v0.137.0 (June 4) adds F13-F24 TUI keybindings, enterprise monthly credit limit display and cloud-managed config bundles, remote-control client pairing via app-server v2 RPCs, machine-readable `codex plugin list --json`, and multi-agent v2 runtime-choice persistence per thread. MCP dependencies updated to rmcp 1.7.0.\n\nWhy it matters: Enterprise config bundles and multi-agent v2 improvements signal Codex CLI maturing toward production team deployments.",
      "date_published": "2026-06-04T00:00:00Z",
      "tags": ["codex", "openai", "coding-agent", "cli", "multi-agent", "enterprise", "release"],
      "authors": [{"name": "OpenAI"}]
    },
    
    {
      "id": "2026-06-04-nvidia-cosmos-3-omnimodal-physical-ai",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-04-nvidia-cosmos-3-omnimodal-physical-ai/",
      "title": "NVIDIA Releases Cosmos 3: Open Omnimodal World Foundation Model for Physical AI",
      "content_text": "NVIDIA released Cosmos 3, the first fully open omnimodal foundation model for physical AI reasoning, trained on 20T tokens of multimodal data including ~1B images, 400M videos, ambient audio, and action sequences. Built on a mixture-of-transformers architecture that unifies vision reasoning, world generation, and action prediction, it ranks first on eight or more vision-reasoning and world-generation leaderboards. Cosmos 3 Super and Nano are immediately available on build.nvidia.com, Hugging Face, and GitHub under the OpenMDW-1.1 license.\n\nWhy it matters: First open foundation model unifying perception, world simulation, and action prediction for robotics and AV training; 8,680 upvotes on HF Daily Papers.",
      "date_published": "2026-06-04T00:00:00Z",
      "tags": ["world-models", "multimodal", "robotics", "embodied-ai", "open-weights", "paper", "physical-ai"],
      "authors": [{"name": "NVIDIA"}]
    },
    
    {
      "id": "2026-06-04-microsoft-scout-autopilot-agent-365",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-04-microsoft-scout-autopilot-agent-365/",
      "title": "Microsoft Launches Scout: Always-On Autopilot AI Agent for Microsoft 365",
      "content_text": "Launched at Microsoft Build on June 2, Scout is Microsoft\u0027s first Autopilot agent \u2014 an always-on AI assistant integrated with Teams, Outlook, OneDrive, and SharePoint that proactively schedules meetings, blocks calendar time, and flags stalled decisions. Available via the Frontier early-access program, requiring a GitHub Copilot and Intune license.\n\nWhy it matters: First enterprise AI agent from Microsoft that takes autonomous calendar and workflow actions without explicit user invocation.",
      "date_published": "2026-06-04T00:00:00Z",
      "tags": ["agents", "enterprise", "multi-agent", "release"],
      "authors": [{"name": "Microsoft"}]
    },
    
    {
      "id": "2026-06-04-jetbrains-mellum2-12b-moe-open-source",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-04-jetbrains-mellum2-12b-moe-open-source/",
      "title": "JetBrains Open-Sources Mellum2: 12B MoE Coding Model for Multi-Model Pipelines",
      "content_text": "JetBrains released Mellum2 under Apache 2.0: a 12B Mixture-of-Experts model (2.5B active parameters, 64 experts activating 8 per token) trained on approximately 10.6T tokens for software engineering. Designed as a fast focal model for routing, RAG, subagents, and high-throughput coding features, it delivers 2x faster inference versus comparably-sized dense models.\n\nWhy it matters: First open-source coding MoE from a major IDE vendor, designed to slot into multi-model pipelines rather than replace frontier models.",
      "date_published": "2026-06-04T00:00:00Z",
      "tags": ["moe", "open-source", "coding-agent", "inference", "release"],
      "authors": [{"name": "JetBrains"}]
    },
    
    {
      "id": "2026-06-04-ideogram-4-0-open-weight-image-model",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-04-ideogram-4-0-open-weight-image-model/",
      "title": "Ideogram 4.0 Launches as Open-Weight 9.3B Text-to-Image Model with Native 2K Resolution",
      "content_text": "Ideogram released version 4.0 on June 3, 2026 as its first open-weight text-to-image model: a 9.3B parameter diffusion transformer with native 2K resolution, transparent background support, bounding-box layout control, and best-in-class multilingual text rendering. Weights in nf4 and fp8 quantizations are publicly available on Hugging Face and GitHub under a non-commercial-free/paid-commercial license. The model tops the DesignArena leaderboard at launch.\n\nWhy it matters: First production-grade open-weight image model to top the DesignArena leaderboard, giving developers a locally-runnable alternative to closed models from OpenAI and Google.",
      "date_published": "2026-06-04T00:00:00Z",
      "tags": ["text-to-image", "open-weights", "image-generation", "multilingual", "release"],
      "authors": [{"name": "Ideogram"}]
    },
    
    {
      "id": "2026-06-04-google-gemma-4-12b-multimodal-local",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-04-google-gemma-4-12b-multimodal-local/",
      "title": "Google DeepMind Releases Gemma 4 12B: Encoder-Free Multimodal Model That Runs on a 16 GB Laptop",
      "content_text": "Google DeepMind released Gemma 4 12B on June 3, 2026 \u2014 an open-weights, encoder-free multimodal model that natively ingests audio, video, and images, runs locally on a 16 GB VRAM laptop, and is licensed under Apache 2.0. It is the first medium-sized model with built-in native audio understanding and is designed to power fully local agentic workflows via the Google AI Edge stack.\n\nWhy it matters: Brings frontier-grade multimodal and audio capabilities to consumer hardware without cloud dependency; first encoder-free design at this scale.",
      "date_published": "2026-06-04T00:00:00Z",
      "tags": ["gemma", "open-weights", "multimodal", "on-device", "release"],
      "authors": [{"name": "Google DeepMind"}]
    },
    
    {
      "id": "2026-06-04-github-copilot-usage-based-billing-max-plan",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-04-github-copilot-usage-based-billing-max-plan/",
      "title": "GitHub Copilot Transitions to Usage-Based AI Credits Billing with New Max Plan",
      "content_text": "As of June 1, all GitHub Copilot plans transitioned to GitHub AI Credits consumption-based billing. A new Copilot Max tier launched for power users with higher included usage and spend limits. User-level budget controls are now generally available for orgs and enterprises, with per-user thresholds and email alerts.\n\nWhy it matters: Usage-based pricing with per-user budget controls directly affects how teams plan and control AI coding spend.",
      "date_published": "2026-06-04T00:00:00Z",
      "tags": ["github-copilot", "pricing", "enterprise"],
      "authors": [{"name": "GitHub"}]
    },
    
    {
      "id": "2026-06-04-github-copilot-standalone-desktop-app",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-04-github-copilot-standalone-desktop-app/",
      "title": "GitHub Copilot Standalone Desktop App Launches in Technical Preview at Microsoft Build 2026",
      "content_text": "Announced at Microsoft Build on June 2, the GitHub Copilot app is a native desktop app for Windows, Mac, and Linux that runs agent sessions in isolated git worktrees, surfaces Canvases (bidirectional human-agent work surfaces), includes Agent Merge for automated PR lifecycle management, and supports local and cloud sandboxes. Available in technical preview for Copilot Pro/Pro+/Business/Enterprise subscribers.\n\nWhy it matters: Standalone app signals GitHub positioning Copilot as a full agent platform rather than an IDE extension, competing directly with Cursor and Devin Desktop.",
      "date_published": "2026-06-04T00:00:00Z",
      "tags": ["github-copilot", "coding-agent", "ide", "enterprise", "multi-agent", "release"],
      "authors": [{"name": "GitHub"}]
    },
    
    {
      "id": "2026-06-04-elevenlabs-stan-lee-voice-licensing",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-04-elevenlabs-stan-lee-voice-licensing/",
      "title": "ElevenLabs Licenses Stan Lee\u0027s Voice and Likeness for AI Commercial Use",
      "content_text": "ElevenLabs announced a deal with Stan Lee Universe to add the late Marvel co-creator\u0027s AI voice and likeness to its Iconic Marketplace for commercial licensing. The voice was trained on professional recordings; users can license it for commercial projects or hear it narrate books in the Eleven Reader app.\n\nWhy it matters: Advances a consent-based model for digital celebrity likenesses, setting industry norms for posthumous voice AI commercialization.",
      "date_published": "2026-06-04T00:00:00Z",
      "tags": ["elevenlabs", "voice-cloning", "tts"],
      "authors": [{"name": "ElevenLabs"}]
    },
    
    {
      "id": "2026-06-04-echo-infinity-infinite-video-generation",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-04-echo-infinity-infinite-video-generation/",
      "title": "Echo-Infinity: Real-Time Infinite Video Generation via Learnable Memory Query",
      "content_text": "Echo-Infinity presents an autoregressive video generation framework with a learnable Memory Query mechanism that dynamically compresses frame history via attention, maintaining constant compute cost regardless of sequence length. The approach achieves real-time generation of 24-hour (over 1.3M frame) video rollouts for the first time, and introduces Unified Relative RoPE to eliminate positional embedding extrapolation gaps.\n\nWhy it matters: First system to demonstrate real-time infinite-length video generation, opening practical applications for long-horizon world simulation and embodied AI.",
      "date_published": "2026-06-04T00:00:00Z",
      "tags": ["video-generation", "paper", "architecture", "memory", "long-context"]
    },
    
    {
      "id": "2026-06-04-claude-code-v2-1-162",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-04-claude-code-v2-1-162/",
      "title": "Claude Code v2.1.162: Security Fix for OAuth Credential Leak, Parallel Tool Call Isolation",
      "content_text": "Claude Code v2.1.162 (June 3) adds a `waitingFor` field to `claude agents --json`, parallel tool call isolation (failed Bash no longer cancels other calls in the same batch), and fixes for WebFetch permission rules, Windows path handling, and a regression that could leak OAuth credentials to custom API gateways.\n\nWhy it matters: The OAuth credential leak fix is security-critical for users running Claude Code behind custom API gateway configurations.",
      "date_published": "2026-06-04T00:00:00Z",
      "tags": ["claude-code", "anthropic", "cli", "coding-agent", "security", "bug-fix", "release"],
      "authors": [{"name": "Anthropic"}]
    },
    
    {
      "id": "2026-06-03-trump-ai-executive-order-30-day-review",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-03-trump-ai-executive-order-30-day-review/",
      "title": "Trump Signs AI Executive Order Requiring 30-Day Voluntary Pre-Release Government Review",
      "content_text": "President Trump signed an executive order on June 2, 2026 directing AI companies to voluntarily submit frontier models for government security testing up to 30 days before public release. The order instructs federal agencies to develop AI cybersecurity benchmarks, establish an \u0027AI cybersecurity clearinghouse,\u0027 and strengthen government defenses against AI-enabled threats. An earlier draft mandated a 90-day window, cut to 30 days after industry pushback over innovation concerns.\n\nWhy it matters: First substantive AI governance action from the Trump administration after months of a largely hands-off approach; sets a precedent for voluntary pre-deployment government review that could shape global standards.",
      "date_published": "2026-06-03T00:00:00Z",
      "tags": ["policy", "regulation", "security", "cybersecurity", "us"]
    },
    
    {
      "id": "2026-06-03-tropd-trust-region-on-policy-distillation",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-03-tropd-trust-region-on-policy-distillation/",
      "title": "TrOPD: Trust-Region On-Policy Distillation Stabilizes LLM Training When Teacher-Student Gap Is Large",
      "content_text": "TrOPD (arXiv 2606.01249, submitted May 31, 2026) addresses instability in on-policy distillation when teacher and student distributions diverge substantially \u2014 a common failure mode when distilling strong reasoning models into smaller students. The method combines trust-region-bounded training restricted to regions of reliable teacher supervision, clipping and masking for outlier handling, and off-policy forward-KL guidance to encourage exploration toward trustworthy areas. It consistently outperforms OPD, EOPD, and REOPOLD baselines on mathematical reasoning, code generation, and general benchmarks.\n\nWhy it matters: On-policy distillation is the dominant technique for building cost-efficient reasoning models from frontier teachers; TrOPD\u0027s trust-region approach offers a principled fix with broad applicability \u2014 top HuggingFace Daily Paper on June 3 with 20 upvotes.",
      "date_published": "2026-06-03T00:00:00Z",
      "tags": ["reasoning", "rl", "distillation", "training", "paper"],
      "authors": [{"name": "Samsung Research"}]
    },
    
    {
      "id": "2026-06-03-qubric-rl-beyond-verifiable-rewards",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-03-qubric-rl-beyond-verifiable-rewards/",
      "title": "QUBRIC: Co-Designing Queries and Rubrics Extends RLVR to Open-Ended Reasoning Domains",
      "content_text": "QUBRIC (arXiv 2606.03968) addresses a structural weakness in rubric-based RLVR: open-ended queries produce vague rubrics, but narrowing queries introduces fabricated references. The method jointly refines queries and rubrics \u2014 using teacher-derived key points to convert open-ended questions into scenario-specific ones, generating contrastive rubrics based on observed policy gaps, and filtering for informative training pairs. Results show a 5.5-point improvement on ArenaHard over SFT baselines, with 6.3-point average gains on legal, moral, and narrative reasoning.\n\nWhy it matters: Extends RL with verifiable rewards (RLVR) \u2014 which has driven recent reasoning breakthroughs \u2014 to subjective, open-ended domains where ground-truth answers do not exist, a significant step toward general-purpose reasoning models.",
      "date_published": "2026-06-03T00:00:00Z",
      "tags": ["rl", "reasoning", "training", "reward-modeling", "paper"]
    },
    
    {
      "id": "2026-06-03-openai-rosalind-biodefense",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-03-openai-rosalind-biodefense/",
      "title": "OpenAI Launches Rosalind Biodefense Program with GPT-Rosalind for Pandemic Preparedness",
      "content_text": "OpenAI announced Rosalind Biodefense on June 1, 2026 \u2014 a gated-access program offering GPT-Rosalind, a specialized life-sciences model, to vetted developers building biosecurity and pandemic preparedness applications. Initial partners include Johns Hopkins Applied Physics Laboratory and CEPI\u0027s 100 Days Mission for vaccine development acceleration. The program covers epidemiological modeling, early detection, screening, and non-pharmaceutical interventions; federal agencies with public-health and biodefense missions also receive extended access.\n\nWhy it matters: Frontier AI applied to biodefense represents one of the highest-stakes dual-use domains; OpenAI\u0027s gated specialty model for biosecurity \u2014 rather than a general-purpose one \u2014 signals a new approach to responsible deployment in sensitive domains.",
      "date_published": "2026-06-03T00:00:00Z",
      "tags": ["openai", "biodefense", "biosecurity", "pandemic-preparedness", "gpt-rosalind", "life-sciences"],
      "authors": [{"name": "OpenAI"}]
    },
    
    {
      "id": "2026-06-03-openai-codex-sites-annotations-plugins",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-03-openai-codex-sites-annotations-plugins/",
      "title": "OpenAI Expands Codex Beyond Developers: Sites, Annotations, and Six Role-Specific Business Plugins",
      "content_text": "OpenAI announced on June 2, 2026 a major expansion of Codex targeting non-developer knowledge workers. New features include Sites (creates interactive hosted web apps and dashboards from analysis), Annotations (inline collaborative editing without rebuilding projects), and six new role-specific plugins covering sales, data analytics, creative production, product design, public equity investing, and investment banking \u2014 aggregating 62 business apps including Salesforce, Figma, and Snowflake. Non-developers now account for ~20% of Codex\u0027s 5 million weekly users and are adopting at 3x the rate of engineers.\n\nWhy it matters: Positions Codex as a general enterprise productivity platform across finance, sales, and creative roles \u2014 directly competing with incumbents like Salesforce, Adobe, and Microsoft Copilot beyond its original developer audience.",
      "date_published": "2026-06-03T00:00:00Z",
      "tags": ["codex", "openai", "coding-agent", "enterprise", "agentic", "update", "plugins", "knowledge-workers"],
      "authors": [{"name": "OpenAI"}]
    },
    
    {
      "id": "2026-06-03-minimax-hailuo-2-3-media-agent",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-03-minimax-hailuo-2-3-media-agent/",
      "title": "MiniMax Launches Hailuo 2.3 Video Model and Expands Video Agent into Media Agent",
      "content_text": "MiniMax released Hailuo 2.3 on June 3, 2026 with improvements in physical action portrayal, character micro-expressions, stylization, and motion command following. A new Hailuo 2.3 Fast variant reduces batch creation costs by up to 50% at the same price as Hailuo 02. Simultaneously, MiniMax renamed and expanded the Hailuo Video Agent into the Media Agent \u2014 a multi-modal creation platform now live globally on the Hailuo AI website, mobile app, and Open Platform API, with VEED as a day-one integration partner.\n\nWhy it matters: Reinforces MiniMax as the cost-efficiency leader in video generation; the Media Agent rebranding signals a strategic push beyond video into full multi-modal creative workflows, competing with Runway and Pika at the workflow orchestration layer.",
      "date_published": "2026-06-03T00:00:00Z",
      "tags": ["text-to-video", "image-to-video", "minimax", "china", "release", "update"],
      "authors": [{"name": "MiniMax"}]
    },
    
    {
      "id": "2026-06-03-language-models-need-sleep-offline-recurrence",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-03-language-models-need-sleep-offline-recurrence/",
      "title": "Do Language Models Need Sleep? Offline Recurrence as Memory Consolidation for Improved Inference",
      "content_text": "This Google/CMU paper (arXiv 2605.26099) proposes a sleep-like memory consolidation mechanism for language models. Periodically, the model converts recent context into persistent fast weights in SSM blocks through N offline recurrent passes, then clears its KV cache. On synthetic tasks (cellular automata, multi-hop graph retrieval) and math reasoning benchmarks, increasing sleep duration N improves performance, with the largest gains on examples requiring deeper multi-step reasoning.\n\nWhy it matters: Introduces a principled mechanism for converting short-term context into long-term weights \u2014 pointing toward a new paradigm for handling very long contexts without unbounded KV cache growth, a key bottleneck for production inference.",
      "date_published": "2026-06-03T00:00:00Z",
      "tags": ["memory", "reasoning", "architecture", "long-context", "ssm", "paper"],
      "authors": [{"name": "Google / CMU"}]
    },
    
    {
      "id": "2026-06-03-humanoid-gpt-zero-shot-motion-tracking",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-03-humanoid-gpt-zero-shot-motion-tracking/",
      "title": "Humanoid-GPT: Scaling to 2B Motion Frames Enables Zero-Shot Generalization in Humanoid Control",
      "content_text": "Humanoid-GPT (arXiv 2606.03985, CVPR 2026) trains a GPT-style causal Transformer on a 2-billion-frame motion corpus aggregating seven datasets for whole-body humanoid control. Scaling both data and model capacity yields a single generative model that tracks highly dynamic motions while achieving zero-shot generalization to unseen tasks \u2014 dissolving the agility-generalization tradeoff inherent to prior MLP-based trackers. Inference latency is under 1.5ms on an RTX 4090. The paper also introduces Harmonic Motion Embedding (HME) to quantify motion diversity.\n\nWhy it matters: Establishes clear GPT-style scaling laws for motion tracking, suggesting the same data-scaling recipe that worked for language applies directly to humanoid control \u2014 accepted at CVPR 2026, 18 upvotes on HuggingFace Daily Papers.",
      "date_published": "2026-06-03T00:00:00Z",
      "tags": ["robotics", "embodied-ai", "scaling", "humanoid", "motion-tracking", "zero-shot", "paper"]
    },
    
    {
      "id": "2026-06-03-faithful-confidence-large-reasoning-models",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-03-faithful-confidence-large-reasoning-models/",
      "title": "Quantifying Faithful Confidence Expression in Large Reasoning Models",
      "content_text": "This Yale NLP paper (arXiv 2606.03969) investigates whether large reasoning models faithfully express their actual uncertainty. The authors compare linguistic confidence signals against three internal uncertainty measures: token probabilities, hidden states, and response sampling consistency. Key findings: (1) reasoning capability does not automatically improve calibration; (2) standard prompting techniques do not transfer to reasoning models; (3) different internal uncertainty measures yield conflicting results, revealing fragility in existing evaluation methodologies.\n\nWhy it matters: As reasoning models are deployed in high-stakes settings, faithful uncertainty communication is safety-critical. The paper establishes that large reasoning models have a distinct, unresolved calibration problem separate from general LLMs.",
      "date_published": "2026-06-03T00:00:00Z",
      "tags": ["reasoning", "interpretability", "safety", "calibration", "paper"],
      "authors": [{"name": "Yale NLP"}]
    },
    
    {
      "id": "2026-06-03-claude-code-v2-1-161",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-03-claude-code-v2-1-161/",
      "title": "Claude Code v2.1.161: OTEL Labels, Parallel Tool Call Resilience, Linux Clipboard Overhaul",
      "content_text": "Claude Code v2.1.161 (released June 2, 2026) adds OTEL_RESOURCE_ATTRIBUTES values as metric labels for slicing usage by team and repo dimensions, improves the `claude agents` display to show done/total counts during fan-out, and collapses unused MCP claude.ai connectors by default. Key reliability fix: failed Bash commands in a parallel tool batch no longer cancel other in-flight calls. Linux fullscreen clipboard now uses wl-copy/xclip/xsel and supports both clipboard and PRIMARY selection. Additional bug fixes address managed-settings policy interference with third-party providers and background subagent stdout corruption.\n\nWhy it matters: The parallel tool call resilience fix is critical for complex agentic workflows where a single failing Bash command previously aborted the entire batch, causing silent data loss in multi-step pipelines.",
      "date_published": "2026-06-03T00:00:00Z",
      "tags": ["claude-code", "anthropic", "cli", "coding-agent", "observability", "bug-fix", "release"],
      "authors": [{"name": "Anthropic"}]
    },
    
    {
      "id": "2026-06-03-chatgpt-live-job-search-resume",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-03-chatgpt-live-job-search-resume/",
      "title": "ChatGPT Adds Live Job Search and Resume Formatting",
      "content_text": "OpenAI updated ChatGPT on June 1, 2026 to surface live job listings and freelance opportunities from Indeed, Upwork, Appstack, and web search results. Users can upload, create, and download resumes in professional formats tailored to specific job descriptions. Job search is available on Free, Go, Plus, and Pro plans in the US; resume formatting is available on all plans globally in English on web.\n\nWhy it matters: OpenAI continues expanding ChatGPT into transactional internet categories \u2014 jobs follows shopping and travel \u2014 directly competing with LinkedIn and Indeed while establishing a referral-fee monetization layer.",
      "date_published": "2026-06-03T00:00:00Z",
      "tags": ["chatgpt", "openai", "search", "consumer", "update", "release"],
      "authors": [{"name": "OpenAI"}]
    },
    
    {
      "id": "2026-06-02-xai-composer-2-5-grok-build",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-02-xai-composer-2-5-grok-build/",
      "title": "xAI Launches Composer 2.5 in Grok Build for Agentic Coding",
      "content_text": "xAI released Composer 2.5 inside Grok Build on June 1, 2026, a fast agentic coding model built on the open-source Moonshot Kimi K2.5 checkpoint and trained with 25 times more synthetic tasks than its predecessor. Available at build.grok.com at $0.50 per million input tokens, it excels at long-running agentic tasks, JSON, tool use, and complex instruction-following.\n\nWhy it matters: Composer 2.5 significantly undercuts comparable agentic coding models on price while matching frontier performance, and its Kimi K2.5 foundation highlights the increasing role of open-weight Chinese models in Western AI products.",
      "date_published": "2026-06-02T00:00:00Z",
      "tags": ["grok", "coding-agent", "agentic", "coding", "pricing", "us"],
      "authors": [{"name": "xAI"}]
    },
    
    {
      "id": "2026-06-02-vllm-v0-22-0",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-02-vllm-v0-22-0/",
      "title": "vLLM v0.22.0: DeepSeek V4 Production Hardening, Rust Frontend, 28.9% Latency Drop",
      "content_text": "vLLM v0.22.0 (released May 29, 2026) includes 459 commits from 230 contributors. Key highlights: DeepSeek V4 production hardening with NVFP4 fused MoE, full CUDA graph, and MTP speculative decoding; a new experimental Rust frontend with data-parallel serving supervisor; 28.9% end-to-end latency improvement via Cutlass FP8 batch-invariant inference; and multi-tier KV cache offloading to disk. AMD ROCm parity and NVIDIA Blackwell (SM12x) optimizations were also merged.\n\nWhy it matters: DeepSeek V4 is the most widely self-hosted frontier model; production-grade vLLM support plus a 28.9% latency improvement makes it significantly more viable for high-throughput deployments at scale.",
      "date_published": "2026-06-02T00:00:00Z",
      "tags": ["vllm", "inference", "open-source", "gpu", "deepseek"]
    },
    
    {
      "id": "2026-06-02-qwen3-7-plus-multimodal-agent",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-02-qwen3-7-plus-multimodal-agent/",
      "title": "Alibaba Launches Qwen3.7-Plus: Multimodal Agent with Vision, Reasoning, and Autonomous Execution",
      "content_text": "Alibaba\u0027s Qwen team released Qwen3.7-Plus on June 2, 2026, adding native image and video understanding to the earlier text-only Qwen3.7-Max. The model combines deep reasoning, self-programming, tool invocation, verification, and autonomous iteration in a single agentic loop, scoring 79 on screen-understanding benchmarks and outperforming GPT-5.4 and Gemini-3.1 Pro on that task. Available via Alibaba Cloud Bailian API at $0.40/$1.60 per million input/output tokens; Alibaba shares rose over 6% on the announcement.\n\nWhy it matters: First Qwen release to unify vision and agentic execution in one model, enabling autonomous end-to-end workflows \u2014 including building a full app over 11 hours without human intervention \u2014 and advancing the frontier of Chinese multimodal agents.",
      "date_published": "2026-06-02T00:00:00Z",
      "tags": ["qwen", "alibaba", "multimodal", "agents", "gui-agent", "agentic", "closed-source", "china"],
      "authors": [{"name": "Alibaba / Qwen"}]
    },
    
    {
      "id": "2026-06-02-opencode-v1-15-13",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-02-opencode-v1-15-13/",
      "title": "OpenCode v1.15.13: Session Metadata API, Adaptive Reasoning Fix for Anthropic Opus 4.7+",
      "content_text": "OpenCode v1.15.13 (released May 30, 2026) fixes a bug where Anthropic Gateway\u0027s Opus 4.7+ adaptive reasoning returned empty thinking blocks instead of summarized thinking. Sessions can now store custom metadata via the API and SDK for workflow automation. Config loading was also improved to apply directory-specific settings more predictably when traversing up the directory tree.\n\nWhy it matters: Adaptive reasoning support for Opus 4.7+ is a key differentiator for open-source coding agents; the metadata API enables richer integrations with CI/CD and orchestration tools.",
      "date_published": "2026-06-02T00:00:00Z",
      "tags": ["opencode", "coding-agent", "open-source", "cli", "release", "update"]
    },
    
    {
      "id": "2026-06-02-openai-codex-goal-mode-appshots",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-02-openai-codex-goal-mode-appshots/",
      "title": "OpenAI Codex: Goal Mode Reaches GA and Appshots Launch for macOS",
      "content_text": "OpenAI\u0027s Codex reached general availability for Goal mode \u2014 allowing Codex to work toward an objective for hours or days with dedicated storage and progress tracking \u2014 across the app, IDE extension, and CLI. Separately, Appshots launched for macOS: pressing both Command keys attaches the frontmost app window (screenshot + text) to the active Codex session without manual copy-paste. Both features are confirmed GA as of late May 2026.\n\nWhy it matters: Goal mode GA transforms Codex from a reactive assistant into a persistent autonomous coding agent, directly competing with Anthropic\u0027s Claude Code ultracode mode and Devin.",
      "date_published": "2026-06-02T00:00:00Z",
      "tags": ["codex", "openai", "coding-agent", "cli", "ga", "update"],
      "authors": [{"name": "OpenAI"}]
    },
    
    {
      "id": "2026-06-02-openai-codex-amazon-bedrock-ga",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-02-openai-codex-amazon-bedrock-ga/",
      "title": "OpenAI GPT-5.5, GPT-5.4, and Codex Now Generally Available on Amazon Bedrock",
      "content_text": "OpenAI\u0027s GPT-5.5, GPT-5.4, and Codex coding agent became generally available on Amazon Bedrock on June 1, 2026. Pricing matches OpenAI\u0027s direct rates with no additional fees; usage counts toward AWS commitments. Enterprises gain AWS-native security controls (IAM, VPC, KMS, CloudTrail) and Bedrock\u0027s inference durability, with Codex supporting VS Code, JetBrains, and Xcode integrations.\n\nWhy it matters: Removes the primary enterprise barrier to adopting Codex by integrating it into the AWS compliance and procurement ecosystem that large organizations already operate; early adopters include Amgen and Autodesk.",
      "date_published": "2026-06-02T00:00:00Z",
      "tags": ["codex", "openai", "aws", "cloud", "enterprise", "ga"],
      "authors": [{"name": "OpenAI"}]
    },
    
    {
      "id": "2026-06-02-minimax-m3-frontier-open-weight",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-02-minimax-m3-frontier-open-weight/",
      "title": "MiniMax Releases M3: Open-Weight Frontier Model with 1M-Token Context and MSA Architecture",
      "content_text": "MiniMax officially released M3 on June 1, 2026, a frontier-class open-weight model built on the novel MiniMax Sparse Attention (MSA) architecture supporting a 1-million-token context window at one-twentieth the per-token compute of the prior generation. The model natively accepts text, image, and video input, scores 59.0% on SWE-Bench Pro (above GPT-5.5 and Gemini 3.1 Pro), and is available via API; open weights and a technical report are promised on Hugging Face within 10 days.\n\nWhy it matters: First Chinese open-weight model to combine frontier-level agentic coding, a genuine 1M-token context window, and native multimodality in a single architecture \u2014 directly challenging top closed-source models at 5\u201310% of the cost.",
      "date_published": "2026-06-02T00:00:00Z",
      "tags": ["open-weights", "long-context", "multimodal", "agentic", "coding", "moe", "china"],
      "authors": [{"name": "MiniMax"}]
    },
    
    {
      "id": "2026-06-02-microsoft-build-2026-mai-models",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-02-microsoft-build-2026-mai-models/",
      "title": "Microsoft Build 2026: MAI Model Family Launched to Power GitHub Copilot Without OpenAI Dependency",
      "content_text": "Microsoft opened Build 2026 in San Francisco on June 2 by launching the MAI model family: MAI-Code-1 (a coding model targeting GitHub Copilot), MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2. MAI-Code-1 reportedly matches or exceeds Anthropic Claude 3.7 Sonnet on SWE-bench Verified while running at lower inference cost on Azure \u2014 enabling Microsoft to power Copilot without routing through OpenAI APIs for the first time.\n\nWhy it matters: Microsoft\u0027s first in-house foundation model family signals a major shift away from OpenAI dependency for its $10B+/year Copilot business; mirrors Google\u0027s Gemini-in-everything playbook and could reshape AI infrastructure pricing across the developer tools market.",
      "date_published": "2026-06-02T00:00:00Z",
      "tags": ["coding-agent", "coding", "benchmark", "swe-bench", "ide", "enterprise", "us"],
      "authors": [{"name": "Microsoft"}]
    },
    
    {
      "id": "2026-06-02-grepseek-direct-corpus-interaction",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-02-grepseek-direct-corpus-interaction/",
      "title": "GrepSeek: Training Search Agents for Direct Corpus Interaction via Shell Commands (93 HF Upvotes)",
      "content_text": "GrepSeek (arXiv 2605.29307) trains LLM-based search agents to interact with text corpora through executable shell commands (grep, file reads, lightweight scripts) rather than pre-built vector indices \u2014 a paradigm called Direct Corpus Interaction (DCI). A two-stage pipeline combines cold-start trajectory generation with Group Relative Policy Optimization (GRPO), and a sharded-parallel execution engine provides up to 7.6\u00d7 speedup. The system achieves top performance on seven open-domain QA benchmarks.\n\nWhy it matters: Removes the semantic index bottleneck entirely, enabling agents to do exact lexical matching, conjunctive sparse clue lookup, and multi-step hypothesis refinement directly on raw corpora \u2014 capabilities that embedding-based RAG systems struggle with. 93 upvotes on HuggingFace Daily Papers for June 1.",
      "date_published": "2026-06-02T00:00:00Z",
      "tags": ["agents", "information-retrieval", "rl", "reasoning", "paper", "research"],
      "authors": [{"name": "University of Massachusetts Amherst"}]
    },
    
    {
      "id": "2026-06-02-github-copilot-ai-credits-billing",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-02-github-copilot-ai-credits-billing/",
      "title": "GitHub Copilot Transitions to Token-Based AI Credits Billing on June 1",
      "content_text": "GitHub Copilot switched from flat-rate subscriptions to usage-based AI Credits billing on June 1, 2026. All plans now include a monthly credit pool (1 AI credit = $0.01), with optional overage budgets; code completions remain free. The change triggered developer backlash as heavy agentic workloads could push individual costs to $750+/month. A new Copilot Max upgrade tier was added for high-volume users.\n\nWhy it matters: The first major repricing of a mainstream coding assistant introduces financial risk for heavy users of agentic workflows, arriving the same day as Microsoft\u0027s MAI announcement \u2014 suggesting the pricing change funds the transition away from OpenAI\u0027s API costs.",
      "date_published": "2026-06-02T00:00:00Z",
      "tags": ["github-copilot", "ide", "pricing", "enterprise", "update"],
      "authors": [{"name": "Microsoft"}]
    },
    
    {
      "id": "2026-06-02-crafter-scientific-figures-paper",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-02-crafter-scientific-figures-paper/",
      "title": "Crafter: Multi-Agent Harness for Editable Scientific Figure Generation Scores +16pt Over Baselines (103 HF Upvotes)",
      "content_text": "Crafter (arXiv 2605.30611) presents a multi-agent system for generating editable scientific figures from diverse inputs (text, masks, sketches, key elements), coordinating five specialized agents around an evolving figure specification. The system uses diversity-driven plan exploration, structured corrective layers, and a verify-then-refine loop, outperforming the best baseline by 16.61 points on PaperBanana-Bench and 22.20 points on CraftBench across 279 samples. The companion CraftEditor converts raster outputs to editable SVGs.\n\nWhy it matters: Automates one of the most time-consuming parts of academic paper production; the CraftBench benchmark provides the first standardized evaluation for cross-type, cross-condition scientific figure generation. Top paper on HuggingFace Daily Papers for June 2 with 103 upvotes.",
      "date_published": "2026-06-02T00:00:00Z",
      "tags": ["multi-agent", "paper", "research", "agents", "benchmark", "scientific-ai"],
      "authors": [{"name": "Tsinghua University"}]
    },
    
    {
      "id": "2026-06-02-cognition-devin-1b-funding",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-02-cognition-devin-1b-funding/",
      "title": "Cognition Raises $1B at $26B Valuation as Devin AI Coder Hits $492M ARR",
      "content_text": "Cognition closed a $1B funding round at a $26B post-money valuation on May 28, 2026, led by Lux Capital, General Catalyst, and 8VC. The company\u0027s autonomous coding AI Devin has reached $492M annualized revenue, growing 50% month-over-month for six consecutive months. Enterprise clients include Mercedes-Benz, NASA, Goldman Sachs, and Santander; Cognition reports 90%+ of its own code is now written by Devin.\n\nWhy it matters: The $26B valuation makes Cognition one of the fastest-growing enterprise software companies in history and validates autonomous AI software engineers as a commercially real product category \u2014 not just a demo.",
      "date_published": "2026-06-02T00:00:00Z",
      "tags": ["coding-agent", "funding", "valuation", "us", "enterprise"],
      "authors": [{"name": "Cognition"}]
    },
    
    {
      "id": "2026-06-02-claude-code-v2-1-160",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-02-claude-code-v2-1-160/",
      "title": "Claude Code v2.1.160: Security Prompts Before Writing Shell Startup Files and Build-Tool Configs",
      "content_text": "Claude Code v2.1.160 (released June 2, 2026) adds user confirmation prompts before writing to shell startup files (.zshenv, .bash_login, ~/.config/git/) and build-tool config files (.npmrc, .yarnrc, .bazelrc, .devcontainer/) in acceptEdits mode \u2014 preventing unintended code execution through startup hook injection. The release also renames the dynamic-workflow trigger from `workflow` to `ultracode`, fixes background session drop, WSL clipboard, and Windows IME rendering issues.\n\nWhy it matters: The security hardening addresses a class of supply-chain attack vectors where an agentic coder could inadvertently install persistent execution hooks; trigger rename to `ultracode` hints at a forthcoming ultracode workflow mode.",
      "date_published": "2026-06-02T00:00:00Z",
      "tags": ["claude-code", "anthropic", "cli", "coding-agent", "security", "bug-fix", "release"],
      "authors": [{"name": "Anthropic"}]
    },
    
    {
      "id": "2026-06-02-badhost-cve-2026-48710",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-02-badhost-cve-2026-48710/",
      "title": "BadHost (CVE-2026-48710): Host-Header Auth Bypass in Starlette Exposes vLLM, LiteLLM, and MCP Servers",
      "content_text": "CVE-2026-48710 \u0027BadHost\u0027 is a critical authentication-bypass vulnerability in Starlette (all versions before 1.0.1) that allows unauthenticated attackers to access restricted endpoints by injecting /, ?, or # characters into the HTTP Host header, shifting path-parsing boundaries. The blast radius covers vLLM, LiteLLM, thousands of MCP server deployments, and FastAPI-based AI agent backends. Fix: upgrade Starlette to \u003e= 1.0.1.\n\nWhy it matters: The first widely-publicized critical CVE specifically targeting AI agent infrastructure; a single-header manipulation can expose LLM API keys, internal agent tooling, and GPU compute resources to unauthenticated attackers.",
      "date_published": "2026-06-02T00:00:00Z",
      "tags": ["security", "mcp", "vllm", "inference", "bug-fix"]
    },
    
    {
      "id": "2026-06-02-anthropic-ipo-s1-filing",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-02-anthropic-ipo-s1-filing/",
      "title": "Anthropic Confidentially Files S-1 IPO Prospectus with SEC at ~$965B Valuation",
      "content_text": "Anthropic confidentially submitted a draft S-1 registration statement to the SEC on June 1, 2026, initiating the IPO review process. The filing follows a $65B Series H that lifted the post-money valuation to ~$965B; the company\u0027s revenue run-rate hit approximately $47B in May 2026, up from ~$10B the prior year. An October 2026 public listing is being targeted, with law firm Wilson Sonsini engaged.\n\nWhy it matters: At ~$965B, Anthropic\u0027s IPO would be the largest AI tech listing in history, placing it just below Apple in market cap territory and signaling that the AI infrastructure build-out cycle is mature enough for public equity markets.",
      "date_published": "2026-06-02T00:00:00Z",
      "tags": ["anthropic", "funding", "valuation", "us"],
      "authors": [{"name": "Anthropic"}]
    },
    
    {
      "id": "2026-06-02-anthropic-expands-project-glasswing",
      "url": "https://ai-digest.kerby.pro/en/i/2026-06-02-anthropic-expands-project-glasswing/",
      "title": "Anthropic Expands Project Glasswing to ~200 Partners, Grants Mythos Preview Access for Critical Infrastructure",
      "content_text": "Anthropic announced June 2 that Project Glasswing \u2014 its restricted cybersecurity AI partnership \u2014 is growing from ~50 organizations to ~200, adding 150 new participants across 15+ countries. The expanded cohort gains access to Claude Mythos Preview, Anthropic\u0027s advanced model for scanning codebases for vulnerabilities; early partners have already surfaced 10,000+ high- or critical-severity security flaws since April. New sectors being prioritized include energy, water, healthcare, and communications infrastructure.\n\nWhy it matters: Signals Anthropic is productizing its most powerful models for offensive-cybersecurity defense before general availability, while competitors like OpenAI (Rosalind biodefense) create parallel restricted-access safety programs.",
      "date_published": "2026-06-02T00:00:00Z",
      "tags": ["anthropic", "security", "cybersecurity", "enterprise", "claude-mythos", "global"],
      "authors": [{"name": "Anthropic"}]
    }
    
  ]
}
