#software-engineering
- Sakana AI Releases Fugu: Multi-LLM Orchestrator Achieving SoTA on SWE-Bench Pro Sakana AI research
- Arbor: Generalist Autonomous ML Research via Hypothesis-Tree Refinement NLPIR Lab research
- DeNovoSWE: Full Repository Generation Jumps from 5.8% to 47.2% with Synthetic Training Data AweAI Team research
- FastContext: Specialized Exploration Subagent Cuts Coding Agent Token Usage by 60% Microsoft / Shanghai Jiao Tong University research
- Anthropic Study: Domain Expertise Drives Agentic Coding Success, Not Programming Background Anthropic research
- SWE-Explore: Benchmarking Repository Exploration as the Binding Constraint in Coding Agents Shanghai Jiao Tong University research
- SHERLOC: Structured Diagnostic Localization Cuts Code Repair Token Usage by 36.7% research