ai today
What shipped today.
Today · June 4, 2026. Releases, benchmarks, and papers — curated by humans, indexed by us. No takes, no Twitter drama.
Model + product drops.
08:14 Anthropic
Claude Opus 4.7 (1M context) GA
1M-token window, 40% cheaper than 4.6 at long-context, new memory tool.
06:02 OpenAI
GPT-5.1 Realtime adds Turkish
Voice latency drops to 220ms median; 28 new languages.
04:47 DeepSeek
DeepSeek R2 weights live
Beats o3-mini on AIME at ~1/8 the inference cost.
yesterday Google
Gemini 3 Ultra benchmarks leaked
Pre-release card hints at 92% MMLU-Pro, 87% GPQA.
Where the leaderboards moved.
MMLU-Pro
Opus 4.7 leads at 89.1%
SWE-Bench Verified
Opus 4.7 at 73.4%
AIME 2025
DeepSeek R2 at 88.0%
Cost per 1M out
DeepSeek V3 at $1.10
Worth your 20 minutes.
arXiv:2606.01421
Long-context attention without quadratic blow-up
Anthropic + ETH. Sliding window + summary tokens get 1M useful.
arXiv:2606.00098
RL from execution feedback for code agents
GitHub + CMU. +7.2% SWE-Bench from real-PR loop.
arXiv:2605.21902
Tiny multimodal models hit GPT-4V on charts
Bytedance. 2B params, mobile-class inference.