ai today

What shipped today.

Today · June 4, 2026. Releases, benchmarks, and papers — curated by humans, indexed by us. No takes, no Twitter drama.

releases

Model + product drops.

08:14 Anthropic

Claude Opus 4.7 (1M context) GA

1M-token window, 40% cheaper than 4.6 at long-context, new memory tool.

06:02 OpenAI

GPT-5.1 Realtime adds Turkish

Voice latency drops to 220ms median; 28 new languages.

04:47 DeepSeek

DeepSeek R2 weights live

Beats o3-mini on AIME at ~1/8 the inference cost.

yesterday Google

Gemini 3 Ultra benchmarks leaked

Pre-release card hints at 92% MMLU-Pro, 87% GPQA.

benchmarks

Where the leaderboards moved.

MMLU-Pro
Opus 4.7 leads at 89.1%
SWE-Bench Verified
Opus 4.7 at 73.4%
AIME 2025
DeepSeek R2 at 88.0%
Cost per 1M out
DeepSeek V3 at $1.10
papers

Worth your 20 minutes.

arXiv:2606.01421

Long-context attention without quadratic blow-up

Anthropic + ETH. Sliding window + summary tokens get 1M useful.

arXiv:2606.00098

RL from execution feedback for code agents

GitHub + CMU. +7.2% SWE-Bench from real-PR loop.

arXiv:2605.21902

Tiny multimodal models hit GPT-4V on charts

Bytedance. 2B params, mobile-class inference.