GPT-5 Takes #1 on SWE-bench at 78.4%

OpenAI's flagship surpasses all previous models on the canonical software-engineering benchmark. PAPER

DeepSeek Publishes V4 Architecture Notes

The Chinese lab reveals mixture-of-experts details and training compute breakdowns for DeepSeek-V4. FUNDING

Mistral Closes $640M Series C at $6B Valuation

The Paris-based lab will use the funds to expand its API, enterprise sales, and open-weight research. KolayVibe Built in Istanbul. Charting AI for the curious, the cautious, and the shipping. Learn Courses Learning Paths Prompt Library AI Glossary Discover Compare Models AI News Vibe Pulse AI Legends Platform Overview Pricing Marketplace API Access Company About Blog Careers Contact © 2026 KolayVibe · All rights reserved Privacy Terms

"Chain-of-Draft" Cuts Token Usage by 70%

Chain-of-Draft (CoD), introduced by Xu et al. at MIT/Stanford, asks the model to produce terse 'drafts' of reasoning steps rather than full natural-language chains.

On GSM8K, MATH, and BIG-Bench Hard, CoD matched chain-of-thought accuracy while emitting 70% fewer tokens — a meaningful cost win for production deployments.

FAQ

What is Chain-of-Draft?

Chain-of-Draft (CoD) is a prompting technique from Xu et al. at MIT and Stanford. Instead of producing verbose natural-language reasoning, the model emits terse 'draft' intermediate steps that compress the same logical chain.

How much does Chain-of-Draft cut token usage?

Across GSM8K, MATH, and BIG-Bench Hard, CoD emitted roughly 70% fewer tokens than equivalent chain-of-thought prompting, while matching CoT on final-answer accuracy.

Does CoD hurt reasoning accuracy?

No measurable hit on the three benchmarks reported. The authors note that the terser format encourages the model to commit to step boundaries rather than ramble, which may explain why accuracy holds.

Why does Chain-of-Draft matter for production?

Token costs scale linearly with output length. A 70% reduction on reasoning traces is a 70% cost cut on inference, plus correspondingly lower latency — both binding constraints at scale.