Claude Opus 4.6 Thinking Retakes Text #1
The text crown shifted backward by one Elo point: Claude Opus 4.6 Thinking is back at #1, Opus 4.7 Thinking slips to #2, and Meta Muse Spark enters the top five. Noisy, but revealing.
Read more→Analysis and insights on the AI model race
The text crown shifted backward by one Elo point: Claude Opus 4.6 Thinking is back at #1, Opus 4.7 Thinking slips to #2, and Meta Muse Spark enters the top five. Noisy, but revealing.
Read more→Anthropic just changed the code leaderboard again. Claude Opus 4.7 Thinking has overtaken base Opus 4.7 to become the new #1 coding model on LMArena. Top four is now all Anthropic.
Read more→Anthropic dropped Claude Opus 4.7 and it now holds #1 on both text and code. The code jump — from 1548 to 1583 Elo — is the biggest single-model gain we've tracked. Anthropic now holds six of the top seven code slots.
Read more→Meta's first model from Alexandr Wang's $14.3B superintelligence lab enters LMArena at rank #3 with 1493 Elo — ahead of every Google and OpenAI model on text. 174 rank positions above Meta's previous best. Here's what it means.
Read more→Google's Gemma 4 enters LMArena at rank #29 with 1451 Elo — a 31B open-weight model beating proprietary rivals many times its size. Apache 2.0 licensed, runnable on a workstation. The open-weight era just got serious.
Read more→OpenAI ships GPT-5.4 with a 1M token context window and improved efficiency. It debuts at #7 on LMArena text with 1480 Elo — strong, but 24 points behind leader Claude Opus 4.6. The enterprise play matters more than the ranking.
Read more→xAI's Grok 4.20 Beta debuts at #4 on LMArena with 1492 Elo — ahead of every Google and OpenAI model on text. Trained on 200K GPUs with native multi-agent architecture, xAI is no longer just participating in the race.
Read more→A new GPT-5.2 checkpoint vaults from #29 to #5 on LMArena's Text leaderboard — a 40-point Elo jump. But the company that held #1 for 651 days is still 27 points behind leader Claude Opus 4.6.
Read more→Claude Sonnet 4.6 scores 1524 on Code — beating Opus 4.5 (1496) and Opus 4.5 Thinking (1510). In just 4 months, the mid-tier model surpassed the previous flagship at one-fifth the price. Today's best is tomorrow's budget tier.
Read more→Anthropic's mid-tier model debuts at #3 on LMArena's Code leaderboard with 1524 Elo — beating every OpenAI and Google model. At $3/$15 per million tokens, it's flagship performance at Sonnet prices.
Read more→Google's Gemini 3.1 Pro scores 77.1% on ARC-AGI-2, more than doubling its predecessor. It leads most major benchmarks. But with Claude Opus 4.6 just 5 Elo points ahead, the AI race has never been tighter.
Read more→After 23 days of Google's Gemini Pro dominance, Anthropic has reclaimed the top spot. Here's what changed, why it matters, and what it signals about the future of AI development.
Read more→Get notified when a new model takes the top spot.
No spam. Unsubscribe anytime.