Gemma 4 Makes Its Arena Debut: Google's 31B Model Beats Giants
Google's Gemma 4 just landed on the LMArena text leaderboard at rank #29 with 1451 Elo. It's a 31B parameter open-weight model with an Apache 2.0 license — outscoring proprietary models many times its size. The open-weight era just got serious.
Google DeepMind released Gemma 4 on April 2nd. Seven days later, it's already in the top 30 of LMArena's globally-contested text leaderboard — a field of 338 models where every ranking point is earned through real human preferences in blind head-to-head battles.
To put this in perspective: the models Gemma 4's 31B version is outscoring include DeepSeek V3.2 (1424, rank #56), multiple Qwen variants, GPT-4.5 Preview (1444, rank #37), and the original Claude Opus 4 (1412, rank #75). These are serious production models with 100B to 700B+ parameters, running on clusters of expensive hardware.
Gemma 4 31B fits on a workstation GPU. That's not a footnote. That's the headline.
What Google built
Gemma 4 ships in four sizes designed for different deployment targets:
The architecture borrows heavily from Gemini 3. Google says Gemma 4 is built on the "same world-class research and technology." That's a notable statement — it implies the open model is sharing knowledge distillation and training techniques with their commercial flagship, not just receiving them as hand-me-downs.
The 26B MoE is particularly interesting for efficiency-conscious deployments: 26 billion total parameters but only ~4 billion active at inference time. Same output quality as the 31B dense, a fraction of the compute cost.
The open-weight trajectory
Let's zoom out. Here's where the open-weight models sit today versus the proprietary frontier:
The gap between the best open-weight models and the proprietary frontier is now around 50 Elo points. One year ago, it was 150+. Two years ago, open-source models weren't meaningfully competing with frontier labs.
The trajectory is clear, and Gemma 4 is accelerating it. The question isn't whether open-weight models will close the gap — it's when.
400 million downloads — and counting
Google reported over 400 million cumulative Gemma downloads across all versions. That's not a vanity metric — it represents deployment at scale across research labs, startups, enterprise fine-tuning teams, and edge device developers. Gemma has generated more than 100,000 community variants on Hugging Face.
Apache 2.0 matters here. Unlike some "open" releases that restrict commercial use, Gemma 4 ships with the most permissive standard license in the industry. Build on it, fine-tune it, deploy it commercially — no royalties, no restrictions.
For developers who can't send sensitive data to a third-party API, this is the path. Healthcare. Finance. Legal. Government. Any domain where data sovereignty matters.
Why Google is giving this away
The strategic logic is straightforward. Every developer who builds on Gemma is a developer using Google's toolchain — JAX, Google Cloud TPUs, AI Studio, Vertex AI. Open-sourcing the model is a distribution play, not a charity.
It also serves a defensive purpose. Meta's Llama series captured enormous developer mindshare. Gemma is Google's counter — hardware-optimized, with enterprise support paths and the trust that comes with a Google brand. Google wants to be the default open-weight foundation model, the same way Android is the default mobile OS.
What's new with Gemma 4 is that the strategy is working. Previous Gemma releases were competitive for their size class but not at the absolute frontier. Gemma 4 31B at rank #29 overall — not #29 among open models, rank #29 across all 338 models on the leaderboard — is a different story.
Also new today: GLM-5.1 at #14
While Gemma 4 is the headline, Z.ai's GLM-5.1 quietly entered the text leaderboard at rank #14 with 1467 Elo — between the existing GLM-5 (rank #23, 1456 Elo) and the top-10 proprietary cluster. This model ships MIT-licensed, a notably permissive choice for a Chinese lab.
Z.ai's rapid iteration — GLM-4.5, 4.6, 4.7, 5, and now 5.1 in under a year — mirrors the pace of the biggest frontier labs. They're not following at a distance. They're shipping every few weeks.
The bottom line
Gemma 4 at #29 is the highest any open-weight model has debuted on LMArena. The fact that a 31B model — runnable on a single A100 — is beating most of the commercial API models in blind human evaluation should reframe how you think about your deployment options.
The era when "open-weight" meant "meaningfully worse" is ending. Gemma 4 is the clearest proof yet.
Track every leaderboard shift
We monitor the LMArena daily. Get notified when rankings change — before the press releases land.