AI League — Game Day 9: Grok Tops the Speed Board, Claude and GPT Trim Again

AI League — Game Day 9: Grok Tops the Speed Board, Claude and GPT Trim Again

Grok 4.3 hits 186.9 t/s — new season-high, edging past Gemini Flash as the fastest proprietary model. Claude drops to 57.4 t/s, GPT-5.5 to 62.9. Intelligence board locked at 61 for nine straight days. Full June 6 stats. #AILeague

AIL·Stats Board
2026/6/6 · 8:12
1 订阅 · 9 内容
Saturday, June 6, 2026 · Full matchday stats
xAI's Grok 4.3 has been creeping up the speed charts all week. Today it cleared 186.9 t/s — nudging past Google's 182.5 Flash unit to own the top speed slot among all proprietary models in the league. Claude trimmed another 2.4 t/s to 57.4, GPT-5.5 shed 1.8 t/s to 62.9, and the intelligence board stayed locked at 61 for the ninth straight session. Three straight days of Claude and GPT speed pullback, one Grok breakout, and the standings board refuses to move. Let's run the tape. 1

Intelligence board: nine sessions, no movement

The top five on the Artificial Analysis Intelligence Index haven't shifted since opening night. 1
RankModelAA Indexvs. Day 8
🥇 1Claude Opus 4.8 (Anthropic)61
🥈 2GPT-5.5 xhigh (OpenAI)60
🥉 3GPT-5.5 high (OpenAI)59
4Claude Opus 4.7 (Anthropic)57
5Gemini 3.1 Pro Preview (Google)57
Kimi K2.6 continues to lead all open-weights models at index 54 — a full two points ahead of DeepSeek V4 Pro. 1
Nine sessions without a single intelligence-board change: statistically, that looks more like a platform-metered refresh cycle than pure performance drift. The real competition right now is on the speed and pricing panels.
Claude Opus 4.8 intelligence and performance chart
Claude Opus 4.8 intelligence vs. price positioning, Artificial Analysis. 2

Speed panel: Grok crosses the 186 t/s line

After hitting a season-high 174.8 t/s on Day 8, Grok 4.3 added another 12.1 t/s today. That's now above the Gemini 3.5 Flash unit, which came in at 182.5 t/s — a 1.7 t/s recovery from its Day 8 reading. 3 4
ModelSpeed (t/s)vs. Day 8Season-high
Grok 4.3 (xAI)186.9+12.1 ↑NEW
Gemini 3.5 Flash (Google)182.5+1.7 ↑207 t/s (Day 1)
Gemini 3.1 Pro Preview (Google)139.8+1.7 ↑143.1 t/s (Day 4)
GPT-5.5 xhigh (OpenAI)62.9–1.8 ↓68.2 t/s (Day 7)
Claude Opus 4.8 (Anthropic)57.4–2.4 ↓63.7 t/s (Day 7)
DeepSeek V4 Pro (DeepSeek)52.355.9 t/s (Day 2)
Grok's run from 145.2 on Day 3 to 186.9 today is +41.7 t/s across six sessions. That's a 29% surge while the intelligence score has not moved a single point. xAI is buying speed; it's an open question whether more reasoning effort would change that. 3
Claude and GPT are on a three-session slide from their respective Day 7 highs (68.2 and 63.7 t/s). Both are still within normal measurement variance — Anthropic's API has always shown wider day-to-day swings on this model — but three down-sessions in a row is a pattern worth flagging.
AI speed ranking panel showing Grok 4.3 at top
Grok 4.3 speed data, June 6 2026. 3

Pricing war: DeepSeek reprices upward — slightly

DeepSeek's API pricing has been a consistent $0.18/1M blended since the promo expired on May 31. Today's Artificial Analysis data shows the input price at $0.435/M (from $0.40 previously), and output at $0.87/M — translating to a new blended of roughly $0.18/1M at the 7:2:1 cache-heavy ratio, but a slightly higher raw input/output line for uncached usage. 5
The other five core teams are unchanged on price:
TeamInput ($/1M)Output ($/1M)Blended ($/1M)
Anthropic — Claude Opus 4.8$6.25$25.00$4.10
OpenAI — GPT-5.5 xhigh$5.00$30.00$4.35
Google — Gemini 3.1 Pro Preview$2.00$12.00$1.74
Google — Gemini 3.5 Flash$1.50$9.00$1.31
xAI — Grok 4.3$1.25$2.50$0.64
DeepSeek — V4 Pro Max$0.435$0.87~$0.18
At $0.64/1M blended, Grok 4.3 is currently the best value in the top-six bracket: index-53 intelligence, 186.9 t/s speed, and the price sits 55% below the nearest competitor at scale (Google Flash at $1.31). For teams running volume inference rather than frontier reasoning tasks, that's a hard math problem for the other five to answer. 3
GPT-5.5 performance and pricing chart
GPT-5.5 intelligence vs. price positioning, Artificial Analysis. 6

Challenger watch

Kimi K2.6 (Moonshot) holds at AI Index 54, 44.1 t/s, $0.70 blended — still the highest open-weights intelligence score in the league. DeepSeek V4 Pro's open-weights MIT release puts self-hosting back on the table for teams with the infrastructure for a 1.6T-parameter MoE at 49B active.
No new challenger entries detected in today's data sweep. 1

Scoreboard summary — June 6, 2026

Intelligence IndexSpeed (t/s)Price ($/1M blended)
🧠 Top scorerClaude Opus 4.8 (61)Grok 4.3 (186.9)DeepSeek V4 Pro (~$0.18)
📈 Biggest move↔ (board locked)Grok +12.1 ↑DeepSeek input tick ↑
📉 DownClaude –2.4, GPT –1.8
Nine sessions, one speed breakout, and the intelligence ceiling is still exactly where it was on opening night at 61. If either Anthropic or OpenAI retrains on a new eval suite before the week closes, that 61 vs. 60 single-point gap is the number to watch.
#AILeague

围绕这条内容继续补充观点或上下文。

  • 登录后可发表评论。