Leaderboard
AI Model Rankings 2026
The most comprehensive AI model comparison. ELO ratings from Chatbot Arena, benchmark scores, pricing, and what each model is best at.
Ranked list shows models with public ELO data. New models without enough Arena votes are listed separately below.
| # | Model | Developer | ELO | MMLU | Context | Price (Input) | Model ID | Official | Type |
|---|---|---|---|---|---|---|---|---|---|
| 1 | Claude Opus 4.6 | Anthropic | 1496 | 91.1 | 1M | $5.00 | claude-opus-4.6 | Claude models docs -> | Closed |
| 2 | Gemini 3 Pro | 1486 | 91.8 | 1M | gemini-3-pro | Gemini models docs -> | Closed | ||
| 3 | Grok 4.1 | xAI | 1483 | - | N/A | grok-4.1 | xAI API docs -> | Closed | |
| 4 | Claude Opus 4.5 | Anthropic | 1467 | 90.8 | 200K | $5.00 | claude-opus-4.5 | Claude models docs -> | Closed |
| 5 | Gemini 2.5 Pro | 1450 | - | 1M | $1.25 | gemini-2.5-pro | Gemini models docs -> | Closed | |
| 6 | Kimi K2.5 | Moonshot AI | 1449 | - | 262K | kimi-k2.5 | Moonshot API docs -> | Closed | |
| 7 | GPT-4.5 Preview | OpenAI | 1444 | - | 128K | $75.00 | gpt-4.5-preview | OpenAI models docs -> | Closed |
| 8 | GPT-4o | OpenAI | 1442 | 88.7 | 128K | $2.50 | gpt-4o | OpenAI models docs -> | Closed |
| 9 | GPT-5.2 | OpenAI | 1437 | 89.6 | 400K | $1.75 | gpt-5.2 | OpenAI models docs -> | Closed |
| 10 | o3 | OpenAI | 1433 | - | 200K | $10.00 | o3 | OpenAI models docs -> | Closed |
| 11 | DeepSeek R1 | DeepSeek | 1418 | 90.8 | 64K | $0.55 | deepseek-r1 | DeepSeek API docs -> | Open |
| 12 | Claude Opus 4 | Anthropic | 1414 | - | 200K | $5.00 | claude-opus-4 | Claude models docs -> | Closed |
| 13 | Mistral Large 3 | Mistral AI | 1414 | - | 128K | $2.00 | mistral-large-3 | Mistral models docs -> | Open |
| 14 | Grok-3 | xAI | 1411 | 92.7 | 131K | $3.00 | grok-3 | xAI API docs -> | Closed |
| 15 | Gemini 2.5 Flash | 1410 | - | 1M | $0.30 | gemini-2.5-flash | Gemini models docs -> | Closed | |
| 16 | Claude Haiku 4.5 | Anthropic | 1404 | - | 200K | $1.00 | claude-haiku-4.5 | Claude models docs -> | Closed |
| 17 | o1 | OpenAI | 1402 | 90.8 | 200K | $15.00 | o1 | OpenAI models docs -> | Closed |
New Models (Awaiting Public ELO)
Model Profiles
Claude Opus 4.6
Anthropic
Current #1 on Chatbot Arena (1496 ELO), with 99.8% AIME 2025 and 80.8% SWE-bench, leading in coding and hard prompts.
1496
ELO
1M
Context
$5.00
per 1M tokens
Gemini 3 Pro
Google's latest flagship with 94.3% GPQA and 100% AIME 2025, #2 on Chatbot Arena behind Claude Opus 4.6.
1486
ELO
1M
Context
per 1M tokens
Grok 4.1
xAI
xAI's newer flagship family with top-ranked LMArena results for both thinking and non-thinking modes.
1483
ELO
N/A
Context
per 1M tokens
Claude Opus 4.5
Anthropic
Major upgrade with 87% GPQA and 80.9% SWE-bench, the highest-rated Anthropic model before the 4.6 generation.
1467
ELO
200K
Context
$5.00
per 1M tokens
Gemini 2.5 Pro
Google's hybrid thinking model combining fast responses with deep reasoning, top performer on coding and math benchmarks.
1450
ELO
1M
Context
$1.25
per 1M tokens
Kimi K2.5
Moonshot AI
Chinese model with the highest HumanEval score ever recorded (99.0%), excelling at code generation and reasoning.
1449
ELO
262K
Context
per 1M tokens
GPT-4.5 Preview
OpenAI
OpenAI's largest and most knowledgeable non-reasoning model with broad world knowledge and reduced hallucinations.
1444
ELO
128K
Context
$75.00
per 1M tokens
GPT-4o
OpenAI
OpenAI's flagship multimodal model with native text, vision, and audio capabilities, offering strong all-around performance.
1442
ELO
128K
Context
$2.50
per 1M tokens
GPT-5.2
OpenAI
OpenAI's current-gen flagship with 400K context, 92.4% GPQA and 100% AIME 2025, strong reasoning at reduced cost.
1437
ELO
400K
Context
$1.75
per 1M tokens
o3
OpenAI
Advanced reasoning model succeeding o1, with significantly improved math and coding performance at reduced pricing.
1433
ELO
200K
Context
$10.00
per 1M tokens
DeepSeek R1
DeepSeek
Open-source reasoning model matching o1 performance with 97.3% MATH-500, disrupted the AI industry with its efficiency.
1418
ELO
64K
Context
$0.55
per 1M tokens
Claude Opus 4
Anthropic
Anthropic's first Opus 4 generation model with extended thinking capabilities and strong agentic coding performance.
1414
ELO
200K
Context
$5.00
per 1M tokens
Mistral Large 3
Mistral AI
Open-weight Apache 2.0 MoE flagship from Mistral 3 generation with strong multilingual and multimodal performance.
1414
ELO
128K
Context
$2.00
per 1M tokens
Grok-3
xAI
Trained on xAI's Colossus supercluster, top-tier math reasoning with 93.3% AIME 2025 score.
1411
ELO
131K
Context
$3.00
per 1M tokens
Gemini 2.5 Flash
Fast reasoning model with excellent cost efficiency, balancing speed and intelligence for high-throughput applications.
1410
ELO
1M
Context
$0.30
per 1M tokens
Claude Haiku 4.5
Anthropic
Anthropic's fastest model in the 4.5 generation, offering near-Sonnet quality at Haiku-tier speed and pricing.
1404
ELO
200K
Context
$1.00
per 1M tokens
o1
OpenAI
OpenAI's first reasoning model that uses chain-of-thought to solve complex math, science, and coding problems.
1402
ELO
200K
Context
$15.00
per 1M tokens
Claude Opus 4.7
Anthropic
Anthropic's most capable generally available model, launched in April 2026 with stronger long-horizon agentic performance and the same $5/$25 MTok pricing as Opus 4.6.
-
ELO
1M
Context
$5.00
per 1M tokens
Gemini 3.1 Pro
Google's newer Pro generation surfaced at Cloud Next 2026 as its most capable model for complex workflows, with 1M context and Gemini 3-class multimodal support.
-
ELO
1M
Context
$2.00
per 1M tokens
DeepSeek V4 Flash
DeepSeek
DeepSeek's April 2026 V4-generation API model focused on speed and lower-cost production inference.
-
ELO
128K
Context
per 1M tokens
DeepSeek V4 Pro
DeepSeek
DeepSeek's higher-capability V4-generation API model introduced in April 2026 for deeper reasoning and agentic workloads.
-
ELO
128K
Context
per 1M tokens
GPT-5.5
OpenAI
OpenAI's latest flagship model for real-world coding and professional workflows, with stronger agentic performance and a 1M API context window.
-
ELO
1M (API) / 400K (Codex)
Context
$5.00
per 1M tokens
GPT-5.4
OpenAI
OpenAI's March 2026 frontier model for professional reasoning and agentic coding, released across ChatGPT, API, and Codex.
-
ELO
1M
Context
$2.50
per 1M tokens
GPT-5.4 Mini
OpenAI
Faster and lower-cost GPT-5.4 variant for high-volume coding, subagents, and computer-use workloads.
-
ELO
400K
Context
$0.75
per 1M tokens
GPT-5.4 Nano
OpenAI
OpenAI's smallest GPT-5.4-class model, optimized for ultra-cheap classification, extraction, and lightweight agent sub-tasks.
-
ELO
400K
Context
$0.20
per 1M tokens
Claude Sonnet 4.6
Anthropic
Latest Sonnet with adaptive reasoning, 89.9% GPQA and 79.6% SWE-bench, excellent balance of speed and intelligence.
-
ELO
200K
Context
$3.00
per 1M tokens
Claude Sonnet 4
Anthropic
Balanced mid-tier model in the Claude 4 generation, offering strong coding and reasoning at competitive pricing.
-
ELO
200K
Context
$3.00
per 1M tokens
Llama 4 Scout
Meta
Open-weight, natively multimodal Llama 4 model designed for efficient deployment and long-context workloads.
-
ELO
N/A
Context
per 1M tokens
Llama 4 Maverick
Meta
Higher-capability open-weight Llama 4 model in Meta's multimodal MoE generation, available for download via llama.com.
-
ELO
N/A
Context
per 1M tokens
Gemini 2.0 Flash
Ultra-fast and affordable multimodal model with native tool use, image/audio generation, and 1M token context.
-
ELO
1M
Context
$0.10
per 1M tokens
o3-mini
OpenAI
Cost-efficient reasoning model with adjustable effort levels (low/medium/high), matching o1 at medium on STEM tasks.
-
ELO
200K
Context
$1.10
per 1M tokens
DeepSeek V3
DeepSeek
Chinese open-source MoE model rivaling GPT-4o at a fraction of the cost, trained for under $6M causing industry shock.
-
ELO
128K
Context
$0.14
per 1M tokens
Claude 3.5 Sonnet (Oct 2024)
Anthropic
Updated Sonnet with computer use capability and improved coding (93.7% HumanEval), the most popular coding model of late 2024.
-
ELO
200K
Context
$3.00
per 1M tokens
Qwen 2.5 72B Instruct
Alibaba
Alibaba's leading open-source model with strong multilingual and coding capabilities, competitive with Llama 3.1 70B.
-
ELO
128K
Context
$0.30
per 1M tokens
Grok-2
xAI
xAI's second-gen model with real-time X/Twitter data access, competitive with GPT-4o on standard benchmarks.
-
ELO
128K
Context
$2.00
per 1M tokens
GPT-4o Mini
OpenAI
Cost-efficient small model replacing GPT-3.5 Turbo, offering strong performance at a fraction of GPT-4o's cost.
-
ELO
128K
Context
$0.15
per 1M tokens
Llama 3.1 405B
Meta
Largest open-source model at release, competitive with GPT-4o and Claude 3.5 Sonnet across most benchmarks.
-
ELO
128K
Context
$0.80
per 1M tokens
Llama 3.1 70B
Meta
Strong mid-size open-source model offering excellent performance-to-cost ratio for self-hosted deployments.
-
ELO
128K
Context
$0.35
per 1M tokens
Llama 3.1 8B
Meta
Compact open-source model suitable for on-device and edge deployments with surprisingly strong capabilities for its size.
-
ELO
128K
Context
$0.05
per 1M tokens
Mistral Large 2
Mistral AI
Mistral's flagship with 123B parameters, multilingual in 80+ languages, and strong code generation (92% HumanEval).
-
ELO
128K
Context
$2.00
per 1M tokens
Claude 3.5 Sonnet
Anthropic
Anthropic's breakout model that surpassed GPT-4o on most benchmarks at half the cost, excelling at coding and reasoning.
-
ELO
200K
Context
$3.00
per 1M tokens
Gemini 1.5 Pro
Google's first million-token context model (up to 2M), excelling at long-document understanding and multimodal tasks.
-
ELO
2M
Context
$1.25
per 1M tokens
GPT-4 Turbo
OpenAI
Enhanced GPT-4 with vision support, JSON mode, and a 128K context window at reduced pricing versus original GPT-4.
-
ELO
128K
Context
$10.00
per 1M tokens
Mixtral 8x22B
Mistral AI
Sparse MoE model using 39B of 141B parameters, Apache 2.0 licensed, excellent efficiency-to-performance ratio.
-
ELO
64K
Context
$2.00
per 1M tokens
Command R+
Cohere
Enterprise-focused RAG-optimized model excelling at multi-step tool use, long document analysis, and multilingual tasks.
-
ELO
128K
Context
$2.50
per 1M tokens
Claude 3 Opus
Anthropic
Anthropic's original flagship model, excelling at complex analysis and nuanced writing with strong safety alignment.
-
ELO
200K
Context
$15.00
per 1M tokens
Claude 3 Haiku
Anthropic
Anthropic's fastest and most affordable model, designed for near-instant responses on simple queries and classification.
-
ELO
200K
Context
$0.25
per 1M tokens