Leaderboard

AI Model Rankings 2026

The most comprehensive AI model comparison. ELO ratings from Chatbot Arena, benchmark scores, pricing, and what each model is best at.

Ranked list shows models with public ELO data. New models without enough Arena votes are listed separately below.

#ModelDeveloperELOMMLUContextPrice (Input)Model IDOfficialType
1Claude Opus 4.6Anthropic149691.11M$5.00claude-opus-4.6Claude models docs ->Closed
2Gemini 3 ProGoogle148691.81Mgemini-3-proGemini models docs ->Closed
3Grok 4.1xAI1483-N/Agrok-4.1xAI API docs ->Closed
4Claude Opus 4.5Anthropic146790.8200K$5.00claude-opus-4.5Claude models docs ->Closed
5Gemini 2.5 ProGoogle1450-1M$1.25gemini-2.5-proGemini models docs ->Closed
6Kimi K2.5Moonshot AI1449-262Kkimi-k2.5Moonshot API docs ->Closed
7GPT-4.5 PreviewOpenAI1444-128K$75.00gpt-4.5-previewOpenAI models docs ->Closed
8GPT-4oOpenAI144288.7128K$2.50gpt-4oOpenAI models docs ->Closed
9GPT-5.2OpenAI143789.6400K$1.75gpt-5.2OpenAI models docs ->Closed
10o3OpenAI1433-200K$10.00o3OpenAI models docs ->Closed
11DeepSeek R1DeepSeek141890.864K$0.55deepseek-r1DeepSeek API docs ->Open
12Claude Opus 4Anthropic1414-200K$5.00claude-opus-4Claude models docs ->Closed
13Mistral Large 3Mistral AI1414-128K$2.00mistral-large-3Mistral models docs ->Open
14Grok-3xAI141192.7131K$3.00grok-3xAI API docs ->Closed
15Gemini 2.5 FlashGoogle1410-1M$0.30gemini-2.5-flashGemini models docs ->Closed
16Claude Haiku 4.5Anthropic1404-200K$1.00claude-haiku-4.5Claude models docs ->Closed
17o1OpenAI140290.8200K$15.00o1OpenAI models docs ->Closed

New Models (Awaiting Public ELO)

ModelDeveloperReleasedContextPrice (Input)Official
Claude Opus 4.7Anthropic2026-041M$5.00Claude models docs ->
Gemini 3.1 ProGoogle2026-041M$2.00Gemini models docs ->
DeepSeek V4 FlashDeepSeek2026-04128K-DeepSeek API docs ->
DeepSeek V4 ProDeepSeek2026-04128K-DeepSeek API docs ->
GPT-5.5OpenAI2026-041M (API) / 400K (Codex)$5.00OpenAI models docs ->
GPT-5.4OpenAI2026-031M$2.50OpenAI models docs ->
GPT-5.4 MiniOpenAI2026-03400K$0.75OpenAI models docs ->
GPT-5.4 NanoOpenAI2026-03400K$0.20OpenAI models docs ->
Claude Sonnet 4.6Anthropic2026-02200K$3.00Claude models docs ->
Claude Sonnet 4Anthropic2025-05200K$3.00Claude models docs ->
Llama 4 ScoutMeta2025-04N/A-Llama model hub ->
Llama 4 MaverickMeta2025-04N/A-Llama model hub ->
Gemini 2.0 FlashGoogle2025-021M$0.10Gemini models docs ->
o3-miniOpenAI2025-01200K$1.10OpenAI models docs ->
DeepSeek V3DeepSeek2024-12128K$0.14DeepSeek API docs ->
Claude 3.5 Sonnet (Oct 2024)Anthropic2024-10200K$3.00Claude models docs ->
Qwen 2.5 72B InstructAlibaba2024-09128K$0.30Qwen docs ->
Grok-2xAI2024-08128K$2.00xAI API docs ->
GPT-4o MiniOpenAI2024-07128K$0.15OpenAI models docs ->
Llama 3.1 405BMeta2024-07128K$0.80Llama model hub ->
Llama 3.1 70BMeta2024-07128K$0.35Llama model hub ->
Llama 3.1 8BMeta2024-07128K$0.05Llama model hub ->
Mistral Large 2Mistral AI2024-07128K$2.00Mistral models docs ->
Claude 3.5 SonnetAnthropic2024-06200K$3.00Claude models docs ->
Gemini 1.5 ProGoogle2024-052M$1.25Gemini models docs ->
GPT-4 TurboOpenAI2024-04128K$10.00OpenAI models docs ->
Mixtral 8x22BMistral AI2024-0464K$2.00Mistral models docs ->
Command R+Cohere2024-04128K$2.50Cohere docs ->
Claude 3 OpusAnthropic2024-03200K$15.00Claude models docs ->
Claude 3 HaikuAnthropic2024-03200K$0.25Claude models docs ->

Model Profiles

#1Proprietary

Claude Opus 4.6

Anthropic

Current #1 on Chatbot Arena (1496 ELO), with 99.8% AIME 2025 and 80.8% SWE-bench, leading in coding and hard prompts.

1496

ELO

1M

Context

$5.00

per 1M tokens

#2Proprietary

Gemini 3 Pro

Google

Google's latest flagship with 94.3% GPQA and 100% AIME 2025, #2 on Chatbot Arena behind Claude Opus 4.6.

1486

ELO

1M

Context

per 1M tokens

#3Proprietary

Grok 4.1

xAI

xAI's newer flagship family with top-ranked LMArena results for both thinking and non-thinking modes.

1483

ELO

N/A

Context

per 1M tokens

#4Proprietary

Claude Opus 4.5

Anthropic

Major upgrade with 87% GPQA and 80.9% SWE-bench, the highest-rated Anthropic model before the 4.6 generation.

1467

ELO

200K

Context

$5.00

per 1M tokens

#5Proprietary

Gemini 2.5 Pro

Google

Google's hybrid thinking model combining fast responses with deep reasoning, top performer on coding and math benchmarks.

1450

ELO

1M

Context

$1.25

per 1M tokens

#6Proprietary

Kimi K2.5

Moonshot AI

Chinese model with the highest HumanEval score ever recorded (99.0%), excelling at code generation and reasoning.

1449

ELO

262K

Context

per 1M tokens

#7Proprietary

GPT-4.5 Preview

OpenAI

OpenAI's largest and most knowledgeable non-reasoning model with broad world knowledge and reduced hallucinations.

1444

ELO

128K

Context

$75.00

per 1M tokens

#8Proprietary

GPT-4o

OpenAI

OpenAI's flagship multimodal model with native text, vision, and audio capabilities, offering strong all-around performance.

1442

ELO

128K

Context

$2.50

per 1M tokens

#9Proprietary

GPT-5.2

OpenAI

OpenAI's current-gen flagship with 400K context, 92.4% GPQA and 100% AIME 2025, strong reasoning at reduced cost.

1437

ELO

400K

Context

$1.75

per 1M tokens

#10Proprietary

o3

OpenAI

Advanced reasoning model succeeding o1, with significantly improved math and coding performance at reduced pricing.

1433

ELO

200K

Context

$10.00

per 1M tokens

#11Open Source

DeepSeek R1

DeepSeek

Open-source reasoning model matching o1 performance with 97.3% MATH-500, disrupted the AI industry with its efficiency.

1418

ELO

64K

Context

$0.55

per 1M tokens

#12Proprietary

Claude Opus 4

Anthropic

Anthropic's first Opus 4 generation model with extended thinking capabilities and strong agentic coding performance.

1414

ELO

200K

Context

$5.00

per 1M tokens

#13Open Source

Mistral Large 3

Mistral AI

Open-weight Apache 2.0 MoE flagship from Mistral 3 generation with strong multilingual and multimodal performance.

1414

ELO

128K

Context

$2.00

per 1M tokens

#14Proprietary

Grok-3

xAI

Trained on xAI's Colossus supercluster, top-tier math reasoning with 93.3% AIME 2025 score.

1411

ELO

131K

Context

$3.00

per 1M tokens

#15Proprietary

Gemini 2.5 Flash

Google

Fast reasoning model with excellent cost efficiency, balancing speed and intelligence for high-throughput applications.

1410

ELO

1M

Context

$0.30

per 1M tokens

#16Proprietary

Claude Haiku 4.5

Anthropic

Anthropic's fastest model in the 4.5 generation, offering near-Sonnet quality at Haiku-tier speed and pricing.

1404

ELO

200K

Context

$1.00

per 1M tokens

#17Proprietary

o1

OpenAI

OpenAI's first reasoning model that uses chain-of-thought to solve complex math, science, and coding problems.

1402

ELO

200K

Context

$15.00

per 1M tokens

#18Proprietary

Claude Opus 4.7

Anthropic

Anthropic's most capable generally available model, launched in April 2026 with stronger long-horizon agentic performance and the same $5/$25 MTok pricing as Opus 4.6.

-

ELO

1M

Context

$5.00

per 1M tokens

#19Proprietary

Gemini 3.1 Pro

Google

Google's newer Pro generation surfaced at Cloud Next 2026 as its most capable model for complex workflows, with 1M context and Gemini 3-class multimodal support.

-

ELO

1M

Context

$2.00

per 1M tokens

#20Proprietary

DeepSeek V4 Flash

DeepSeek

DeepSeek's April 2026 V4-generation API model focused on speed and lower-cost production inference.

-

ELO

128K

Context

per 1M tokens

#21Proprietary

DeepSeek V4 Pro

DeepSeek

DeepSeek's higher-capability V4-generation API model introduced in April 2026 for deeper reasoning and agentic workloads.

-

ELO

128K

Context

per 1M tokens

#22Proprietary

GPT-5.5

OpenAI

OpenAI's latest flagship model for real-world coding and professional workflows, with stronger agentic performance and a 1M API context window.

-

ELO

1M (API) / 400K (Codex)

Context

$5.00

per 1M tokens

#23Proprietary

GPT-5.4

OpenAI

OpenAI's March 2026 frontier model for professional reasoning and agentic coding, released across ChatGPT, API, and Codex.

-

ELO

1M

Context

$2.50

per 1M tokens

#24Proprietary

GPT-5.4 Mini

OpenAI

Faster and lower-cost GPT-5.4 variant for high-volume coding, subagents, and computer-use workloads.

-

ELO

400K

Context

$0.75

per 1M tokens

#25Proprietary

GPT-5.4 Nano

OpenAI

OpenAI's smallest GPT-5.4-class model, optimized for ultra-cheap classification, extraction, and lightweight agent sub-tasks.

-

ELO

400K

Context

$0.20

per 1M tokens

#26Proprietary

Claude Sonnet 4.6

Anthropic

Latest Sonnet with adaptive reasoning, 89.9% GPQA and 79.6% SWE-bench, excellent balance of speed and intelligence.

-

ELO

200K

Context

$3.00

per 1M tokens

#27Proprietary

Claude Sonnet 4

Anthropic

Balanced mid-tier model in the Claude 4 generation, offering strong coding and reasoning at competitive pricing.

-

ELO

200K

Context

$3.00

per 1M tokens

#28Open Source

Llama 4 Scout

Meta

Open-weight, natively multimodal Llama 4 model designed for efficient deployment and long-context workloads.

-

ELO

N/A

Context

per 1M tokens

#29Open Source

Llama 4 Maverick

Meta

Higher-capability open-weight Llama 4 model in Meta's multimodal MoE generation, available for download via llama.com.

-

ELO

N/A

Context

per 1M tokens

#30Proprietary

Gemini 2.0 Flash

Google

Ultra-fast and affordable multimodal model with native tool use, image/audio generation, and 1M token context.

-

ELO

1M

Context

$0.10

per 1M tokens

#31Proprietary

o3-mini

OpenAI

Cost-efficient reasoning model with adjustable effort levels (low/medium/high), matching o1 at medium on STEM tasks.

-

ELO

200K

Context

$1.10

per 1M tokens

#32Open Source

DeepSeek V3

DeepSeek

Chinese open-source MoE model rivaling GPT-4o at a fraction of the cost, trained for under $6M causing industry shock.

-

ELO

128K

Context

$0.14

per 1M tokens

#33Proprietary

Claude 3.5 Sonnet (Oct 2024)

Anthropic

Updated Sonnet with computer use capability and improved coding (93.7% HumanEval), the most popular coding model of late 2024.

-

ELO

200K

Context

$3.00

per 1M tokens

#34Open Source

Qwen 2.5 72B Instruct

Alibaba

Alibaba's leading open-source model with strong multilingual and coding capabilities, competitive with Llama 3.1 70B.

-

ELO

128K

Context

$0.30

per 1M tokens

#35Proprietary

Grok-2

xAI

xAI's second-gen model with real-time X/Twitter data access, competitive with GPT-4o on standard benchmarks.

-

ELO

128K

Context

$2.00

per 1M tokens

#36Proprietary

GPT-4o Mini

OpenAI

Cost-efficient small model replacing GPT-3.5 Turbo, offering strong performance at a fraction of GPT-4o's cost.

-

ELO

128K

Context

$0.15

per 1M tokens

#37Open Source

Llama 3.1 405B

Meta

Largest open-source model at release, competitive with GPT-4o and Claude 3.5 Sonnet across most benchmarks.

-

ELO

128K

Context

$0.80

per 1M tokens

#38Open Source

Llama 3.1 70B

Meta

Strong mid-size open-source model offering excellent performance-to-cost ratio for self-hosted deployments.

-

ELO

128K

Context

$0.35

per 1M tokens

#39Open Source

Llama 3.1 8B

Meta

Compact open-source model suitable for on-device and edge deployments with surprisingly strong capabilities for its size.

-

ELO

128K

Context

$0.05

per 1M tokens

#40Proprietary

Mistral Large 2

Mistral AI

Mistral's flagship with 123B parameters, multilingual in 80+ languages, and strong code generation (92% HumanEval).

-

ELO

128K

Context

$2.00

per 1M tokens

#41Proprietary

Claude 3.5 Sonnet

Anthropic

Anthropic's breakout model that surpassed GPT-4o on most benchmarks at half the cost, excelling at coding and reasoning.

-

ELO

200K

Context

$3.00

per 1M tokens

#42Proprietary

Gemini 1.5 Pro

Google

Google's first million-token context model (up to 2M), excelling at long-document understanding and multimodal tasks.

-

ELO

2M

Context

$1.25

per 1M tokens

#43Proprietary

GPT-4 Turbo

OpenAI

Enhanced GPT-4 with vision support, JSON mode, and a 128K context window at reduced pricing versus original GPT-4.

-

ELO

128K

Context

$10.00

per 1M tokens

#44Open Source

Mixtral 8x22B

Mistral AI

Sparse MoE model using 39B of 141B parameters, Apache 2.0 licensed, excellent efficiency-to-performance ratio.

-

ELO

64K

Context

$2.00

per 1M tokens

#45Proprietary

Command R+

Cohere

Enterprise-focused RAG-optimized model excelling at multi-step tool use, long document analysis, and multilingual tasks.

-

ELO

128K

Context

$2.50

per 1M tokens

#46Proprietary

Claude 3 Opus

Anthropic

Anthropic's original flagship model, excelling at complex analysis and nuanced writing with strong safety alignment.

-

ELO

200K

Context

$15.00

per 1M tokens

#47Proprietary

Claude 3 Haiku

Anthropic

Anthropic's fastest and most affordable model, designed for near-instant responses on simple queries and classification.

-

ELO

200K

Context

$0.25

per 1M tokens