Leaderboard

AI Model Rankings 2026

The most comprehensive AI model comparison. ELO ratings from Chatbot Arena, benchmark scores, pricing, and what each model is best at.

#ModelDeveloperELOMMLUContextPrice (Input)Type
1Claude Opus 4.6Anthropic149691.11M$5.00Closed
2Gemini 3 ProGoogle148691.81MClosed
3Claude Opus 4.5Anthropic146790.8200K$5.00Closed
4Gemini 2.5 ProGoogle14501M$1.25Closed
5Kimi K2.5Moonshot AI1449262KClosed
6GPT-4.5 PreviewOpenAI1444128K$75.00Closed
7GPT-4oOpenAI144288.7128K$2.50Closed
8GPT-5.2OpenAI143789.6400K$1.75Closed
9o3OpenAI1433200K$10.00Closed
10DeepSeek R1DeepSeek141890.864K$0.55Open
11Claude Opus 4Anthropic1414200K$5.00Closed
12Mistral Large 3Mistral AI1414128K$2.00Closed
13Grok-3xAI141192.7131K$3.00Closed
14Gemini 2.5 FlashGoogle14101M$0.30Closed
15Claude Haiku 4.5Anthropic1404200K$1.00Closed
16o1OpenAI140290.8200K$15.00Closed
17GPT-4o MiniOpenAI82128K$0.15Closed
18GPT-4 TurboOpenAI86.5128K$10.00Closed
19o3-miniOpenAI200K$1.10Closed
20Claude 3.5 SonnetAnthropic90.4200K$3.00Closed
21Claude 3.5 Sonnet (Oct 2024)Anthropic90.4200K$3.00Closed
22Claude 3 OpusAnthropic86.8200K$15.00Closed
23Claude 3 HaikuAnthropic75.2200K$0.25Closed
24Claude Sonnet 4Anthropic200K$3.00Closed
25Gemini 1.5 ProGoogle85.92M$1.25Closed
26Gemini 2.0 FlashGoogle76.41M$0.10Closed
27Llama 3.1 405BMeta86128K$0.80Open
28Llama 3.1 70BMeta83.6128K$0.35Open
29Llama 3.1 8BMeta73128K$0.05Open
30Mistral Large 2Mistral AI84128K$2.00Closed
31Mixtral 8x22BMistral AI77.364K$2.00Open
32DeepSeek V3DeepSeek88.5128K$0.14Open
33Grok-2xAI128K$2.00Closed
34Command R+Cohere128K$2.50Closed
35Qwen 2.5 72B InstructAlibaba85.3128K$0.30Open
36Claude Sonnet 4.6Anthropic89.3200K$3.00Closed

Model Profiles

#1Proprietary

Claude Opus 4.6

Anthropic

Current #1 on Chatbot Arena (1496 ELO), with 99.8% AIME 2025 and 80.8% SWE-bench, leading in coding and hard prompts.

1496

ELO

1M

Context

$5.00

per 1M tokens

#2Proprietary

Gemini 3 Pro

Google

Google's latest flagship with 94.3% GPQA and 100% AIME 2025, #2 on Chatbot Arena behind Claude Opus 4.6.

1486

ELO

1M

Context

per 1M tokens

#3Proprietary

Claude Opus 4.5

Anthropic

Major upgrade with 87% GPQA and 80.9% SWE-bench, the highest-rated Anthropic model before the 4.6 generation.

1467

ELO

200K

Context

$5.00

per 1M tokens

#4Proprietary

Gemini 2.5 Pro

Google

Google's hybrid thinking model combining fast responses with deep reasoning, top performer on coding and math benchmarks.

1450

ELO

1M

Context

$1.25

per 1M tokens

#5Proprietary

Kimi K2.5

Moonshot AI

Chinese model with the highest HumanEval score ever recorded (99.0%), excelling at code generation and reasoning.

1449

ELO

262K

Context

per 1M tokens

#6Proprietary

GPT-4.5 Preview

OpenAI

OpenAI's largest and most knowledgeable non-reasoning model with broad world knowledge and reduced hallucinations.

1444

ELO

128K

Context

$75.00

per 1M tokens

#7Proprietary

GPT-4o

OpenAI

OpenAI's flagship multimodal model with native text, vision, and audio capabilities, offering strong all-around performance.

1442

ELO

128K

Context

$2.50

per 1M tokens

#8Proprietary

GPT-5.2

OpenAI

OpenAI's current-gen flagship with 400K context, 92.4% GPQA and 100% AIME 2025, strong reasoning at reduced cost.

1437

ELO

400K

Context

$1.75

per 1M tokens

#9Proprietary

o3

OpenAI

Advanced reasoning model succeeding o1, with significantly improved math and coding performance at reduced pricing.

1433

ELO

200K

Context

$10.00

per 1M tokens

#10Open Source

DeepSeek R1

DeepSeek

Open-source reasoning model matching o1 performance with 97.3% MATH-500, disrupted the AI industry with its efficiency.

1418

ELO

64K

Context

$0.55

per 1M tokens

#11Proprietary

Claude Opus 4

Anthropic

Anthropic's first Opus 4 generation model with extended thinking capabilities and strong agentic coding performance.

1414

ELO

200K

Context

$5.00

per 1M tokens

#12Proprietary

Mistral Large 3

Mistral AI

Mistral's latest flagship competitive with frontier models, strong multilingual support and function calling.

1414

ELO

128K

Context

$2.00

per 1M tokens

#13Proprietary

Grok-3

xAI

Trained on xAI's Colossus supercluster, top-tier math reasoning with 93.3% AIME 2025 score.

1411

ELO

131K

Context

$3.00

per 1M tokens

#14Proprietary

Gemini 2.5 Flash

Google

Fast reasoning model with excellent cost efficiency, balancing speed and intelligence for high-throughput applications.

1410

ELO

1M

Context

$0.30

per 1M tokens

#15Proprietary

Claude Haiku 4.5

Anthropic

Anthropic's fastest model in the 4.5 generation, offering near-Sonnet quality at Haiku-tier speed and pricing.

1404

ELO

200K

Context

$1.00

per 1M tokens

#16Proprietary

o1

OpenAI

OpenAI's first reasoning model that uses chain-of-thought to solve complex math, science, and coding problems.

1402

ELO

200K

Context

$15.00

per 1M tokens

#17Proprietary

GPT-4o Mini

OpenAI

Cost-efficient small model replacing GPT-3.5 Turbo, offering strong performance at a fraction of GPT-4o's cost.

ELO

128K

Context

$0.15

per 1M tokens

#18Proprietary

GPT-4 Turbo

OpenAI

Enhanced GPT-4 with vision support, JSON mode, and a 128K context window at reduced pricing versus original GPT-4.

ELO

128K

Context

$10.00

per 1M tokens

#19Proprietary

o3-mini

OpenAI

Cost-efficient reasoning model with adjustable effort levels (low/medium/high), matching o1 at medium on STEM tasks.

ELO

200K

Context

$1.10

per 1M tokens

#20Proprietary

Claude 3.5 Sonnet

Anthropic

Anthropic's breakout model that surpassed GPT-4o on most benchmarks at half the cost, excelling at coding and reasoning.

ELO

200K

Context

$3.00

per 1M tokens

#21Proprietary

Claude 3.5 Sonnet (Oct 2024)

Anthropic

Updated Sonnet with computer use capability and improved coding (93.7% HumanEval), the most popular coding model of late 2024.

ELO

200K

Context

$3.00

per 1M tokens

#22Proprietary

Claude 3 Opus

Anthropic

Anthropic's original flagship model, excelling at complex analysis and nuanced writing with strong safety alignment.

ELO

200K

Context

$15.00

per 1M tokens

#23Proprietary

Claude 3 Haiku

Anthropic

Anthropic's fastest and most affordable model, designed for near-instant responses on simple queries and classification.

ELO

200K

Context

$0.25

per 1M tokens

#24Proprietary

Claude Sonnet 4

Anthropic

Balanced mid-tier model in the Claude 4 generation, offering strong coding and reasoning at competitive pricing.

ELO

200K

Context

$3.00

per 1M tokens

#25Proprietary

Gemini 1.5 Pro

Google

Google's first million-token context model (up to 2M), excelling at long-document understanding and multimodal tasks.

ELO

2M

Context

$1.25

per 1M tokens

#26Proprietary

Gemini 2.0 Flash

Google

Ultra-fast and affordable multimodal model with native tool use, image/audio generation, and 1M token context.

ELO

1M

Context

$0.10

per 1M tokens

#27Open Source

Llama 3.1 405B

Meta

Largest open-source model at release, competitive with GPT-4o and Claude 3.5 Sonnet across most benchmarks.

ELO

128K

Context

$0.80

per 1M tokens

#28Open Source

Llama 3.1 70B

Meta

Strong mid-size open-source model offering excellent performance-to-cost ratio for self-hosted deployments.

ELO

128K

Context

$0.35

per 1M tokens

#29Open Source

Llama 3.1 8B

Meta

Compact open-source model suitable for on-device and edge deployments with surprisingly strong capabilities for its size.

ELO

128K

Context

$0.05

per 1M tokens

#30Proprietary

Mistral Large 2

Mistral AI

Mistral's flagship with 123B parameters, multilingual in 80+ languages, and strong code generation (92% HumanEval).

ELO

128K

Context

$2.00

per 1M tokens

#31Open Source

Mixtral 8x22B

Mistral AI

Sparse MoE model using 39B of 141B parameters, Apache 2.0 licensed, excellent efficiency-to-performance ratio.

ELO

64K

Context

$2.00

per 1M tokens

#32Open Source

DeepSeek V3

DeepSeek

Chinese open-source MoE model rivaling GPT-4o at a fraction of the cost, trained for under $6M causing industry shock.

ELO

128K

Context

$0.14

per 1M tokens

#33Proprietary

Grok-2

xAI

xAI's second-gen model with real-time X/Twitter data access, competitive with GPT-4o on standard benchmarks.

ELO

128K

Context

$2.00

per 1M tokens

#34Proprietary

Command R+

Cohere

Enterprise-focused RAG-optimized model excelling at multi-step tool use, long document analysis, and multilingual tasks.

ELO

128K

Context

$2.50

per 1M tokens

#35Open Source

Qwen 2.5 72B Instruct

Alibaba

Alibaba's leading open-source model with strong multilingual and coding capabilities, competitive with Llama 3.1 70B.

ELO

128K

Context

$0.30

per 1M tokens

#36Proprietary

Claude Sonnet 4.6

Anthropic

Latest Sonnet with adaptive reasoning, 89.9% GPQA and 79.6% SWE-bench, excellent balance of speed and intelligence.

ELO

200K

Context

$3.00

per 1M tokens