AIBreaking Wire
Pricing
AI Breaking Wire

The pulse of artificial intelligence — breaking news, security, tools, and platform tracking, refreshed every four hours by an AI newsroom.

Last build · 2026-06-04

The AI Brief

Free weekly digest — top AI news, tools, and security alerts.

Explore

  • News
  • Tools
  • Jobs
  • Merch
  • Webinars
  • Dashboards

Community

  • Discord
  • Projects
  • Marketplace
  • Claude Code
  • Events

Security

  • Security Hub
  • Vulnerability DB
  • Security News
  • Challenges

Company

  • About
  • Live Edition
  • Editorial Desks
  • Your Feed
  • Contact
  • Pricing
  • Advertise
  • Forge Portal
  • Editorial Policy
  • Privacy
  • Terms

Developers

  • API Docs
  • API Keys

Connect

  • Discord
  • Twitter / X
  • GitHub
  • Newsletter
  • Newsletter Archive
  • RSS Feeds

© 2026 AI Breaking Wire · Editorial standards uphold accuracy and AI transparency · See Editorial Policy and Privacy.

Press tip line: [email protected]

AI Model LeaderboardMarket DataGitHub Trending

AI Model Leaderboard

Benchmark scores across leading AI models. Click column headers to sort.

44 results
Rank ↕Model ↕Company ↕Benchmark ↕Score ↓
1Claude Opus 4.7AnthropicArena ELO1285.0
2Claude Opus 4.6AnthropicArena ELO1270.0
4GPT-4oOpenAIArena ELO1250.0
5Gemini 3.1 ProGoogle DeepMindArena ELO1235.0
3Claude Sonnet 4.6AnthropicArena ELO1220.0
6Gemini 2.5 FlashGoogle DeepMindArena ELO1180.0
1Claude Opus 4.7AnthropicHumanEval97.2
2Claude Opus 4.6AnthropicHumanEval96.5
10GPT-5.4 ThinkingOpenAIHumanEval95.0
3Claude Sonnet 4.6AnthropicHumanEval94.0
4GPT-4oOpenAIHumanEval93.5
1Claude Opus 4.7AnthropicGPQA93.1
2Claude Opus 4.6AnthropicGPQA92.5
1Claude Opus 4.7AnthropicMMLU92.3
5Gemini 3.1 ProGoogle DeepMindHumanEval92.1
10GPT-5.4 ThinkingOpenAIGPQA92.0
4GPT-4oOpenAIGPQA91.8
1Claude Opus 4.7AnthropicMATH91.5
2Claude Opus 4.6AnthropicMMLU91.2
6Gemini 2.5 FlashGoogle DeepMindHumanEval91.0
5Gemini 3.1 ProGoogle DeepMindGPQA90.5
3Claude Sonnet 4.6AnthropicGPQA90.2
2Claude Opus 4.6AnthropicMATH90.1
7Llama 3.2 405BMeta AIHumanEval90.0
10GPT-5.4 ThinkingOpenAIMMLU90.0
6Gemini 2.5 FlashGoogle DeepMindGPQA89.0
10GPT-5.4 ThinkingOpenAIMATH89.0
7Llama 3.2 405BMeta AIGPQA88.8
4GPT-4oOpenAIMMLU88.7
4GPT-4oOpenAIMATH88.2
3Claude Sonnet 4.6AnthropicMMLU88.1
8Grok 3xAIHumanEval88.0
3Claude Sonnet 4.6AnthropicMATH87.5
9DeepSeek R1—HumanEval87.0
5Gemini 3.1 ProGoogle DeepMindMMLU86.5
7Llama 3.2 405BMeta AIMMLU85.2
5Gemini 3.1 ProGoogle DeepMindMATH85.0
7Llama 3.2 405BMeta AIMATH83.5
6Gemini 2.5 FlashGoogle DeepMindMMLU83.0
6Gemini 2.5 FlashGoogle DeepMindMATH82.0
8Grok 3xAIMMLU80.1
8Grok 3xAIMATH79.5
9DeepSeek R1—MMLU78.5
9DeepSeek R1—MATH77.0