Security Leaderboard

21 models ranked by security rating. Click any column to sort.

#ModelOverall
1Claude Opus 4.6
Anthropic
86
2GPT-5.4
OpenAI
82
3Claude Sonnet 4.6
Anthropic
80
4Gemini 3.1 Pro
Google
80
5o3
OpenAI
78
6Claude Opus 4
Anthropic
78
7GPT-5.4 mini
OpenAI
76
8Claude Sonnet 4
Anthropic
74
9Gemini 2.5 Pro
Google
74
10o3-mini
OpenAI
70
11Claude Haiku 4.5
Anthropic
70
12GPT-5.3 Instant
OpenAI
68
13GPT-5.3 Codex
OpenAI
67
14GPT-4o
OpenAI
64
15Gemini 2.0 Flash
Google
62
16Grok 4.20
xAI
62
17Llama 4
Meta
62
18Mistral Large
Mistral
62
19Qwen 3.5
Alibaba
59
20DeepSeek V3.2
DeepSeek
55
21Gemini 2.0 Flash-Lite
Google
49

Scores estimated based on model architecture, published security research, and documented vulnerabilities. Not verified benchmark results. Actual security depends on deployment configuration.

Compare models side-by-side →

Model selection is just the first layer. Your system prompt configuration matters more.

Scan Your Agent
Scan Agent