Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Complex Reasoning on Seal-0
Loading...
53.4
Accuracy (Seal-0)
Claude-4.5-Sonnet
24.072
31.686
39.3
46.914
Apr 20, 2026
Accuracy (Seal-0)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy (Seal-0)
Claude-4.5-Sonnet
Context Window=128k, M...
2026.04
53.4
OpenAI-GPT-5-high
Context Window=128k, M...
2026.04
51.4
LiteResearcher-4B
Context Window=128k, M...
2026.04
41.8
AgentCPM-Explore-4B
Context Window=128k, M...
2026.04
40.5
Mirothinker 8B
Context Window=128k, M...
2026.04
40.4
DeepSeek-V3.2
Context Window=128k, M...
2026.04
38.5
Kimi-Researcher
Context Window=128k, M...
2026.04
36
Kimi-K2-0905
Context Window=128k, M...
2026.04
25.2
Feedback
Search any
task
Search any
task