Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Reasoning on Multi-Domain Reasoning
Loading...
83.9
Accuracy
Council Mode
64.452
69.501
74.55
79.599
Apr 3, 2026
Accuracy
Updated 13d ago
Evaluation Results
Method
Method
Links
Accuracy
Council Mode
Latency (s)=8.4
2026.04
83.9
Claude Opus 4.6
Latency (s)=4.1
2026.04
74.1
GPT-5.4
Latency (s)=3.2
2026.04
72.4
Gemini 3.1 Pro
Latency (s)=2.8
2026.04
70.8
DeepSeek V3.2
Latency (s)=5.6
2026.04
67.5
Seed 2.0 Pro
Latency (s)=3.8
2026.04
65.2
Feedback
Search any
task
Search any
task