Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Conversation Evaluation on Braceval
Loading...
70.8
Accuracy
Gemini-3 Pro
31.176
41.463
51.75
62.037
Mar 10, 2026
Accuracy
Updated 2mo ago
Evaluation Results
Method
Method
Links
Accuracy
Gemini-3 Pro
Model variant=low
2026.03
70.8
Gemini-3 Pro
Model variant=high
2026.03
68.1
sabiazinho-4
Price Range=cost-effec...
2026.03
66
Qwen3
Model variant=235b
2026.03
65.6
deepseek
Model variant=v3.2
2026.03
60.8
gpt-5.2
Model variant=high
2026.03
60.2
gpt-5.2
Model variant=instant
2026.03
59
kimi-k2
Model variant=thinking
2026.03
56.9
gpt-5-mini
Price Range=cost-effec...
2026.03
56.3
gpt-oss-120b
Price Range=cost-effec...
2026.03
55.8
sabia-4
2026.03
53.8
gemini-2.5-flash-lite
Price Range=cost-effec...
2026.03
50.9
gpt-4.1
2026.03
50.2
sabia-3.1
2026.03
44.6
gpt-4.1-mini
Price Range=cost-effec...
2026.03
32.7
Feedback
Search any
task
Search any
task