Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Quantitative reasoning and autonomous analysis on BixBench-Verified-50 Full set

90Accuracy

Claude Code

89.303289.484189.66589.8459May 7, 2026
Updated 26d ago

Evaluation Results

MethodLinks
2026.05
90
2026.05
89.33
2026.05
89.33