Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Scientific Problem Solving on SciBench (Diff, Stat, Calc metrics)

65.47Diff Accuracy

Meta-reasoner

54.872457.623760.37563.1263Feb 27, 2025
Updated 26d ago

Evaluation Results

MethodLinks
2025.02
65.4779.4282.77
2025.02
60.3273.6480.23
2025.02
57.7675.9280.23
2025.02
57.4270.1277.93
2025.02
57.3270.3278.42
2025.02
55.2867.3276.53