Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multiple Choice Question (MCQ) on AccidentBench (land)

68.4Accuracy (Short, Easy)

MAVEN (+ RL)

42.60849.3045662.696May 21, 2026
Updated 12d ago

Evaluation Results

MethodLinks
2026.05
68.450.337.752.15342.638.344.65135.421.63644.2
2026.05
66.646.930.247.947.742.13541.652.635.221.336.442
2026.05
63.145.13146.451.542.738.944.4484025.337.842.8
2026.05
6342.834.846.954.733.935.841.54632.718.732.540.3
2026.05
60.440.826.142.452.340.534.542.450.237.52336.940.6
2026.05
43.632.421.332.4403329.634.230.324.314.723.129.9