Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Abductive Reasoning on ART (test)

58.2Accuracy

Complementary Steering

31.232838.233945.23552.2361Apr 26, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.04
58.2---------
2026.04
57.53---------
2026.04
55.98---------
2026.04
55.74---------
2026.04
54.92---------
2026.04
54.67---------
2026.04
50.5---------
2026.04
47.19---------
2026.04
46.89---------
2026.04
45.09---------
2026.04
42.13---------
2026.04
42.07---------
2026.04
41.69---------
2026.04
40.95---------
2026.04
39.19---------
2026.04
39.1---------
2026.04
39.01---------
2026.04
32.27---------
2022.02
-0.8816.263.4938.484.573.882.682.7-
2022.02
-1.619.067.8841.744.34.232.832.87-
2022.02
-1.7919.510.6842.674.4443.062.96-