Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Failure Attribution on Who&When Algorithm-Generated

42.86Step-level Accuracy

AgenTracer (G)

11.285619.482827.6835.8772Mar 12, 2026
Updated 2mo ago

Evaluation Results

MethodLinks
2026.03
42.8669.62
2026.03
37.363.73
2026.03
25.5135.2
2026.03
24.7546.53
2026.03
23.9844.13
2026.03
23.8155.56
2026.03
21.72-
2026.03
19.24-
2026.03
19.0629.1
2026.03
16.5930.11
2026.03
15.3126.02
2026.03
13.5251.12
2026.03
12.554.33