Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Image Classification on Digits (Robustness and Uncertainty Evaluation)
Loading...
75.28
Average Accuracy
Task Objective
16.6864
31.8982
47.11
62.3218
May 8, 2026
Average Accuracy
Unseen Performance
OOD Detection Score
Path Sensitivity (Logit)
Path Sensitivity (Hidden)
Expected Calibration Error (ECE)
Updated 22d ago
Evaluation Results
Method
Method
Links
Average Accuracy
Unseen Performance
OOD Detection Score
Path Sensitivity (Logit)
Path Sensitivity (Hidden)
Expected Calibration Error (ECE)
Task Objective
Objective=task
2026.05
75.28
66.33
80.89
8.4402
6.2641
3.25
Reveal-path Objective
Objective=reveal
2026.05
52.61
42.22
59.78
3.5193
2.1058
17.46
Full Field Objective
Objective=full
2026.05
18.94
18.33
18.44
10.455
2.0529
58.76
Feedback
Search any
task
Search any
task