Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Confidence-guided retention on Checkerboard Toy
Loading...
83
AUPRC
FMwC
2.92
23.71
44.5
65.29
May 18, 2026
AUPRC
Updated 15d ago
Evaluation Results
Method
Method
Links
AUPRC
FMwC
Scoring=learned (GB),...
2026.05
83
FMwC
Scoring=learned (L1-LR...
2026.05
65
MC-Dropout (k=5)
Scoring=endpoint dispe...
2026.05
60
FMwC (k=5)
Scoring=endpoint dispe...
2026.05
58
Ensemble (k=5)
Scoring=endpoint dispe...
2026.05
55
FMwC
Scoring=temporal ratio...
2026.05
55
FMwC
Scoring=integrated con...
2026.05
33
FMwC
Scoring=endpoint confi...
2026.05
30
FM
Scoring=random, Traj.=1
2026.05
6
Feedback
Search any
task
Search any
task