Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Failure Detection on Franka unseen
Loading...
0.215
Brier Score
SAFE-RNN-TDQC (Ours)
0.20916
0.24858
0.288
0.32742
Apr 22, 2026
Brier Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Brier Score
SAFE-RNN-TDQC (Ours)
VLA Model=π0-FAST
2026.04
0.215
RNN-TDQC (Ours)
VLA Model=π0-FAST
2026.04
0.228
SAFE-MLP-TDQC (Ours)
VLA Model=π0-FAST
2026.04
0.229
RNN-BCE
VLA Model=π0-FAST
2026.04
0.243
SAFE-MLP BCE
VLA Model=π0-FAST
2026.04
0.248
Avg entropy
VLA Model=π0-FAST
2026.04
0.281
SAFE-RNN
VLA Model=π0-FAST
2026.04
0.288
Avg prob.
VLA Model=π0-FAST
2026.04
0.294
Max prob.
VLA Model=π0-FAST
2026.04
0.323
Running Avg entropy
VLA Model=π0-FAST
2026.04
0.339
Running Avg prob.
VLA Model=π0-FAST
2026.04
0.361
Feedback
Search any
task
Search any
task