Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Predicting Model Output on Kit Fox vs. Red Fox
Loading...
84.1
Accuracy (Session 1)
Baseline
76.508
78.479
80.45
82.421
Mar 13, 2025
Accuracy (Session 1)
Accuracy (Session 2)
Accuracy (Session 3)
Utility
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy (Session 1)
Accuracy (Session 2)
Accuracy (Session 3)
Utility
Baseline
2025.03
84.1
89
84.1
1
VISIONLOGIC
2025.03
84.1
84.5
82.9
0.98
Control
2025.03
80.8
79.2
79.2
0.93
ACE
2025.03
80.6
83.2
76.2
0.93
CRAFT
2025.03
76.8
81.8
76.8
0.92
Feedback
Search any
task
Search any
task