Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Predicting Model Output on Otter vs. Beaver
Loading...
96.8
Accuracy (Session 1)
VISIONLOGIC
79.744
84.172
88.6
93.028
Mar 13, 2025
Accuracy (Session 1)
Accuracy (Session 2)
Accuracy (Session 3)
Utility
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy (Session 1)
Accuracy (Session 2)
Accuracy (Session 3)
Utility
VISIONLOGIC
2025.03
96.8
98.4
99.2
1.1
CRAFT
2025.03
86.3
90.9
90.9
1
Control
2025.03
85.1
88.3
92.9
1
Baseline
2025.03
84.4
90.3
92.2
1
ACE
2025.03
80.4
85.7
90.5
0.96
Feedback
Search any
task
Search any
task