Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Program generation on expert-curated Biology dataset
Loading...
70.26
Program Accuracy
TaNOS
56.012
59.711
63.41
67.109
Apr 23, 2026
Program Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Program Accuracy
TaNOS
Supervision budget=100...
2026.04
70.26
TaNOS w/o SSL
Supervision budget=100...
2026.04
68.85
TaNOS
Supervision budget=10%...
2026.04
63.85
SFT
Supervision budget=100...
2026.04
62.3
TaNOS w/o SSL
Supervision budget=10%...
2026.04
62.18
SFT + SSL
Supervision budget=100...
2026.04
61.48
TaNOS w/o Sketch
Supervision budget=100...
2026.04
59.84
TaNOS w/o Sketch
Supervision budget=10%...
2026.04
58.2
SFT
Supervision budget=10%...
2026.04
57.38
SFT + SSL
Supervision budget=10%...
2026.04
56.56
Feedback
Search any
task
Search any
task