Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Genetic Circuit Design on Literature-91
Loading...
44.9
Task Success Rate (TSR)
Qwen3-8B RLVF-H-C
2.78
13.715
24.65
35.585
May 14, 2026
Task Success Rate (TSR)
Updated 9d ago
Evaluation Results
Method
Method
Links
Task Success Rate (TSR)
Qwen3-8B RLVF-H-C
Training Protocol=GenC...
2026.05
44.9
Gemma-3-12B RLVF-H-C
Training Protocol=GenC...
2026.05
39
Qwen3-8B RLVF-HIER
Training Protocol=RLVF...
2026.05
37.8
Llama-3.1-8B RLVF-H-C
Training Protocol=GenC...
2026.05
37.1
Qwen3-8B RLVF-BIN
Training Protocol=RLVF...
2026.05
31.1
Qwen3-8B SFT
Training Protocol=Supe...
2026.05
27.3
Gemma-3-12B SFT
Training Protocol=Supe...
2026.05
23.4
Llama-3.1-8B SFT
Training Protocol=Supe...
2026.05
22.4
OPUS 4.5
Few-shot Setting=5-shot
2026.05
21.7
OPUS 4.5
Few-shot Setting=0-shot
2026.05
14.8
Qwen3-8B BASE
Few-shot Setting=0-shot
2026.05
4.4
Feedback
Search any
task
Search any
task