Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Binary Classification on AGIN-Rat (test)
Loading...
82.71
Macro-F1
RL-Single
64.0524
68.8962
73.74
78.5838
Mar 12, 2026
Macro-F1
Updated 1mo ago
Evaluation Results
Method
Method
Links
Macro-F1
RL-Single
training=Single-task RL
2026.03
82.71
MT-RL-Judge
training=Multi-task RL
2026.03
81.58
SFT-Unified
training=Unified multi...
2026.03
81.31
SFT-Single
training=Single-task SFT
2026.03
78.08
Off-the-shelf
2026.03
64.77
Feedback
Search any
task
Search any
task