Share your thoughts, 1 month free Claude Pro on usSee more

Complex reasoning on FEVER (val)

84.67Macro-F1

EvoPool

Updated 1d ago

Evaluation Results

Method	Links
EvoPool 2026.06		84.67
LLM annotation 2026.06		79.32
Alchemist 2026.06		16.65
DataSculpt 2026.06		4.64