Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Code-Preference

Benchmarks

Task NameDataset NameSOTA ResultTrend
Causal Variable IdentificationCode-Preference
F1 (X)80.2
7
Outcome ReasoningCode-Preference
M' (F1 Mean)77
7
Showing 2 of 2 rows