Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Reasoning on BigBench-Hard (Averaged Collection)

45.41Ours Accuracy

Trans-LoRA

43.3343.8744.4144.95May 27, 2024
Updated 1mo ago

Evaluation Results

MethodLinks
2024.05
45.41--
2024.05
44.12--
2024.05
43.61--
2024.05
43.41--
2024.05
-43.32-
2024.05
-31.84-
2024.05
--37.85
2024.05
--37.75