Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Logical Deduction Five Objects on Big-Bench Hard (test)

52.33Accuracy

EvoPrompt(DE)-OPTS(TS)

-0.356413.32182740.6782Mar 3, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.03
52.33
2025.03
48.17
2025.03
2.67
2025.03
1.67