Share your thoughts, 1 month free Claude Pro on usSee more

Average Performance across 10 Task Types on 13 Datasets (test)

75.8Avg. Accuracy

PROMPTED

Updated 5mo ago

Evaluation Results

Method	Links
PROMPTED 2023.10		75.8
PROMPTED 2023.10		68.8
Output Refinement 2023.10		68.6
Zero-Shot CoT 2023.10		67.3
Zero-Shot 2023.10		65.7
Output Refinement 2023.10		64.1
Zero-Shot CoT 2023.10		63.4
Zero-Shot 2023.10		62.2