Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Zero-shot Evaluation on Arc-e, PIQA, Hellaswag, OpenBookQA, Winogrande, MMLU, and BoolQ (test)

31.63Arc-e Accuracy

Peri-LN

29.18629.820530.45531.0895Dec 26, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.12
31.6363.0732.0532.449.4124.9158.0541.65
2025.12
31.3962.9432.833.550.3724.8856.6741.79
2025.12
30.9762.8932.7732.3250.5425.756.2641.64
2025.12
30.562.4232.1133.8850.1525.1961.8642.3
2025.12
29.2858.8426.5130.8550.3424.939.0837.11