Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Question Answering and Reasoning on Macro-average (MMLU, MATH, GSM8K, BBH)
Loading...
46
Cost Reduction
CISC
8.56
18.28
28
37.72
Feb 10, 2025
Cost Reduction
Accuracy Improvement
Updated 3d ago
Evaluation Results
Method
Method
Links
Cost Reduction
Accuracy Improvement
CISC
Confidence Method=P(Tr...
2025.02
46
1.1
CISC
Confidence Method=P(Tr...
2025.02
41
1.6
CISC
Confidence Method=Resp...
2025.02
31
0.8
CISC
Confidence Method=Verb...
2025.02
30
0.4
CISC
Confidence Method=Verb...
2025.02
22
0.8
CISC
Confidence Method=Resp...
2025.02
22
1.1
CISC
Confidence Method=Verb...
2025.02
18
0.4
CISC
Confidence Method=Verb...
2025.02
10
0.2
Feedback
Search any
task
Search any
task