Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
LLM Workflow Optimization on Big-Bench Hard (test)
Loading...
78.6
BBH Overall Accuracy
Trace
40.12
50.11
60.1
70.09
Jun 23, 2024
BBH Overall Accuracy
BBH NLP Accuracy
BBH Algorithmic Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
BBH Overall Accuracy
BBH NLP Accuracy
BBH Algorithmic Accuracy
Trace
Evaluation Protocol=0-...
2024.06
78.6
75.8
80.6
DSPy-PO
Evaluation Protocol=0-...
2024.06
71.6
73.9
70
DSPy
Evaluation Protocol=0-...
2024.06
70.4
73.7
68
Trace
Evaluation Protocol=0-...
2024.06
59.5
70.9
51.1
DSPy-PO
Evaluation Protocol=0-...
2024.06
55.3
69
45.2
DSPy
Evaluation Protocol=0-...
2024.06
41.6
53.8
32.6
Feedback
Search any
task
Search any
task