Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Code Generation on MBPP EvalPlus base
Loading...
90.5
Pass@1
Claude 3.5 Sonnet
47.86
58.93
70
81.07
Jul 31, 2024
Pass@1
Updated 4d ago
Evaluation Results
Method
Method
Links
Pass@1
Claude 3.5 Sonnet
2024.07
90.5
Llama 3 405B
Parameters=405B
2024.07
88.6
GPT-4o
2024.07
87.8
Llama 3 70B
Parameters=70B
2024.07
86
GPT-4
2024.07
83.6
GPT-3.5 Turbo
2024.07
82
Mixtral 8x22B
Parameters=8x22B
2024.07
78.6
Llama 3 8B
Parameters=8B
2024.07
72.8
Nemotron 4 340B
Parameters=340B
2024.07
72.8
Gemma 2 9B
Parameters=9B
2024.07
71.7
Mistral 7B
Parameters=7B
2024.07
49.5
Feedback
Search any
task
Search any
task