Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Solution Simulation on Human Evaluation Solution Simulation (test)
Loading...
3.75
Score
GPT-4o
2.8452
3.0801
3.315
3.5499
May 26, 2025
Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Score
GPT-4o
Setting=Prototype Mapp...
2025.05
3.75
GPT-4o
Setting=Prototype Mapp...
2025.05
3.7
GPT-3.5
Setting=Prototype Mapp...
2025.05
3.3
GPT-3.5
Setting=Random + Refine
2025.05
3.29
Claude-3.5-Sonnet
Setting=Prototype Mapp...
2025.05
3.27
Claude-3.5-Sonnet
Setting=Prototype Mapp...
2025.05
3.24
Llama-3.3-70B-Instruct
Setting=Prototype Mapp...
2025.05
3.08
Llama-3.3-70B-Instruct
Setting=Prototype Mapp...
2025.05
2.88
Feedback
Search any
task
Search any
task