Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Code Generation on BigCodeBench (Success Rate Metrics)
Loading...
59.5
Last Epoch Success Rate
MemRL
47.02
50.26
53.5
56.74
Jan 6, 2026
Last Epoch Success Rate
Cumulative Success Rate (CSR)
Updated 4d ago
Evaluation Results
Method
Method
Links
Last Epoch Success Rate
Cumulative Success Rate (CSR)
MemRL
Model=GPT-4o
2026.01
59.5
0.627
MemP
Model=GPT-4o
2026.01
57.8
0.602
Self-RAG
Model=GPT-4o
2026.01
49.7
0.561
Mem0
Model=GPT-4o
2026.01
48.7
0.495
No Memory
Model=GPT-4o
2026.01
48.5
-
RAG
Model=GPT-4o
2026.01
47.5
0.483
Pass@10
Model=GPT-4o
2026.01
-
0.577
Feedback
Search any
task
Search any
task