Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Code Generation on BigCodeBench (Success Rate Metrics)
Loading...
59.5
Last Epoch Success Rate
MemRL
47.02
50.26
53.5
56.74
Jan 6, 2026
Last Epoch Success Rate
Cumulative Success Rate (CSR)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Last Epoch Success Rate
Cumulative Success Rate (CSR)
MemRL
Model=GPT-4o
2026.01
59.5
0.627
MemP
Model=GPT-4o
2026.01
57.8
0.602
Self-RAG
Model=GPT-4o
2026.01
49.7
0.561
Mem0
Model=GPT-4o
2026.01
48.7
0.495
No Memory
Model=GPT-4o
2026.01
48.5
-
RAG
Model=GPT-4o
2026.01
47.5
0.483
Pass@10
Model=GPT-4o
2026.01
-
0.577
Feedback
Search any
task
Search any
task