Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Code Generation on RepoBench-P Python XF-First
Loading...
52.4
Exact Match (EM)
Ours
37.632
41.466
45.3
49.134
May 18, 2026
Exact Match (EM)
Exact Score (ES)
Time To First Byte (TTFT)
End-to-End Time (E2E)
Updated 15d ago
Evaluation Results
Method
Method
Links
Exact Match (EM)
Exact Score (ES)
Time To First Byte (TTFT)
End-to-End Time (E2E)
Ours
Backbone=Llama-3.1-8B
2026.05
52.4
73.8
118
3.8
Repoformer
Backbone=Llama-3.1-8B
2026.05
51.8
73.4
245
5.6
RepoHyper
Backbone=Llama-3.1-8B
2026.05
51.2
73.1
268
5.9
RepoCoder
Backbone=Llama-3.1-8B
2026.05
49.8
71.5
285
6.2
Sync-RAG
Backbone=Llama-3.1-8B
2026.05
48.3
70.2
312
6.8
No-RAG
Backbone=Llama-3.1-8B
2026.05
38.2
62.4
45
1.8
Feedback
Search any
task
Search any
task