Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Software Engineering Question Answering on SWE-QA Streamlink
Loading...
8.74
Score
Unpruned
8.428
8.509
8.59
8.671
May 14, 2026
Score
Rounds
Tokens (K)
Updated 16d ago
Evaluation Results
Method
Method
Links
Score
Rounds
Tokens (K)
Unpruned
Backbone=Claude Opus 4.6
2026.05
8.74
17.4
394.4
LaMR
Backbone=Claude Opus 4.6
2026.05
8.68
18.5
353.2
SWE-Pruner
Backbone=Claude Opus 4.6
2026.05
8.6
19
398.6
SWE-Pruner
Backbone=Claude Sonnet...
2026.05
8.59
23.9
587.1
LaMR
Backbone=Claude Sonnet...
2026.05
8.53
16.3
467.7
Unpruned
Backbone=Claude Sonnet...
2026.05
8.44
23.4
629.3
Feedback
Search any
task
Search any
task