Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SWE-QA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Software Engineering Question AnsweringSWE-QA Pro
Rubric Judge Accuracy51.4
15
Codebase QASWE-QA (test)
Score80.28
9
Software Engineering Question AnsweringSWE-QA Conan
Score8.71
6
Software Engineering Question AnsweringSWE-QA Reflex
Overall Score8.15
6
Software Engineering Question AnsweringSWE-QA Streamlink
Score8.74
6
Showing 5 of 5 rows