Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
QNP abstraction generation on Heavy domain (evaluation)
Loading...
100
Coverage Rate
GPT
-4
23
50
77
Feb 11, 2026
Coverage Rate
Updated 3mo ago
Evaluation Results
Method
Method
Links
Coverage Rate
GPT
Automated Debugging=true
2026.02
100
Gemini
Automated Debugging=true
2026.02
100
Gemini
Automated Debugging=false
2026.02
100
GPT
Automated Debugging=false
2026.02
75
DeepSeek
Automated Debugging=true
2026.02
50
Qwen
Automated Debugging=true
2026.02
50
DeepSeek
Automated Debugging=false
2026.02
0
Qwen
Automated Debugging=false
2026.02
0
Feedback
Search any
task
Search any
task