Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
QNP Abstraction Generation on Ferry domain (evaluation)
Loading...
1
Coverage Rate
GPT
-0.04
0.23
0.5
0.77
Feb 11, 2026
Coverage Rate
Updated 3mo ago
Evaluation Results
Method
Method
Links
Coverage Rate
GPT
Automated Debugging=true
2026.02
1
Gemini
Automated Debugging=true
2026.02
1
DeepSeek
Automated Debugging=true
2026.02
0.55
GPT
Automated Debugging=false
2026.02
0.15
Gemini
Automated Debugging=false
2026.02
0.15
DeepSeek
Automated Debugging=false
2026.02
0
Qwen
Automated Debugging=true
2026.02
0
Qwen
Automated Debugging=false
2026.02
0
Feedback
Search any
task
Search any
task