Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Question Answering on NQ-Open (Adversarial/Delta Evaluation)
Loading...
4
PTR Adversarial Success Count
PTR
3.8
3.9
4
4.1
Apr 5, 2026
PTR Adversarial Success Count
ReAct Adversarial Success Count
Average Delta EM
Updated 12d ago
Evaluation Results
Method
Method
Links
PTR Adversarial Success Count
ReAct Adversarial Success Count
Average Delta EM
PTR
Average over=4 models
2026.04
4
0
0.292
ReAct
Average over=4 models
2026.04
4
0
0.292
Feedback
Search any
task
Search any
task