Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Legal Reasoning on LegalBench CUAD Cardlytics Buffalo Wild Wings PF Hospitality 2023
Loading...
82.7
Accuracy (Cardl)
Agentic Adversarial QA
66.684
70.842
75
79.158
Feb 20, 2026
Accuracy (Cardl)
Accuracy (Buffa)
Accuracy (Pfhos)
Accuracy (Avg)
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy (Cardl)
Accuracy (Buffa)
Accuracy (Pfhos)
Accuracy (Avg)
Agentic Adversarial QA
Tokens=96k, Backbone=L...
2026.02
82.7
79.6
85.7
82.7
Entigraph
Tokens=6.7M, Backbone=...
2026.02
80.4
76.5
82
79.6
Knowledge-Instruct
Tokens=159k, Backbone=...
2026.02
78.6
70.4
75.8
75
Model-indep. QA
Tokens=147k, Backbone=...
2026.02
75
74.1
78.3
75.8
Paraphrase ×6
Tokens=149k, Backbone=...
2026.02
68.5
70.4
77
71.9
No Extra Data
Tokens=0, Backbone=LLa...
2026.02
67.3
69.1
72.1
69.5
Feedback
Search any
task
Search any
task