Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Knowledge Conflict Resolution on Held-out 30 q
Loading...
76.7
Accuracy
Conflict-Aware
48.932
56.141
63.35
70.559
Apr 26, 2026
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
Conflict-Aware
Backbone=Mistral-7B-In...
2026.04
76.7
SLB
Backbone=Mistral-7B-In...
2026.04
70
Baseline
Backbone=Mistral-7B-In...
2026.04
50
Feedback
Search any
task
Search any
task