Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Knowledge Conflict Resolution on Counterfact-style 40 q
Loading...
67.5
Accuracy
Conflict-Aware
41.5
48.25
55
61.75
Apr 26, 2026
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
Conflict-Aware
Backbone=Mistral-7B-In...
2026.04
67.5
SLB
Backbone=Mistral-7B-In...
2026.04
57.5
Baseline
Backbone=Mistral-7B-In...
2026.04
42.5
Feedback
Search any
task
Search any
task