Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Novel Knowledge Recall on KID-Bench Category A v2
Loading...
97.1
Accuracy
Conflict-Aware
95.436
95.868
96.3
96.732
Apr 26, 2026
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
Conflict-Aware
Backbone=Gemma-2B, Dec...
2026.04
97.1
D2L Baseline
Backbone=Gemma-2B, Dec...
2026.04
96.7
SLB
Backbone=Gemma-2B, Dec...
2026.04
95.5
Feedback
Search any
task
Search any
task