Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Novelty Evaluation on NoveltyBench
Loading...
44
Overall Dominance
BACO-P-PUNC
3.96
14.355
24.75
35.145
Nov 7, 2025
Overall Dominance
Overall Coverage
Updated 22h ago
Evaluation Results
Method
Method
Links
Overall Dominance
Overall Coverage
BACO-P-PUNC
Models=Olmo2-7B and Ol...
2025.11
44
23.6
Aligned
Models=Olmo2-7B and Ol...
2025.11
30.5
20.9
Base
Models=Olmo2-7B and Ol...
2025.11
11.7
9.8
Others
Models=Olmo2-7B and Ol...
2025.11
8.3
-
Nudging
Models=Olmo2-7B and Ol...
2025.11
5.5
28.1
Feedback
Search any
task
Search any
task