Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Claim-level specificity control on LongFact pilot
Loading...
724
Claims Emitted
Calibrated CSS
720.88
721.69
722.5
723.31
Apr 19, 2026
Claims Emitted
Total Claims
Support Precision
Specificity Retention
Supported Specificity
OAU
Updated 1mo ago
Evaluation Results
Method
Method
Links
Claims Emitted
Total Claims
Support Precision
Specificity Retention
Supported Specificity
OAU
Calibrated CSS
Policy=Calibrated, Mod...
2026.04
724
757
99.31
94.11
93.55
92.89
Oracle CSS
Policy=Oracle, Model=G...
2026.04
723
757
100
94.4
94.4
94.4
Uncalibrated CSS
Policy=Uncalibrated, M...
2026.04
721
757
99.58
59
58.76
58.36
Feedback
Search any
task
Search any
task