Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Relational linear probing on TRUTH
Loading...
86
F1 (GT)
KL-RP
35.04
48.27
61.5
74.73
May 21, 2026
F1 (GT)
F1 (LLM)
dKL Divergence
Updated 12d ago
Evaluation Results
Method
Method
Links
F1 (GT)
F1 (LLM)
dKL Divergence
KL-RP
Model=Llama-3.1, Layer=16
2026.05
86
94
0.04
LRE
Model=Llama-3.1, Layer=16
2026.05
84
90
0.25
LRE
Model=Gemma-2, Layer=13
2026.05
80
92
0.1
KL-RP
Model=Gemma-2, Layer=13
2026.05
72
90
0.42
Random
Model=Llama-3.1, Layer=16
2026.05
50
49
0.49
Random
Model=Gemma-2, Layer=13
2026.05
37
47
0.82
Feedback
Search any
task
Search any
task