Share your thoughts, 1 month free Claude Pro on usSee more

Truthful and Informative Generation on TruthfulQA (test)

84.7True*Info (%)

A-LQR

Updated 3mo ago

Evaluation Results

Method	Links
A-LQR 2026.04		84.7	86.78	97.6
S-PID 2026.04		78.52	80.51	97.53
A-LQR 2026.04		76.28	80.81	94.39
ODESteer 2026.04		71.56	75.03	95.37
A-LQR 2026.04		67.81	73.17	92.68
Llama3-8B-chat + E/R + Coarse 2024.10		67.14	-	-
Llama3-8B-chat + FactTune-FS 2024.10		64.58	-	-
A-LQR 2026.04		63.63	65.8	96.7
ODESteer 2026.04		63.5	66.24	95.86
Llama3-8B-chat + EVER-Pref 2024.10		63.01	-	-
S-PID 2026.04		62.99	64.99	96.92
ODESteer 2026.04		62.36	64.11	97.26
Llama3-8B-chat + Self-Eval-SKT 2024.10		61.88	-	-
S-PID 2026.04		61.48	63.55	96.74
Llama3-8B-chat + SFT 2024.10		59.17	-	-
ActAdd 2026.04		59.14	65.26	90.76
ActAdd 2026.04		58.98	67.94	86.8
Llama3-8B-chat 2024.10		58.89	-	-
S-PID 2026.04		58.26	62.06	93.88
PID-AcT 2026.04		58.07	62.96	92.26
PID-AcT 2026.04		58.07	62.96	92.26
ITI 2026.04		57.01	59.09	96.47
Llama2-7B-chat + E/R + Coarse 2024.10		56.47	-	-
Linear-AcT 2026.04		55.94	57.6	97.14
ActAdd 2026.04		55.85	62.04	90.28
Mean-AcT 2026.04		55.81	55.32	97.1
Original 2026.04		55.29	57.02	96.96
Linear-AcT 2026.04		53.71	55.32	97.1
PID-AcT 2026.04		53.3	54.92	97.06
ITI 2026.04		53.24	59.07	90.13
ITI 2026.04		53.24	55.91	95.23
Original 2026.04		53.17	55.15	96.4
Llama2-7B-chat + FactTune-FS 2024.10		52.48	-	-
Llama2-7B-chat + EVER-Pref 2024.10		51.07	-	-
Mean-AcT 2026.04		50.12	53.85	93.06
Linear-AcT 2026.04		50.07	53.1	94.29
Linear-AcT 2026.04		50.07	53.1	94.29
Llama2-7B-chat + Self-Eval-SKT 2024.10		48.65	-	-
Original 2026.04		48.64	50.62	96.08
ITI 2026.04		47.54	50.18	94.74
Mean-AcT 2026.04		47.02	48.22	97.52
Original 2026.04		46.22	47.44	97.43
Llama2-7B-chat + SFT 2024.10		45.52	-	-
Llama2-7B-chat 2024.10		38.83	-	-