Trustworthiness evaluation

Benchmarks

Dataset Name	SOTA Method	Metric
LLM Trustworthiness Benchmark	Mi:dm K 2.5 Pro (March ‘26)	Bias Score89.58	17	4mo ago
DVDs	LIME	Average F196.6	16	4mo ago
Books	LIME	Avg F196.7	16	4mo ago
Trust-Memevo Tool-use Domain	TAME	No-Memory81.8	14	4mo ago
Trust-Memevo Math Domain	Reasoningbank+Guard	No-Memory Score36.7	14	4mo ago
Trust-Memevo Science Domain	Reasoningbank	No-Memory81.3	14	4mo ago
AraTrust	LLaMA3-Tamed-70B	Accuracy63.41	8	4mo ago
RagTruth	DeepSeek-V3.2	Score93.92	5	2mo ago
TruthfulQA		TruthfulQA Score81.88	5	2mo ago
Trustworthiness Average (human evaluation)	Sparse Activation Control	Control Win Rate0.88	2	4mo ago
Adv Fact (human evaluation)	Sparse Activation Control	Control Wins68	1	4mo ago
Privacy (human evaluation)	Sparse Activation Control	Control Wins100	1	4mo ago
Robust (human evaluation)	Sparse Activation Control	Control Wins100	1	4mo ago
Exag safety (human evaluation)	Sparse Activation Control	Control Wins68	1	4mo ago

Showing 14 of 14 rows