Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Trustworthiness Evaluation on RagTruth
Loading...
93.92
Score
DeepSeek-V3.2
68.6376
75.2013
81.765
88.3287
May 23, 2026
Score
Updated 8d ago
Evaluation Results
Method
Method
Links
Score
DeepSeek-V3.2
2026.05
93.92
JT-Safe-V2-35B
Number of Parameters=35B
2026.05
92.17
Qwen3-235B
Number of Parameters=235B
2026.05
92.09
Qwen3-32B
Number of Parameters=32B
2026.05
89.04
Qwen3.5-35B
Number of Parameters=35B
2026.05
69.61
Feedback
Search any
task
Search any
task