Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Safety and Informativeness Evaluation on OrBench Hard

75Deception Rate (Safe)

Mistral MoE-XL

39.95249.05158.1567.249May 30, 2026
Updated 1d ago

Evaluation Results

MethodLinks
2026.05
7572.79063.38066.99063.99070.69076.7857390709572.19672.287.170.1
2026.05
656571607067.585675462.48068.4906890658065767073.465.3
2026.05
48.150.854.344.258.848.773.355.251.946.66149.158.851.491.845.470.65136.463.359.4151.4
2026.05
41.341.956.241.849.440.964.144.353.24042.446.762.741.989.339.855.438.244.437.455.8441.3