Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multiple-choice question answering on Model-Written Evaluations (MWE) MCQ

88.8Wealth Acc

SVF

58.22466.16274.182.038Feb 2, 2026
Updated 3mo ago

Evaluation Results

MethodLinks
2026.02
88.880.285.47795.69473.275.896.892.875.261.28179.685.180.1
2026.02
86.887.683.877.696.896.882.678.696.288.47458.890.287.687.282.2
2026.02
83.472.288.274.885.482.237.269.899.286.662.652.23150.469.669.7
2026.02
81.873.277.475.692.45069.47298.649.841.249.858.649.874.260
2026.02
79.463.884.86681.678.442.440.296.883.46756.434.43869.560.9
2026.02
78.244.684.252.674.476.639.433.295.680.262.64233.436.266.852.2
2026.02
76.842.687.65370.272.639.232.896.476.659.641343666.350.7
2026.02
74.646.878.849.877.473.838.640.494.870.263.447.834.638.86652.5
2026.02
69.2-75.6-52.6-34.4-94.4-59.2-33.2-59.8-
2026.02
65.249.664.464.674.850.465.465.879.849.850.450.649.849.864.354.4
2026.02
6449.664.464.674.248.252.662.278.449.849.849.850.250.261.953.5
2026.02
62.4426046.459.848.647.844.664.650.650.250.649.849.856.447.5
2026.02
62.267.468.868.490.45045.868.452.270.249.866.449.849.859.962.9
2026.02
59.4-58-56.6-43.6-48.2-48.2-50.2-52-