Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Question Answering on PolicyQA (test)

0.484SAE

GPT-4o-mini Multi-agent-few

0.303040.350020.3970.44398Jun 3, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.06
0.4840.460.4750.4730.4690.4670.4710.0060.024
2025.06
0.4780.4230.4580.4520.4440.4380.4490.0140.055
2025.06
0.4740.4760.4940.480.4870.480.4820.0060.02
2025.06
0.4640.4440.4510.4580.4470.4450.4520.0060.02
2025.06
0.4550.4360.4290.4370.4220.4220.4340.0090.033
2025.06
0.4510.480.4740.4830.4630.4810.4720.010.032
2025.06
0.4460.4830.4680.4720.4920.4770.4730.0110.046
2025.06
0.4120.3320.360.3570.3930.370.3710.0210.08
2025.06
0.40.380.3910.3850.3940.3720.3870.0080.028
2025.06
0.3810.3740.3680.3580.3720.3680.370.0060.023
2025.06
0.3520.3430.3320.3380.3310.3230.3370.0080.029
2025.06
0.310.260.2680.2310.2370.2890.2660.0230.079