Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Noise Sensitivity Analysis on 420-item set v4 vs v2 (full)

6.58Delta Performance (v4 vs v2)

gpt-5.1

1.382.734.085.43Apr 20, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.04
6.58
2026.04
5.22
4.35
2026.04
3.65
3.2
2026.04
1.58