Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Code Correctness Prediction on MultiPL-E Java (AUROC)

0.705AUROC

Min. Prob.

0.51780.56640.6150.6636May 27, 2026
Updated 6d ago

Evaluation Results

MethodLinks
2026.05
0.705
2026.05
0.701
2026.05
0.697
2026.05
0.695
2026.05
0.693
2026.05
0.691
2026.05
0.69
2026.05
0.69
2026.05
0.687
2026.05
0.687
2026.05
0.686
2026.05
0.681
2026.05
0.676
2026.05
0.674
2026.05
0.672
2026.05
0.665
2026.05
0.663
2026.05
0.662
2026.05
0.662
2026.05
0.658
2026.05
0.653
2026.05
0.652
2026.05
0.647
2026.05
0.645
2026.05
0.644
2026.05
0.644
2026.05
0.64
2026.05
0.64
2026.05
0.638
2026.05
0.637
2026.05
0.635
2026.05
0.632
2026.05
0.628
2026.05
0.627
2026.05
0.626
2026.05
0.623
2026.05
0.623
2026.05
0.618
2026.05
0.616
2026.05
0.616
2026.05
0.615
2026.05
0.615
2026.05
0.61
2026.05
0.601
2026.05
0.599
2026.05
0.598
2026.05
0.596
2026.05
0.595
2026.05
0.592
2026.05
0.589
2026.05
0.588
2026.05
0.572
2026.05
0.553
2026.05
0.551
2026.05
0.551
2026.05
0.548
2026.05
0.548
2026.05
0.536
2026.05
0.529
2026.05
0.525