Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Natural shifts

Benchmarks

Task NameDataset NameSOTA ResultTrend
Model improvementNatural shifts All-Bins
I-prop100
6
Model improvementNatural shifts Bin (0.16, 0.319)
Average Improvement52.1
6
Model improvementNatural shifts Bin (-0.001, 0.16)
Average Improvement27.189
6
Model improvementNatural shifts Bin (-0.041, -0.001)
Average Improvement18.047
6
Model improvementNatural shifts Bin (-0.082, -0.041)
Average Improvement25.824
6
Model Performance EstimationNatural shifts All-Bins
Power0.963
4
Model Performance EstimationNatural shifts bin (0.16, 0.319)
Estimation Inaccuracy0.9
4
Model Performance EstimationNatural shifts bin (-0.001, 0.16)
Estimation Inaccuracy0.5
4
Model Performance EstimationNatural shifts bin (-0.041, -0.001)
Estimation Inaccuracy0.034
4
Model Performance EstimationNatural shifts bin (-0.082, -0.041)
Estimation Inaccuracy1.8
4
Showing 10 of 10 rows