Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Experiment

Benchmarks

Task NameDataset NameSOTA ResultTrend
Annotation AccuracyExperiment 1 (test)
F1 Score (Ga)100
40
Touch Pointing Parameter EstimationExperiment 3 (Leave-One-Out Cross-Validation)
R20.948
22
Touch Pointing Parameter EstimationExperiment 3 Full Aggregate Data (train)
R20.991
22
Modeling pointing movement timeExperiment 2 no-correction condition
Adjusted R-squared0.9622
14
Success Rate and Distribution Parameter RegressionExperiment 2 (LOOCV)
R2 Score0.947
14
Success Rate and Distribution Parameter RegressionExperiment 2
R20.999
14
Network Security Policy EnforcementExperiment Thread Network 1
Requests Sent100
12
Success Rate PredictionExperiment 1 (LOOCV)
R22.64
10
Success Rate PredictionExperiment 1
R22.65
10
Trajectory PlanningExperiment Simulation 1
Trajectory Time (s)4.46
7
Approximate Bayesian InferenceExperiment 5D input Gaussian Process 4.2.1
Expected NLL (Hyper)0.31
6
Geometric Specificity MeasurementExperiment 2 Small-deformation regime lambda=0.05
RM (\u03BB=0.05)161.73
6
Lyrics-to-song generationExperiment 3 (evaluation set)
CLAP0.417
5
Fairness-constrained classificationExperiment E6 (test)
Best Loss2.9
4
Fairness-constrained classificationExperiment E4 (test)
Best Loss0.41
4
Fairness-constrained classificationExperiment E3 (test)
Best Loss0.42
4
Fairness-constrained classificationExperiment E2 (test)
Best Loss0.42
4
Fairness-constrained classificationExperiment E1 (test)
Best Loss0.41
4
box_logsumexp optimizationExperiment d = 20 3
Test Relative Error0.031
4
Causal Graph Metric Agreement AnalysisExperiment 3 within-n centered
Pearson Correlation0.885
4
Debate GenerationExperiment 1 Input Set
Choose Rate76.32
4
Circuit board assemblyExperiment 4.4.3
Total Assembly Time (s)26.7
3
Gear assemblyExperiment 4.4.2
Total Assembly Time63.5
3
Robotic task executionExperiment 4.4.1
Task Completion Time (s)6.56
3
Interaction controlExperiment 4.2.2
Contact established (s)1.4
3
Showing 25 of 36 rows