Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Chess

Benchmarks

Task NameDataset NameSOTA ResultTrend
Classificationchess
F1 Score81.19
26
Safe Move PredictionChess (Exploratory Learning Phase)
Blunder Rate24.11
16
Anomaly Detectionchess
ROC-AUC93.22
14
Anomaly Detectionchess
PR-AUC1.86
14
Classificationchess
Accuracy85.4
12
Tool SelectionChess Skill: beginner, intermediate, advanced
Accuracy100
10
Tool SelectionChess Specialists: opening, midgame, endgame, late-endgame
Accuracy64.4
10
Tabular Classificationchess (test)
F1 Score85
8
Chess Position EvaluationChess 1,002 disjoint positions
Mean Evaluation (cp)-243
5
ExtrapolationChess (8, 8) (test)
Accuracy84.3
4
ClassificationChess (test)
Macro F1-score82.42
3
Board state reconstructionChess
Coverage98
3
Decision Tree Rashomon Set CalculationChess
Runtime6.12
2
ChessChess Match vs Stockfish 100 games
Win Rate25
2
Decision Tree Rashomon Set constructionChess
Construction Time6.12
1
Instruction RecoveryChess O3
Extracted Instructions126,516,521
1
Instruction RecoveryChess O2
Extracted Instructions Count1,905,184
1
Instruction RecoveryChess O0
Extracted Instructions Count8,568,798
1
Showing 18 of 18 rows