Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Zero-shot Evaluation on 7 tasks zero-shot

72.79Mean Accuracy (Zero-shot)

BTC-LLM

45.427652.531359.63566.7387May 24, 2025Jun 14, 2025Jul 6, 2025Jul 27, 2025Aug 18, 2025Sep 8, 2025Sep 30, 2025
Updated 13d ago

Evaluation Results

MethodLinks
2025.05
72.79-
2025.05
72.71-
2025.05
72.25-
2025.05
71.5-
2025.05
69.47-
2025.05
67.73-
2025.05
66.53-
2025.05
65.53-
2025.05
65.45-
2025.05
65.24-
2025.05
62.77-
2025.05
62.65-
2025.05
62.11-
2025.05
60.71-
2025.09
600
2025.05
59.8-
2025.09
59.280
2025.05
59-
2025.09
58.81-1.19
2025.09
58.74-1.26
2025.09
58.57-1.43
2025.05
58.23-
2025.09
57.7-2.3
2025.09
57.4-1.88
2025.09
57.08-2.2
2025.05
56.98-
2025.09
56.95-2.33
2025.05
55.88-
2025.09
55.83-3.45
2025.09
54.840
2025.09
54.82-0.02
2025.09
54.16-0.68
2025.09
53.640
2025.09
53.60
2025.09
53.57-1.27
2025.09
52.93-0.71
2025.09
52.9-1.94
2025.09
52.74-0.86
2025.09
52.59-1.05
2025.09
52.12-1.52
2025.09
52.02-1.58
2025.09
52-1.6
2025.09
51.96-1.68
2025.09
51.590
2025.09
51.16-0.43
2025.09
51.110.01
2025.09
51.10
2025.09
50.34-0.76
2025.09
50.27-1.32
2025.09
49.78-1.81
2025.09
49.48-1.62
2025.09
49.25-4.35
2025.09
48.52-3.07
2025.09
48.42-2.68
2025.05
46.48-