Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Arabic Financial Sentiment Classification on Arabic Financial Sentiment Gold Standard (250 Samples)
Loading...
61.6
Accuracy
DeepSeek R1 (Chat)
49.12
52.36
55.6
58.84
May 19, 2026
Accuracy
Precision
Recall
F1 Score
Macro F1 Score
Updated 14d ago
Evaluation Results
Method
Method
Links
Accuracy
Precision
Recall
F1 Score
Macro F1 Score
DeepSeek R1 (Chat)
2026.05
61.6
47.3
61.6
53.4
36
GPT-5
2026.05
57.2
60.2
57.2
58.3
82.9
DeepSeek R1 (Reasoner)
2026.05
56.8
57.1
56.8
55.6
73.9
GPT-4o Mini
used to construct gold...
2026.05
49.6
63.4
49.6
55.6
0
GPT-4 Turbo
used to construct gold...
2026.05
49.6
63.4
49.6
55.6
0
Gemini 2.5 Flash
used to construct gold...
2026.05
49.6
63.4
49.6
55.6
0
Feedback
Search any
task
Search any
task