Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Sentiment and topic classification on Subj
Loading...
74.1
Macro F1
BC
29.848
41.3365
52.825
64.3135
May 28, 2023
Sep 25, 2023
Jan 24, 2024
May 24, 2024
Sep 22, 2024
Jan 21, 2025
May 22, 2025
Macro F1
Updated 1mo ago
Evaluation Results
Method
Method
Links
Macro F1
BC
Model=Qwen, Shots=4
2025.05
74.1
GPT-J (DC)
Backbone=GPT-J, Prompt...
2023.05
70.7
GPT-J (Original)
Backbone=GPT-J, Prompt...
2023.05
65.2
SC
Model=Qwen, Shots=4
2025.05
62.23
GPT-J (CC)
Backbone=GPT-J, Prompt...
2023.05
61.9
SC
Model=Mistral, Shots=4
2025.05
59.38
SC
Model=Llama, Shots=4
2025.05
55.79
BC
Model=Llama, Shots=4
2025.05
54.15
BC
Model=Mistral, Shots=4
2025.05
48.05
Base LLM
Model=Llama, Shots=4
2025.05
40.18
CC
Model=Qwen, Shots=4
2025.05
38.54
DC
Model=Qwen, Shots=4
2025.05
36.97
Base LLM
Model=Mistral, Shots=4
2025.05
35.03
Base LLM
Model=Qwen, Shots=4
2025.05
33.02
CC
Model=Llama, Shots=4
2025.05
32.36
DC
Model=Llama, Shots=4
2025.05
32.36
CC
Model=Mistral, Shots=4
2025.05
31.55
DC
Model=Mistral, Shots=4
2025.05
31.55
Feedback
Search any
task
Search any
task