Large-Scale Aspect-Based Sentiment Analysis with Reasoning-Infused LLMs

About

We introduce Arctic-ABSA, a collection of powerful models for real-life aspect-based sentiment analysis (ABSA). Our models are tailored to commercial needs, trained on a large corpus of public data alongside carefully generated synthetic data, resulting in a dataset 20 times larger than SemEval14. We extend typical ABSA models by expanding the number of sentiment classes from the standard three (positive, negative, neutral) to five, adding mixed and unknown classes, while also jointly predicting overall text sentiment and supporting multiple languages. We experiment with reasoning injection by fine-tuning on Chain-of-Thought (CoT) examples and introduce a novel reasoning pretraining technique for encoder-only models that significantly improves downstream fine-tuning and generalization. Our 395M-parameter encoder and 8B-parameter decoder achieve up to 10 percentage points higher accuracy than GPT-4o and Claude 3.5 Sonnet, while setting new state-of-the-art results on the SemEval14 benchmark. A single multilingual model maintains 87-91% accuracy across six languages without degrading English performance. We release ABSA-mix, a large-scale benchmark aggregating 17 public ABSA datasets across 92 domains.

Pawe{\l} Liskowski, Krzysztof Jankowski• 2026

Related benchmarks

Task	Dataset	Result
Aspect-based Sentiment Analysis	SemEval Task 4 Subtask 2 Restaurant domain 2014 (test)	Accuracy91.76	30
Aspect-based Sentiment Analysis	SemEval Laptop 2014	--	19
Aspect-based Sentiment Analysis	ABSA-mix	Accuracy93.03	11
Aspect-based Sentiment Analysis	Overalls (Overall Sentiments Dataset)	Accuracy90	11
Aspect-based Sentiment Analysis	FABSA	Accuracy97.17	11
Aspect-based Sentiment Analysis	SENTFIN	Accuracy91.53	11
Aspect-based Sentiment Analysis	SemEval Restaurant 2014	Accuracy89.34	11
Aspect-based Sentiment Analysis	SemEval Laptop 2014 (test)	Accuracy87.16	9

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord