Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Do we still need Human Annotators? Prompting Large Language Models for Aspect Sentiment Quad Prediction

About

Aspect sentiment quad prediction (ASQP) facilitates a detailed understanding of opinions expressed in a text by identifying the opinion term, aspect term, aspect category and sentiment polarity for each opinion. However, annotating a full set of training examples to fine-tune models for ASQP is a resource-intensive process. In this study, we explore the capabilities of large language models (LLMs) for zero- and few-shot learning on the ASQP task across five diverse datasets. We report F1 scores almost up to par with those obtained with state-of-the-art fine-tuned models and exceeding previously reported zero- and few-shot performance. In the 20-shot setting on the Rest16 restaurant domain dataset, LLMs achieved an F1 score of 51.54, compared to 60.39 by the best-performing fine-tuned method MVP. Additionally, we report the performance of LLMs in target aspect sentiment detection (TASD), where the F1 scores were close to fine-tuned models, achieving 68.93 on Rest16 in the 30-shot setting, compared to 72.76 with MVP. While human annotators remain essential for achieving optimal performance, LLMs can reduce the need for extensive manual annotation in ASQP tasks.

Nils Constantin Hellwig, Jakob Fehle, Udo Kruschwitz, Christian Wolff• 2025

Related benchmarks

TaskDatasetResultRank
Aspect Sentiment Quad PredictionRest15
F1 Score52.07
93
Aspect Sentiment Quad PredictionRest16
F1 Score58.79
93
Target Aspect Sentiment DetectionRest15
F1 Score65.94
63
Target Aspect Sentiment DetectionRest16
F1 Score72.15
42
Target Aspect Sentiment DetectionFlightABSA
F1 Score68.11
32
Target Aspect Sentiment DetectionRest 2016
F1 Score68.53
31
Target Aspect Sentiment DetectionCoursera
F1 Score46.83
29
Target Aspect Sentiment DetectionHotels
F1 Score66.92
29
Aspect Sentiment Quad PredictionFlightABSA
F1 Score56.9
23
Aspect Sentiment Quad PredictionCoursera
F1 Score32.02
23
Showing 10 of 14 rows

Other info

Code

Follow for update