Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Rationale Behind Essay Scores: Enhancing S-LLM's Multi-Trait Essay Scoring with Rationale Generated by LLMs

About

Existing automated essay scoring (AES) has solely relied on essay text without using explanatory rationales for the scores, thereby forgoing an opportunity to capture the specific aspects evaluated by rubric indicators in a fine-grained manner. This paper introduces Rationale-based Multiple Trait Scoring (RMTS), a novel approach for multi-trait essay scoring that integrates prompt-engineering-based large language models (LLMs) with a fine-tuning-based essay scoring model using a smaller large language model (S-LLM). RMTS uses an LLM-based trait-wise rationale generation system where a separate LLM agent generates trait-specific rationales based on rubric guidelines, which the scoring model uses to accurately predict multi-trait scores. Extensive experiments on benchmark datasets, including ASAP, ASAP++, and Feedback Prize, show that RMTS significantly outperforms state-of-the-art models and vanilla S-LLMs in trait-specific scoring. By assisting quantitative assessment with fine-grained qualitative rationales, RMTS enhances the trait-wise reliability, providing partial explanations about essays. The code is available at https://github.com/BBeeChu/RMTS.git.

SeongYeub Chu, JongWoo Kim, Bryan Wong, MunYong Yi• 2024

Related benchmarks

TaskDatasetResultRank
Automated essay scoringASAP++ full-data setting
Score P10.716
10
Multi-trait automated essay scoringASAP++ (full-data)
Overall Score0.755
10
Automated essay scoringASAP++ 32-data setting (test)
QWK (P1)0.479
6
Multi-trait automated essay scoringASAP++ 32-data setting (test)
Overall Score0.494
6
Showing 4 of 4 rows

Other info

Follow for update