Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

An Exploration-Analysis-Disambiguation Reasoning Framework for Word Sense Disambiguation with Low-Parameter LLMs

About

Word Sense Disambiguation (WSD) remains a key challenge in Natural Language Processing (NLP), especially when dealing with rare or domain-specific senses that are often misinterpreted. While modern high-parameter Large Language Models (LLMs) such as GPT-4-Turbo have shown state-of-the-art WSD performance, their computational and energy demands limit scalability. This study investigates whether low-parameter LLMs (<4B parameters) can achieve comparable results through fine-tuning strategies that emphasize reasoning-driven sense identification. Using the FEWS dataset augmented with semi-automated, rationale-rich annotations, we fine-tune eight small-scale open-source LLMs (e.g. Gemma and Qwen). Our results reveal that Chain-of-Thought (CoT)-based reasoning combined with neighbour-word analysis achieves performance comparable to GPT-4-Turbo in zero-shot settings. Importantly, Gemma-3-4B and Qwen-3-4B models consistently outperform all medium-parameter baselines and state-of-the-art models on FEWS, with robust generalization to unseen senses. Furthermore, evaluation on the unseen "Fool Me If You Can'' dataset confirms strong cross-domain adaptability without task-specific fine-tuning. This work demonstrates that with carefully crafted reasoning-centric fine-tuning, low-parameter LLMs can deliver accurate WSD while substantially reducing computational and energy demands.

Deshan Sumanathilaka, Nicholas Micallef, Julian Hough• 2026

Related benchmarks

TaskDatasetResultRank
Word Sense Disambiguation42D
F1 Score78.48
19
Word Sense DisambiguationhardEN
F1 Score54.19
19
Word Sense DisambiguationFEWS (test)
F1 Score76.52
19
Word Sense DisambiguationFEWS
Noun WSD Accuracy81
12
Binary classification of sense IDFool me if you can (Set 4)
F1 Score85.2
10
Binary classification of sense IDFool me if you can (Set 1)
F1 Score97
10
Binary classification of sense IDFool me if you can (Set 2)
F1 Score97.2
10
Binary classification of sense IDFool me if you can (Set 3)
F1 Score84.7
10
Showing 8 of 8 rows

Other info

Follow for update