Ideology as a Problem: Lightweight Logit Steering for Annotator-Specific Alignment in Social Media Analysis

About

LLMs internally organize political ideology along low-dimensional structures that are partially, but not fully aligned with human ideological space. This misalignment is systematic, model specific, and measurable. We introduce a lightweight linear probe that both quantifies the misalignment and minimally corrects the output layer. This paper introduces a simple and efficient method for aligning models with specific user opinions. Instead of retraining the model, we calculated a bias score from its internal features and directly adjusted the final output probabilities. This solution is practical and low-cost and preserves the original reasoning power of the model.

Wei Xia, Haowen Tang, Luozheng Li• 2025

Related benchmarks

Task	Dataset	Result	Rank
Stance Detection	MITweet 12 Facets	Accuracy66.83		10

Showing 1 of 1 rows

Other info

Follow for update

@wizwand_team Discord