Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Learning Interpretable Style Embeddings via Prompting LLMs

About

Style representation learning builds content-independent representations of author style in text. Stylometry, the analysis of style in text, is often performed by expert forensic linguists and no large dataset of stylometric annotations exists for training. Current style representation learning uses neural methods to disentangle style from content to create style vectors, however, these approaches result in uninterpretable representations, complicating their usage in downstream applications like authorship attribution where auditing and explainability is critical. In this work, we use prompting to perform stylometry on a large number of texts to create a synthetic dataset and train human-interpretable style representations we call LISA embeddings. We release our synthetic stylometry dataset and our interpretable style models as resources.

Ajay Patel, Delip Rao, Ansh Kothary, Kathleen McKeown, Chris Callison-Burch• 2023

Related benchmarks

TaskDatasetResultRank
Style Representation EvaluationSTEL-or-Content Multilingual (averaged across languages)
Simplicity Score15
5
Style Representation EvaluationSTEL-or-Content Cross-lingual (averaged across languages)
Formality0.27
5
Authorship VerificationPAN AV 2015 (test)
ROC-AUC (Greek)0.48
4
Authorship VerificationPAN AV 2013 (test)
ROC-AUC (Greek)0.51
4
Authorship VerificationPAN AV 2014 (test)
ROC-AUC (Greek)0.46
4
Authorship VerificationPAN Average 2013-2015 (test)
Greek Avg ROC-AUC0.48
4
Showing 6 of 6 rows

Other info

Follow for update