G-Loss: Graph-Guided Fine-Tuning of Language Models
About
Traditional loss functions, including cross-entropy, contrastive, triplet, and su pervised contrastive losses, used for fine-tuning pre-trained language models such as BERT, operate only within local neighborhoods and fail to account for the global semantic structure. We present G-Loss, a graph-guided loss function that incorporates semi-supervised label propagation to use structural relationships within the embedding manifold. G-Loss builds a document-similarity graph that captures global semantic relationships, thereby guiding the model to learn more discriminative and robust embeddings. We evaluate G-Loss on five benchmark datasets covering key downstream classification tasks: MR (sentiment analysis), R8 and R52 (topic categorization), Ohsumed (medical document classification), and 20NG (news categorization). In the majority of experimental setups, G-Loss converges faster and produces semantically coherent embedding spaces, resulting in higher classification accuracy than models fine-tuned with traditional loss functions.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Text Classification | MR | Accuracy90.82 | 174 | |
| Text Classification | R8 | Accuracy98.18 | 91 | |
| Text Classification | R52 | Accuracy96.65 | 76 | |
| Text Classification | ohsumed | Accuracy75.76 | 33 | |
| Text Classification | 20NG | Accuracy85.33 | 20 | |
| Text Classification | MR | Accuracy0.9087 | 12 | |
| Text Classification | GLUE | SST-2 Accuracy95.88 | 9 |