Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

AlignTune: Modular Toolkit for Post-Training Alignment of Large Language Models

About

Post-training alignment is central to deploying large language models (LLMs), yet practical workflows remain split across backend-specific tools and ad-hoc glue code, making experiments hard to reproduce. We identify backend interference, reward fragmentation, and irreproducible pipelines as key obstacles in alignment research. We introduce AlignTune, a modular toolkit exposing a unified interface for supervised fine-tuning (SFT) and RLHF-style optimization with interchangeable TRL and Unsloth backends. AlignTune standardizes configuration, provides an extensible reward layer (rule-based and learned), and integrates evaluation over standard benchmarks and custom tasks. By isolating backend-specific logic behind a single factory boundary, AlignTune enables controlled comparisons and reproducible alignment experiments.

R E Zera Marveen Lyngkhoi, Chirag Chawla, Pratinav Seth, Utsav Avaiya, Soham Bhattacharjee, Mykola Khandoga, Rui Yuan, Vinay Kumar Sankarapu• 2026

Related benchmarks

TaskDatasetResultRank
Wealth Management Chatbot Response GenerationBitext Wealth Management LLM Chatbot (test)
BLEU0.2692
6
Conversational Response GenerationBitext Retail Banking LLM Chatbot (test)
BLEU26.85
5
Showing 2 of 2 rows

Other info

Follow for update