FLAG-Trader: Fusion LLM-Agent with Gradient-based Reinforcement Learning for Financial Trading
About
Large language models (LLMs) fine-tuned on multimodal financial data have demonstrated impressive reasoning capabilities in various financial tasks. However, they often struggle with multi-step, goal-oriented scenarios in interactive financial markets, such as trading, where complex agentic approaches are required to improve decision-making. To address this, we propose \textsc{FLAG-Trader}, a unified architecture integrating linguistic processing (via LLMs) with gradient-driven reinforcement learning (RL) policy optimization, in which a partially fine-tuned LLM acts as the policy network, leveraging pre-trained knowledge while adapting to the financial domain through parameter-efficient fine-tuning. Through policy gradient optimization driven by trading rewards, our framework not only enhances LLM performance in trading but also improves results on other financial-domain tasks. We present extensive empirical evidence to validate these enhancements.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Single Asset Trading | TSLA (test) | CR %50.394 | 24 | |
| Stock Trading | JNJ stock | Calmar Ratio33.724 | 15 | |
| Stock Trading | BTC (test) | Compound Return (CR)0.4551 | 15 | |
| Stock Trading | UVV stock | Compound Return (CR)46.799 | 15 | |
| Stock Trading | MSFT stock | Compound Return (CR)20.106 | 15 | |
| Stock Trading | HON (test) | Calmar Ratio (%)34.342 | 15 |