Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SHARP: A Self-Evolving Human-Auditable Rubric Policy for Financial Trading Agents

About

Large language models (LLMs) are increasingly deployed for autonomous financial trading, a domain requiring continuous adaptation to noisy, non-stationary markets. Existing self-improving agents typically address this through unbounded free-form prompt optimization. However, in low signal-to-noise environments with delayed scalar rewards (P\&L), this unstructured approach exacerbates the fundamental credit assignment problem: optimizers cannot reliably distinguish systematic logic flaws from stochastic market variance, inevitably leading to policy drift. To overcome this bottleneck, we introduce the Self-Evolving Human-Auditable Rubric Policy (SHARP), a neuro-symbolic framework that replaces unconstrained text mutation with structured, symbolic policy optimization. SHARP confines the agent's reasoning to a bounded, human-readable rubric of explicit condition-action rules. When sub-optimal trades occur, an attribution agent employs cross-sample reasoning across multiple samples to isolate specific rule failures. This enables targeted, atomic policy edits that are subsequently regularized through strict walk-forward validation. Evaluated across three diverse equity sectors and four LLM backbones, SHARP consistently transforms generic initial heuristics into highly robust strategies, lifting the empirical performance of compact models by 10 to 20 percentage points on average (e.g., GPT-4o-mini). Ultimately, SHARP demonstrates that LLMs can achieve dynamic and efficient adaptation while significantly enhancing the structural transparency and auditability demanded by institutional finance.

Xiwen Chen, Wenhui Zhu, Songzhu Zheng, Kashif Rasul, Yueyue Deng, Huayu Li• 2026

Related benchmarks

TaskDatasetResultRank
Financial TradingAI Tech 16-stock universe April 2025 to March 2026
Return33.2
11
Financial TradingCons. Disc. 16-stock universe April 2025 to March 2026
Return16.8
11
Financial TradingBiotech 16-stock universe April 2025 to March 2026
Return17.8
11
Trading Signal GenerationThree-sector average walk-forward backtest (test)
Return (%)9.2
7
Showing 4 of 4 rows

Other info

Follow for update