Future-as-Label: Scalable Supervision from Real-World Outcomes

About

Time creates free supervision: forecasts about real-world events resolve to verifiable outcomes. The passage of time provides labels that require no annotation. To exploit this structure, we extend reinforcement learning with verifiable rewards to real-world prediction over time. We train language models to make probabilistic forecasts from causally masked information, using proper scoring rules as the reward function once events resolve. Learning is driven entirely by realized outcomes, enabling scalable outcome-based supervision in open-world prediction. On real-world forecasting benchmarks, Qwen3-32B trained using Foresight Learning improves Brier score by 27% and halves calibration error relative to its pretrained baseline, and outperforms Qwen3-235B on both constructed future-event prediction tasks and the Metaculus benchmark despite a 7x parameter disadvantage.

Benjamin Turtel, Paul Wilczewski, Danny Franklin, Kris Skothiem• 2026

Related benchmarks

Task	Dataset	Result	Rank
Probabilistic Forecasting	Metaculus and Polymarket (test)	Brier Score0.137		30
Forecasting	Kalshi August 2025 resolution filter (OOD)	Brier Score0.258		10

Showing 2 of 2 rows

Other info

Follow for update

@wizwand_team Discord