Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

SAGE: Training Smart Any-Horizon Agents for Long Video Reasoning with Reinforcement Learning

About

As humans, we are natural any-horizon reasoners, i.e., we can decide whether to iteratively skim long videos or watch short ones in full when necessary for a given task. With this in mind, one would expect video reasoning models to reason flexibly across different durations. However, SOTA models are still trained to predict answers in a single turn while processing a large number of frames, akin to watching an entire long video, requiring significant resources. This raises the question: Is it possible to develop performant any-horizon video reasoning systems? Inspired by human behavior, we first propose SAGE, an agent system that performs multi-turn reasoning on long videos while handling simpler problems in a single turn. Secondly, we introduce an easy synthetic data generation pipeline using Gemini-2.5-Flash to train the orchestrator, SAGE-MM, which lies at the core of SAGE. We further propose an effective RL post-training recipe essential for instilling any-horizon reasoning ability in SAGE-MM. Thirdly, we curate SAGE-Bench with an average duration of greater than 700 seconds for evaluating video reasoning ability in real-world entertainment use cases. Lastly, we empirically validate the effectiveness of our system, data, and RL recipe, observing notable improvements of up to 6.1% on open-ended video reasoning tasks, as well as an impressive 8.2% improvement on videos longer than 10 minutes.

Jitesh Jain, Jialuo Li, Zixian Ma, Jieyu Zhang, Chris Dongjoo Kim, Sangho Lee, Rohun Tripathi, Tanmay Gupta, Christopher Clark, Humphrey Shi• 2025

Related benchmarks

TaskDatasetResultRank
Video ReasoningSAGE-Bench 1.0 (test)
Overall Score73.4
29
Video ReasoningMINERVA overall (test)
Accuracy32.9
8
Video ReasoningMINERVA 600+s (test)
Accuracy29
8
Video ReasoningMINERVA 0-600s (test)
Accuracy35.6
8
Showing 4 of 4 rows

Other info

GitHub

Follow for update