StreamBridge: Turning Your Offline Video Large Language Model into a Proactive Streaming Assistant

About

We present StreamBridge, a simple yet effective framework that seamlessly transforms offline Video-LLMs into streaming-capable models. It addresses two fundamental challenges in adapting existing models into online scenarios: (1) limited capability for multi-turn real-time understanding, and (2) lack of proactive response mechanisms. Specifically, StreamBridge incorporates (1) a memory buffer combined with a round-decayed compression strategy, supporting long-context multi-turn interactions, and (2) a decoupled, lightweight activation model that can be effortlessly integrated into existing Video-LLMs, enabling continuous proactive responses. To further support StreamBridge, we construct Stream-IT, a large-scale dataset tailored for streaming video understanding, featuring interleaved video-text sequences and diverse instruction formats. Extensive experiments show that StreamBridge significantly improves the streaming understanding capabilities of offline Video-LLMs across various tasks, outperforming even proprietary models such as GPT-4o and Gemini 1.5 Pro. Simultaneously, it achieves competitive or superior performance on standard video understanding benchmarks.

Haibo Wang, Bo Feng, Zhengfeng Lai, Mingze Xu, Shiyu Li, Weifeng Ge, Afshin Dehghan, Meng Cao, Ping Huang• 2025

Related benchmarks

Task	Dataset	Result
Streaming Video Understanding	StreamingBench	Overall57.12	259
Real-Time Visual Understanding	StreamingBench	Overall Score73.79	134
Long Video Understanding	VideoMME	Accuracy64.4	89
Online Video Understanding	OVOBench 1.0 (test)	Real-Time Perception71.3	27
Real-time Streaming	OVO-Bench	RTVP71.3	17
Streaming Video Understanding	OVOBench	Accuracy (Proactive Forwarding)48.4	17
Real-time Streaming	StreamingBench	RTVU77	15
Readiness-aware streaming understanding	ProReady-QA	SSR Accuracy72.2	14
Online VideoQA	StreamingBench 1.0 (test)	Real-Time Score77	14
Dense Video Captioning	E.T.Bench	--	14

Showing 10 of 16 rows

Other info

Follow for update

@wizwand_team Discord