Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

BCER Agent: Reliable Long-Horizon MRI Workflow Execution via Compilation, Artifact Binding, and Bounded Local Recovery

About

Many recent medical VLM and agent studies are benchmarked on 2D images or comparatively short tool-calling exchanges, whereas real MRI analysis typically demands long, interdependent pipelines that operate on 3D/4D volumetric data. Under these conditions, reactive tool-calling agents are prone to cascading breakdowns triggered by faulty intermediate references, mismatched tool arguments, and limited control over cross-step dependencies. To address this, we introduce BCER (Brain-Cerebellum-Extremity-Reflector), a controller architecture aimed at dependable long-horizon MRI workflow execution. BCER decouples high-level planning from execution and provides bounded local recovery. We assess BCER on a multi-organ MRI benchmark covering brain, prostate, and cardiac tasks with both short- and long-chain workflows, using matched task contracts across controller variants and several backbone models. Relative to reactive baselines, BCER yields consistent improvements in end-to-end execution, with the most pronounced gains observed on long-chain workflows. BCER additionally enables auditability by maintaining explicit links between final outputs and intermediate artifacts and measurements. Code and benchmark are released at https://github.com/Albertlongzi/BCER.

Ziyang Long, Xinqi Li, Junzhou Chen, Yifan Gao, Debiao Li, Hsin-Jung Yang• 2026

Related benchmarks

TaskDatasetResultRank
Cardiac reportMRI Workflows
SR93
4
Prostate reportMRI Workflows
SR99
4
Super-ResolutionMRI Workflows
SR Score (%)100
4
Total Overall PerformanceMRI Workflows
Success Rate (SR)99
4
DenoiseMRI Workflows
Success Rate (SR)100
4
SegmentationMRI Workflows
SR100
4
Brain gradingMRI Workflows
SR100
4
RegistrationMRI Workflows
Success Rate (SR)100
4
ReconstructionMRI Workflows
SR100
4
Showing 9 of 9 rows

Other info

Follow for update