Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Frontiers in Intelligent Colonoscopy

About

Colonoscopy is currently one of the most sensitive screening methods for colorectal cancer. This study investigates the frontiers of intelligent colonoscopy techniques and their prospective implications for multimodal medical applications. With this goal, we begin by assessing the current data-centric and model-centric landscapes through four tasks for colonoscopic scene perception, including classification, detection, segmentation, and vision-language understanding. This assessment enables us to identify domain-specific challenges and reveals that multimodal research in colonoscopy remains open for further exploration. To embrace the coming multimodal era, we establish three foundational initiatives: a large-scale multimodal instruction tuning dataset ColonINST, a colonoscopy-designed multimodal language model ColonGPT, and a multimodal benchmark. To facilitate ongoing monitoring of this rapidly evolving field, we provide a public website for the latest updates: https://github.com/ai4colonoscopy/IntelliScope.

Ge-Peng Ji, Jingyi Liu, Peng Xu, Nick Barnes, Fahad Shahbaz Khan, Salman Khan, Deng-Ping Fan• 2024

Related benchmarks

TaskDatasetResultRank
CLSColonINST (seen)
Accuracy94.06
17
CLSColonINST (unseen)
Accuracy83.24
17
RECColonINST (seen)
IoU85.74
17
RECColonINST (unseen)
IoU0.5624
17
REGColonINST (seen)
Accuracy99.96
17
REGColonINST (unseen)
Accuracy80.18
17
Capsule endoscopy keyframe detection and diagnostic performanceVideoCAP (test)
Lesion-Level Detection Rate34.56
11
High-level Diagnostic ReasoningEndoAgentBench
CAP (CAR)52.91
8
Fine-Grained PerceptionEndoAgentBench
Localization Classification Accuracy33.5
8
Showing 9 of 9 rows

Other info

Code

Follow for update