Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

YOLOv11: An Overview of the Key Architectural Enhancements

About

This study presents an architectural analysis of YOLOv11, the latest iteration in the YOLO (You Only Look Once) series of object detection models. We examine the models architectural innovations, including the introduction of the C3k2 (Cross Stage Partial with kernel size 2) block, SPPF (Spatial Pyramid Pooling - Fast), and C2PSA (Convolutional block with Parallel Spatial Attention) components, which contribute in improving the models performance in several ways such as enhanced feature extraction. The paper explores YOLOv11's expanded capabilities across various computer vision tasks, including object detection, instance segmentation, pose estimation, and oriented object detection (OBB). We review the model's performance improvements in terms of mean Average Precision (mAP) and computational efficiency compared to its predecessors, with a focus on the trade-off between parameter count and accuracy. Additionally, the study discusses YOLOv11's versatility across different model sizes, from nano to extra-large, catering to diverse application needs from edge devices to high-performance computing environments. Our research provides insights into YOLOv11's position within the broader landscape of object detection and its potential impact on real-time computer vision applications.

Rahima Khanam, Muhammad Hussain• 2024

Related benchmarks

TaskDatasetResultRank
Object DetectionCS-positive
mAP15
25
Object DetectionAI-TOD 1.0 (test)
AP30.5
24
Object DetectionDTOD
AP5012.2
22
Object DetectionFireRescue dataset (test)
mAP500.6856
21
Object DetectionTN3K
AP52.9
19
Object DetectionBUSI
AP@0.5 (BN)70.6
19
Object DetectionThyroid II
AP@0.5 (BN)88.1
19
Object DetectionThyroid I (test)
AP@0.5 (BN)0.841
19
Attention Heatmap PredictionSurgAtt-SZPH (test)
NSS2.195
18
Object DetectionPlantDoc (test)
mAP@500.572
16
Showing 10 of 22 rows

Other info

Follow for update