Automated robotic systems are evolving rapidly in industrial manufacturing. In robotic seal coating scenarios, while RGB-D based perception is adaptable for flexible demands, it struggles with achieving high accuracy.
To address this limitation, we propose a multi-modal visual detection system for robotic seam gluing, which employs a coarse-to-fine strategy across 2D and 3D data modalities. The proposed framework integrates two main components: a 2D Seam Segmentation & Classification Network (2D-SSCN) for coarse seam identification at the image level, and a 3D Seam Refinement & Transformation Network (3D-SRTN) for precise seam refinement at the point cloud level. Specifically, 3D-SRTN performs local seam point optimization, followed with global transformation to minimize cumulative coordinate errors. A post-processing procedure is then designed to generate smooth 6-DoF paths.
Additionally, we introduce the RGB-D Coating Dataset, specifically tailored for this task, which includes both 2D and 3D annotated data for RGB-D coating perception. Ablation studies validate the effectiveness of individual modules, while comparative experiments demonstrate the overall superiority of our system.
Architecture of Our Framework
SFRM
APTM
The results of 2D-SSCN
The results of 3D-SRTN
Real-world Experiments