Deep Learning-based SIFT-like Keypoint Detection for Object Contours in Real Images

K. Krüger

The objective of this work is to develop a Convolutional Neural Network (CNN)-based keypoint detection approach for object contours in real images. A hand-crafted detection method has been developed in GET Lab, but requires given object contours. Similar to Scale-Invariant Feature Transform (SIFT), a scale-space analysis is performed to obtain scale and rotation-invariant keypoints and their characteristic scales. Another work developed in GET Lab is a contour extraction method based on image segmentation to extract the object contours from real images. However, the sequential application of image segmentation and keypoint detection adds considerable difficulty and complexity to the overall process, thereby impeding its real-time capability. A CNN-based approach is investigated to address this drawback and aim for a better generalization. Examples of existing CNN-based keypoint detection methods are Key.Net and SobelNet, which are not only real-time capable but also achieve comparable results to traditional keypoint detection methods. However, they are not contour-based. In the context of this thesis, the complete process of contour-based keypoint extraction should be replaced by a CNN-based approach. The final approach should be executable as a single unit and real-time capable. The CNN should be trained in a supervised manner using images from the Segment Anything 1 Billion (SA-1B) dataset. The ground truth data, consisting of keypoint positions and their characteristic scales, will be generated using the detection method from GET Lab.