Semantic Image Segmentation for Search-and-Rescue Scenarios using Deep Learning and Attention Mechanisms
The importance of incorporating robots in search-and-rescue missions has been on the rise. Nevertheless, their effectiveness heavily depends on their capability to process visual information. This work investigates the potential of deep learning architectures integrating attention mechanisms in enhancing semantic image segmentation for search-and-rescue scenarios. Attention mechanisms in transformer architectures are explored as they have demonstrated remarkable performance in Natural Language Processing and Computer Vision by capturing rich contextual information. Since no existing segmentation dataset aligns with the objectives of this work, image classification datasets for disaster response can be labeled for training and testing purposes. Among other potential architectures, in particular SegFormer is considered for implementation, due to its success in achieving state-of-the-art results while being computationally efficient. For implementation and training, the well-established PyTorch library is used. After training, the results are evaluated regarding generalization capability as well as the integration and the effects of attention mechanisms in particular.