Home AI News Enhancing Instance Segmentation with Mask-free Annotations for New Categories

Enhancing Instance Segmentation with Mask-free Annotations for New Categories

Enhancing Instance Segmentation with Mask-free Annotations for New Categories

Instance segmentation is the process of identifying and differentiating multiple objects within an image. It has become more popular in recent years due to advancements in deep learning techniques, such as convolutional neural networks (CNNs) and Mask R-CNN. These techniques combine object detection and pixel-wise segmentation to accurately identify and mask objects in an image.

However, current detection models have limitations in the number of categories they can identify. Typically, models trained on the COCO dataset can detect around 80 categories. Adding new categories requires human annotation, which is time-consuming. To overcome this challenge, Open Vocabulary (OV) methods have been developed, leveraging image-caption pairs and vision language models to learn new categories. However, these methods often struggle with overfitting on existing categories and poor generalization to new ones.

To address these issues, Salesforce AI researchers have introduced the Mask-free OVIS pipeline. This pipeline generates bounding box and instance-mask annotations from image-caption pairs, eliminating the need for manual annotation. The pipeline utilizes a pre-trained vision-language model to create pseudo-mask annotations for objects of interest. These pseudo-mask annotations are then refined using an iterative masking process. In the second stage, a weakly-supervised segmentation network is trained to select the best proposal based on overlap with the refined pseudo-mask. Finally, a Mask-RCNN model is trained using the generated pseudo annotations.

The researchers conducted experiments on popular datasets like MS-COCO and OpenImages, and their pipeline outperformed existing open vocabulary instance segmentation models. By utilizing pseudo-annotations, the pipeline achieved exceptional performance in detection and instance segmentation tasks.

This vision-language guided approach to pseudo annotation generation eliminates the need for human annotators and paves the way for more advanced and precise instance segmentation models. The researchers’ work has been recognized and accepted at the Computer Vision and Pattern Recognition Conference in 2023.

You can check out the paper, project, and reference article for more details. Don’t forget to join the ML SubReddit, Discord Channel, and Email Newsletter for the latest AI research news. If you have any questions or suggestions, feel free to reach out to Asif@marktechpost.com.

Featured Tools from AI Tools Club:
– Check out 100’s AI Tools in AI Tools Club.

Meet Notion: Your Wiki, Docs, & Projects Together. Click here to learn more.

Source link


Please enter your comment!
Please enter your name here