Skip to content
/ NeMo Public

[ECCV 2024] Finding NeMo: Negative-mined Mosaic Augmentation for Referring Image Segmentation

License

Notifications You must be signed in to change notification settings

snuviplab/NeMo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

[ECCV 2024] Finding NeMo: Negative-mined Mosaic Augmentation for Referring Image Segmentation

Seonsu Ha*, Chaeyun Kim*, Donghwa Kim*, Junho Lee, Sangho Lee, Joonseok Lee

스크린샷 2024-09-11 오후 11 37 55

Abstract

Referring Image Segmentation is a comprehensive task to segment an object referred by a textual query from an image. In nature, the level of difficulty in this task is affected by the existence of similar objects and the complexity of the referring expression. Recent RIS models still dshow a significant performance gap between easy and hard scenarios. We pose that the bottleneck exists in the data, and propose a simple but powerful data augmentation method, Negative-mined Mosaic Augmentation (NeMo). This method augments a training image into a mosaic with three other negative images carefully curated by a pretrained multimodal alignment model, e.g., CLIP, to make the sample more challenging. We discover that it is critical to properly adjust the difficulty level, neither too ambiguous nor too trivial. The augmented training data encourages the RIS model to recognize subtle differences and relationships between similar visual entities and to concretely understand the whole expression to locate the right target better. Our approach shows consistent improvements on various datasets and models, verified by extensive experiments.

🎙️ News

[2024/09/19] Initial code release

✏️ Note

We release the dataloder code for NeMo, which augments each image by combining it with three hard negatives to create a mosaic. This dataloader code is based on the implementation code of CRIS. Please refer to this repository for more details. Also, LMDB files for each dataset are need to load the image pool.

🔒 License

This project is under the MIT license following the previous works. See LICENSE for details.

📌 Citation

TBA

About

[ECCV 2024] Finding NeMo: Negative-mined Mosaic Augmentation for Referring Image Segmentation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages