PyTorch implementation of SLiMe: Segment Like Me, a 1-shot image segmentation method based on Stable Diffusion.
Aliasghar Khani1, 2, Saeid Asgari Taghanaki1, 2, Aditya Sanghi2, Ali Mahdavi Amiri1, Ghassan Hamarneh1
1 Simon Fraser University 2 Autodesk Research
To begin using SLiMe, you first need to create a virtual environment and install the dependencies using the following commands:
python -m venv slime_venv
source slime_venv/bin/activate
pip install -r requirements.txt
*** For each image and mask pair used for training, validation, or testing with SLiMe, their names should match. Furthermore, the images should be in PNG
format, while the masks should be in NumPy
format. ***
First, create a new folder (e.g., slime/data/train
) and place the training images along with their corresponding masks in that folder (slime/data/train
). Then, provide the path to the created training data folder (slime/data/train
) as an argument to --train_data_dir
. If you have validation data, which will only be used for checkpoint selection, repeat the same process for the validation data (e.g., place the images and masks in slime/data/val
) and provide the folder's address as an argument to --val_data_dir
. However, if you don't have validation data, use the address of the training data folder as an argument for --val_data_dir
.
Next, place the test images in a separate folder (e.g., slime/data/test
) and specify the path to this folder using --test_data_dir
. Additionally, you should define a name for the segmented parts within the training images to be used with the --parts_to_return
argument, including the background. For instance, if you have segmented the body and head of a dog, you should set --parts_to_return
to "background body head"
.
Finally, execute the following command within the slime folder (the main folder obtained after cloning):
python -m src.main --dataset sample \
--part_names {PARTNAMES} \
--train_data_dir {TRAIN_DATA_DIR} \
--val_data_dir {TRAIN_DATA_DIR} \
--test_data_dir {TEST_DATA_DIR} \
--train \
If you have supplied test images along with their corresponding masks, running this command will display the mean Intersection over Union (mIoU) for each of the segmented parts on the test data. Furthermore, it will save the trained text embeddings in the slime/outputs/checkpoints
folder and log files in the slime/outputs/lightning_logs
folder within the slime
directory.
To use the trained text embeddings for testing, run this command:
python -m src.main --dataset sample \
--checkpoint_dir {CHECKPOINT_DIR} \
--test_data_dir {TEST_DATA_DIR} \
In this command:
- Replace
{CHECKPOINT_DIR}
with the path to the folder where the trained text embeddings are stored. Ensure that only the relevant text embeddings are present in this directory because the code will load all text embeddings from the specified folder. - Make sure you've placed the test images (and their masks, if available, for calculating mIoU) in a new folder, and provide the path to this folder using the
--test_data_dir
argument.
You can test the trained text embeddings in the Colab notebook. After cloning the code, please follow the steps mentioned above and execute the provided command.
To configure the patching of images for validation and testing, you can specify different values for the --patch_size
and --num_patches_per_side
parameters. These settings will be used to divide the image into a grid of patches, calculate individual final attention maps (referred to as WAS-attention maps), aggregate them, and generate the segmentation mask prediction.
Here's an example of how to include these parameters in your command:
python -m src.main --dataset sample \
--checkpoint_dir {CHECKPOINT_DIR} \
--test_data_dir {TEST_DATA_DIR} \
--patch_size {PATCH_SIZE} \
--num_patches_per_side {NUM_PATCHES_PER_SIDE}
- Replace
{PATCH_SIZE}
with the desired size for each patch. - Replace
{NUM_PATCHES_PER_SIDE}
with the number of patches you want per side of the image. By adjusting these values, you can control the patching process for validation and testing, which can be useful for fine-tuning the method's performance on different image sizes or characteristics.
To train and test with the 1-sample setting of SLiMe on the car class of PASCAL-Part, follow these steps:
- Download the data from this link and extract it.
- Navigate to the
slime
folder. - Run the following command, replacing
{path_to_data_folder}
with the path to the folder where you extracted the data (without a backslash at the end):
DATADIR={path_to_data_folder}
python3 -m src.main --dataset_name pascal \
--part_names background body light plate wheel window \
--train_data_dir $DATADIR/car/train_1 \
--val_data_dir $DATADIR/car/train_1 \
--test_data_dir $DATADIR/car/test \
--min_crop_ratio 0.6 \
--num_patchs_per_side 1 \
--patch_size 512 \
--train
For the 10-sample setting, you can modify the command as follows:
DATADIR={path_to_data_folder}
python3 -m src.main --dataset_name pascal \
--part_names background body light plate wheel window \
--train_data_dir $DATADIR/car/train_10 \
--val_data_dir $DATADIR/car/val \
--test_data_dir $DATADIR/car/test \
--min_crop_ratio 0.6 \
--num_patchs_per_side 1 \
--patch_size 512 \
--train
In this case, you should specify car/train_10
for --train_data_dir
and car/val
for --val_data_dir
.
To train and test with the 1-sample setting of SLiMe on the horse class of PASCAL-Part, you can follow these steps:
- Download the data from this link and extract it.
- Navigate to the
slime
folder. - Run the following command, replacing
{path_to_data_folder}
with the path to the folder where you extracted the data (without a backslash at the end):
DATADIR={path_to_data_folder}
python3 -m src.main --dataset_name pascal \
--part_names background head neck+torso leg tail \
--train_data_dir $DATADIR/horse/train_1 \
--val_data_dir $DATADIR/horse/train_1 \
--test_data_dir $DATADIR/horse/test \
--min_crop_ratio 0.8 \
--num_patchs_per_side 1 \
--patch_size 512 \
--train
For the 10-sample setting, you can modify the command as follows:
DATADIR={path_to_data_folder}
python3 -m src.main --dataset_name pascal \
--part_names background head neck+torso leg tail \
--train_data_dir $DATADIR/horse/train_10 \
--val_data_dir $DATADIR/horse/val \
--test_data_dir $DATADIR/horse/test \
--min_crop_ratio 0.8 \
--num_patchs_per_side 1 \
--patch_size 512 \
--train
In this case, you should specify horse/train_10
for --train_data_dir
and horse/val
for --val_data_dir
.
To train and test with the 1-sample setting of SLiMe on CelebAMask-HQ, you can follow these steps:
- Download the data from this link and extract it.
- Navigate to the
slime
folder. - Run the following command, replacing
{path_to_data_folder}
with the path to the folder where you extracted the data (without a backslash at the end):
DATADIR={path_to_data_folder}
python3 -m src.main --dataset_name celeba \
--part_names background skin eye mouth nose brow ear neck cloth hair \
--train_data_dir $DATADIR/celeba/train_1 \
--val_data_dir $DATADIR/celeba/train_1 \
--test_data_dir $DATADIR/celeba/test \
--min_crop_ratio 0.6 \
--train
For the 10-sample setting, you can modify the command as follows:
DATADIR={path_to_data_folder}
python3 -m src.main --dataset_name celeba \
--part_names background skin eye mouth nose brow ear neck cloth hair \
--train_data_dir $DATADIR/celeba/train_10 \
--val_data_dir $DATADIR/celeba/val \
--test_data_dir $DATADIR/celeba/test \
--min_crop_ratio 0.6 \
--train
In this case, you should specify celeba/train_10
for --train_data_dir
and celeba/val
for --val_data_dir
.
At this link, we are uploading the text embeddings that we have trained, including the text embeddings we trained for the paper. You can download these text embeddings and use them for testing on your data using the command in Testing with the trained text embeddings section.
If you have any questions or problems, please create an issue here.
@article{khani2023slime,
title={SLiMe: Segment Like Me},
author={Khani, Aliasghar and Taghanaki, Saeid Asgari and Sanghi, Aditya and Amiri, Ali Mahdavi and Hamarneh, Ghassan},
journal={arXiv preprint arXiv:2309.03179},
year={2023}
}