CAILA: Concept-Aware Intra-Layer Adapters for Compositional Zero-Shot Learning [WACV 2024]

CAILA: Concept-Aware Intra-Layer Adapters for Compositional Zero-Shot Learning
Zhaoheng Zheng, Haidong Zhu and Ram Nevatia

Official implementation of CAILA: Concept-Aware Intra-Layer Adapters for Compositional Zero-Shot Learning.

Installation

We build our model based on Python 3.8 and PyTorch 1.13. To prepare the environment, please follow the instructions below.

Create a conda environment and install the requirements:
```
 conda create -n caila-release python=3.8.13 pip
```
Enter the environment:
```
 conda activate caila-release
```
Install the requirements:
```
 pip install -r requirements.txt
```

Datasets

For MIT-States, C-GQA and UT-Zappos, please run the following script to download the datasets to the directory you desire (DATA_ROOT in our example):

bash ./utils/download_data.sh DATA_ROOT

For VAW-CZSL, please follow the instruction in the official repo.

The DATA_ROOT folder should be organized as following:

DATA_ROOT/
├── mit-states/
│   ├── images/
│   ├── compositional-split-natural/
├── cgqa/
│   ├── images/
│   ├── compositional-split-natural/
├── ut-zap50k/
│   ├── images/
│   ├── compositional-split-natural/
├── vaw-czsl/
│   ├── images/
│   ├── compositional-split-natural/

After preparing the data, set the DATA_FOLDER variable in flags.py to your data path.

If you encounter any FileNotFoundError regarding the split files, please find them here: Link.

Model Zoo

Dataset	AUC (Base/Large)	Download
MIT-States	16.1 / 23.4	Base / Large
C-GQA	10.4 / 14.8	Base / Large
UT-Zappos	39.0 / 44.1	Base / Large
VAW-CZSL*	17.1 / 19.0	V / V+L

*For VAW-CZSL, we provide two variations of Large model: one has adapters on the vision side (V) and the other has adapters on both the vision and language sides (V+L). The V+L model requires more GPU memory.

Evaluation

To evaluate the model, put the downloaded checkpoint in a folder. We use mit-base as an example:

checkpoints/
├── mit-base/
│   ├── ckpt_best_auc.t7

Then, run the following command to evaluate the model:

python test.py --config configs/caila/mit.yml --logpath checkpoints/mit-base

Training

First, please download CLIP checkpoints from HuggingFace: VIT-B/32 and VIT-L/14 and put them under clip_ckpts as following:

clip_ckpts/
├── clip-vit-base-patch32.pth
├── clip-vit-large-patch14.pth

Then, run the following command to train the model:

torchrun --nproc_per_node=$N_GPU train.py --config CONFIG_FILE

where CONFIG_FILE is the path to the config file. We provide the config files for all the experiments in the configs folder. For example, to train the base model on MIT-States, run:

torchrun --nproc_per_node=$N_GPU train.py --config configs/caila/mit.yml

Citing CAILA

If you find CAILA useful in your research, please consider citing:

@InProceedings{Zheng_2024_WACV,
    author    = {Zheng, Zhaoheng and Zhu, Haidong and Nevatia, Ram},
    title     = {CAILA: Concept-Aware Intra-Layer Adapters for Compositional Zero-Shot Learning},
    booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
    month     = {January},
    year      = {2024},
    pages     = {1721-1731}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CAILA: Concept-Aware Intra-Layer Adapters for Compositional Zero-Shot Learning [WACV 2024]

Installation

Datasets

Model Zoo

Evaluation

Training

Citing CAILA

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
configs/caila		configs/caila
data		data
models		models
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
flags.py		flags.py
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py

License

zhaohengz/CAILA

Folders and files

Latest commit

History

Repository files navigation

CAILA: Concept-Aware Intra-Layer Adapters for Compositional Zero-Shot Learning [WACV 2024]

Installation

Datasets

Model Zoo

Evaluation

Training

Citing CAILA

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages