Skip to content

Code for training and evaluating a prompt-tuned LLaVa on multimodal SMART-101 data

License

Notifications You must be signed in to change notification settings

akankshya107/SMART-LLaVa

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SMART-LLaVa

SMART Dataset

Download the dataset using:

wget https://zenodo.org/records/7775984/files/SMART101-release-v1.zip
unzip SMART101-release-v1.zip

Zero-Shot Model Baselines

The Model_baselines.ipynb file can be run to evaluate each of the models on the test data. These are the zero-shot generalization benchmarks.

LLaVa model prompt tuning

The LLaVa_baselines.py file prompt tunes a LLaVa/BakLLaVa model on SMART data. The prompt tokens are injected between the image and language instruction tokens.

The code trains and uploads a model to huggingface. The huggingface token must be provided as $HUGGINGFACE_TOKEN before running the code. The following packages need to be installed before running the code.

pip install transformers
pip install peft
pip install bitsandbytes==0.41.3 accelerate==0.25.0
python3 LLaVa_baselines.py

Evaluating puzzle types

The relevant files are LLaVa_types.ipynb and resnet.py

About

Code for training and evaluating a prompt-tuned LLaVa on multimodal SMART-101 data

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published