TagGPT: Large Language Models are Zero-shot Multimodal Taggers

TagGPT is a fully automated system capable of tag extraction and multimodal tagging in a completely zero-shot fashion, produced by QQ-ARC Joint Lab at Tencent PCG.

🔧 Dependencies

Python >= 3.7
PyTorch == 2.0.0
transformers==4.27.4

pip install -r requirements.txt

💻 How to use TagGPT

Step 1: Tagging system construction

You need a batch of data to build your tagging system. Here, we can use the Kuaishou open source data, which you can download here (password: ihc2).

First, you can place the data in the './data/' folder and format it with the following command.

python ./scripts/main.py --data_path ./data/222k_kw.ft --func data_format

Then, you can use the following command to generate candidate tags based on LLMs.

python ./scripts/main.py --data_path ./data/sentences.txt --func tag_gen --openai_key "put your own key here" --gen_feq 5

Next, the tagging system can be obtained by post-processing.

python ./scripts/main.py --data_path ./data/tag_gen.txt --func posterior_process

Step 2: Data tagging

TagGPT can assign tags to the given samples based on the built tagging system, and you can adapt your data to what './data/examples.csv looks like.

And TagGPT provides two different tagging paradigms:

Generative tagger

python main.py --data_path ../data/examples.csv --tag_path ../data/final_tags.csv --func selective_tagger --openai_key "put your own key here"

Selective tagger

python main.py --data_path ../data/examples.csv --tag_path ../data/final_tags.csv --func generative_tagger --openai_key "put your own key here"

🤗 Acknowledgements

We appreciate the open source of the following projects: Kuaishou, Hugging Face, LangChain.

📖 Citation

If you find this work useful for your research or applications, please cite our technical report:

@article{li2023taggpt,
  title={TagGPT: Large Language Models are Zero-shot Multimodal Taggers},
  author={Li, Chen and Ge, Yixiao and Mao, Jiayong and Li, Dian and Shan, Ying},
  journal={arXiv preprint arXiv:2304.03022},
  year={2023}
}

📧 Contact Information

For help or issues using the TagGPT, please submit a GitHub issue.

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
data		data
scripts		scripts
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TagGPT: Large Language Models are Zero-shot Multimodal Taggers

🔧 Dependencies

💻 How to use TagGPT

Step 1: Tagging system construction

Step 2: Data tagging

🤗 Acknowledgements

📖 Citation

📧 Contact Information

About

Releases

Packages

Contributors 2

Languages

License

TencentARC-QQ/TagGPT

Folders and files

Latest commit

History

Repository files navigation

TagGPT: Large Language Models are Zero-shot Multimodal Taggers

🔧 Dependencies

💻 How to use TagGPT

Step 1: Tagging system construction

Step 2: Data tagging

🤗 Acknowledgements

📖 Citation

📧 Contact Information

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages