LS-LLaMA: Label Supervised LLaMA Finetuning

📢: For convenience, we build a bi-directional LLMs toolkit BiLLM for language understanding. Welcome to use it.

Usage

Our implementation currently supports the following sequence classification benchmarks:

SST2 (2 classes) / SST5 (5 classes)
AGNews (4 classes)
Twitter Financial News Sentiment (twitterfin, 3 classes)

and token classification benchmarks for named entity recognition (NER): CoNLL2003 and OntonotesV5.

Commands for training LS-LLaMA and LS-unLLaMA on different tasks can follow the templates below:

foo@bar:~$ CUDA_VISIBLE_DEVICES=0 python file_name.py dataset_name model_size

file_name.py can be one of unllama_seq_clf.py, unllama_token_clf.py, llama_seq_clf.py, and llama_token_clf.py, for training LS-LLaMA and LS-unLLaMA on sequence- and token-level classification.

dataset_name can be one of sst2, sst5, agnews, twitterfin, conll03, and ontonotesv5.

model_size can be 7b or 13b, corresponding to LLaMA-2-7B and LLaMA-2-13B.

For example, the following command will train LS-unLLaMA based on LLaMA-2-7B on AGNews for sequence classification:

foo@bar:~$ CUDA_VISIBLE_DEVICES=0 python unllama_seq_clf.py agnews 7b

Implementations

Load Pretrained Models

from transformers import AutoTokenizer
from modeling_llama import (
    LlamaForSequenceClassification, LlamaForTokenClassification,
    UnmaskingLlamaForSequenceClassification, UnmaskingLlamaForTokenClassification,
)


model_id = 'meta-llama/Llama-2-7b'
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = LlamaForSequenceClassification.from_pretrained(model_id).bfloat16()
model = LlamaForTokenClassification.from_pretrained(model_id).bfloat16()
model = UnmaskingLlamaForSequenceClassification.from_pretrained(model_id).bfloat16()
model = UnmaskingLlamaForTokenClassification.from_pretrained(model_id).bfloat16()

For more usage, please refer to unllama_seq_clf.py, unllama_token_clf.py, llama_seq_clf.py, llama_token_clf.py.

Citation

@article{li2023label,
  title={Label supervised llama finetuning},
  author={Li, Zongxi and Li, Xianming and Liu, Yuzhang and Xie, Haoran and Li, Jing and Wang, Fu-lee and Li, Qing and Zhong, Xiaoqin},
  journal={arXiv preprint arXiv:2310.01208},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
data/ontonotesv5		data/ontonotesv5
docs		docs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
llama_seq_clf.py		llama_seq_clf.py
llama_token_clf.py		llama_token_clf.py
modeling_llama.py		modeling_llama.py
requirements.txt		requirements.txt
unllama_seq_clf.py		unllama_seq_clf.py
unllama_token_clf.py		unllama_token_clf.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LS-LLaMA: Label Supervised LLaMA Finetuning

📢: For convenience, we build a bi-directional LLMs toolkit BiLLM for language understanding. Welcome to use it.

Usage

Implementations

Citation

About

Releases

Packages

Contributors 2

Languages

License

4AI/LS-LLaMA

Folders and files

Latest commit

History

Repository files navigation

LS-LLaMA: Label Supervised LLaMA Finetuning

📢: For convenience, we build a bi-directional LLMs toolkit BiLLM for language understanding. Welcome to use it.

Usage

Implementations

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages