Changelog

Use the tokenizer chat template found in the tokenizer_config.json file instead of FastChat. This change eliminates the need for users to specify the chat template for each model, ensuring the correct template is always used.
Implement the few-shot setting, in which the model is prompted with 4 examples from the training dataset of each split, totaling 44 few-shot examples. The few-shot examples for each split correspond to:

Distractor: False. Negation: False 
Distractor: False. Negation: True 
Distractor: True. Negation: False 
Distractor: True. Negation: True

Implement a method to evaluate multiple models without the need to create a configuration for each one:

for model_name in \
meta-llama/Llama-2-70b-chat-hf \
meta-llama/Llama-2-70b-hf \
meta-llama/Llama-2-13b-chat-hf \
meta-llama/Llama-2-13b-hf \
meta-llama/Llama-2-7b-chat-hf \
meta-llama/Llama-2-7b-hf \
mistralai/Mistral-7B-Instruct-v0.2 \
mistralai/Mixtral-8x7B-Instruct-v0.1 \
do

accelerate launch run.py --config configs/zero-shot/base.yaml --model_name_or_path "$model_name" --output_dir results/zero-shot/"$model_name"

done

Other minor fixes to ensure compatibility with additional models and enhance performance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Few-shot setting, Chat Templates and Batch evaluation

Changelog

Results

Zero Shot

Few Shot