Code for the 2023 ACL Short Paper "With a Little Push, NLI Models can Robustly and Efficiently Predict Faithfulness"
For more details read the Paper
To derive scores, use score.py
. By default it will use a model trained using our data augmentation procedure with --mc
flag.
The script expects to receive the dataset (e.g. TRUE) in a jsonl format. Each instance should have the following fields:
label
: A binary faithfulness labelcorpus
: Used for grouping resultsgrounding
: The grounding you want to check faithfulness ongeneration
: The generation to score
Use the -o
flag to determine where to save scores.
All relevant code can be found in the nlifactspush
module.
To train a new augmented model, first run augment_dataset.py
, followed by train.py
.