TrustScore: Reference-Free Evaluation of LLM Response Trustworthiness [paper]
Environment: conda env export > trustscore_environment.yml
Behavior_consistency_example.ipynb
: An example code showing how
qa_human_check.json
: includes the mixed QA data used in this project, the predictions of Flan-T5-XXL, LLAMA-7B, GPT-3.5-turbo, and the human evaluation for the predictions.