Skip to content

Latest commit

 

History

History
21 lines (19 loc) · 3.39 KB

README.md

File metadata and controls

21 lines (19 loc) · 3.39 KB

cleanlab-tools

Miscellaneous cookbooks and code made available for purposes of education, reproducibility, and transparency.

Example Description
TLM-Demo-Notebook Showcasing various applications of the Trustworthy Language Model
TLM-PII-Detection Find and mask PII with the Trustworthy Language Model
TLM-Record-Matching Using the Trustworthy Language Model to reliably match records between two different data tables.
TLM-SimpleQA-Benchmark Benchmarking TLM and OpenAI LLMs on the SimpleQA dataset
benchmarking_hallucination_metrics Notebook that compares the performance of popular hallucination detection metrics on a set of hallucination benchmarks.
fine_tuning_data_curation Notebook showing how to use Cleanlab TLM and Cleanlab Studio to detect bad data in instruction tuning LLM datasets.
Detecting GDPR Violations with TLM Notebook showing the code used to analyze application logs using TLM to detect GDPR violations
Customer Support AI Agent with NeMo Guardrails Reliable customer support AI Agent with Guardrails and trustworthiness scoring
few_shot_prompt_selection Notebook showing how to clean few-shot examples pool to improve prompt template for OpenAI LLM.
fine_tuning_classification Notebook showing how to use Cleanlab Studio to improve the accuracy of fine-tuned LLMs for classification tasks.
generate_llm_response Notebook showing how to generate LLM responses for customer service requests using Llama 2 and OpenAI's API.
gpt4-rag-logprobs Notebook showing how to obtain logprobs from a GPT-4 based RAG system.
fine_tuning_mistral_beavertails Analyze human annotated AI-safety-related labels (like toxicity) using Cleanlab Studio, and thus generate safer responses from LLMs.
Evaluating_Toxicity_Datasets_Large_Language_Models Notebook on analyzing toxicity annotations in the Jigsaw dataset using Cleanlab Studio.
time_series_automl Notebook showing how to model time series data in a tabular format and use AutoML with Cleanlab Studio to improve out-of-sample accuracy.