SurvAI Short Course: Finetuning LLMs for Data Augmentation and Synthesis

In this short course, we present a fine-tuning pipeline for LLMs to generate synthetic survey responses. Since there are many generic fine-tuning tutorials available, we focus on the parts of the pipeline that are specifically relevant for research in general, e.g., reproducibility, and work with survey data specifically, like answer extraction.

The notebook can either be run directly in Google Colab or Kaggle, or locally using the uv package manager.

The slides of the presentation preceding this practical course can be found at https://socialdatascience.umd.edu/survai-workshop/schedule/.

This work has received funding through the DFG (No. 504226141).

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.gitattributes		.gitattributes
.gitignore		.gitignore
.python-version		.python-version
2016_anes_argyle.pkl		2016_anes_argyle.pkl
README.md		README.md
demo.ipynb		demo.ipynb
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SurvAI Short Course: Finetuning LLMs for Data Augmentation and Synthesis

About

Releases

Packages

Languages

tobihol/survai-finetuning

Folders and files

Latest commit

History

Repository files navigation

SurvAI Short Course: Finetuning LLMs for Data Augmentation and Synthesis

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages