In this short course, we present a fine-tuning pipeline for LLMs to generate synthetic survey responses. Since there are many generic fine-tuning tutorials available, we focus on the parts of the pipeline that are specifically relevant for research in general, e.g., reproducibility, and work with survey data specifically, like answer extraction.
The notebook can either be run directly in Google Colab or Kaggle, or locally using the uv package manager.
The slides of the presentation preceding this practical course can be found at https://socialdatascience.umd.edu/survai-workshop/schedule/.
This work has received funding through the DFG (No. 504226141).