Skip to content

Latest commit

 

History

History
77 lines (58 loc) · 6.2 KB

File metadata and controls

77 lines (58 loc) · 6.2 KB

Addressing LLM-related Measurement Error in Social Science Modeling Research

With the advent of large language models (LLMs), the collection of measurements related to social science constructs (e.g., personality traits, political attitudes, human values) has become easier, faster and more affordable. These measurements are subsequently used for modelling of societal and group processes that social scientists typically engage in, where inferences from samples to populations are also made. Valid modelling and inferences, however, requires high-quality measurements or at the very least, methods to deal with the presence of measurement error. Just like traditional questionnaire-based measurements, LLM-based measurements have been shown to suffer from validity and reliability issues.

While there is an abundance of research literature in dealing with measurement error, they focus on questionnaire-based measurement error. It is relatively new to social scientists how to deal with measurement issues arising from LLMs.

This project has three primary objectives.

First, we review existing literature to identify methods for addressing LLM-related measurement error in social science modelling. Second, we conduct simulation studies to compare existing methods. Lastly, we synthesise these findings with existing measurement modelling literature to propose a practical framework for making valid social science inferences using LLM-based measurements. By bridging the gap between LLM prediction capabilities and social science inference requirements, our framework aims to enhance the reliability and validity of social science research outcomes in the era of LLMs.

Literature Overview

Current literature can be sorted into four groups:

  1. Inferences with LLM-based predictions;
  2. Inferences with general machine learning-based predictions;
  3. Inferences with general measurement error in the social sciences;
  4. Others, such as missing data imputation, conformal prediction, semi-supervised learning.

Existing proposed methods can be distinguished based on whether the LLM- or machine learning-based predictions are made on the predictors, the outcome variable or both that are to be used in downstream modelling (typically with regression models).

Inferences with LLMs

Year Title Predicted Variable(s)
2023 Using Imperfect Surrogates for Downstream Inference: Design-based Supervised Learning for Social Science Applications of Large Language Models Outcome
2024 Inference for Regression with Variables Generated from Unstructured Data Predictor
2024 From Narratives to Numbers: Valid Inference Using Language Model Predictions from Verbal Autopsies Outcome
2024 Using Large Language Model Annotations for the Social Sciences: A General Framework of Using Predicted Variables in Downstream Analyses Predictor and outcome

Inferences with Machine Learning Predictions

Year Title Predicted Variable(s)
2020 Methods for correcting inference based on outcomes predicted by machine learning Outcome
2022 How Using Machine Learning Classification as a Variable in Regression Leads to Attenuation Bias and What to Do about It Predictor or outcome
2023 Prediction-powered inference Outcome
2024 PPI++: Efficient Prediction-Powered Inference Outcome
2024 Cross-prediction-powered inference Outcome
2024 A Note on the Prediction-Powered Bootstrap Outcome
2024 Assumption-Lean and Data-Adaptive Post-Prediction Inference Predictor and outcome
2024 ipd: An R Package for Conducting Inference on Predicted Data Outcome
2024 Task-Agnostic Machine-Learning-Assisted Inference Outcome
2024 Prediction De-Correlated Inference: A safe approach for post-prediction inference Outcome

Inferences with Measurement Error

TBA.

Other Approaches

e.g., Missing data imputation, semi-supervised learning, conformal prediction. TBA.

Datasets

TBA.

Software Packages

Name Method Language Estimators Predicted Variables
PostPI Post-Prediction Inference R Means, quantitles and GLMs Outcome
PPI, PPI++, Cross-PPI, PPBoot Prediction-powered inference and its extensions Python Any arbitrary estimator Outcome
PSPA PoSt-Prediction Adaptive inference R Means, quantiles, linear regression, logistic regression Predictor and outcome
ipd Implemented PostPI, PPI, PPI++ and PSPA R Means, quantiles, linear regression, logistic regression Outcome
PSPS PoSt-Prediction Summary-statistics-based (PSPS) inference R and Python M-estimators Outcome
DSL Design-based Supervised Learning R Moment-based estimators Predictor and outcome

Simulation Studies

TBA.

Contact

SoDa logo

This project is developed and maintained by the ODISSEI Social Data Science (SoDa) team.

Do you have questions, suggestions, or remarks? File an issue in the issue tracker or feel free to contact the team at odissei-soda.nl