Sampling

An empirical study where some of many possible items are selected

Application

This standard applies to empirical research in which the researcher selects smaller groups of items to study (a sample) from a larger group of items of interest (the population) using a usually imperfect population list (the sampling frame). Common items in software engineering research include people (e.g. software developers), code artifacts (e.g. source code files) nd non-code artifacts (e.g. online discussions, user stories).

Specific Attributes

Essential Attributes

explains the goal of sampling (e.g. aiming for representativeness, identifying exceptional cases)
explains the sampling strategy, in particular the different filtering steps involved or the reasons for selecting certain objects
explains why the sampling strategy is reasonable (not necessarily optimal) for the sampling goal
explains the reasoning behind the selection of study objects (especially qualitative studies)
reports the sample size

Essential only if representativeness is a goal

states the theoretical population (what would the researcher like to generalize to?)
presents a replicable, concise, algorithmic account of how other researchers could derive the same sample
explicitly argues for representativeness (e.g. compares sample and population parameters, provides confidence interval and confidence level for sample size)
explains how the sample could be biased along the sampling steps

Desirable Attributes

reports the approximate or exact sizes of populations and sampling frames
provides the sample, sampling frame, and sampling scripts as supplementary material (subject to the collected data containing sensitive or protected information).
uses more sophisticated sampling strategies where appropriate, e.g.:
- exploratory research: using purposive rather than convenience sampling for unit of analysis
- case study: using purposive rather than convenience sampling for site selection
- repository mining: using probability rather than convenience or purposive sampling (if a sampling frame is available)
- online survey: using respondent-driven rather than snowball sampling
- study with identifiable strata: using stratified random rather than simple random sampling
- theory building: using theoretical rather than convenience sampling

Examples of Acceptable Deviations

omitting a detailed account of the sampling strategy because it is explained in previous work using the same data set
using a very simple sampling strategy in exceptional circumstances where expediency outweighs representativeness (e.g. research during a disaster)

Antipatterns

making claims about a population, based on sample, without providing an argument for representativeness
claiming that a sample is representative of a population because it was randomly selected from a sampling frame, without considering bias in the sampling frame
conducting underpowered research; i.e.:
- quantitative research with a sample size insufficient to detect effects of the expected size¹
- qualitative research with too little data for plausible saturation
justifying the selection of items merely by stating that they come from a "real-world" context, without providing additional reasoning why the selected items are suitable for the study context

Invalid Criticisms

complaining about lack of representativeness or low external validity in studies where representativeness is not a goal
abstractly criticizing generalizability rather than pointing to best practices, e.g.:
invalid: 'as most respondents work in app development, the results may not generalize to other settings'
valid: 'the researchers should have sent participation reminders to mitigate response bias'
for qualitative research, claiming that the sample size is too small without considering how the items were selected (e.g. theoretical sampling) or the authors' argument for saturation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sampling.md

Sampling.md

Sampling

Application

Specific Attributes

Essential Attributes

Essential only if representativeness is a goal

Desirable Attributes

Examples of Acceptable Deviations

Antipatterns

Invalid Criticisms

Suggested Readings

Files

Sampling.md

Latest commit

History

Sampling.md

File metadata and controls

Sampling

Application

Specific Attributes

Essential Attributes

Essential only if representativeness is a goal

Desirable Attributes

Examples of Acceptable Deviations

Antipatterns

Invalid Criticisms

Suggested Readings