Reproducible and reusable social science with R: a field guide
2026-02-20
Introduction
About
This website is intended as a quick reference for some techniques that I think people may need to keep their code reproducible when cleaning, analyzing, or presenting data. For a more basic intro to R try the R for Social Science Data Carpentry Workshop , on which some of this website is based.
The site covers advice on workflows, general R techniques such as loops and functions that facilitate reproducibility. Then the following chapters each cover a part of the research cycle: design, data cleaning and data analysis (including reporting the results). For each, there’s specific code examples using survey data.
Setting up
This book uses the SAFI data set, and a large number of libraries. Everything is available on github, and cloning that repo will ensure you can install all packages through renv (see Chapter 1).
Throughout this book, I will mostly make use of tidyverse packages.
While there are alternatives that have advantages of this,
such as data.table which is faster,
tidyverse makes easily readable code,
which is imortant from a reproducibility perspective.
Moreover, we are not so concerned with performance,
since household survey datasets are almost never large enough for that matter.