Skip to content
R for the Humanities

To set up R for humanities research, install R first, then RStudio (Posit) on top of it, set your default text encoding to UTF-8, create a per-project RStudio Project, and install the tidyverse plus a handful of humanities packages. That four-step base gives you a clean, reproducible environment in under twenty minutes, and it sidesteps the encoding and working-directory problems that derail most newcomers working with archival sources.

What do you actually install first?

Install in this order, because RStudio depends on R:

  1. R from CRAN (cran.r-project.org). Pick the binary for your operating system.
  2. RStudio Desktop from Posit. The free open-source build is all you need.
  3. On Windows, also install Rtools if you ever compile packages from source; macOS users may be prompted to install the Xcode command line tools.

You do not need a paid licence, a server, or admin rights to a shared university machine for most tasks. If you cannot install software locally, Posit Cloud runs the same stack in a browser.

Which settings should you change immediately?

Two defaults trip up historians, so fix them on day one:

r
# Check your encoding from the console
Sys.getlocale("LC_CTYPE")
# You want a UTF-8 locale, e.g. "en_GB.UTF-8" or "English_United Kingdom.utf8"

In Tools, Global Options:

  • Code, Saving: set default text encoding to UTF-8. This stops é, ß or the long-s from arriving as mojibake.
  • General: untick Restore .RData into workspace at startup and set Save workspace to .RData on exit to Never. This forces a clean session each time, so your results are reproducible rather than dependent on hidden state.

How should you organise a humanities project?

Never use setwd() with an absolute path. Instead, create an RStudio Project (File, New Project) and let the here package resolve paths relative to the project root:

r
install.packages("here")
library(here)
read.csv(here("data", "parish-registers.csv"))

A reliable folder skeleton:

my-project/
  my-project.Rproj
  data/        # raw, read-only sources
  data-clean/  # tidied outputs
  R/           # scripts
  output/      # figures, tables
  README.md

Which packages do humanities researchers need?

Start small, add as you go. This base covers tabular wrangling, text, dates and visualisation:

PackageWhat it does for historians
tidyverseData wrangling (dplyr), reading (readr), plotting (ggplot2)
hereRobust, project-relative file paths
readxlReads .xlsx spreadsheets from archives without Java
lubridateParses messy and partial historical dates
tidytextTokenising and mining transcribed text
sfMapping and historical GIS
renvRecords exact package versions per project
r
install.packages(c("tidyverse", "here", "readxl",
                   "lubridate", "tidytext", "sf", "renv"))

Should you turn on renv from the start?

Yes, for anything you will cite or return to. Inside your project run:

r
renv::init()      # snapshots your library into a project-local lockfile
renv::snapshot()  # call again whenever you add packages

The resulting renv.lock file records every package version, so a reviewer or your future self can reconstruct the exact environment with renv::restore(). This is the single biggest difference between a setup that rots in two years and one that survives.

How do you confirm everything works?

Run a one-line smoke test that touches reading, wrangling and plotting:

r
library(tidyverse)
tibble(year = 1801:1810, baptisms = c(34, 41, 29, 38, 45, 33, 50, 47, 39, 52)) |>
  ggplot(aes(year, baptisms)) +
  geom_line() +
  labs(title = "Smoke test: it works")

If a chart appears, your installation, the tidyverse and the graphics device are all healthy.

Key Takeaways

  • Install R first, then RStudio; neither replaces the other.
  • Set default encoding to UTF-8 before you open a single archival file.
  • Use RStudio Projects plus the here package; never hard-code working directories.
  • Disable .RData restore so every session starts clean and reproducible.
  • A small package set (tidyverse, lubridate, tidytext, sf) covers most humanities tasks.
  • Run renv::init() early so your analysis still runs years later.
  • Verify with a one-line plotting smoke test before trusting the setup.

Frequently Asked Questions

Do I install R and RStudio, or just one of them?

Install both. R is the language and engine; RStudio (now part of Posit) is the editor that makes R usable. RStudio will not run without R installed first, so install R, then RStudio.

Which encoding should I use for historical sources?

Always UTF-8. Set it as your default in RStudio under Tools, Global Options, Code, Saving. UTF-8 handles accented names, long-s, currency marks and non-Latin scripts that ASCII or Windows-1252 will corrupt.

Should humanities researchers use renv from the start?

Yes, if you plan to share or revisit a project. Running renv::init() in each project records exact package versions so the analysis still runs years later, which matters for citation and reproducibility.

How big a machine do I need for R in the humanities?

Most archival tabular and text work runs comfortably on 8 GB of RAM. Large corpora or geospatial rasters benefit from 16 GB, but you rarely need specialist hardware to start.

What is the fastest way to get help when R errors out?

Copy the exact error message into a search, and check the package vignette with browseVignettes('packagename'). The rOpenSci and Programming Historian communities are humanities-aware and welcoming.