Appearance
To pin software dependencies you record the exact resolved version of every package your project uses — direct and transitive — in a committed lock file, then install only from that file. The practical goal is simple: a colleague (or you, in two years) runs one install command and gets byte-for-byte the same environment that produced your published figures. For humanities computing, where a re-run might happen years after a project closes, pinning is what makes a result defensible rather than a happy accident.
Why pinning matters more in long-lived humanities projects
A digital humanities project is rarely a sprint. You might OCR a corpus in 2024, analyse it in 2025, and answer a peer reviewer in 2026. In that window, scikit-learn changed a default, spaCy shipped a new model, and a numpy update altered floating-point reductions. Without pins, the same notebook silently produces different topic clusters or entity counts, and you cannot tell whether your conclusion changed or just your toolchain. Pinning freezes the toolchain so that only intentional changes move your numbers.
How do I pin dependencies in Python?
Do not hand-edit requirements.txt. Use a resolver that produces a hash-locked file. With pip-tools:
bash
# requirements.in holds your abstract asks, e.g. pandas>=2,<3
pip-compile --generate-hashes --output-file=requirements.txt requirements.in
pip install --require-hashes -r requirements.txtuv does the same far faster and writes a cross-platform uv.lock:
bash
uv add pandas spacy
uv lock # resolves and pins the full tree
uv sync --frozen # installs exactly the lock, fails if drifted--generate-hashes and --require-hashes are the load-bearing flags: they reject any package whose download does not match the recorded SHA-256, defeating tampering and registry mutation.
What about conda, R and the rest of the stack?
| Ecosystem | Abstract spec | Concrete lock | Notes |
|---|---|---|---|
| pip | requirements.in | requirements.txt (hashes) | use pip-tools or uv |
| conda | environment.yml | conda-lock *.lock | per-platform locks |
| R | DESCRIPTION | renv.lock | renv::snapshot() |
| Node | package.json | package-lock.json | npm ci for installs |
For R, renv::init() then renv::snapshot() writes renv.lock with package versions and source repositories; renv::restore() rebuilds it. Mixed-language projects should commit one lock file per ecosystem rather than trying to unify them.
Should I pin Python and the OS too?
Packages are only half the environment. Pin the interpreter with a .python-version file or in your container base image, and record the platform. The most reproducible setups wrap a hash-locked lock file inside a pinned container:
dockerfile
FROM python:3.11.9-slim
COPY requirements.txt .
RUN pip install --require-hashes --no-deps -r requirements.txtThis makes the pinned set portable across macOS, Windows and Linux, which raw pip install cannot guarantee because many wheels are platform-specific.
How do I keep pins current without chaos?
Treat updates as a reviewable event. Re-run pip-compile --upgrade (or uv lock --upgrade) on a branch, install the new set, run your test suite and your headline analysis, and diff the outputs. If the numbers move, investigate before merging. Tools like Dependabot or Renovate can open these branches automatically, but a human still signs off because in research a "minor" bump can change a published figure.
Key Takeaways
- Pinning records exact, hash-locked versions of the entire resolved dependency tree, not just direct packages.
- Distinguish abstract specs (
requirements.in) from concrete locks (requirements.txt) and commit the lock. - Pin the interpreter and platform too; a container makes pinned environments portable.
- Use
pip-tools/uvfor Python,conda-lockfor conda,renvfor R,npm cifor Node. - Always use
--generate-hashes/--require-hashesto defend against registry mutation. - Update on a deliberate schedule, re-run your analysis, and record the change in a changelog.
Frequently Asked Questions
What does it mean to pin a dependency?
Pinning means recording the exact version (and ideally the cryptographic hash) of every package your code uses, so the same versions install every time. It is the difference between pandas and pandas==2.2.2.
What is the difference between abstract and concrete dependencies?
Abstract dependencies are the direct packages you ask for, usually with loose ranges. Concrete (pinned) dependencies are the fully resolved set, including transitive packages, frozen to exact versions in a lock file.
Do I need to pin transitive dependencies too?
Yes. A package you never imported directly can still change a result, so a real lock file pins the entire resolved tree, not just your top-level requirements.
Should I pin Python itself, or just the packages?
Pin both. The interpreter version changes hashing, ordering and floating-point behaviour, so record it in your environment file and ideally in a container or .python-version.
How often should I update pinned versions?
On a deliberate schedule, not silently. Re-resolve, re-run your tests and your analysis, confirm the outputs match, then commit the new lock file with a dated note in the changelog.
Will pinning make my project break on a new machine?
Sometimes, because pinned binaries are platform-specific. Use hash-locked, multi-platform lock files or a container image to make the pinned set portable across operating systems.