When to Use Jupyter notebooks for history

Q: How do I version-control a Jupyter notebook properly?

Strip outputs before committing with `nbstripout` or `jupyter nbconvert --clear-output`, or pair the notebook to a plain-text format with Jupytext so Git diffs are readable.

Use Jupyter notebooks for history when you are exploring — asking open questions of a corpus, prototyping a chart, or weaving code and argument together for teaching or a methods appendix. Avoid them for production: reusable libraries, scheduled harvesting, and anything several people edit. The deciding factor is whether the work is a one-time investigation or a repeatable pipeline.

Notebooks are not a worse Python; they are a different medium with a specific sweet spot. Knowing where that spot ends is what separates a tidy project from an unreproducible mess.

What makes notebooks genuinely good for history?

Three properties matter for our work:

Literate analysis. You can argue in Markdown, show the code that supports the claim, and render the resulting chart inline. That is close to how historians already write.
Tight feedback loops. Loading a 50,000-row register and trying five groupings interactively is far faster than edit-run-edit on a script.
Shareable narrative. Exported to HTML, a notebook becomes a self-documenting methods appendix a reviewer can read top to bottom.

For a question like "how did recorded burial counts shift across the 1665 plague year?", a notebook is the right tool.

When should you NOT reach for a notebook?

The same flexibility becomes a liability when the code needs to last:

Recurring pipelines — a nightly scrape of a catalogue API belongs in a script run by cron, not a notebook someone must click through.
Shared libraries — functions used by several projects should live in importable .py modules with tests.
Collaborative editing — two people editing one .ipynb produces brutal merge conflicts because the file is JSON with embedded outputs.

If you find yourself copying the same cell into a third notebook, that logic wants to be a module.

Why do notebook results change when I rerun?

The notorious failure mode is hidden, order-dependent state. A cell can reference a variable defined in a cell you later deleted, so the notebook "works" only because of history that is not in the file.

The discipline that fixes this:

Before trusting any result, run Kernel → Restart and Run All.
If it errors, your notebook depended on stale state — fix it now, not at submission.
Keep cells short and top-to-bottom; resist editing cell 3 after running cell 20.

Notebook or script: a decision table

Signal	Lean notebook	Lean script/module
Goal	Explore, teach, narrate	Automate, reuse
Lifespan	One-off question	Runs repeatedly
Editors	Solo	Team
Output	Charts + prose	Files, database rows
Testing needs	Light	Unit tests wanted
Version control	Tolerable with care	Native and clean

Many projects use both: notebooks to discover the method, then graduate the stable logic into a src/ package the notebooks import.

How do I keep notebooks reproducible and archivable?

Notebooks can be perfectly reproducible if you treat them as artefacts:

bash

# strip volatile outputs so Git diffs stay readable
nbstripout --install
# or pair to plain text for clean diffs
jupytext --set-formats ipynb,py:percent analysis.ipynb

Pin dependencies in a requirements.txt or environment.yml, export an HTML copy for the fixed record, and deposit the .ipynb with its data when you publish. A notebook with unpinned packages and embedded outputs is not an archive; it is a screenshot.

What about teaching historians to code?

Start learners in notebooks for the instant feedback, but introduce scripts within the first few sessions. The goal is for students to understand that the notebook is the workshop, while shareable, citable code usually ends up as modules. Leaving them with only notebooks teaches a habit they later have to unlearn.

Key Takeaways

Notebooks suit exploration, teaching, and literate analysis; scripts suit automation and reuse.
Out-of-order execution causes phantom results — always Restart and Run All before trusting output.
Move repeated logic into .py modules with tests once it is stable.
Strip outputs or pair with Jupytext so notebooks version-control cleanly.
Archive notebooks by pinning dependencies and exporting a fixed HTML/PDF record.
Use both media deliberately: discover in a notebook, productionise in a module.

Frequently Asked Questions

When are Jupyter notebooks the right tool for a history project?

Notebooks shine for exploratory analysis, teaching, and narrative outputs where you interleave prose, code, and charts. They suit one-off questions about a corpus far better than recurring production pipelines.

When should a historian avoid notebooks?

Avoid them for reusable libraries, scheduled scrapers, or anything multiple people edit at once. Out-of-order execution and poor version-control diffs make notebooks fragile for production code; move that logic into plain .py modules.

Why do my notebook results change when I rerun cells?

Because cells share hidden state and can be run out of order. Always test with Kernel then Restart and Run All before trusting results — if it fails, the notebook was relying on stale variables.

How do I version-control a Jupyter notebook properly?

Strip outputs before committing with nbstripout or jupyter nbconvert --clear-output, or pair the notebook to a plain-text format with Jupytext so Git diffs are readable.

Can I cite or archive a notebook as part of my research?

Yes. Export it to HTML or PDF for a fixed record, deposit the .ipynb with its data in a repository like Zenodo, and pin every package version so the analysis remains rerunnable.

Should students learn notebooks or scripts first?

Notebooks first for immediate feedback and visible output, but introduce scripts early so learners understand that reproducible, shareable code usually lives in modules, not cells.

What makes notebooks genuinely good for history? ​

When should you NOT reach for a notebook? ​

Why do notebook results change when I rerun? ​

Notebook or script: a decision table ​

How do I keep notebooks reproducible and archivable? ​

What about teaching historians to code? ​

Key Takeaways ​

Frequently Asked Questions ​

When are Jupyter notebooks the right tool for a history project? ​

When should a historian avoid notebooks? ​

Why do my notebook results change when I rerun cells? ​

How do I version-control a Jupyter notebook properly? ​

Can I cite or archive a notebook as part of my research? ​

Should students learn notebooks or scripts first? ​

Related reading ​