Appearance
To use Git for humanities research, install Git, run git init in your project folder, write a .gitignore that excludes large source images, and then commit your text files and notes in small, well-described chunks. Git gives you a complete, dated history of every change to your transcriptions, datasets and scripts, so you can always see what changed, when, and why, and roll back mistakes without fear.
This guide assumes you work with manuscripts, datasets, TEI files or analysis notebooks and have never used version control before. By the end you will have a working repository and a sustainable daily habit.
Why does a historian need version control at all?
Most humanities projects die from filename chaos: transcription_final_v3_REALfinal_edited.docx. Git replaces that with one file and a timeline of every saved state. The concrete payoffs are recoverability (any past version is one command away), accountability (every change is stamped with who did it and a message), and collaboration (two people can edit the same TEI file and merge cleanly).
For funded work, version control is increasingly an expectation of data management plans, and it makes your eventual deposit to a repository far easier to assemble.
How do I set up my first repository?
Install Git from git-scm.com, then set your identity once so commits are attributed correctly:
bash
git config --global user.name "Your Name"
git config --global user.email "[email protected]"Now turn a project folder into a repository and make your first commit:
bash
cd my-edition-project
git init
git add notes.md transcriptions/
git commit -m "Add initial notes and folio transcriptions 1r-4v"If git init defaulted to a master branch, rename it to the modern default:
bash
git branch -m mainWhat should I keep out of Git?
This is the single most common mistake humanists make. Git is wonderful for text and miserable for large binaries; a folder of 400 MB TIFFs will make your repository slow and unwieldy. Create a .gitignore file before your first commit:
gitignore
# Large source media — track elsewhere
*.tif
*.tiff
*.wav
*.mov
# Derived and temporary files
__pycache__/
.ipynb_checkpoints/
*.tmp
~$*.docxKeep images in a separate, backed-up store or use Git LFS for the handful you must version. The rule of thumb: version the things you author (transcriptions, code, metadata), reference the things you merely hold (scans, audio).
What goes in a good commit message?
A commit message is a note to your future self and your collaborators. Use the imperative mood and say what and why, not how. Compare:
| Weak message | Strong message |
|---|---|
update | Correct misread abbreviations in folios 12r-15v |
stuff | Add gazetteer lookup for place names in chapter 3 |
fixed it | Fix encoding error that broke TEI validation |
Small, single-purpose commits make the history readable and let you revert one change without losing others.
How do I undo mistakes?
Two everyday situations:
bash
# Discard uncommitted edits to one file
git restore transcriptions/folio-07r.xml
# See the history and inspect any past version
git log --oneline
git show a1b2c3d:transcriptions/folio-07r.xmlBecause every committed state is preserved, deleting work you have committed is almost impossible. That safety net is the whole point.
Do I have to use the command line?
No. GitHub Desktop and the Git panel built into VS Code give you buttons for staging, committing and viewing history. Many humanists never touch the terminal. Learn two or three commands when you are ready; until then a graphical client is perfectly legitimate and will not corrupt anything.
Key Takeaways
- Git records a dated, attributed history of every text file so you can recover and explain any past state.
- Set
user.nameanduser.emailonce, theninit,add,committo start. - Write a
.gitignorefirst to keep large scans, audio and temp files out of the repository. - Commit small, coherent units with descriptive imperative messages.
- Version what you author; reference (do not embed) what you merely hold.
- A graphical client such as GitHub Desktop is a valid alternative to the command line.
Frequently Asked Questions
Do I need to learn the command line to use Git?
No. You can do everything through GitHub Desktop or VS Code's Git panel. The command line is faster once you commit daily, but it is optional for getting started.
Should I put my primary sources and images in Git?
Generally no. Git is built for text and bloats badly with large binaries. Keep TIFFs and audio out of the repository and use Git LFS or a data repository for them instead.
What is the difference between Git and GitHub?
Git is the version-control software that runs on your machine. GitHub is a hosted service that stores a copy of your repository online and adds collaboration features on top.
How often should I commit?
Commit whenever you finish a coherent unit of work, such as transcribing a folio or cleaning a column. Aim for several small commits a day rather than one giant weekly commit.
Can Git recover work I accidentally deleted?
Yes, as long as you had committed it. Any committed state can be restored with git checkout or git restore, which is the main reason to commit often.