Skip to content
DH Project Management

Collaborating across disciplines in digital humanities succeeds when you treat shared vocabulary, decision rights and credit as deliverables, not afterthoughts. The teams that produce consistent, defensible results across a whole collection write a glossary in week one, agree authorship and data ownership in a one-page memorandum, and keep a living data dictionary in version control. Tooling matters less than these three social contracts.

Why do interdisciplinary DH projects break down?

Most failures are not technical. They trace to four predictable gaps: words that mean different things to each discipline, unclear decision-making when humanist intuition meets engineering constraint, invisible labour that never reaches the credit line, and divergent quality standards. A historian's "good enough" transcription and a developer's "validated dataset" can be months apart in effort. Name these gaps explicitly and you defuse most of them.

How do you build a shared vocabulary fast?

Start a glossary in your repository on day one and make every recurring noun an entry. Capture both the disciplinary meaning and the project-specific meaning.

markdown
# GLOSSARY.md
- **record**: (archival) a document of evidential value; (database) one row in `people.csv`. We use the database sense in code, archival sense in prose.
- **model**: (history) an explanatory framework; (engineering) a serialised ML artefact. Always qualify: "conceptual model" vs "trained model".
- **circa**: encoded as EDTF `1640?` in `date_qualified`; never freetext.

Review it at every monthly meeting. A glossary that stops changing in month two is usually being ignored, not finished.

Who decides when historians and developers disagree?

Write decision rights down before the first disagreement. A lightweight RACI grid scoped to decision types works better than role titles.

Decision typeOwnerConsultedNotes
Research question / scopePI / lead scholarwhole teamdrives everything else
Data model & schemaData curatorscholar + devscholar signs off on semantics
Tooling & infrastructureLead developercuratormust meet sustainability rule
Public interpretationLead scholardev (feasibility)dev flags what is buildable

The principle: the question owner sets priorities, the technologist owns feasibility, and you never let a tooling preference quietly redefine the research question.

How should you handle credit and authorship?

Adopt a contributor-roles taxonomy. CRediT lists 14 roles including Software, Data Curation, Conceptualization and Visualization, so a developer who wrote no prose still appears with a named contribution. Agree authorship order and data ownership in a one-page MOU at kick-off covering: who can reuse the data afterwards, what licence the outputs carry (CC BY 4.0 for text, MIT or Apache-2.0 for code is a common pairing), and what happens to the repository if a contributor leaves the institution.

What keeps results consistent across the whole collection?

A living data dictionary committed to the repo. It pins every field name, its controlled vocabulary, and its format so a manually entered date and a scripted import agree.

yaml
# data-dictionary.yml
place_name:
  source: getty_tgn          # authority list
  required: true
date_qualified:
  format: EDTF               # e.g. 1640?, 164X, 1640/1645
  required: false
confidence:
  enum: [certain, probable, conjectural]

Pair it with validation: a CI check that fails the build when a CSV row violates the dictionary catches inconsistency before it spreads across a thousand records.

A practical kick-off checklist

  1. Glossary and data dictionary created in the repo (day 1).
  2. One-page MOU: authorship order, data ownership, licences, exit plan.
  3. CRediT roles drafted for every named person.
  4. Decision-rights grid agreed and pinned to the wiki.
  5. Cadence set: weekly standup, monthly scope review.
  6. A shared "definition of done" per artefact type.
  7. One neutral shared workspace everyone can actually use.

Key Takeaways

  • Interdisciplinary friction is mostly social, not technical — vocabulary, decision rights and credit cause more delays than code.
  • Write a glossary and data dictionary into version control in week one and review them monthly.
  • Use a decision-rights grid scoped to decision types, not job titles.
  • Adopt CRediT so non-prose contributions are visibly credited.
  • A one-page MOU on authorship, ownership and exit prevents publication-time disputes.
  • Validate data against the dictionary in CI so consistency is enforced, not hoped for.

Frequently Asked Questions

What is the single most common cause of interdisciplinary friction in DH?

Unstated vocabulary. Historians, librarians and developers use the same word (record, source, model, tag) to mean different things. Writing a shared glossary in week one prevents most downstream rework.

Do we need a memorandum of understanding for a small project?

Even two-person teams benefit from a one-page MOU covering authorship order, data ownership and what happens to the repo if someone leaves. It takes an hour and prevents the disputes that surface at publication.

How do I credit a programmer who wrote no prose?

Use a contributor-roles taxonomy such as CRediT, which has 14 named roles including Software and Data Curation. List each person's roles explicitly rather than forcing everyone into 'author' or 'acknowledgements'.

Should the humanist or the technologist lead the project?

Neither by default. Lead with the research question, and let whoever owns that question set priorities while the technologist owns feasibility. Co-leadership with clear decision rights works better than a single discipline dominating.

How often should an interdisciplinary team meet?

A short weekly standup plus a monthly deep-dive works for most grant-scale DH projects. Weekly catches drift early; monthly is where you renegotiate scope and review the shared glossary.

What artefact best keeps a mixed team consistent?

A living data dictionary in the repository. It pins field names, controlled vocabularies and date formats so a historian's entry and a developer's import script agree on what 'circa 1640' means.