Appearance
To batch rename archive files in Python safely, the golden rule is: do a dry run first. Build the new name for every file, print the planned old-to-new pairs, check them by eye, and only then perform the rename — ideally on a copy of the folder and while writing a CSV log you can reverse. The renaming itself is one line; the safety around it is what matters for archival material.
This guide assumes no prior Python and walks through a small, real example: tidying a folder of scanned register images.
What does a batch rename actually do?
Imagine a folder of scans named IMG_0481.jpg, IMG_0482.jpg, and so on, that you want to become reg-1841-001.jpg, reg-1841-002.jpg. A batch rename is just a loop that, for each file, works out a new name and applies it. Python's pathlib module gives us clean tools for this.
python
from pathlib import Path
folder = Path("scans")
for path in sorted(folder.glob("IMG_*.jpg")):
print(path.name)That snippet only lists files. We add the renaming once we trust the names.
Why should I always do a dry run first?
A rename is destructive — the old name is gone. A dry run prints what would happen so you can catch mistakes before any file changes:
python
for n, path in enumerate(sorted(folder.glob("IMG_*.jpg")), start=1):
new_name = f"reg-1841-{n:03d}.jpg"
print(f"{path.name} -> {new_name}") # nothing is renamed yetRead the output. Do the numbers run in the right order? Are the years correct? Only when it looks right do you swap print for the real rename. For archival files, also work on a copy of the folder so the originals survive any error.
How do I zero-pad numbers so files sort correctly?
The :03d inside the f-string means "format this integer with at least 3 digits, padding with zeros". This matters because file browsers sort text, so file2 comes after file10 without padding. With padding, 001 through 999 always sort correctly.
| Code | Result |
|---|---|
f"{7}" | 7 |
f"{7:03d}" | 007 |
f"{42:03d}" | 042 |
f"{1841}-{7:03d}" | 1841-007 |
How do I keep a record of every rename?
For archives, losing the link between the original camera filename and the new catalogue name can quietly break provenance. Log every change to a CSV as you rename:
python
import csv
from pathlib import Path
folder = Path("scans")
with open("rename_log.csv", "w", newline="", encoding="utf-8") as log:
writer = csv.writer(log)
writer.writerow(["original", "new"])
for n, path in enumerate(sorted(folder.glob("IMG_*.jpg")), start=1):
new_path = path.with_name(f"reg-1841-{n:03d}.jpg")
writer.writerow([path.name, new_path.name])
path.rename(new_path) # the actual renameThat rename_log.csv is your audit trail and your undo button.
Why does it fail with 'file already exists'?
If two source files would get the same new name, or the target name is already taken, the rename errors. Guard against it by collecting proposed names and checking for duplicates before touching anything:
python
planned = {}
for n, path in enumerate(sorted(folder.glob("IMG_*.jpg")), start=1):
new = f"reg-1841-{n:03d}.jpg"
if new in planned.values():
raise ValueError(f"Collision: {new}")
planned[path.name] = newCan I undo a batch rename?
Yes — but only because you kept the log. A short reverse script reads rename_log.csv and swaps the columns:
python
for row in csv.DictReader(open("rename_log.csv", encoding="utf-8")):
Path(folder / row["new"]).rename(folder / row["original"])Without the log, there is no automatic way back, which is exactly why logging is non-negotiable for archival work.
Key Takeaways
- Always dry-run a rename by printing planned names before changing any file.
- Prefer
pathliboveros.renamefor clearer, cross-platform code. - Zero-pad numbers with
f"{n:03d}"so files sort correctly everywhere. - Write a CSV log of original-to-new names; it is your provenance record and undo button.
- Check for name collisions before renaming to avoid 'file already exists' errors.
- Work on a copy of an archival folder so the originals survive any mistake.
Frequently Asked Questions
What is the safest way to test a batch rename before running it?
Do a dry run: print the old and new name for every file without renaming anything. Only call the rename once the printed output looks correct, ideally working on a copy of the folder first.
Should I use os.rename or pathlib in Python?
For new code prefer pathlib, where path.rename(new) reads more clearly and handles paths across operating systems. os.rename works too, but pathlib is the modern, more readable choice.
How do I keep a record of what I renamed?
Write a small CSV log mapping each original filename to its new name as you go. For archival work this audit trail is essential, because a rename that loses the original identifier can break provenance.
Why does my rename fail with 'file already exists'?
Two source files mapped to the same new name, or the target already exists. Check for collisions before renaming by collecting proposed names in a set, and add a counter suffix when duplicates appear.
How do I add zero-padded numbers like 001, 002 to filenames?
Use an f-string with format specifiers, for example f"{n:03d}", which pads the number to three digits. Padding keeps files sorting in the correct order in every file browser.
Can I undo a batch rename?
Only if you kept a log. Save the old-to-new mapping to a CSV, then a second script can read it backwards to restore the originals. Without that log, there is no automatic undo.