Appearance
To get started with Gephi, download version 0.10 from gephi.org, prepare two CSV files (a nodes list and an edges list), import them through the Data Laboratory, then run the ForceAtlas 2 layout while sizing nodes by degree. That four-step path takes about fifteen minutes and produces a readable network from your own sources without any scripting.
Gephi is free, open-source desktop software for exploring and visualising networks. For historians and archivists it is the fastest way to see who corresponds with whom, which records cite which, or how people move between institutions — all without writing Python.
What do I need before I open Gephi?
You need your relationships as a table. The cleanest starting point is two CSV files:
- Nodes — one row per entity, with at least an
Idcolumn and a human-readableLabel. - Edges — one row per relationship, with
SourceandTargetcolumns matching the nodeIdvalues.
A minimal edges file looks like this:
csv
Source,Target,Weight,Type
voltaire,frederick,12,Directed
voltaire,emilie,34,Directed
emilie,frederick,3,DirectedThe Weight column lets you record how many letters or citations link two people; Type is either Directed or Undirected. Keep IDs short, lowercase and stable — you will reuse them everywhere.
How do I import my data?
Open Gephi, choose New Project, then go to the Data Laboratory tab. Click Import Spreadsheet, select your nodes CSV, and tell the wizard it is a Nodes table. Repeat for the edges CSV as an Edges table. On the final screen, choose Append to existing workspace so the two tables join into one graph.
Watch the import report. If Gephi says it created phantom nodes, that means an edge references an Id your nodes file never declared — usually a typo or trailing whitespace. Fix the source CSV rather than patching it in Gephi.
Why does the first view look like a black ball?
Because Gephi drops every node at a random coordinate. The graph only becomes legible once you apply a layout from the bottom-left panel. Sensible defaults for a first pass:
- Run ForceAtlas 2 with Prevent Overlap ticked and Scaling around 10.
- Let it run until the motion settles, then Stop.
- Open the Statistics panel and run Average Degree and Modularity.
- In Appearance, size nodes by Degree (min 8, max 40) and colour them by Modularity Class.
| Layout | Best for | Watch out for |
|---|---|---|
| ForceAtlas 2 | General exploration, communities | Slow above ~50k nodes |
| Fruchterman-Reingold | Small, tidy graphs | Cramps large networks |
| Yifan Hu | Large networks, fast | Less cluster separation |
| OpenOrd | Very large, clustered data | Hard-edged, less organic |
How do I turn metrics into something I can read?
Gephi's power is mapping numbers to visual channels. After running Statistics, use the Appearance panel: map Betweenness Centrality to node colour to spot brokers, and Degree to node size to spot hubs. Avoid mapping more than two metrics at once — a chart with size, colour and label all competing becomes unreadable.
How do I export something publishable?
Do not screenshot the main window. Switch to the Preview tab, click Refresh, then adjust edge opacity, node borders and label fonts. Export as SVG or PDF for print, or PNG only for quick web use. Vector files let you finish typography in Inkscape and stay sharp at poster size.
What pitfalls trip up beginners?
- Mixed directed/undirected edges inflate degree counts — pick one model per project.
- Running layout forever wastes time; structure stabilises in seconds, not minutes.
- Forgetting to save as
.gephiloses your layout positions; CSV export only keeps data. - Labelling everything at once clutters the canvas; filter to high-degree nodes first.
Key Takeaways
- Gephi 0.10 bundles Java; download it from gephi.org and you are ready in one install.
- Prepare two CSVs: nodes (
Id,Label) and edges (Source,Target,Weight). - Import via the Data Laboratory and fix phantom-node warnings at the source.
- ForceAtlas 2 with Prevent Overlap is the most reliable first layout.
- Map degree to size and modularity to colour for an instantly readable graph.
- Export from the Preview tab as SVG or PDF, never a screenshot.
- Save the
.gephiproject file to preserve node positions.
Frequently Asked Questions
What file format should I import into Gephi first?
Import two CSV files via the Data Laboratory: a nodes table with an Id column and an edges table with Source and Target columns. CSV is the most forgiving format for historians starting out, and it round-trips cleanly to spreadsheets.
Why does my Gephi graph look like a black ball of spaghetti?
That is the default random layout with no spatialisation applied. Run ForceAtlas 2 with 'Prevent Overlap' enabled, then resize nodes by degree, and the structure will emerge within seconds for graphs under a few thousand nodes.
Does Gephi need Java to run?
Gephi 0.10 bundles a compatible JDK on Windows and macOS, so most users no longer install Java separately. If you hit a launch error, install a 64-bit JDK 11 or 17 and point Gephi to it via the gephi.conf file.
How many nodes can Gephi handle?
Gephi visualises tens of thousands of nodes comfortably on a laptop, and up to a few hundred thousand if you raise the JVM heap. Above that, pre-filter your data or use a GPU-free layout and disable real-time label rendering.
How do I export a publication-quality image from Gephi?
Switch to the Preview tab, click Refresh, tune node and edge settings, then export as SVG or PDF rather than PNG. Vector output stays crisp at any print size and lets you edit labels in Inkscape afterwards.
Can I calculate centrality inside Gephi?
Yes. The Statistics panel runs degree, betweenness, closeness, eigenvector and modularity with one click each, writing the results back as node attributes you can then map to size or colour.