A short guide to what this project tracks, how the data is built, and what each category is for. Everything here is a draft and is offered as such; the underlying data is open and source-linked.
What this is
A research project mapping the infrastructure behind AI in Australia (data centres, the minerals, energy and water they draw, and the lines of ownership through them) and the public debate forming around it.
Its guiding question is not whether Australia has "AI sovereignty" but what kind it is building: sovereignty over the buildings the intelligence runs in, or over the intelligence itself. It pairs a live map with a set of open databases and a light, auditable AI-assisted pipeline.
The components
- Critical Infrastructure Tracker (Australia). One row per site, with ownership, governance, public funding, water and energy profile, and location. It feeds the live map.
- Contestation Tracker. One row per source-event (a news article, editorial, planning submission, council motion or forum statement) recording the structure of public debate around those sites.
- Synoptics. Survey and overview sources that span many sites at once (news roundups, parliamentary inquiries, NGO and government reviews, moratorium calls).
- Cost of Compute. A self-accounting log estimating the environmental cost of the project's own AI use. The research holds itself to the same question it asks of others.
- The map at sovereignties.civicinterplay.io, a live read of the tracker, colourable by several lenses.
How the data is built (methods)
- Sources. Planning portals and inquiry submissions (public and explicitly framed), local and national press, an open global news index (GDELT), council and forum material, and licensed databases such as Factiva via the RMIT library, which are handled manually under licence rather than scraped.
- AI-assisted classification. A pipeline reads each source and proposes a structured record against a fixed controlled vocabulary; it never writes free-form prose. It resolves which site a source concerns conservatively, flagging uncertain matches for human review rather than guessing, because misattribution is the most damaging error.
- Human verification. Every machine-classified item carries a confidence score. Anything below threshold, and anything destined for publication, is checked by a person and marked human-verified. Nothing is used in print before that step.
- Provenance and ethics. Every record links to its original source. Licensed text stays out of the database; only the structured codes, a short summary and one attributed quote are kept. Quotes are short and attributed to the original publisher.
The categories and what they are for
- Sovereignty register (Locational, Financial, Operational, Productive). The analytical core. It asks what kind of sovereignty a site embodies: onshore but foreign-owned and governed (Locational), public capital with a commercial mandate (Financial), government-operated (Operational), or domestically built and owned (Productive). Read across the map, it shows where real control sits. Applied to compute it reads one way; applied to the minerals supply chain, through ownership of onshore value-add, it reads another.
- Tenant / model served. Who actually runs on each building, which clouds and which AI models. It makes visible whether "sovereign compute" is hosting a domestic or a foreign intelligence.
- Stance (Opposing, Conditional, Supporting, Neutral). The tracker records the whole debate, the case for as well as against, not only the opposition.
- Grounds, Actor type, Intensity. The structure of a position: on what basis it is made (water, energy, jobs, foreign ownership, consultation, climate), by whom (resident, community group, council, MP, industry, environmental body, Traditional Owner), and how strongly (a passing mention through to an organised campaign).