TGIS/AI

TGIS — Strategy & Direction


The Core Idea

Build a Transport Global Intelligence System (TGIS): an AI-powered service that makes the world's fragmented transport data platforms queryable together, letting anyone ask questions in plain language across sources that were never designed to connect.

The system connects dozens of existing platforms, datasets, and knowledge sources that today require specialist skills to use individually and are almost never used together. It builds on the Transport Data Commons (TDC) as the foundational data catalogue and the SDMX standard as the semantic backbone for statistical data, extending agent-accessibility into the geospatial, financial, and policy domains that these frameworks don't yet cover.

Why This Matters Now

Three forces have converged:

  1. The UN Decade of Sustainable Transport (2026–2035) launched in December 2025 with an Implementation Plan that calls for progress monitoring across six priority areas. Proposals for a global tracking framework exist (from SuM4All, TDC, and SLOCAT among others) but the data infrastructure to underpin the Decade's accountability goals remains largely unbuilt.
  2. AI/LLM capabilities have reached the point where natural language querying of complex, heterogeneous data is technically feasible and affordable. This was not possible three years ago.
  3. The data already exists. Our audit of 25 sources found 14 with open REST/SDMX APIs requiring no authentication, and another 6 with free access. The problem is data fragmentation.

What Makes This Different

Transport data already has strong foundations. TDC is building the shared data commons, a trusted metadata and standards layer above individual data sources, built on SDMX (the international standard for statistical data exchange). The SDMX community (eight major international organisations including the World Bank, IMF, OECD, and Eurostat) issued a joint statement on AI-readiness in September 2025, committing to make official statistics discoverable and queryable by AI agents. The IMF has already built StatGPT 2.0 on SDMX-structured data. TDC has an early-stage AI assistant that helps users discover datasets, reports, and dashboards.

What doesn't yet exist is the connective layer across data types and sources. Each tool queries a single organisation's data or a single data category. Transport information spans at least three fundamentally different worlds:

  1. Structured statistical data (country-level indicators, time series, transport statistics). This is SDMX territory, and TDC is building the standards. AI agents can accelerate SDMX adoption, generate draft data structures, and translate natural language queries into SDMX calls.

  2. Geospatial and infrastructure data: road networks, port locations, climate risk layers, infrastructure condition, satellite-derived indicators. These live in spatial formats (GeoJSON, GeoPackage, ArcGIS, OGC services) that SDMX is not designed to carry. SDMX 3.0 added support for attaching spatial references to statistical observations, but native geospatial data (topological networks, feature-level attributes, raster layers) needs its own access patterns.

  3. Unstructured knowledge: policy reports, NDC commitments, project evaluations, research analyses, transport news, institutional knowledge locked in PDFs and documents. This is the largest and least accessible category. No data standard covers it. AI agents with retrieval-augmented generation (RAG) can search, summarise, and cross-reference this knowledge alongside structured data, connecting a country's stated policy commitments with its actual spending and infrastructure outcomes.

TGIS works across all three. It builds on TDC and SDMX for structured data, provides agent-accessible interfaces for geospatial sources, and brings unstructured knowledge into the same query flow. The hard part is bridging data that was never meant to talk to each other:

  • Policy documents + GIS data: A question like "what road investments are planned in regions with the highest climate vulnerability?" requires combining narrative policy documents with spatial infrastructure data. Today that takes a specialist weeks. An AI agent could synthesise it in minutes.
  • Financial data + infrastructure data: Comparing cost-per-kilometre of tarred road across countries means joining IATI/DAC spending data with road network inventories. Nobody does this routinely.
  • Real-time monitoring + static indicators: Overlaying port disruption alerts (PortWatch) with trade dependency indicators (World Bank) gives early warning capability that neither dataset provides alone.
  • Research evidence + live data: Connecting HVT/RIDE/RECaP study findings with current country-level indicators, grounding policy recommendations in both published research and up-to-date statistics.

The principle: show that data which was never designed to be connected can be connected through AI, and that the combination yields answers no single platform can deliver.

Governance and Ownership

TGIS will be led and developed by RIDE, with support from the Frontier Tech Hub, in partnership with multilateral development banks (ADB, AfDB, EBRD, EIB, CAF, IDB), UN agencies (UNECE, UNDESA), TDC, SLOCAT, and other organisations. The vision is a system that is inclusive from the start, drawing on data and expertise from across the sector.

The relationship with TDC is particularly important. TDC is the initiative building trusted metadata, SDMX-based data standards, and shared guidance for global transport data. It sits above individual data sources, working to make them discoverable and interoperable. TGIS does not duplicate or compete with TDC. It uses agentic AI to accelerate TDC's mission in several ways:

  • Accelerate SDMX adoption: AI agents can help onboard new data providers to TDC's SDMX standards faster by automating format conversion, validating compliance, and generating draft data structure definitions that expert committees can review rather than build from scratch
  • Extend reach beyond structured data: TDC's strength is structured, SDMX-formatted statistical data. TGIS extends agent-accessibility into geospatial infrastructure data, financial flows, and unstructured policy knowledge that SDMX doesn't cover
  • Enable cross-source queries: TDC catalogues what exists. TGIS connects those catalogued datasets with data from other platforms (OPSIS, PortWatch, IATI) in a single analytical flow, making TDC's holdings more useful to a wider range of tools and agents
  • Complement TDC's AI assistant: TDC's early-stage AI assistant helps users discover datasets and reports within TDC. TGIS could serve as the cross-source query engine underneath, the layer that lets TDC's assistant (or any other transport AI tool) draw on multiple sources in a single query

Our Contribution

Our role is the how. The domain expertise lives with Oxford, UNECE, WRI, UK International Development / FCDO, GIZ, MDBs, country governments and the wider coalition. Our contribution is:

  • AI architecture expertise: how do you make heterogeneous data AI-queryable? How do you combine unstructured text (policy docs) with structured data (indicators) with geospatial (maps)?
  • Data integration patterns: what does it mean to get data "AI-ready"? What are the practical steps from "25 data sources with different APIs" to "one conversational interface"?
  • Proof of concept: demonstrating the art of the possible with working examples
  • Honest brokerage: reality-checking what is actually feasible

We bring a toolkit and a way of working. The sector brings the knowledge, the data, and the users.

The Key Design Questions

1. Who is this for?

Two user archetypes:

  • Senior policymakers in partner governments, not tech-savvy, need plain-language answers to complex questions about transport investment, safety, infrastructure condition
  • Technical staff in road agencies, port authorities, rail authorities, more data-literate but siloed in their mode, currently navigating multiple disconnected platforms

A critical point from the Frontier Tech Hub: behaviour change. Giving people a tool does not mean they will use it. If they do not interrogate data today, a new tool alone will not change that. The system must be designed around how people actually make decisions.

2. Global or local?

Both. A transport minister in Ghana cares about Ghana, not global averages. A UNDESA analyst tracking the Decade needs the global view. The system must work at both scales, a "global-local intelligence system" that lets users zoom to their context while drawing on the full breadth of available data.

3. One mode or multi-modal?

Most users will approach through a single mode (roads, maritime, rail, air). But the real value is in the connections between modes: port capacity affecting road freight costs, rail alternatives reducing road damage, climate vulnerability across networked infrastructure. The interface should allow mode-specific entry points while enabling cross-modal insight.

4. What are the first ten data connections?

For any proof of concept, we need to start narrow and deep rather than broad and shallow. The initial selection should:

  • Cover different transport modes (not all roads)
  • Include different data types (GIS + structured + narrative)
  • Use the most accessible, well-documented sources
  • Demonstrate the principle of cross-source synthesis

From the data access audit, the most integration-ready candidates:

SourceTypeMode CoverageAccess
OPSIS / open-giraGeospatial infrastructure networksMulti-modal (road, rail, air, maritime)3 REST APIs, no auth
Transport Data Commons (TDC)Multiple datasets via single APIMulti-modalCKAN API, emerging
World Bank Open DataIndicators, LPI, spendingCross-cuttingREST, no auth, CC-BY
PortWatch (IMF/Oxford)Maritime trade monitoringMaritime/portsArcGIS REST, no auth
IATI / DACFinancial flows, aid spendingCross-cuttingREST, no auth
ADB Asian Transport OutlookStatistical observatory (450+ indicators, 51 economies)Multi-modalReports, ADB Data API
Africa Transport ObservatoryContinental transport dataMulti-modalWeb portal, limited API
SLOCAT NDC Transport TrackerClimate policy commitments for transportCross-cuttingExcel download
AfDB Data PortalAfrica Information Highway, infrastructure + economic indicatorsCross-cutting (Africa)REST API, open data

With a policy document corpus (RAG) layered on top, this gives us structured data, geospatial data, financial data, climate policy data, and narrative knowledge, spanning Asia-Pacific, Africa, and global sources rather than relying on any single institution.

Two Ways of Working with Geospatial

A distinction that matters for the tool's design:

  1. Querying maps: the user asks a natural language question; the AI agent queries geospatial data in the background and returns a text answer. ("Which East African road corridors have the highest climate risk?")
  2. Showing maps: the user asks a question and the answer IS a map, with the right layers applied. The AI agent interprets spatial data and presents it visually.

Both are needed. The first is technically more straightforward (text-to-GIS-query). The second requires the system to generate or compose visual outputs, which is harder but often more useful. Sometimes a map with red-and-green corridors says more than a paragraph.

Multilingual Access

TGIS should support simultaneous translation. Users query in any language and receive responses in that language. Current LLMs handle this well for major languages. This matters because transport policymakers in partner countries work in French, Spanish, Arabic, Portuguese, and other languages, English among them. Building multilingual support from the start avoids the trap of anglophone-only tools that exclude most of the intended users.

Scope

  • Built on existing platforms, not a replacement for them
  • An intelligence and planning tool, designed for analysis rather than real-time control
  • A proof of concept, building momentum toward a fuller system
  • AI and integration capability from our side; domain expertise from the sector

Timelines

Near-term (now through March 2026)

  • Data source audit and mapping (done)
  • Validate data priorities and fill gaps with domain experts (Oxford, FCDO, UNECE)
  • Build working examples of cross-source data synthesis
  • Aim toward something demonstrable for Transforming Transportation (second week of March 2026)

Medium-term (next financial year, from April 2026)

  • More capacity available for deeper work
  • Move from examples to a coherent proof-of-concept prototype
  • Refine user journeys based on feedback from the February/March events
  • Engage with WRI and other partners on sustained development

Long-term (aligned with the UN Decade, 2026-2035)

  • Position the system as the data backbone of the Decade
  • Expand from 10 to 50+ data sources
  • Build a federated, sustainable platform with proper governance
  • Move from UK International Development seed funding toward institutional sustainability

Open Questions

  • What are the top priority data gaps that partners can help identify?
  • TDC's AI assistant MVP guides users to datasets and reports. How can TGIS serve as the cross-source engine underneath?
  • What does IE Connect's real-time data collection methodology look like, and can it feed in?
  • Which OECD/DAC indicator set is FCDO working to update, and can it inform the taxonomy?
  • What is the right institutional home for something like this long-term?
  • WB Road Safety Calculator: WB have agreed to provide access. What form does that take (API, data tables, embedded tool)?
  • Infrastructure resilience GIS: How do we integrate the GIRI/GMTRA prioritised investment layer into TGIS as a live service rather than academic outputs?
  • TRiP network: How does the UK university network for transport research connect to TGIS's evidence base?

Working document. Captures direction, not decisions.