AI Readiness for Transport Data: Making the World's Transport Data Agent-Accessible
The shift from human-facing portals to AI-agent-facing data services providing insights for human decision-making
From Portals to Agent-Accessible Data
Transport data platforms have made real progress. TDC is building SDMX-based standards for structured transport data. The World Bank, OECD, and ADB expose APIs. OPSIS serves global infrastructure data as clean REST endpoints. These platforms work. They serve human users, power dashboards, and feed policy analysis.
What's changing is who consumes the data. AI agents are increasingly acting on behalf of humans: a policymaker asks a question in plain language, an agent works out which datasets to query, how to query them, what to combine, and returns a synthesised answer. The human never downloads a CSV. This doesn't replace existing portals (they keep serving their users) but it changes what's needed at the data serving layer: machine-readable discovery, semantic descriptions, and consistent access patterns that agents can use across sources.
The concept: use agentic AI to extend what's possible with transport data, building on TDC and SDMX as the foundation while connecting the geospatial, financial, and unstructured knowledge that these standards don't yet cover.
The goal is infrastructure that makes any chatbot, any AI workflow, any automated analysis pipeline able to tap into global transport data. A chatbot is one possible application. The service layer is the platform. It builds on the Transport Data Commons (TDC) as the shared data catalogue and the SDMX standard as the semantic backbone for statistical data, extending agent-accessibility into the geospatial, financial, and policy domains that these frameworks don't reach.
What AI Readiness Means in Practice
An AI agent trying to answer "What is the climate risk to road corridors in East Africa?" today faces this reality:
- It does not know what datasets exist about road corridors in Africa
- It does not know that OPSIS has a global infrastructure resilience API
- Even if it did, it does not know the API schema, query parameters, or what fields mean
- It does not know that PortWatch has related trade disruption data accessible via ArcGIS REST
- It cannot combine a geospatial bounding-box query with a statistical indicator query without understanding both query paradigms
- It has no way to verify that it is using the right data at the right granularity
AI readiness means solving all six of these problems at the data serving layer, so that any agent, regardless of who built it, can discover, understand, query, and combine transport data.
The Three Layers of AI Readiness
Layer 1: DISCOVERY
"What data exists? What can it tell me?"
Metadata that lets agents find relevant datasets
for any question, without prior knowledge.
Layer 2: COMPREHENSION
"What does this data look like? What do the fields mean?"
Schema descriptions, semantic annotations, and
query pattern documentation, written for machines.
Layer 3: ACCESS
"How do I get the data I need?"
Standardised, thin query interfaces that agents
can call with consistent patterns.
Layer 1: Discovery — Telling Agents What Exists
Today, a human discovers transport data by browsing portals, reading reports, or asking colleagues. An AI agent needs structured, machine-readable metadata that answers:
- What datasets exist, and what topics, geographies, and modes they cover
- What questions each dataset can answer (semantic capability in natural language)
- How fresh the data is and how often it updates
- What format and access method each dataset uses
- Which other datasets it can be combined with, and how
What This Looks Like
A data registry: a lightweight, structured catalogue that agents consult before making any query. Think of it as the table of contents an agent reads before deciding which chapters to open.
# Example registry entry
- id: opsis-global-infrastructure
name: "OPSIS Global Infrastructure Resilience"
provider: "University of Oxford"
description: "Global road, rail, air, and maritime infrastructure networks with climate risk exposure analysis"
capabilities:
- "Infrastructure network topology for any country"
- "Climate hazard exposure for transport corridors"
- "Multi-modal connectivity analysis"
geographic_coverage: "Global"
transport_modes: ["road", "rail", "air", "maritime"]
data_types: ["geospatial", "network", "risk-indicators"]
update_frequency: "Quarterly"
access_method: "rest_api"
api_base: "https://global.infrastructureresilience.org/api/v1"
combinable_with:
- id: portwatch
relationship: "Port disruptions can be overlaid on OPSIS maritime network nodes"
- id: world-bank-indicators
relationship: "Country-level transport indicators can contextualise infrastructure data"
license: "ODbL + CC-BY-4.0"
The same principle exists elsewhere: STAC for earth observation, CKAN for open data portals, tool manifests for AI agents. But nothing equivalent exists for global transport data in an agent-friendly form.
Skills and MCP Servers
In current AI agent tooling, capabilities are delivered in two complementary forms:
Skills are packaged units of knowledge and behaviour that an agent can be given. A skill bundles together the context an agent needs (what a data source contains, how to query it, what the fields mean, what patterns to follow) so the agent can act competently on a domain it has never seen before. Skills can be as simple as a prompt with structured instructions, or as rich as a multi-step workflow that guides an agent through discovery, querying, and synthesis. Each transport data source becomes a skill an agent can pick up.
MCP servers (Model Context Protocol) are the programmatic counterpart: live services that expose tools an agent can call. An MCP server wraps an API and presents it as a set of typed functions: query this dataset, resolve this geography, fetch these indicators. OpenAI function calling schemas and Anthropic tool definitions serve the same purpose in their respective ecosystems.
These two forms work together. A skill tells the agent what to do and why. An
MCP server gives it the tools to do it. A skill might say "to answer questions
about port disruptions, use the PortWatch tool with a spatial filter for the
region of interest." The MCP server provides the query_port_disruptions
function the agent calls.
The discovery layer should be expressible in both forms: skills that any agent can learn from, and tool definitions that any major AI framework can consume directly. Some sources need only a skill (the underlying API is simple enough for the agent to call directly with guidance). Others need an MCP server (the underlying API is too complex or requires translation). Many benefit from both.
Layer 2: Comprehension — Progressive Discovery
Discovery tells the agent what exists. Comprehension tells it what the data means. How much work this requires varies enormously by source.
SDMX sources arrive with comprehension built in. Every dimension has a formal definition, every value is constrained to a codelist, and structural metadata travels with the data. An agent querying OECD transport statistics via SDMX already knows what each field means, what values are valid, and how observations relate to each other. LLM hallucinations drop significantly when grounded in this kind of structured semantic context. For SDMX sources, the comprehension layer is largely free. The standard was designed for exactly this kind of machine interpretability.
For non-SDMX sources, comprehension requires more work but can happen incrementally. Most APIs already describe themselves to some degree: OpenAPI specs, JSON Schema, GraphQL introspection, OGC capabilities documents. The information exists, but it is scattered, inconsistent, and not presented in a way an agent can consume progressively.
Let the APIs Speak for Themselves
The comprehension layer works through progressive disclosure:
Level 1 — Tool description. The MCP tool definition or skill already carries a natural-language description of what the source does and typed parameter signatures. This is often enough for an agent to make a first call. A well-written tool description is schema documentation.
Level 2 — Contract exposure. For sources with OpenAPI specs, JSON Schema, or GraphQL schemas, expose the existing contract directly. The agent can inspect parameter types, enum values, and response shapes on demand, no translation needed. For sources without machine-readable contracts, a minimal description in the skill covers the gap.
At level 3, semantic hints fill in what contracts leave ambiguous: what
country_iso3 means, that OPSIS spatial data uses WGS84, that World Bank and
OPSIS can be joined on country code. These live in the skill or tool
descriptions, not a parallel metadata system.
Level 4 is examples. A few concrete query/response pairs often teach an agent more than abstract schema. Skills can embed these directly.
Design constraints
- No bespoke YAML schema for every dataset
- No replicating information that already exists in API contracts
- No single metadata standard required across all sources
- Agents can act before comprehension is complete
An agent that calls an endpoint and inspects the response learns the schema faster than one that reads a 200-line YAML definition. Start with what the source already provides. Fill gaps with lightweight annotations. Let the agent learn through interaction.
Layer 3: Access — Skills and Servers Across a Spectrum
The question for each data source is: what does an agent need to use it? The answer varies enormously, and each source needs different treatment.
The Access Spectrum
The insight from the skills + MCP servers model is that access sits on a spectrum. Each source sits somewhere on it:
Skill only — the API is already good enough. Sources like the World Bank REST API or OPSIS return clean JSON, have sensible query parameters, and are well-documented. An agent given a skill that explains what the source contains, how to construct queries, and what the response looks like can call these APIs directly. No middleware needed. The skill is the access layer.
Skill + light guidance — the API is usable but quirky. GraphQL endpoints (Transitland), OGC API Features, or APIs with unusual pagination or auth patterns. The agent can call them directly, but the skill needs to teach specific patterns: how to structure a GraphQL query, how to handle cursor-based pagination, how to pass an API key. Still no server. Just a smarter skill.
MCP server — the API needs translation for agents. ArcGIS REST services (deeply nested query model, esri-specific spatial encoding), SDMX (complex dimension-based query syntax, XML responses), OGC WFS/WMS (SOAP, capabilities negotiation, GML). These need a server that translates between the source's native interface and something an agent can call. The MCP server exposes simple typed tools; the complexity stays behind it. For SDMX specifically, this aligns with the SDMX community's own direction. The sponsors' September 2025 joint statement on AI-readiness explicitly explores MCP as a way for agents to access official statistics, and the IMF's StatGPT 2.0 already demonstrates AI querying of SDMX-structured data. An MCP server for SDMX transport data would contribute to that effort, not build around it.
MCP server + ingestion — there is no API at all. Static downloads on Zenodo, CSV dumps, PDF reports. These need pre-ingestion into a queryable store, with an MCP server in front. This is the most infrastructure-heavy option and should be used sparingly.
What This Means in Practice
| Source | API Type | Access Approach |
|---|---|---|
| World Bank | REST/JSON | Skill only — agent calls API directly |
| OPSIS | REST/JSON | Skill only — agent calls API directly |
| OGC API Features | REST/JSON | Skill with query pattern guidance |
| Transitland | GraphQL | Skill with GraphQL examples |
| PortWatch | ArcGIS REST | MCP server — translates spatial queries |
| OECD/ITF | SDMX | MCP server — translates dimension queries (aligns with SDMX AI-readiness initiative) |
| Overture Maps | S3/Parquet | MCP server — DuckDB queries behind simple tools |
| African Transport DB | Static download | MCP server + ingestion |
The ratio matters: most sources need a skill rather than a server. This keeps the infrastructure footprint small and puts the intelligence in the agent's understanding of the domain.
The Agent Decides
Each skill or MCP server follows the conventions natural to its domain. JSON and GeoJSON are preferred where possible, but an agent equipped with the right skill can handle a GraphQL response or a paginated REST API without everything being normalised into a single pattern.
The skills provide the consistency. An agent with skills for five different transport data sources has a consistent mental model (discovery, comprehension, query) even though the underlying calls are different.
Key Considerations: Regular APIs and Geospatial APIs Working Together
Transport data lives in two fundamentally different worlds (tabular/statistical and geospatial) and an AI agent needs to work across both in a single reasoning flow.
How Regular (Tabular) APIs Work
Tabular APIs (World Bank, OECD/SDMX, IATI) deal in indicators, time series, and categorical data. You filter by dimension (country, year, indicator code) and get back rows of values. Spatial references are abstract: country or region codes (ISO 3166, UN M.49), not coordinates. Responses are typically small (hundreds to thousands of rows). An agent can say "get road fatality rates for East African countries, 2020-2025" and receive a clean table.
How Geospatial APIs Work
Geospatial APIs (OPSIS, PortWatch ArcGIS, OGC API Features, Overture) deal in geometries and features. You query with spatial filters (bounding box, point+radius, polygon) and attribute filters (feature type, hazard class), and get back features with coordinates. Spatial references are literal: WGS84 lat/lon, bounding boxes, full geometries. Responses can be very large. A road network for a country means thousands of LineString features and megabytes of GeoJSON. An agent can say "get all road segments within 50km of Mombasa port that have flood risk > 0.7" and receive a set of map features.
The Bridge: How an Agent Uses Both in One Flow
A real question almost always needs both. Consider: "Which East African countries have the worst road safety outcomes relative to their road infrastructure investment?"
An agent needs to:
- Tabular query: Get road safety indicators (WHO/World Bank) for East African countries
- Tabular query: Get transport infrastructure spending (IATI/DAC) for the same countries
- Geospatial query (optional but enriching): Get road network extent (OPSIS) to normalise by network size
- Synthesis: Combine the results, compute ratios, rank countries
Steps 1 and 2 are pure tabular. Step 3 is geospatial but the agent only needs an aggregate (total road km per country), not the full geometry. Step 4 is reasoning.
The key design insight: the agent needs to be able to move between tabular and geospatial worlds using shared reference frames, primarily geography (country codes, region names, bounding boxes) and time.
Critical Considerations
1. Geographic Reference Translation
An agent thinks in terms of "East Africa" or "Kenya" or "Mombasa." Different APIs need this expressed differently:
| API Type | How Geography is Expressed |
|---|---|
| World Bank | country=KEN;TZA;UGA;ETH (ISO codes) |
| OECD/SDMX | REF_AREA=KEN+TZA+UGA+ETH |
| OPSIS | bbox=28.8,-11.7,51.4,5.0 (bounding box) |
| PortWatch ArcGIS | geometry={"rings":...}&geometryType=esriGeometryPolygon |
| OGC API Features | bbox=28.8,-11.7,51.4,5.0 or CQL2 spatial filter |
| Overture Maps | Parquet partition by S2 cell or DuckDB spatial filter |
The service layer must provide geographic reference translation. The agent
says "East Africa"; the layer knows this means ISO codes KEN, TZA, UGA, ETH,
RWA, BDI, SSD, SOM, DJI, ERI for tabular APIs and bounding box [28.8, -11.7, 51.4, 5.0] for spatial APIs. This is a lookup table plus boundary geometries.
Straightforward to build, essential for agent usability.
2. Response Size Management
Tabular APIs return kilobytes. Geospatial APIs can return megabytes or gigabytes. An agent's context window cannot hold the full road network of Kenya.
Several strategies help:
- The thin wrapper can aggregate geospatial results before returning them ("total road km by region" instead of every road segment).
- Return a summary first (counts, totals, statistics) and let the agent request detail only if needed.
- Reduce geometry precision for agent consumption. An agent does not need 15-decimal-point coordinates.
- Never return unbounded result sets. Pagination with feature count limits is a baseline requirement.
3. Temporal Alignment
Different sources have different temporal models:
- World Bank indicators: annual snapshots, often 1-2 years lag
- OPSIS: quarterly updates, point-in-time infrastructure state
- PortWatch: near-real-time disruption events with timestamps
- IATI: spending by fiscal year, variable reporting lag
The service layer needs temporal metadata: when was this data last updated, what period does it cover, what is the expected lag. Agents need this to know whether they are combining comparable time periods or mixing 2023 infrastructure data with 2025 spending data.
4. Geospatial Query Patterns That Agents Need
Not all geospatial queries are the same. The service layer should support a small, well-defined set of spatial query patterns:
| Pattern | Description | Example |
|---|---|---|
| Bounding box | Features within a rectangle | "Road network in this map extent" |
| Point + radius | Features within distance of a point | "Ports within 100km of Dar es Salaam" |
| Region lookup | Features within a named administrative boundary | "Infrastructure in Kenya" |
| Corridor query | Features along a route/line | "Climate risk along the Mombasa-Nairobi corridor" |
| Intersection | Features that overlap with another geometry | "Road segments that cross flood zones" |
Of these, region lookup and bounding box cover 80% of agent use cases. Corridor and intersection queries are more advanced but critical for transport-specific analysis.
The agent should not need to construct WKT geometries or encode ArcGIS spatial
reference objects. It should be able to say region=KEN or
bbox=36.6,-1.5,37.1,-1.1 and get results.
5. Combining Results Across APIs
The trickiest part: an agent queries two sources and needs to join the results. This requires shared identifiers or spatial joining.
Shared identifiers (preferred where available):
- ISO country codes link most tabular datasets
- Port LOCODE links port-related data
- IATA/ICAO codes link aviation data
- No universal identifier exists for roads, rail lines, or transit stops
Spatial joining (when identifiers do not align):
- "Find the World Bank transport spending for the country that contains this OPSIS road segment"
- This requires the agent (or the service layer) to do a point-in-polygon or feature-in-region lookup
- The service layer should provide a spatial reference resolver: given a coordinate or geometry, return the containing country/region/district
Practical approach: The service layer provides a small set of reference resolution tools:
/resolve/country?lat=-1.3&lon=36.8 → {"iso3": "KEN", "name": "Kenya", "region": "East Africa"}
/resolve/region?name=East Africa → {"countries": ["KEN","TZA",...], "bbox": [28.8,-11.7,51.4,5.0]}
/resolve/port?name=Mombasa → {"locode": "KEMBA", "lat": -4.04, "lon": 39.67, "country": "KEN"}
These become the agent's spatial vocabulary. Simple tools that translate between human geography and machine-queryable spatial references.
6. Agentic Multi-Pass Orchestration
This becomes concrete when agents chain queries across sources:
User: "Which transport corridors in Sub-Saharan Africa are most
vulnerable to climate disruption relative to their trade importance?"
Agent reasoning:
Pass 1 (Discovery): Check registry → need OPSIS (infrastructure + climate risk),
PortWatch (trade/port data), World Bank (trade indicators)
Pass 2 (Reference): Resolve "Sub-Saharan Africa" → list of countries + bounding box
Pass 3 (Parallel queries):
- OPSIS: Get major transport corridors in SSA bbox with climate risk scores
- PortWatch: Get trade volumes for SSA ports
- World Bank: Get trade-to-GDP ratios for SSA countries
Pass 4 (Synthesis): Join corridor risk data with trade importance,
rank corridors by vulnerability * trade impact,
return top 10 with narrative explanation
Pass 5 (Optional): Generate a map showing the top corridors
colour-coded by combined risk-trade score
The agent handles the orchestration. The service layer's job is to make each individual step trivial to discover, understand, and call.
Unstructured Knowledge: The Third Data World
The sections above cover structured data (tabular APIs, SDMX) and geospatial data (spatial APIs, feature services). But transport information has a third component, probably larger than both: unstructured knowledge.
Policy reports, NDC commitments, project evaluations, corridor studies, master plans, research papers, conference proceedings, HVT and RECaP study outputs, World Bank project completion reports, news articles about transport developments. This is where institutional knowledge lives. It's what gives the numbers context. A road safety statistic for Kenya is more useful when an agent can also surface the HVT study that evaluated the interventions tried there, or the NDC commitment that pledged a shift to non-motorised transport, or the project evaluation that found maintenance budgets were diverted.
No data standard covers this. SDMX is for structured statistical data. CKAN catalogues datasets. Neither indexes the content of a 200-page transport master plan or a research paper on rural access.
How agents handle unstructured knowledge
Retrieval-augmented generation (RAG) is the established pattern. Documents are indexed (chunked, embedded, stored in a vector database) and when an agent receives a question, it retrieves the most relevant passages before generating a response. The agent's answer is grounded in actual source material rather than whatever it absorbed during training.
For TGIS, the practical architecture:
- Index key document collections: HVT/RIDE/RECaP research outputs, SLOCAT policy tracker documents, World Bank transport project documents, NDC submissions with transport components, ITF policy papers.
- Cross-reference with structured data. An agent answering "What's working for road safety in East Africa?" should pull both the WHO/World Bank fatality statistics (structured, via SDMX or REST API) and the relevant HVT research evaluations (unstructured, via RAG). The structured data says what happened. The unstructured knowledge says why, and what was tried.
- Every claim from an unstructured source should carry a citation back to the original document. The agent always points to the evidence. That's where trust comes from.
This is the part of the transport information space that's most naturally suited to AI and least served by existing standards work. TDC catalogues datasets. SDMX structures statistics. Nobody makes the sector's accumulated knowledge searchable and synthesisable alongside live data. That's a genuine gap.
Serving the UN Decade of Sustainable Transport
The UN Decade of Sustainable Transport (2026–2035) launched in December 2025 with monitoring goals but no data infrastructure to track progress across its six priority areas. Decade monitoring is exactly the kind of cross-source problem TGIS is designed for.
Tracking whether a country is making progress on sustainable transport requires combining data that currently lives in completely separate systems:
- Policy commitments: what did the country's NDC pledge for transport? (SLOCAT NDC Tracker)
- Actual investment: how much aid and domestic spending is going to transport? (IATI/DAC, national budgets)
- Infrastructure outcomes: is the road network expanding, and where? What's the climate exposure? (OPSIS, World Bank)
- Statistical indicators like road safety rates, modal split, emissions intensity (ITF/OECD via SDMX, World Bank)
- Research evidence from HVT/RIDE/RECaP studies on what works in this context (unstructured knowledge)
No single platform holds all of this. An agent with access to all five can answer "Is Kenya on track for its transport NDC commitments, and what does the evidence say about the interventions it's chosen?", a question that currently takes a research team weeks to assemble.
Scope
TGIS uses agentic AI across multiple opportunities in transport data:
- Accelerate standards adoption: help build SDMX data structures for transport, automate onboarding to TDC standards, contribute to the SDMX community's AI-readiness initiative
- Bridge the geospatial gap: provide agent-accessible interfaces for spatial infrastructure data, climate risk layers, and trade disruption feeds that SDMX doesn't cover
- Make unstructured knowledge searchable: make policy reports, research findings, NDC commitments, and institutional knowledge searchable and synthesisable alongside structured data
- Enable cross-source orchestration: connect tabular statistics, geospatial networks, financial flows, and policy documents in single analytical flows that no existing tool provides
A chatbot is one possible consumer. The service layer serves any AI workflow.
Any organisation building AI tools for transport can plug in. The service layer makes the data investment of 25+ existing platforms more valuable by making them composable. It accelerates TDC and SDMX rather than building around them. And it starts with 10 sources to prove the approach works before expanding.
What TGIS is not. TGIS does not create new data standards; it uses SDMX and TDC's existing ones. It does not centralise data; it queries sources where they live. It does not replace any existing platform or tool. It does not compete with TDC. It builds on TDC as the foundational data commons and uses agentic AI to extend TDC's reach into domains that structured data standards don't cover.
Governance
This work is led by the RIDE programme (Research on Infrastructure in Developing Economies), funded by UK International Development / FCDO, with technical delivery supported by the Frontier Tech Hub. RIDE provides the strategic direction, partner relationships, and domain expertise; Frontier Tech Hub provides the AI and data architecture capability. The aim is a system built in partnership with MDBs (ADB, AfDB, World Bank), UN agencies (UNECE, UNDESA), TDC, SLOCAT, and other organisations. No single institution owns it.
From Concept to Deliverable
Minimum Viable Service Layer
Start with 10 sources that represent the range of API types, data types, and institutions (deliberately not WB-centric):
| Source | API Type | Data Type | Wrapper Needed? |
|---|---|---|---|
| World Bank | REST/JSON | Tabular indicators | Minimal — add agent metadata |
| OPSIS | REST/JSON | Geospatial infrastructure | Minimal — add agent metadata |
| PortWatch | ArcGIS REST | Geospatial + trade events | Yes — translate ArcGIS patterns to simple REST |
| OECD/ITF | SDMX | Transport statistics | MCP server — aligns with SDMX sponsors' AI-readiness initiative |
| IATI/DAC | REST/XML | Financial flows | Moderate — simplify query model, return JSON |
| ADB Asian Transport Outlook | REST/JSON | Statistical observatory (450+ indicators, 51 economies) | Moderate — ADB Data API wrapper |
| Africa Transport Observatory | Web portal | Continental transport data | Yes — scrape/ingest structured data |
| TDC | CKAN API | Multi-modal datasets (SDMX-structured) | Minimal — CKAN is agent-friendly; SDMX metadata makes data semantically self-describing |
| AfDB Data Portal | REST/JSON | African infrastructure + economic indicators | Moderate — Africa Information Highway API |
| SLOCAT NDC Transport Tracker | Excel/CSV | Climate policy commitments for transport | Yes — ingest + structured query layer |
Relationship with TDC and SDMX
TDC is the initiative building trusted metadata, SDMX-based data standards, and shared guidance for global transport data. It is a layer above individual data sources. TGIS builds on that foundation across several opportunities:
- Accelerate SDMX standards development: Transport has no formal global SDMX domain package. Building one through traditional international committee processes takes years. AI agents can compress the preparatory work (analysing existing national data files, comparing structures across TDC's datasets, generating draft Data Structure Definitions) so expert committees review rather than build from scratch. This contributes directly to TDC's standards mission.
- Extend into non-SDMX territory: SDMX is the right standard for statistical data. But transport decisions also require geospatial infrastructure data (road networks, climate risk layers, port locations), financial flows (aid spending, investment), and unstructured knowledge (policy reports, research findings, NDC commitments). These live outside SDMX's scope. TGIS provides agent-accessible interfaces for these sources and can combine their outputs with SDMX statistical data in a single analytical flow.
- Complement TDC's AI assistant: TDC's early-stage AI assistant helps users discover datasets, reports, and dashboards within TDC. TGIS could serve as the cross-source query engine underneath, enabling TDC's assistant (or any transport AI tool) to draw on multiple sources in a single query.
- Align with the SDMX AI-readiness initiative: The eight SDMX sponsor organisations issued a joint statement in September 2025 committing to AI-readiness, including exploration of MCP (Model Context Protocol) for agent access to official statistics. The IMF's StatGPT 2.0 already demonstrates this pattern. TGIS MCP servers for SDMX sources would contribute to this effort rather than building a parallel one.
Plus:
- A data registry with entries for all 25+ audited sources (even if only 10 have active wrappers)
- A geographic reference resolver for region/country/coordinate translation
- Agent tool definitions in MCP-compatible format
- Multilingual query support: users should be able to ask questions in any language
What This Demonstrates
With these 10 sources wired up, an agent can answer questions that span infrastructure condition and climate risk, country-level transport indicators, trade and port dynamics, OECD and non-OECD transport statistics, aid and investment flows, African continental transport corridors, and climate policy commitments mapped to transport actions across 190+ countries.
That is enough to show the principle: data that was never designed to be connected, made connectable through an AI service layer, broad enough to avoid being seen as dominated by any single institution.