UK Explr (alt. uk-explr
) is a scalable ETL pipeline for UK census data that transforms bulk "raw" data (CSV and JSON files) into a unified statistical lookup table with multi-resolution querying capabilities.
To get started, see Usage file.
A deployment can be accessed via the URL: https://uk-explr.up.railway.app
(test query: https://uk-explr.up.railway.app/v1/query-result?mode=oa&pageSize=50)
Look up statistics by:
✅ Output Area (OA) ✅ Lower/Middle Layer Super Output Areas (LSOA/MSOA) ✅ Local Authority Districts (LAD) ✅ Postal Codes
🚀 Production-ready REST API for web applications
⚡ MCP (Model Context Protocol) server for system integration — TO DO
-
Efficient handling of large datasets (e.g. UK census data)
-
Normalised output structure for consistent analysis
-
Real estate investment
-
Policy analysis & demographic research
-
Location-based service development
-
Academic studies requiring unified datasets
Built with reliability and scalability in mind, this pipeline serves as a robust foundation for applications requiring granular UK geospatial statistics.
data/
├─ raw/ # Source data files before processing (CSV, JSON, etc.)
etl-pipeline/ # Scripts for schema generation and data transformation
libs/ # Shared utilities and helper functions (project-wide)
web-api/ # REST API server implementation
├─ controllers/ # Business logic for handling requests/responses
├─ libs/ # API-specific utilities and helpers
├─ middleware/ # Express/HTTP middleware functions
├─ models/ # Data validation schemas (request/response shapes)
├─ services/ # Business logic and external service integrations
├─ types/ # TypeScript interfaces and type definitions
├─ index.ts # web server entry point
node_modules/ # Installed npm dependencies
Major features in the current roadmap:
Key:
🔴 — scheduled: very important
🟠 — scheduled: important
🟡 — scheduled
🔘 — backlogged
-
Implement MCP (Model Context Protocol) Server for AI integration. 🔴
-
Create schema-derived
.../help
route (e.g.GET /v1/query-result/help
) as documentation. 🟠 -
Implement HATEOAS (Hypermedia as the Engine of Application State) best practises in RESTful API implementation. 🟠
-
Ingest street names to associate with postal codes. 🟡
-
Ingest point geocoordinates for streets. 🔘
-
Ingest boundary geocoordinates for Output Area (OA), Lower-layer Super Output Area (LSOA), Middle-layer Super Output Area (MSOA), and Local Area District (LAD). 🔘
Contributions are welcome!
Feel free to discuss improvements by opening an issue.
Alternatively, feel free to make improvements by submitting a pull request (PR).
MIT; please see LICENSE file for more information.