Case study · Spring 2026
A production-pattern data lakehouse pipeline that synthesises scattered public property data into a unified analytical layer. I then added an AI-powered natural language interface: built to quantify something that no existing tool measures: the impact of the Building Safety Crisis on London flat prices.
Stack
Kafka · Spark · Delta Lake · DuckDB · MCP · Claude
Data
39,000+ transactions, 2015–2025
Areas
N16, E5, E8, E9, N5
GitHub
andrewfieldwmg19%
of purpose-built blocks frozen since BSA
0%
of conversion and older apartment blocks frozen
178
named blocks with zero sales since April 2022
30%
of sampled high rise apartments sold at a loss
01
Property data is everywhere. Natural language AI analytics on it are not.
The UK Building Safety Crisis, triggered by Grenfell in 2017 and codified into law by the Building Safety Act 2022, has fundamentally restructured the London flat market. Purpose-built blocks with external cladding concerns require EWS1 certification before lenders will issue mortgages. Thousands of leaseholders are trapped in properties they cannot sell.
No existing data product measures this. Rightmove shows you listings. Land Registry shows you completed transactions. Neither classifies buildings by cladding risk, identifies frozen blocks, or quantifies the price divergence between affected and unaffected property types.
This platform builds that analytical layer from scratch: ingesting listing data, enriching with sold prices, classifying buildings by type, and exposing the results through a natural language (NLP) interface powered by Claude.
02
72 lives lost. National review of external wall safety triggered.
External wall fire review required for mortgage lending on affected buildings.
Lenders begin refusing mortgages on uncertified buildings of any height.
Market temporarily frozen. Stamp duty holiday follows in July 2020.
Statutory framework. Responsible parties identified for remediation costs.
Many blocks remain frozen. Market bifurcation deepening.
03
Seven stages. Three repositories. Zero managed services.
Ingress
node.js + Puppeteer + Express + kafkajs
Kafka
AJV validation → dead letter queue
Bronze
Spark Structured Streaming → Delta / S3
Silver
PySpark classify, clean, dedup
Gold
8 analytical Delta tables
DuckDB
read_parquet → FastMCP
Claude
9 MCP tools · natural language
rm-report-kafka
Apache Kafka in Kraft mode. Single-node, no Zookeeper. Kafdrop for topic inspection.
rm-report-ingress
Puppeteer, Express API, kafkajs producer with AJV JSON Schema validation at publish time.
rm-report-mcp
FastMCP server, PySpark Bronze/Silver/Gold pipeline, DuckDB query layer over Delta tables on S3.
Bronze streaming (PySpark)
MCP tool (Python)
04 · Finding 1
Building Safety Act created a two-tier market. 19% of purpose-built blocks have effectively stopped transacting. Every conversion block in the same postcodes continues to sell normally.
19%
of purpose-built blocks frozen
178 of 916 blocks · zero sales since April 2022
Highrise
18% frozen
92 of 505
Midrise
21% frozen
86 of 411
0%
of conversion blocks frozen
Victorian / Edwardian house splits - unaffected
Of the 178 frozen blocks
68%
EWS1-triggered
Froze 2019–2021 and never recovered. These blocks were already illiquid before the BSA.
32%
BSA-triggered
Survived EWS1 but froze after April 2022. The Building Safety Act caught them.
Named frozen blocks (N16 + E5)
pre-EWS1 active · zero post-BSA sales
| Block | Class | Pre-EWS1 | EWS1 period | Post-BSA | Avg price |
|---|---|---|---|---|---|
| Bellevalia CourtE5 9PD | highrise | 18 | 1 | 0 | £395k |
| 78 Cedarwood CourtE5 9FU | highrise | 16 | 4 | 0 | £429k |
| 43 Pembury EstateE5 8AY | midrise | 10 | 1 | 0 | £522k |
| Indigo MewsN16 9AE | highrise | 9 | 1 | 0 | £563k |
| St Andrews MewsN16 5HR | highrise | 9 | 2 | 0 | £249k |
| Moore HouseE5 0DD | midrise | 7 | 0 | 0 | £580k |
| 41 Smithies CourtN16 8AY | highrise | 7 | 0 | 0 | £644k |
| 20 Dehavilland StudiosE5 9NY | midrise | 4 | 2 | 0 | £824k |
| 3 Red SquareN16 9AW | midrise | 3 | 4 | 0 | £678k |
| Nonet HouseE5 0NP | midrise | 6 | 3 | 0 | £506k |
05 · Finding 2
Building Safety Act created a two-tier market — purpose-built vs conversion. Within the flat market itself, conversions have dramatically outperformed purpose-built types since 2015.
Indexed price growth · N16 flats · 2015 = 100
Conversions
+36%
since 2015
index 100 → 136
Mid-Rise
+13%
since 2015
index 100 → 113
High-Rise
+9%
since 2015
index 100 → 109
Insight: Conversions have outperformed all purpose-built flat types by a widening margin since 2015 (+36% vs +9–13%). The COVID stamp duty holiday (2020–21) created a temporary volume spike across all classes, but conversions recovered their premium while purpose-built types plateaued. The gap between conversions and highrise flats now stands at 27 index points — the widest since records began.
06 · Finding 3
Freehold and leasehold properties tracked together until 2020: both barely moved from their 2017 baseline. The divergence begins in 2021 and accelerates sharply from 2022, as EWS1 enforcement and the Building Safety Act fed through into completed transactions. Freehold has since gained nearly three times as much value as leasehold newer-build flats. The absolute price gap has more than doubled: from £102K to £229K.
Indexed price growth · 2017 = 100
Freehold
+28.7%
since 2017
E11+E17 terraced · 3,900+ transactions
Leasehold
+8.2%
since 2017 · now declining
N5+N16 newer flats · lowrise/mid/highrise
Gap widened
£102k → £229k
Absolute price gap, 2017 vs 2025
Insight: Tenure compounds the BSA impact. Leaseholders cannot initiate remediation independently: they are dependent on freeholders and management companies to engage with the certification process. The leasehold line peaked in 2024 at 112.7 and is now declining: newer-build flats are losing value in absolute terms, not just underperforming relative to houses.
07
Nine MCP tools expose the pipeline to Claude. Ask questions in English.
Based on the quarterly price cube and building class gap tables, mid-rise and high-rise flats in N16 have significantly underperformed conversions since 2017:
The cladding freeze detector identifies 178 mid-rise and high-rise blocks in N16+E5 with zero transactions since the Building Safety Act (April 2022). Top frozen blocks include Bellevalia Court (E5 9PD, 18 pre-EWS1 sales) and Indigo Mews (N16 9AE, 9 pre-EWS1 sales).
Of 3,347 high-rise flat sales tracked, 30.5% sold below asking price, vs 26.7% for conversions. The absolute price gap freehold-to-leasehold has widened from £102k to £229k.
quarterly_cube
postcode_analytics
street_analytics
building_class_gap
cladding_freeze
listing_efficiency
seasonal_patterns
beds_analytics
sold_properties
08
Architecture is about trade-offs, not best practices.
Kafka over direct-to-S3
Chosen
Apache Kafka (Kraft mode)
Alternative
Direct S3 write / Kinesis
Governance, replay capability, and fan-out to multiple consumers. Kraft mode removes Zookeeper dependency and is forward-compatible with Kafka 4.0+.
AJV over Schema Registry
Chosen
AJV at producer
Alternative
Confluent Schema Registry
Producer-side validation without a Confluent dependency. Schema violations route to a dead letter queue rather than crashing the pipeline.
DuckDB over Spark for serving
Chosen
DuckDB + read_parquet
Alternative
Spark ThriftServer / Trino
Sub-100ms queries on 39K rows, zero infrastructure, perfect MCP backend. read_parquet glob pattern used instead of delta_scan due to Delta Kernel credential limitations outside EC2.
FastMCP over REST API
Chosen
Python FastMCP (stdio)
Alternative
Express REST API
Native Claude Desktop integration via MCP protocol. Nine typed tools with docstrings that Claude uses to select the right analytical approach for each question.
Always interested in discussing data, AI, and enterprise architecture: get in touch.