AF
← Back to work

Case study · Spring 2026

London Property Data & AI Platform

A production-pattern data lakehouse pipeline that synthesises scattered public property data into a unified analytical layer. I then added an AI-powered natural language interface: built to quantify something that no existing tool measures: the impact of the Building Safety Crisis on London flat prices.

Stack

Kafka · Spark · Delta Lake · DuckDB · MCP · Claude

Data

39,000+ transactions, 2015–2025

Areas

N16, E5, E8, E9, N5

19%

of purpose-built blocks frozen since BSA

0%

of conversion and older apartment blocks frozen

178

named blocks with zero sales since April 2022

30%

of sampled high rise apartments sold at a loss

01

The Problem

Property data is everywhere. Natural language AI analytics on it are not.

The UK Building Safety Crisis, triggered by Grenfell in 2017 and codified into law by the Building Safety Act 2022, has fundamentally restructured the London flat market. Purpose-built blocks with external cladding concerns require EWS1 certification before lenders will issue mortgages. Thousands of leaseholders are trapped in properties they cannot sell.

No existing data product measures this. Rightmove shows you listings. Land Registry shows you completed transactions. Neither classifies buildings by cladding risk, identifies frozen blocks, or quantifies the price divergence between affected and unaffected property types.

This platform builds that analytical layer from scratch: ingesting listing data, enriching with sold prices, classifying buildings by type, and exposing the results through a natural language (NLP) interface powered by Claude.


02

Regulatory Context

2017JunGrenfell Tower fire

72 lives lost. National review of external wall safety triggered.

2019AprEWS1 form introduced

External wall fire review required for mortgage lending on affected buildings.

2020JanEWS1 extended to sub-18m buildings

Lenders begin refusing mortgages on uncertified buildings of any height.

2020MarCOVID-19 lockdown

Market temporarily frozen. Stamp duty holiday follows in July 2020.

2022AprBuilding Safety Act

Statutory framework. Responsible parties identified for remediation costs.

2024+Remediation ongoing

Many blocks remain frozen. Market bifurcation deepening.


03

The Pipeline

Seven stages. Three repositories. Zero managed services.

Ingress

node.js + Puppeteer + Express + kafkajs

Kafka

AJV validation → dead letter queue

Bronze

Spark Structured Streaming → Delta / S3

Silver

PySpark classify, clean, dedup

Gold

8 analytical Delta tables

DuckDB

read_parquet → FastMCP

Claude

9 MCP tools · natural language

rm-report-kafka

Apache Kafka in Kraft mode. Single-node, no Zookeeper. Kafdrop for topic inspection.

rm-report-ingress

Puppeteer, Express API, kafkajs producer with AJV JSON Schema validation at publish time.

rm-report-mcp

FastMCP server, PySpark Bronze/Silver/Gold pipeline, DuckDB query layer over Delta tables on S3.

Bronze streaming (PySpark)

# Kafka → Delta Lake Bronze df = (spark.readStream .format("kafka") .option("subscribe", topic) .load()) bronze = df.select( from_json(col("value"), schema) .alias("data"), col("timestamp").alias("kafka_ts") ) bronze.writeStream .format("delta") .outputMode("append") .start(BRONZE_PATH)

MCP tool (Python)

@mcp.tool() def get_cladding_freeze( postcode_district: str = None, building_class: str = None, ) -> str: """Identify frozen blocks - zero post-BSA sales, pre-EWS1 active.""" # DuckDB query against Gold # Delta tables on S3 result = con.execute(sql) return rows_to_json(result)

04 · Finding 1

The Cladding Freeze

Building Safety Act created a two-tier market. 19% of purpose-built blocks have effectively stopped transacting. Every conversion block in the same postcodes continues to sell normally.

19%

of purpose-built blocks frozen

178 of 916 blocks · zero sales since April 2022

Highrise

18% frozen

92 of 505

Midrise

21% frozen

86 of 411

0%

of conversion blocks frozen

Victorian / Edwardian house splits - unaffected

The absence of a price penalty in transacting newer builds reflects survivorship bias. Only EWS1-compliant blocks continue to trade — the legislation has not reduced prices; it has removed supply.

Of the 178 frozen blocks

68%

EWS1-triggered

Froze 2019–2021 and never recovered. These blocks were already illiquid before the BSA.

32%

BSA-triggered

Survived EWS1 but froze after April 2022. The Building Safety Act caught them.

Named frozen blocks (N16 + E5)

pre-EWS1 active · zero post-BSA sales

BlockClassPre-EWS1EWS1 periodPost-BSAAvg price
Bellevalia CourtE5 9PDhighrise1810£395k
78 Cedarwood CourtE5 9FUhighrise1640£429k
43 Pembury EstateE5 8AYmidrise1010£522k
Indigo MewsN16 9AEhighrise910£563k
St Andrews MewsN16 5HRhighrise920£249k
Moore HouseE5 0DDmidrise700£580k
41 Smithies CourtN16 8AYhighrise700£644k
20 Dehavilland StudiosE5 9NYmidrise420£824k
3 Red SquareN16 9AWmidrise340£678k
Nonet HouseE5 0NPmidrise630£506k

05 · Finding 2

Building Class Price Divergence

Building Safety Act created a two-tier market — purpose-built vs conversion. Within the flat market itself, conversions have dramatically outperformed purpose-built types since 2015.

Indexed price growth · N16 flats · 2015 = 100

9010011012013014020152016201720182019202020212022202320242025GrenfellEWS1COVIDBSA13611310999
Conversions (+36%)
Low-rise (-1%)
Mid-rise (+13%)
High-rise (+9%)

Conversions

+36%

since 2015

index 100 → 136

Mid-Rise

+13%

since 2015

index 100 → 113

High-Rise

+9%

since 2015

index 100 → 109

Insight: Conversions have outperformed all purpose-built flat types by a widening margin since 2015 (+36% vs +9–13%). The COVID stamp duty holiday (2020–21) created a temporary volume spike across all classes, but conversions recovered their premium while purpose-built types plateaued. The gap between conversions and highrise flats now stands at 27 index points — the widest since records began.


06 · Finding 3

Price Divergence by Tenure (Freehold vs Leasehold)

Freehold and leasehold properties tracked together until 2020: both barely moved from their 2017 baseline. The divergence begins in 2021 and accelerates sharply from 2022, as EWS1 enforcement and the Building Safety Act fed through into completed transactions. Freehold has since gained nearly three times as much value as leasehold newer-build flats. The absolute price gap has more than doubled: from £102K to £229K.

Indexed price growth · 2017 = 100

90100110120130201720182019202020212022202320242025GrenfellEWS1BSA128.7108.2
Freehold proxy (E11+E17 terraced)
Leasehold proxy (N5+N16 newer flats)

Freehold

+28.7%

since 2017

E11+E17 terraced · 3,900+ transactions

Leasehold

+8.2%

since 2017 · now declining

N5+N16 newer flats · lowrise/mid/highrise

Gap widened

£102k → £229k

Absolute price gap, 2017 vs 2025

Insight: Tenure compounds the BSA impact. Leaseholders cannot initiate remediation independently: they are dependent on freeholders and management companies to engage with the certification process. The leasehold line peaked in 2024 at 112.7 and is now declining: newer-build flats are losing value in absolute terms, not just underperforming relative to houses.


07

Natural Language Analytics

Nine MCP tools expose the pipeline to Claude. Ask questions in English.

Claude Desktop · property-analytics MCP
How have mid-rise flat prices in N16 changed since Grenfell, compared to conversions and houses?
Called get_building_class_gap, get_quarterly_cube, get_cladding_freeze

Based on the quarterly price cube and building class gap tables, mid-rise and high-rise flats in N16 have significantly underperformed conversions since 2017:

  • Conversions: +36% price growth (2017–2025)
  • Mid-Rise flats: +13% over the same period
  • High-Rise flats: +9% (index 100 → 109)

The cladding freeze detector identifies 178 mid-rise and high-rise blocks in N16+E5 with zero transactions since the Building Safety Act (April 2022). Top frozen blocks include Bellevalia Court (E5 9PD, 18 pre-EWS1 sales) and Indigo Mews (N16 9AE, 9 pre-EWS1 sales).

Of 3,347 high-rise flat sales tracked, 30.5% sold below asking price, vs 26.7% for conversions. The absolute price gap freehold-to-leasehold has widened from £102k to £229k.

quarterly_cube

postcode_analytics

street_analytics

building_class_gap

cladding_freeze

listing_efficiency

seasonal_patterns

beds_analytics

sold_properties


08

Technical Decisions

Architecture is about trade-offs, not best practices.

Kafka over direct-to-S3

Chosen

Apache Kafka (Kraft mode)

Alternative

Direct S3 write / Kinesis

Governance, replay capability, and fan-out to multiple consumers. Kraft mode removes Zookeeper dependency and is forward-compatible with Kafka 4.0+.

AJV over Schema Registry

Chosen

AJV at producer

Alternative

Confluent Schema Registry

Producer-side validation without a Confluent dependency. Schema violations route to a dead letter queue rather than crashing the pipeline.

DuckDB over Spark for serving

Chosen

DuckDB + read_parquet

Alternative

Spark ThriftServer / Trino

Sub-100ms queries on 39K rows, zero infrastructure, perfect MCP backend. read_parquet glob pattern used instead of delta_scan due to Delta Kernel credential limitations outside EC2.

FastMCP over REST API

Chosen

Python FastMCP (stdio)

Alternative

Express REST API

Native Claude Desktop integration via MCP protocol. Nine typed tools with docstrings that Claude uses to select the right analytical approach for each question.

Interested in this work?

Always interested in discussing data, AI, and enterprise architecture: get in touch.