The Bartlett Centre for Advanced Spatial Analysis


CASA Working Paper 227 | Modeling Clusters From The Ground Up: A Web Data Approach

Modeling Clusters From The Ground Up: A Web Data Approach


28 May 2021


This paper proposes a new methodological framework to identify economic clusters over space and time. We employ a unique open source dataset of geolocated and archived business webpages and interrogate them using Natural Language Processing to build bottom-up classi-fications of economic activities. We validate our method on an iconic UK tech cluster – Shoreditch, East London. We benchmark our results against existing case studies and admin-istrative data, replicating the main features of the cluster and providing fresh insights. As well as overcoming limitations in conventional industrial classification, our method addresses some of the spatial and temporal limitations of the clustering literature.

Keywords: clusters, cities, technology industry, machine learning

Authors: Christoph Stich, Emmanouil Tranos, Max Nathan

Download CASA Working Paper 227 (file size 3.2MB, file format PDF)