CASA Working Paper 227 | Modeling Clusters From The Ground Up: A Web Data Approach
Modeling Clusters From The Ground Up: A Web Data Approach
28 May 2021
Abstract
This paper proposes a new methodological framework to identify economic clusters over space and time. We employ a unique open source dataset of geolocated and archived business webpages and interrogate them using Natural Language Processing to build bottom-up classi-fications of economic activities. We validate our method on an iconic UK tech cluster – Shoreditch, East London. We benchmark our results against existing case studies and admin-istrative data, replicating the main features of the cluster and providing fresh insights. As well as overcoming limitations in conventional industrial classification, our method addresses some of the spatial and temporal limitations of the clustering literature.
Keywords: clusters, cities, technology industry, machine learning
Authors: Christoph Stich, Emmanouil Tranos, Max Nathan
Download CASA Working Paper 227 (file size 3.2MB, file format PDF)