boy looking out at city of Bogota.
Machine learning for informal settlement health
Tom Fisher
Mohammad Mamouei
Samuel Yutong Cai
Kazem Rahimi

Over a billion people live in informal settlements worldwide and this number is increasing across the Global South. 

Populations in informal settlements are exposed to health risks due to inadequate infrastructure and services including safe drinking water, waste collection, sanitation and healthcare access. Residents may also be disproportionately exposed to other environmental health hazards  including indoor and outdoor air pollution, heavy traffic, road accidents, pesticides, soil pollution and flooding.

 A combination of administrative, funding and infrastructural shortcomings hinders data-driven policy making. For example we cannot yet accurately determine the location and size of informal  settlements and how they change over time.  We do not know how environmental health around these settlements is changing and affects population health. Both of these examples are important for decision makers to be able to make informed changes to policy to help many of the world's poorest people.

To tackle these complex population and environmental health challenges in these settings, we will need novel analytical approaches utilising promising data sources of greater depth and breadth than used before.


This project is aided by Descartes Labs who provides high quality satellite images and processing infrastructure.

We first use different types of satellite imagery to train a deep learning model to accurately map informal settlements. We focus our attention on multispectral and very high resolution visual spectrum images of Mumbai, India and Bogota, Colombia where accurate ground truth maps exist and allow us to train our model well and confirm that it has good unseen test set performance.

After obtaining ground level air quality data for the two cities we will then use satellite imagery and develop robust machine learning models to generate air quality maps at a fine local level (<1km). We will then scale this approach to estimate ground-level air quality in other cities in the global south  where  air quality monitoring networks  are  lacking. We will also be able to determine to what extent informal settlements are exposed to air pollution. 

Finally, we will conduct a health impact assessment of air quality in the two cities, and specifically look at the health burden separately in populations residing in informal and formal settlements.


Our model is able to detect changes in slum areas over time, tracking the increasing number of dwellings appearing as populations grow.Having verified that our model is able to achieve good test set performance, we are now deploying it to further applications like slum monitoring over time.

We have developed a model training process that significantly improves the quality of training on datasets with low-quality ground truth labelling, resulting in better test performance.

Future work is set to deploy the model alongside environmental and resident health data to comprehensively map the well-being of cities. The second work plan on mapping air quality across the two cities will start in July 2021.

The insights that policy makers and local governments will be able to unlock from their data will inform improved urban policy making around issues such as optimising water and sanitation infrastructure construction.