Exploring the complex relationship between environmental factors and health outcomes using machine learning approaches
Researchers: Yajie Zhu, Kazem Rahimi, Andy Hong, Dexter Canoy, Reza Khorshidi, Jose Roberto Ayala Solares, Mariagrazia Zottoli
Description of work
Disparities in the burden of chronic disease between and within populations are well-recognised. Whilst many of the modifiable, individual-level disease risk factors have been identified, understanding the wider determinants of health, such as the physical (as well as socio-political) environment, is likely to play a key role in addressing iniquities in population health.
Intervening or modifying upstream environmental determinants of health could potentially lead to a greater impact in improving health for a greater number in the population. Although the impact of the environment on health is well-recognised, large-scale and detailed phenotypic environmental exposures linked at an individual level has been limited. The combined impact or the relative contributions of these different environmental factors on health or on geospatial clustering of risk factors and disease at high resolution remain understudied.
In this project, we hope to elucidate the importance of environmental factors in influencing health and disease and their geospatial distribution using a range of data science and deep learning approaches to extract, infer, and validate knowledge using a large, well-characterised UK cohort. We will also investigate the feasibility of using other big data sources to widen environmental phenotypic information and increase spatial resolution to allow assessments at smaller geographical units, particularly in urban areas.
Our data-driven approach will apply specific domain knowledge and theories in big data science and urban health, and use inductive reasoning for generating novel insights. Through this approach, we hope to be able to identify multiple modifiable environmental factors relevant to population health, and infer relationships between the environment and health to allow us to apply relevant policies even in areas where local health data might be limited.
Aims and research questions
Our aim is to explore the relation between environment and health using a wide range of objective measures representing different environmental domains and several indicators of chronic disease and relevant risk factors. Using these findings, we will create a ‘health map’ and develop risk prediction models and apply these predictions to various ‘what-if’ scenarios. Finally, we will explore how we can enhance risk predictions at higher resolution by incorporating other data sources to improve environmental phenotypic information, and develop methods to enable handling and analysis of big, complex data for research and policy development.
- Using individual-level data, what are the separate and combined impacts of environmental factors on health and disease risk factors? Which environmental factors are most predictive of major causes of morbidity and mortality?
- Is it feasible to incorporate other sources of environmental exposures to enhance phenotypic information of the environment in this cohort? Will the addition of these data improve the predictive models and offer high resolution information about the environment and health even at smaller geographical units?
- How applicable are various predictive models for different health outcomes in other settings/populations where data on environment and health indicators are likely to be sparse?
- Are remote sensing-based measures indicators of the environment, and if so, are these measures predictive of health outcomes?