figshare
Browse
es1c02204_si_001.pdf (1.44 MB)

Machine Learning-Aided Causal Inference Framework for Environmental Data Analysis: A COVID-19 Case Study

Download (1.44 MB)
journal contribution
posted on 2021-09-24, 17:38 authored by Qiao Kang, Xing Song, Xiaying Xin, Bing Chen, Yuanzhu Chen, Xudong Ye, Baiyu Zhang
Links between environmental conditions (e.g., meteorological factors and air quality) and COVID-19 severity have been reported worldwide. However, the existing frameworks of data analysis are insufficient or inefficient to investigate the potential causality behind the associations involving multidimensional factors and complicated interrelationships. Thus, a causal inference framework equipped with the structural causal model aided by machine learning methods was proposed and applied to examine the potential causal relationships between COVID-19 severity and 10 environmental factors (NO2, O3, PM2.5, PM10, SO2, CO, average air temperature, atmospheric pressure, relative humidity, and wind speed) in 166 Chinese cities. The cities were grouped into three clusters based on the socio-economic features. Time-series data from these cities in each cluster were analyzed in different pandemic phases. The robustness check refuted most potential causal relationships’ estimations (89 out of 90). Only one potential relationship about air temperature passed the final test with a causal effect of 0.041 under a specific cluster-phase condition. The results indicate that the environmental factors are unlikely to cause noticeable aggravation of the COVID-19 pandemic. This study also demonstrated the high value and potential of the proposed method in investigating causal problems with observational data in environmental or other fields.

History