Developing quantitative tools for asthma forecast in London using weather and air quality

2017-02-06T06:09:03Z (GMT) by Soyiri, Ireneous Ngmenlanaa
The thesis examines approaches to the forecasting of respiratory events, generally hospital admissions for asthma, but also mortality. The focus of the thesis is forecasting accuracy rather than model specification per se. The thesis is a compilation of eight papers (seven published, one "under review"), broken in to four sections, with a brief narrative drawing the themes together. The topic is introduced with a review of asthma - the main condition examined in the thesis – and the known relationships between environmental conditions and asthma events. An empirical study is then presented that examines the factors affecting length of stay (LOS) in a hospital following an asthma admission. The paper relies on National Health Service (NHS), England data for London from 2001 to 2006. The idea was to demonstrate a burden of disease, as measured in this case by LOS, as a motivation for forecasting asthma events. If there is no consequence for the health system of asthma events, then there may be no point proceeding to the forecasting. Negative binomial regression was used to model the effect(s) of demographic, temporal and diagnostic factors on the LOS, taking into account the cluster effect of each patient's hospital attendance in London. The median and mean asthma LOS over the period of study were 2 and 3 days respectively. Admissions increased over the years from 8,308 (2001) to 10,554 (2006), but LOS consistently declined within the same period. Younger individuals were more likely to be admitted than the elderly, but the latter significantly had higher LOS (p<0.001). Respiratory related secondary diagnoses, age, and gender of the patient as well as day of the week and year of admission were important predictors of LOS. Having established the burden of asthma on the health system, health forecasting as an approach is introduced in a series of three closely related, published papers. In the first paper a general overview of health forecasting is provided (Soyiri and Reidpath, 2012a). In the second paper, there is a greater emphasis on the specific modelling approaches used in forecasting, and the measures of forecasting accuracy (Soyiri and Reidpath, 2012b). The final published paper in this series introduces in a general sense a "semi structured black-box approach" to forecasting. Two modelling techniques are described Negative Binomial Models for modelling the conditional mean, and Quantile Regression Models for modelling more extreme quantiles; and these are illustrated using London data from 2005-2006 (Soyiri and Reidpath, 2012c). In the next section of the thesis, four empirical studies are presented, each looking at an approach to health forecasting in greater detail. The first paper examines the use of negative binomial regression to forecast asthma related admissions to London hospitals (2005-2006) using weather and air quality as predictive factors (Soyiri et al, 2013 ). The data were split in two, with one year’s data used for model development and the second years data used for cross validation. Three models were contrasted; a historical average model, a seasonal average model, and a model using selected weather and air quality factors. The seasonal model out performed the historical and the weather and air quality models. Given the known causal effect of weather and air quality on asthma, this was somewhat surprising, and led to an alternative approach. The second paper describes the use of humans as animal sentinels in the forecasting of asthma events (Soyiri and Reidpath, 2012d). In effect, the sensitive lung is "the canary in the coal mine" for the less sensitive lung. Without having to measure any particular environmental trigger or determine the causal relationships between environmental exposures and asthma events, the potential exists to use the frequency of asthma events in the population today to predict the frequency of asthma events in the future. The lungs of the population are seen as "processors of the information" about weather and air quality - avoiding the need to independently estimate the effects. Negative binomial regressions were used in the modelling, allowing for non-contiguous autoregressive components. Selected lags of previous days' admissions were based on partial autocorrelation function (PACF) plot with a maximum lag of 7 days. The model was contrasted with naïve historical and seasonal models. All models were cross validated, with a clear indication of the superiority of the lag - human sentinel - model over the seasonal or historical model. One of the issues with the previous approaches described here is that they rely on modelling the conditional mean, and yet it is often more useful to be able to forecast a more extreme quantile. Knowing the conditional 90th percentile of asthma admissions for instance provides information about the high end of resources that should be made available. The third paper examines the use of quantile regression to forecast asthma higher than expected numbers of asthma events in London (Soyiri et al. 2012). Appropriate lags of weather and air quality factors were selected, and then pooled to form multivariate predictive models, selected through a systematic backward stepwise reduction approach. Models were cross-validated using a hold-out sample of the data, and their respective root mean square error measures, sensitivity, specificity and predictive values compared. The results indicate that associations between asthma and environmental factors, including temperature, ozone and carbon monoxide can be exploited in predicting future events using quantile regression models. Two criticisms of this paper arose - one during the review process, and one after the review process. The criticism that arose during the review process was that the number of years (2005-2006) was small and it would be better to have more years of data. The second criticism was that the quantile regression approach could be improved upon by using the more unusual quantile regression for count data. The final paper re-examines the QRM methodology taking account of the two criticisms, a larger dataset was identified that contained 70,830 respiratory related deaths that occurred between 1987-2000 in New York City (Soyiri and Reidpath, Under Review). The models showed improvements of quantile regression models with seasonal and weather/air quality predictors over a seasonal models alone. Health forecasting is in early stages of development; however, the indications are that relatively simple models may be able to provide information to health systems that will improve service delivery and resource allocation. There remains considerable work to be done in this area both in refining the modelling approaches, and in testing the models in different settings. References Soyiri IN, Reidpath DD, Sarran C. (2011). Asthma length of stay in hospitals in London 2001-2006: demographic, diagnostic and temporal factors. PloS One. 6(11):e27184 Soyiri IN, Reidpath DD. (2012a). An overview of health forecasting. Environ Health Prev Med. 2013 [DOI: 10.1007/s12199-012-0294-6] Soyiri IN, Reidpath DD. (2012b). Evolving forecasting classifications and applications in health forecasting. Int J Gen Med. 5:381-9. Soyiri IN, Reidpath DD. (2012c). Semistructured black-box prediction: proposed approach for asthma admissions in London. Int J Gen Med. 5:693-705 Soyiri IN, Reidpath DD, Sarran C. (2012). Forecasting peak asthma admissions in London: an application of quantile regression models. Int J Biometeorol. 2012 Aug 12. [DOI: 10.1007/s00484-012-0584-0] Soyiri IN, Reidpath DD. (2012d). Humans as animal sentinels for forecasting asthma events: helping health services become more responsive. PLoS One. 7(10): e47823. Soyiri IN, Reidpath DD, Sarran C. Forecasting asthma related hospital admissions in London using negative binomial models. Chronic Respiratory Disease; 2013;10(2):85-94. DOI: 10.1177/1479972313482847. Soyiri IN, Reidpath DD. (Under Review). The use of quantile regression to forecast higher than expected respiratory deaths in a daily time series: a study of New York City data 1987-2000.