%0 Journal Article %A Mohan, Sankaralingam %A Saranya, Packiam %D 2018 %T A novel bagging ensemble approach for predicting summertime ground-level ozone concentration %U https://tandf.figshare.com/articles/journal_contribution/A_novel_bagging_ensemble_approach_for_predicting_summer_time_ground_level_ozone_concentration/7189517 %R 10.6084/m9.figshare.7189517.v2 %2 https://ndownloader.figshare.com/files/13757984 %K study scaffolded %K summer time ground level O 3 concentration %K Comparison study %K error measures %K research gap %K model results %K air pollutant prediction %K ensemble classifiers %K test data %K Nash-Sutcliffe coefficient 0.93. %K data analysis %K input variables %K Random forest %K ensemble model %K PEP %K base learners %K surface level O 3 concentration %K summertime ground-level ozone concentration Ozone pollution %K base classifiers %K Industrial area %K peak concentrations %K air quality issue %K summer time ground level ozone %K novel bagging ensemble approach %K ground level ozone %K ensemble bagging approach %K Multilayer perceptron %K waste management facility %X

Ozone pollution appears as a major air quality issue, e.g. for the protection of human health and vegetation. Formation of ground level ozone is a complex photochemical phenomenon and involves numerous intricate factors most of which are interrelated with each other. Machine learning techniques can be adopted to predict the ground level ozone. The main objective of the present study is to develop the state-of-the-art ensemble bagging approach to model the summer time ground level ozone in an industrial area comprising a hazardous waste management facility. In this study, the feasibility of using ensemble model with seven meteorological parameters as input variables to predict the surface level O3 concentration. Multilayer perceptron, RTree, REPTree, and Random forest were employed as the base learners. The error measures used for checking the performance of each model includes IoAd, R2, and PEP. The model results were validated against an independent test data set. Bagged random forest predicted the ground level ozone better with higher Nash-Sutcliffe coefficient 0.93. This study scaffolded the current research gap in big data analysis identified with air pollutant prediction.

Implications: The main focus of this paper is to model the summer time ground level O3 concentration in an Industrial area comprising of hazardous waste management facility. Comparison study was made between the base classifiers and the ensemble classifiers. Most of the conventional models can well predict the average concentrations. In this case the peak concentrations are of importance as it has serious effect on human health and environment. The models developed should also be homoscedastic.

%I Taylor & Francis