Machine learning meteorological normalization models for trend analysis of air quality time series
Price
Free (open access)
Volume
Volume 4 (2021), Issue 4
Pages
12
Page Range
375 - 387
Paper DOI
10.2495/EI-V4-N4-375-387
Copyright
WIT Press
Author(s)
Roberta Valentina Gagliardi & Claudio Andenna
Abstract
Air pollution is a major environmental cause of morbidity and mortality worldwide, representing a top public health objective, especially in areas interested by the presence of anthropic emissions sources. Correctly assessing how pollutant emissions influence the air quality is, therefore, crucial for the design and/or implementation of effective measures from the public health perspectives. The impact of local emission sources on air quality is strongly modulated by meteorological conditions, which can mask the real trends in the observed pollutant concentrations. However, the confounding effect of meteorology in air quality time series can be accounted for by techniques of meteorological normalisation. In this study, the performances of a meteorological normalisation technique based on machine learning (ML) algorithms were investigated. To these purposes, two Ml models (gradient boosted regression (GBM) and random forest (RF)) were developed and subsequently used to calculate meteorologically normalised trends of nitrogen oxide (NOx ) concentrations time series. Both models were trained on daily averaged data of NOx concentrations and meteorological parameters, as well as on temporal variables; data were acquired, over the 2013–2019 period, in a rural area affected by anthropic sources of air pollutants. Results obtained show that both models are able to explain more than 70% of the variance in the NOx observed concentrations and that the meteorological normalization technique based on both algorithms represent a robust method to account for the confounding effect of meteorology in air quality time series. Moreover, the GBM/RF ML models allowed to analyse the dependence of the observed concentrations on each explanatory variables used in the models, shedding light on the role of local meteorological processes in the observed pollutant concentrations. This knowledge can help in defining air pollution control strategies that are increasingly effective in preventing and/or mitigating health damage associated with exposure to atmospheric pollution.
Keywords
air pollution, boosted regression trees, machine learning, meteorology, random forest, trend analysis.