WIT Press

Machine learning meteorological normalization models for trend analysis of air quality time series

Price

Free (open access)

Volume

Volume 4 (2021), Issue 4

Pages

12

Page Range

375 - 387

Paper DOI

10.2495/EI-V4-N4-375-387

Copyright

WIT Press

Author(s)

Roberta Valentina Gagliardi & Claudio Andenna

Abstract

Air pollution is a major environmental cause of morbidity and mortality worldwide, representing a top public health objective, especially in areas interested by the presence of anthropic emissions sources. Correctly assessing how pollutant emissions influence the air quality is, therefore, crucial for the design and/or implementation of effective measures from the public health perspectives. The impact of local emission sources on air quality is strongly modulated by meteorological conditions, which can mask the real trends in the observed pollutant concentrations. However, the confounding effect of meteorology in air quality time series can be accounted for by techniques of meteorological normalisation. In this study, the performances of a meteorological normalisation technique based on machine learning (ML) algorithms were investigated. To these purposes, two Ml models (gradient boosted regression (GBM) and random forest (RF)) were developed and subsequently used to calculate meteorologically normalised trends of nitrogen oxide (NOx ) concentrations time series. Both models were trained on daily averaged data of NOx concentrations and meteorological parameters, as well as on temporal variables; data were acquired, over the 2013–2019 period, in a rural area affected by anthropic sources of air pollutants. Results obtained show that both models are able to explain more than 70% of the variance in the NOx observed concentrations and that the meteorological normalization technique based on both algorithms represent a robust method to account for the confounding effect of meteorology in air quality time series. Moreover, the GBM/RF ML models allowed to analyse the dependence of the observed concentrations on each explanatory variables used in the models, shedding light on the role of local meteorological processes in the observed pollutant concentrations. This knowledge can help in defining air pollution control strategies that are increasingly effective in preventing and/or mitigating health damage associated with exposure to atmospheric pollution.

Keywords

air pollution, boosted regression trees, machine learning, meteorology, random forest, trend analysis.