Enhancing PM2.5 modeling with reinforcement learning: dynamic ensembling of multi-graph attention networks and deep recurrent models
Abstract
Modeling PM2.5 concentrations in urban environments is complex due to the irregular distribution of air pollution monitoring (APM) stations, uncertainties in spatiotemporal relationships, and the dynamic, heterogeneous nature of urban environments. To address these challenges, this study proposes a novel three-stage framework to enhance PM2.5 modeling accuracy. First, a graph attention network (GAT) effectively handles the irregular distribution and uncertainty in spatiotemporal relationships by using multi-graphs to capture both spatial and temporal correlations between APM stations. The GAT's attention mechanism adaptively assigns greater weights to more relevant inputs, improving both interpretability and prediction precision. In the final stage, reinforcement learning, through the use of a Deep Q-Network (DQN), a reinforcement learning algorithm, optimizes the ensemble of GAT with deep recurrent networks long short-term memory (LSTM), and Gated recursive unit (GRU), dynamically adjusting model weightings to better adapt to rapidly changing urban environments. This framework significantly outperforms thirteen state-of-the-art models, demonstrating superior adaptability and accuracy in capturing PM2.5 dynamics. These findings offer a robust and scalable solution for air pollution prediction, with direct implications for public health interventions and urban policy planning. © The Author(s) under exclusive licence to Iranian Society of Environmentalists (IRSEN) and Science and Research Branch, Islamic Azad University 2025.