Access


Abstract

By capturing the physical complexity of the interactions between atmospheric flows and the built environment, Large-Eddy Simulations (LES) could provide detailed information for risk assessment and mitigation in case of environmental emergency. However, to account for LES uncertainties and cover the range of plausible scenarios in order to support decision making, it is necessary to go beyond deterministic simulation capability. This study introduces a novel ensemble-based data assimilation algorithm to correct the LES meteorological forcing and thereby improve LES spatial predictions of pollutant concentration by making use of available measurements.

Figure 1: Schematic of the surrogate-based ensemble data assimilation system.

This approach is demonstrated through the MUST field-scale experiment. Results show that the ensemble smoother with multiple data assimilation (ESMDA1) algorithm improves parameter estimation compared to the standard Ensemble Kalman Filter. This is because its iterative nature address parameter interaction effects in the relationship between uncertain meteorological forcing and LES field quantities. The substantial speed-up provided by the POD–GPR surrogate model2 enables large ensemble size, thereby improving DA estimation accuracy, while making the DA system tractable for real-time applications. It is also a strength for carrying out a detailed validation of the DA system.

Figure 2: Parameter estimation errors for the ESMDA and EnKF and varying background biases.

LES plume predictions are improved by the DA system near the source and in the vertical, but persistent underestimation near the ground and downstream remains. These errors cannot be explained solely by meteorological forcing biases or internal variability. This indicates significant structural uncertainties in LES that should be investigated and accounted for in future developments.

Figure 3: Concentration vertical profiles estimation.

Finally, it is shown that DA performance is highly sensitive to sensor placement, and this sensitivity is underestimated when assessed with synthetic measurements. The combination of model and real measurement biases leads to contradictory corrections depending on sensor locations. This shows that sensor optimization strategies based solely on idealized twin experiments may be overly optimistic.


Code

A notebook describing the ESMDA code and its use with the POD-GPR surrogate model is openly available at https://github.com/eliott-lumet/esmda_ppmles


How to cite?

Lumet, E., Rochoux, M., C., Jaravel, T., and Lacroix, S. (2025). Surrogate-Based Ensemble Data Assimilation for Reducing Uncertainty in Large-Eddy Simulation of Microscale Pollutant Dispersion. In press in Building and Environment, Preprint DOI: 10.2139/ssrn.5354492


  1. Emerick, A., A., and Renyolds, A., C. (2013). Ensemble smoother with multiple data assimilation. Computers & Geosciences., volume 55 , p.3. DOI: 10.1016/j.cageo.2012.03.011 ↩︎

  2. Lumet, E., Rochoux, M., C., Jaravel, T., and Lacroix, S. (2024). Uncertainty-aware surrogate modeling for urban air pollutant dispersion prediction. Building and Environment, 267:112287. DOI: 10.1016/j.buildenv.2024.112287 ↩︎