The NOAA Unified Forecast System (UFS) Global Ensemble Forecast System version 13 (GEFSv13) replay dataset supports and enhances the next implementation of NOAA’s medium-range forecast system (GEFSv13 / GFSv17). NOAA’s PSL developed the UFS replay dataset, which provides initial conditions for the retrospective forecast archive.
This retrospective forecasting technique, known as replay (Orbe et al., 2017), improves the accuracy and reliability of weather forecast simulations using historical weather data and modern models. This reanalysis-like dataset can also be used for training of the machine learning (ML) models. The GEFS v13 Replay dataset was produced by replaying the coupled version of the new UFS model to external reanalyses: ERA5 for the atmosphere and ORAS5 for the ocean and ice. For the replay experiment, the HR1 tag of the NOAA UFS coupled model was utilized, which includes atmosphere, ocean, ice, land, and wave model components.
NOAA EPIC assisted PSL in converting the replay dataset into a machine learning-ready format suitable for training new emulators. This conversion involved moving about 1 Pb of netcdf files from AWS to GCP cloud platform and converting them from the model native netcdf format to machine-learning friendly zarr format. As part of the conversion, a 1-degree dataset was created by subsampling every 4th grid point from the original ¼ degree grid. These conversions were conducted by Mariah Pope from EPIC utilizing the “ufs2arco” framework, developed by PSL. A Jupyter notebook example is provided for user convenience.
Read more: NOAA Unified Forecast System (UFS) Replay >