NOAA models have a long-standing reputation in the academic community as being challenging to obtain, and even more challenging to work with. The research team at the Center for Ocean-Land-Atmosphere Studies (COLA), now at George Mason University (GMU), is well-acquainted with these challenges. This year marks our 40th anniversary of using NOAA operational models in a research setting, dating back to 1985 and the Medium Range Forecast model. Recently, however, there has been a sea-change in the accessibility and portability of much of the NOAA modeling suite, beginning with the transition to the community-based Unified Forecast System (UFS).
Through our ongoing engagement and collaboration with our NOAA Earth Prediction Innovation Center (EPIC) partners, we are now making production runs in support of the Seasonal Forecast System (SFS) development effort. These runs are being managed and configured using the same global-workflow system as our NOAA colleagues, eliminating one of the longest standing pain-points in the Research to Operations (R2O) process. Here we describe the steps and resources we are using to perform novel numerical experiments using the UFS. It is our hope that the community will find this useful in setting up their own experiments with UFS, particularly if they have found getting started challenging in the past.
Obtain Code and Model Files: Until relatively recently, one of the greatest challenges in running a NOAA model was simply getting the code! It used to be that this required a personal connection, a tarball on an ftp site, and a certain amount of luck. Now the code can be obtained from git-hub via a one-line command and a few minutes of waiting. While it is easy to take this for granted it’s worth emphasizing how thoroughly this problem has been solved.
In addition to the model code, many other files are needed before production can begin (e.g., boundary conditions, grid definitions, initial conditions, etc.) Here again the barrier to access used to be extremely high, with most files available only from a handful of NOAA systems that were inaccessible to the community. Now the majority of these files, including initial conditions and fix files, are readily available for download from the cloud.
Build the Code: Once you can access the code, you need to be able to build and run it on resources that are available to you. Building NOAA models outside of NOAA High-Performance Computing (HPC) systems has long been one of the most formidable obstacles to their use, and the UFS has been no exception. The models require an extensive list of specialized libraries that system administrators are unlikely to take on the responsibility of building and maintaining, often leaving researchers with no viable path forward. While there have been efforts to increase the ease of installing these libraries via the spack-stack project it can still be quite challenging even for experts.
In recognition of this issue, we have collaborated with EPIC to make the spack-stack libraries available via containers. In our experience, this has been a game-changing event for portability of the UFS, making the entirety of the required library stack easily and quickly available on any system that supports containers. For our group this has primarily been the Frontera system at the Texas Advanced Computing Center, although we have also run on our local cluster at GMU. Nearly identical procedures can now be used to build the model across a variety of platforms, and we have literally shipped executables built on Frontera to our local cluster and other systems for testing.
Configure and Run: Once you have all of the input files and built the executable, the experiment still needs to be configured and run, typically managed via a workflow system. Workflow has been, and to a certain extent remains, the least Unified element of the UFS. Researchers, including our group, have often found themselves in the position of writing their own workflow solutions, after previous attempts to port the official NOAA solutions have failed. This leads to significant difficulties in comparing results with colleagues and in the transition to Operations.
As part of our ongoing collaboration with EPIC, and with particular thanks to Mark Potts, it is now possible for the Environmental Modeling Center (EMC) global-workflow to be built and run via containers as well. While some manual updates are still required for specific machines, the difficulty of this step has been vastly reduced. In our experience the remaining challenges are significantly outweighed by the benefits of using the official workflow, both in terms of debugging and in comparing experiments with NOAA colleagues.
Summary: If you have not tried running the UFS on your own recently, you may be in for a pleasant surprise! From git-hub to AWS to containers, building and running NOAA models has come a long way from the tarballs and custom hardware of yesteryear.