Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #49 - Add METplus instructions to the Jetstream2 tutorial #56

Merged
merged 39 commits into from
Sep 9, 2024
Merged
Show file tree
Hide file tree
Changes from 31 commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
e340cb7
Roll back change to usecases.rst
Trumbore May 20, 2024
b1ccada
Draft of the Jetstream2-Matthew tutorial instructions
Trumbore May 20, 2024
ab76b18
Tests to see what Markdown format I should really use
Trumbore May 20, 2024
7b63228
Crazy-ass markdown language
Trumbore May 20, 2024
af0c1ff
Sigh
Trumbore May 20, 2024
b27166c
Switch the incorrect markdown to the correct, yet horrible, version
Trumbore May 20, 2024
9108559
Edit text, fix typos and links
Trumbore May 21, 2024
aa27524
Make sure all commands work properly, add more info.
Trumbore May 22, 2024
01b5d1a
A few more tweaks before others view it
Trumbore May 22, 2024
3d47fc2
Revisions based on feedback and some procedural changes
Trumbore May 28, 2024
099d462
Final edits before initial publication
Trumbore May 29, 2024
672f400
merge main into jetsream and resolved conflicts
georgemccabe May 29, 2024
703af04
fixed URL links, fixed header underline lengths, added link to matthe…
georgemccabe May 29, 2024
bdcc262
added ID so heading can be linked using :ref:
georgemccabe May 29, 2024
e194514
use double underscores to create anonymous reference for links that h…
georgemccabe May 29, 2024
54c3b29
updated versions of actions to prevent node.js deprecated warnings
georgemccabe May 29, 2024
26ba3de
added orphan identifier to prevent warning that this page is not incl…
georgemccabe May 29, 2024
1bf8a00
fixed typos
georgemccabe May 29, 2024
bc4e97d
Refactor the existing doc in preparation for adding METPlus doc
Trumbore Jun 26, 2024
14b22a4
Add initial version of METPlus instructions
Trumbore Jun 26, 2024
36f08fe
Tweaks to documentation after full testing
Trumbore Jun 26, 2024
3acf669
Get METPlus working
Trumbore Jun 27, 2024
c409ad6
Edits from final testing pass on new and revised content
Trumbore Jun 27, 2024
88504c0
Final tweaks before creating a pull request
Trumbore Jun 27, 2024
993df6d
Add text about viewing the output of METPlus
Trumbore Jun 28, 2024
3a507e4
Merge branch 'main' into jetstream-metplus
Trumbore Jun 28, 2024
62c6966
changed METPlus to METplus
georgemccabe Jun 28, 2024
1c3a23c
split long docker run commands into multiple lines for better readabi…
georgemccabe Jun 28, 2024
7aa55a2
ignore auto-generated file
georgemccabe Jun 28, 2024
601d40a
use env var to reference obs data volume to more easily adapt to othe…
georgemccabe Jun 28, 2024
76dbd96
rewording and avoid using analysis to describe the METplus verificati…
georgemccabe Jun 28, 2024
e606907
A few formatting tweaks after the code review
Trumbore Jul 1, 2024
b441065
ignore auto-generated file
georgemccabe Jul 1, 2024
5264e7b
Refactor instructions in preparation for edits related to changes in …
Trumbore Jul 17, 2024
e416156
Add a run script and config file and update the existing config file …
Trumbore Jul 18, 2024
b23b297
change run.sh permissions
Trumbore Jul 18, 2024
62121c3
Remove data downloading from run.sh
Trumbore Jul 18, 2024
6e894e8
Finalize edits of the tutorial to use config files from new location …
Trumbore Jul 18, 2024
064179f
Merge branch 'main' into jetstream-metplus
Trumbore Jul 18, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
*~
.vs
/.DS_Store
Binary file added docs/.DS_Store
Binary file not shown.
2 changes: 1 addition & 1 deletion docs/Users_Guide/configuration.rst
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ Use this for the vars_io.txt file for the Hurricane Matthew case, from the Githu
https://github.com/NCAR/i-wrf/blob/main/use_cases/Hurricane_Matthew/WRF/vars_io.txt

^^^^^^^^^^^^^^^^^^^
METPlus Config File
METplus Config File
^^^^^^^^^^^^^^^^^^^
For the METplus configuration file for the Hurricane Matthew case, please use this file on the Github repository:

Expand Down
166 changes: 125 additions & 41 deletions docs/Users_Guide/matthewjetstream.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,21 +8,24 @@ Running I-WRF On Jetstream2 with Hurricane Matthew Data
Overview
========

The following instructions can be used to run
the `I-WRF weather simulation program <https://i-wrf.org>`_
The following instructions can be used to run elements of
the `I-WRF weather simulation framework <https://i-wrf.org>`_
from the `National Center for Atmospheric Research (NCAR) <https://ncar.ucar.edu/>`_
and the `Cornell Center for Advanced Computing <https://cac.cornell.edu/>`_.
The steps below run the `Weather Research & Forecasting (WRF) <https://www.mmm.ucar.edu/models/wrf>`_ model
and the `METplus <https://https://dtcenter.org/community-code/metplus>`_ verification framework
with data from `Hurricane Matthew <https://en.wikipedia.org/wiki/Hurricane_Matthew>`_
on the `Jetstream2 cloud computing platform <https://jetstream-cloud.org/>`_.
This exercise provides an introduction to using cloud computing platforms,
running computationally complex simulations and using containerized applications.
running computationally complex simulations and analyses, and using containerized applications.

Simulations like I-WRF often require greater computing resources
Simulations like WRF often require greater computing resources
than you may have on your personal computer,
but a cloud computing platform can provided the needed computational power.
Jetstream2 is a national cyberinfrastructure resource that is easy to use
and is available to researchers and educators.
This exercise runs the I-WRF program as a Docker "container",
which simplifies the set-up work needed to run the simulation.
This exercise runs the I-WRF programs as Docker "containers",
which simplifies the set-up work needed to run the simulation and verification.

It is recommended that you follow the instructions in each section in the order presented
to avoid encountering issues during the process.
Expand Down Expand Up @@ -76,7 +79,7 @@ Create a Cloud Instance and Log In
==================================

After you have logged in to Jetstream2 and added your allocation to your account,
you are ready to create the cloud instance where you will run the I-WRF simulation.
you are ready to create the cloud instance where you will run the simulation and verification.
If you are not familiar with the cloud computing terms "image" and "instance",
it is recommended that you `read about them <https://cvw.cac.cornell.edu/jetstream/intro/imagesandinstances>`__
before proceeding.
Expand Down Expand Up @@ -123,10 +126,10 @@ In either case you will need to know the location and name of the private SSH ke
the IP address of your instance (found in the Exosphere web dashboard)
and the default username on your instance, which is "exouser".

Once you are logged in to the web shell you can proceed to the
Once you are logged in to the instance you can proceed to the
"Install Software and Download Data" section below.
You will know that your login has been successful when the prompt has the form ``exouser@instance-name:~$``,
which indicates your username, the instance name, and your current working directory, followed by "$"
which indicates your username, the instance name, and your current working directory, followed by "$".

Managing a Jetstream2 Instance
------------------------------
Expand All @@ -153,31 +156,47 @@ so Shelving as soon as you are done becomes even more important!
Install Software and Download Data
==================================

With your instance created and running and you logged in to it through a Web Shell,
you can now install the necessary software and download the data to run the simulation.
With your instance created and running and you logged in to it through SSH,
you can now install the necessary software and download the data to run the simulation and verification.
You will only need to perform these steps once,
as they essentially change the contents of the instance's disk
and those changes will remain even after the instance is shelved and unshelved.

The following sections instruct you to issue numerous Linux commands in your web shell.
The following sections instruct you to issue numerous Linux commands in your shell.
If you are not familiar with Linux, you may want to want to refer to
`An Introduction to Linux <https://cvw.cac.cornell.edu/Linux>`_ when working through these steps.
The commands in each section can be copied using the button in the upper right corner
and then pasted into your web shell by right-clicking.
and then pasted into your shell by right-clicking.

If your web shell ever becomes unresponsive or disconnected from the instance,
If your shell ever becomes unresponsive or disconnected from the instance,
you can recover from that situation by rebooting the instance.
In the Exosphere dashboard page for your instance, in the Actions menu, select "Reboot".
The process takes several minutes, after which the instance status will return to "Ready".

Install Docker and Get the I-WRF Image
--------------------------------------
We will be using some environment variables throughout this exercise to
make sure that we use the same resource names and file paths wherever they are used.
Copy and paste the definitions below into your shell to define the variables before proceeding::

WRF_IMAGE=ncar/iwrf:latest
METPLUS_IMAGE=dtcenter/metplus-dev:develop
WORKING_DIR=/home/exouser
WRF_DIR=${WORKING_DIR}/wrf/20161006_00
METPLUS_DIR=${WORKING_DIR}/metplus
WRF_CONFIG_DIR=${WORKING_DIR}/i-wrf/use_cases/Hurricane_Matthew/WRF
METPLUS_CONFIG_DIR=${WORKING_DIR}/i-wrf/use_cases/Hurricane_Matthew/METplus
OBS_DATA_VOL=data-matthew-input-obs

Any time you open a new shell on your instance, you will need to perform this action
to redefine the variables before executing the commands that follow.

Install Docker
--------------

As mentioned above, the I-WRF simulation application is provided as a Docker image that will run as a
As mentioned above, the WRF and METplus software are provided as Docker images that will run as a
`"container" <https://docs.docker.com/guides/docker-concepts/the-basics/what-is-a-container/>`_
on your cloud instance.
To run a Docker container, you must first install the Docker Engine on your instance.
You can then "pull" (download) the I-WRF image that will be run as a container.
You can then "pull" (download) the WRF and METplus images that will be run as containers.

The `instructions for installing Docker Engine on Ubuntu <https://docs.docker.com/engine/install/ubuntu/>`_
are very thorough and make a good reference, but we only need to perform a subset of those steps.
Expand All @@ -192,68 +211,104 @@ When the installation is complete, you can verify that the Docker command line t

docker --version

Next, you must start the Docker daemon, which runs in the background and processes commands::
The Docker daemon should start automatically, but it sometimes runs into issues.
First, check to see if the daemon started successfully::

sudo service docker start
sudo systemctl --no-pager status docker

If that command appeared to succeed, you can confirm its status with this command::
If you see a message saying the daemon failed to start because a "Start request repeated too quickly",
wait a few minutes and issue this command to try again to start it::

sudo systemctl --no-pager status docker
sudo systemctl start docker

If the command seems to succeed, confirm that the daemon is running using the status command above.
Repeat these efforts as necessary until it is started.

Once all of that is in order, you must pull the latest version of the I-WRF image onto your instance::
Get the WRF and METplus Docker Images and the Observed Weather Data
-------------------------------------------------------------------

docker pull ncar/iwrf
Once Docker is running, you must pull the correct versions of the WRF and METplus images onto your instance::

docker pull ${WRF_IMAGE}
docker pull ${METPLUS_IMAGE}

METplus is run to perform verification of the results of the WRF simulation using
observations gathered during Hurricane Matthew.
We download that data by pulling a Docker volume that holds it,
and then referencing that volume when we run the METplus Docker container.
The commands to pull and create the volume are::

docker pull ncar/iwrf:${OBS_DATA_VOL}.docker
docker create --name ${OBS_DATA_VOL} ncar/iwrf:${OBS_DATA_VOL}.docker

Get the Geographic Data
-----------------------

To run I-WRF on the Hurricane Matthew data set, you need a copy of the
To run WRF on the Hurricane Matthew data set, you need a copy of the
geographic data representing the terrain in the area of the simulation.
These commands download an archive file containing that data,
uncompress the archive into a folder named "WPS_GEOG", and delete the archive file.
uncompress the archive into a folder named "WPS_GEOG" in your home directory, and delete the archive file.
They take several minutes to complete::

wget https://www2.mmm.ucar.edu/wrf/src/wps_files/geog_high_res_mandatory.tar.gz
tar -xzf geog_high_res_mandatory.tar.gz
rm geog_high_res_mandatory.tar.gz

Create the Run Folder
---------------------
Create the WRF and METplus Run Folders
--------------------------------------

The simulation is performed using a script that must first be downloaded.
The script expects to run in a folder where it can download data files and create result files.
The instructions in this exercise create that folder in the user's home directory and name it "matthew".
The instructions in this exercise create a folder (named "wrf") under the user's home directory,
and a sub-folder within "wrf" to hold the output of this simulation.
The subfolder is named "20161006_00", which is the beginning date and time of the simulatition.
The simulation script is called "run.sh".
The following commands create the empty folder and download the script into it,
then change its permissions so it can be run::
Similarly, a run folder named "metplus" must be created for the METplus process to use.
The following commands create the empty folders and download the script
and change its permissions so it can be run::

mkdir -p ${WRF_DIR}
curl --location https://bit.ly/3xzm9z6 > ${WRF_DIR}/run.sh
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend committing this run script to the i-wrf GitHub repository so it could be copied from there instead of downloading it. This would allow us to make any necessary updates to the script more easily and keep everything that is needed contained in the repo.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another benefit of committing it to GitHub is you can set the file permissions of the file in GitHub so you wouldn't need to open execute permissions by hand.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea. I hadn't been familiar with the contents of that repo, but see the structure now, and since it is already being pulled in this document, it would simplify things a lot. What do you think about also including the script to install Docker in that repo? Both would need to have names that accurately and fully define their scope.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hesitate to put a script in the GitHub repo that is very-specific to a single environment. This script wouldn't be used by users running on another machine besides Jetstream2, so I think it would be misleading to include it in the repo. We could poll the rest of the team to see what they think.

It looks like the script is relatively simple, so the commands could be included in the instructions instead of downloading and running a script from a short URL. This may appear more transparent to users.

I am going to be writing the "Getting Started" chapter of the I-WRF User's Guide. I was planning on including basic information on how to obtain the tools needed to run, e.g. Docker or Apptainer. I was going to include information relating to specific environments that we want to support, e.g. run module load apptainer on Casper/Derecho to load Apptainer. Your instructions to install Docker on Jetstream2 could be included in the "Getting Started" chapter since they are not specific to the Hurricane Matthew use case and a link from your instructions could point to the Getting Started section relating to Jetstream2. What do you think? If this sounds good, I can migrate your installation instructions to the Getting Started page when I create it.

chmod 775 ${WRF_DIR}/run.sh
mkdir -p ${METPLUS_DIR}

Download Configuration Files
----------------------------

Both WRF and METplus require some configuration files to direct their behavior,
and those are downloaded from the I-WRF GitHub repository.
Some of those configuration files must also be copied into run folders.
These commands perform the necessary operations::

mkdir matthew
curl --location https://bit.ly/3KoBtRK > matthew/run.sh
chmod 775 matthew/run.sh
git clone https://github.com/NCAR/i-wrf ${WORKING_DIR}/i-wrf
cp ${WRF_CONFIG_DIR}/vars_io.txt ${WRF_DIR}
curl --location https://bit.ly/4eKpb47 > ${WRF_DIR}/namelist.input.template
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this namelist file in the i-wrf GitHub repository? Should it be obtained from there instead of downloading it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, the file I am using is not quite the same as the one from GitHub. When using the GitHub version, I got parsing errors for the lines in the @time_control section that start with "io_form_". In the GitHub version, those lines all end with "= 2,", but that trailing comma causes problems in the test case so I had to create a version with it removed.

It's only a guess, but it may be because the run script I got from Bennett replaces some of the earlier @time_control settings with new ones. The GitHub version of lines like "start_year" have two values in them (i.e. "2016, 2019"). The overwritten versions only have one value (i.e. "2016,"). That change on 11 lines seems to be the only difference.

I don't know how else that namelist file is being used or if removing the trailing commas would break that other usage. Is the second column of date/time values actually needed in another usage? Maybe we should plan to have more than one of these configuration files so that demos with different needs can each store their files in GitHub.

Actually, a fair bit of the run script is spent tweaking the namelist file contents and it might be better to just save the tweaked version in GitHub and simplify the run script. I think that we should probably discuss these questions at a meeting before I make these changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also discuss this with the group, but in my opinion we should include the namelist file that is used for the Hurricane Matthew use case in the GitHub repo and skip the logic in the run script that replaces values in it. Not only are there errors that are resulting in this, but it hides the details from the user who may review the namelist file we provide and become confused as to why the actual namelist file that is used to run WRF differs. We should provide a correct version of all of the configuration files that do not require any modification to run.


Run I-WRF
=========
Run WRF

With everything in place, you are now ready to run the Docker container that will perform the simulation.
The downloaded script runs inside the container, prints lots of status information,
and creates output files in the run folder you created.
Execute this command to run the simulation in your web shell::
Execute this command to run the simulation in your shell::

time docker run --shm-size 14G -it -v ~/:/home/wrfuser/terrestrial_data -v ~/matthew:/tmp/hurricane_matthew ncar/iwrf:latest /tmp/hurricane_matthew/run.sh
docker run --shm-size 14G -it \
-v ${WORKING_DIR}:/home/wrfuser/terrestrial_data \
-v ${WRF_DIR}:/tmp/hurricane_matthew \
${WRF_IMAGE} /tmp/hurricane_matthew/run.sh

The command has numerous arguments and options, which do the following:

* ``time docker run`` prints the runtime of the "docker run" command.
* ``docker run`` creates the container if needed and then runs it.
* ``--shm-size 14G -it`` tells the command how much shared memory to use, and to run interactively in the shell.
* The ``-v`` options map folders in your cloud instance to paths within the container.
* ``ncar/iwrf:latest`` is the Docker image to use when creating the container.
* ``/tmp/hurricane_matthew/run.sh`` is the location within the container of the script that it runs.

The simulation initially prints lots of information while initializing things, then settles in to the computation.
The provided configuration simulates 12 hours of weather and takes under three minutes to finish on an m3.quad Jetstream2 instance.
The provided configuration simulates 48 hours of weather and takes about 12 minutes to finish on an m3.quad Jetstream2 instance.
Once completed, you can view the end of any of the output files to confirm that it succeeded::

tail matthew/rsl.out.0000
tail ${WRF_DIR}/rsl.out.0000

The output should look something like this::

Expand All @@ -268,3 +323,32 @@ The output should look something like this::
Timing for Writing wrfout_d01_2016-10-06_12:00:00 for domain 1: 0.32534 elapsed seconds
d01 2016-10-06_12:00:00 wrf: SUCCESS COMPLETE WRF

Run METplus
===========

After the WRF simulation has finished, you can run the METplus verification to compare the simulated results
to the actual weather observations during the hurricane.
The verification takes about five minutes to complete.
We use command line options to tell the METplus container several things, including where the observed data is located,
where the METplus configuration can be found, where the WRF output data is located, and where it should create its output files::

docker run --rm -it \
--volumes-from ${OBS_DATA_VOL} \
-v ${METPLUS_CONFIG_DIR}:/config \
-v ${WORKING_DIR}/wrf:/data/input/wrf \
-v ${METPLUS_DIR}:/data/output ${METPLUS_IMAGE} \
/metplus/METplus/ush/run_metplus.py /config/PointStat_matthew.conf

Progress information is displayed while the verification is performed.
**WARNING** log messages are expected because observations files are not available for every valid time and METplus is
configured to allow some missing inputs. An **ERROR** log message indicates that something went wrong.
METplus first converts the observation data files to a format that the MET tools can read using the MADIS2NC wrapper.
Point-Stat is run to generate statistics comparing METAR observations to surface-level model fields and
RAOB observations to "upper air" fields.
METplus will print its completion status when the processing finishes.

The results of the METplus verification can be found in ${WORKING_DIR}/metplus/point_stat.
These files contain tabular output that can be viewed in a text editor. Turn off word wrapping for better viewing.
Refer to the MET User's Guide for more information about the
`Point-Stat output <https://met.readthedocs.io/en/latest/Users_Guide/point-stat.html#point-stat-output>`.
In the near future, this exercise will be extended to include instructions to visualize the results.