Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle Missing Radius of Last Closed Isobar in SCHISM-PaHM #29

Closed
FariborzDaneshvar-NOAA opened this issue Aug 10, 2023 · 25 comments
Closed
Assignees

Comments

@FariborzDaneshvar-NOAA
Copy link

FariborzDaneshvar-NOAA commented Aug 10, 2023

Use /lustre/scripts/schism.sbatch to run SCHISM for the non-perturbed (original) faked BEST track that was generated by the ondemand-storm-workflow

Directory: /lustre/hurricanes/florence_2018_Fariborz_OFCL_10_v2/setup/ensemble.dir/runs/original/

@FariborzDaneshvar-NOAA
Copy link
Author

@SorooshMani-NOAA It failed with this message after the model parameters in the slurm-*.out file:

---------- MODEL PARAMETERS ----------
 

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 36 PID 11838 RUNNING AT sorooshmani-nhccolab2-00004-1-0002
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 37 PID 11839 RUNNING AT sorooshmani-nhccolab2-00004-1-0002
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 38 PID 11840 RUNNING AT sorooshmani-nhccolab2-00004-1-0002
=   KILLED BY SIGNAL: 7 (Bus error)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 39 PID 11841 RUNNING AT sorooshmani-nhccolab2-00004-1-0002
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 40 PID 11842 RUNNING AT sorooshmani-nhccolab2-00004-1-0002
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 41 PID 11843 RUNNING AT sorooshmani-nhccolab2-00004-1-0002
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 42 PID 11844 RUNNING AT sorooshmani-nhccolab2-00004-1-0002
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 43 PID 11845 RUNNING AT sorooshmani-nhccolab2-00004-1-0002
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 44 PID 11846 RUNNING AT sorooshmani-nhccolab2-00004-1-0002
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 45 PID 11847 RUNNING AT sorooshmani-nhccolab2-00004-1-0002
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 46 PID 11848 RUNNING AT sorooshmani-nhccolab2-00004-1-0002
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 47 PID 11850 RUNNING AT sorooshmani-nhccolab2-00004-1-0002
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 48 PID 11851 RUNNING AT sorooshmani-nhccolab2-00004-1-0002
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 49 PID 11853 RUNNING AT sorooshmani-nhccolab2-00004-1-0002
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 50 PID 11854 RUNNING AT sorooshmani-nhccolab2-00004-1-0002
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 51 PID 11856 RUNNING AT sorooshmani-nhccolab2-00004-1-0002
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 52 PID 11858 RUNNING AT sorooshmani-nhccolab2-00004-1-0002
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 53 PID 11859 RUNNING AT sorooshmani-nhccolab2-00004-1-0002
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 54 PID 11860 RUNNING AT sorooshmani-nhccolab2-00004-1-0002
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 55 PID 11861 RUNNING AT sorooshmani-nhccolab2-00004-1-0002
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 56 PID 11862 RUNNING AT sorooshmani-nhccolab2-00004-1-0002
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 57 PID 11863 RUNNING AT sorooshmani-nhccolab2-00004-1-0002
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 58 PID 11864 RUNNING AT sorooshmani-nhccolab2-00004-1-0002
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 59 PID 11865 RUNNING AT sorooshmani-nhccolab2-00004-1-0002
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 60 PID 11866 RUNNING AT sorooshmani-nhccolab2-00004-1-0002
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 61 PID 11867 RUNNING AT sorooshmani-nhccolab2-00004-1-0002
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 62 PID 11868 RUNNING AT sorooshmani-nhccolab2-00004-1-0002
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 63 PID 11869 RUNNING AT sorooshmani-nhccolab2-00004-1-0002
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 64 PID 11870 RUNNING AT sorooshmani-nhccolab2-00004-1-0002
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 65 PID 11871 RUNNING AT sorooshmani-nhccolab2-00004-1-0002
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 66 PID 11872 RUNNING AT sorooshmani-nhccolab2-00004-1-0002
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 67 PID 11873 RUNNING AT sorooshmani-nhccolab2-00004-1-0002
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 68 PID 11874 RUNNING AT sorooshmani-nhccolab2-00004-1-0002
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 69 PID 11875 RUNNING AT sorooshmani-nhccolab2-00004-1-0002
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 70 PID 11876 RUNNING AT sorooshmani-nhccolab2-00004-1-0002
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 71 PID 11877 RUNNING AT sorooshmani-nhccolab2-00004-1-0002
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

@SorooshMani-NOAA
Copy link
Contributor

Let me try to run it on my side and see what I can find out.

@SorooshMani-NOAA
Copy link
Contributor

I tried running SCHISM using the same binaries as @FariborzDaneshvar-NOAA, and the same setup (symlinked all files) and it went through successfully. Maybe the issue had something to do with the platform cloud instance failure. In any case I have the output and still I get 0 winds:

>>> ds = xr.open_dataset('outputs/out2d_1.nc')
>>> ds.windSpeed
ds.windSpeedX  ds.windSpeedY
>>> ds.windSpeedX.max()
<xarray.DataArray 'windSpeedX' ()>
array(0., dtype=float32)
>>> ds.windSpeedY.max()
<xarray.DataArray 'windSpeedY' ()>
array(0., dtype=float32)

@pvelissariou1 we will provide the Sandy track (generated using the same method) so that you can test. In the meantime do you have any suggestion for us to debug? I used to compile and run SCHISM on PW without issues using this script and the binaries I ran in this case are generated using the same old script (no container, etc.)

Do you suggest that I debug it? If so, what flags should I use to debug and where to put breakpoints? We're compiling using the following modules on PW:

  • cmake
  • intel/2021.3.0
  • impi/2021.3.0
  • hdf5/1.10.6
  • netcdf/4.7.0

and using the commit ddf15649 from SCHISM:


* 784d20a0 Update overview.md
* 55ffa74b Update visualization.md
* 9e67694f Update visualization.md
* ddf15649 Fixed bugs in WWM/vegetation. 
* 3fdeb8fb fixed an issue with PaHM output; tested
* 299309b2 Add PaHM model type from param.nml
* 90a67ff3 Changed nws<0 to nws=-1 (more precise)

@pvelissariou1
Copy link
Collaborator

@FariborzDaneshvar-NOAA and @SorooshMani-NOAA I agree, the error messages might be related to cloud instance issues as Soroosh replied. The modules:
cmake
intel/2021.3.0
impi/2021.3.0
hdf5/1.10.6
netcdf/4.7.0
should be fine. I suggest using the latest 2022* versions for intel and impi, or even the 2023* versions.
The full log files will be more usefull in debuging.

@pvelissariou1
Copy link
Collaborator

Let me get the same SCHISM commit and check it out. @SorooshMani-NOAA debuging SCHISM which runs without errors will be a trial and error approach, let me check the particular commit you are using and I'll let you know what I find.

@pvelissariou1
Copy link
Collaborator

@FariborzDaneshvar-NOAA @SorooshMani-NOAA I guess you use nws=-1 in the namelist file and you define USE_PAHM=ON when compiling SCHISM.

@SorooshMani-NOAA
Copy link
Contributor

Yes, we compiled with USE_PAHM=TRUE and in param.nml we have nws=-1

@pvelissariou1
Copy link
Collaborator

I checked in detail SCHISM for PaHM related codes in (a) CoastalApp/SCHISM/schim (commit bb616ded), (b) schism-dev/schism (commit ddf15649) and (c) schism-dev/schism master (commit 9517c51e) and found no major differences in PaHM implementation. So, in principle the "fake" track files should work as they do with PaHM and with PaHM+ADCIRC in CoastalApp. I am setting up explicit tests in CoastalApp-testsuite using (a) SCHISM standalone (PaHM activated) and (b) PAHM+SCHISM coupled to check things out.

@FariborzDaneshvar-NOAA
Copy link
Author

Thanks for the update!

@pvelissariou1
Copy link
Collaborator

@FariborzDaneshvar-NOAA , @SorooshMani-NOAA , @saeed-moghimi-noaa

Background information:

I run schism (standalone) using commit ddf1564 from https://github.com/schism-dev/schism for both the BEST and OFCL tracks for hurricane Sandy and the shinnecock inlet case.
During the schism compilation I used the following flags:
-DOLDIO=ON -DUSE_WW3=ON -DUSE_PAHM=ON -DPREC_EVAP=OFF
for the simulation, I set: nws=-1 in the param.nml file.
I used the tracks for Sandy as supplied from Fariborz:

Sandy_hurricane-track_BEST.dat
Sandy_hurricane-track_OFCL.dat
Sandy_original_BEST.22
Sandy_original_OFCL.22

The simulation results for the shinnecock.sch case are in hera at:

/scratch2/STI/coastal/noscrub/shared/Takis/check_OFCLs/CoastalApp-testsuite/test_schism/ike_shinnecock.sch/run/outputs-CoastalApp-schism-ddf15649_Sandy_*model10

folders. The data are written in the schout_000001_1.nc files contained in each folder.

The BEST simulation results contain non-zero wind speeds that are non-zero as expected. The OFCL simulation results contain only zero wind speeds as we have already discussed.

IMPORTANT: Keep in mind, that if the storm path (eye in particular) is not near on inside the computational domain, PaHM produces zero winds (as expected). Also zero winds are produced if there are no data in the track file.

Issue resolution:

In SCHISM, the GAHM model has been modified slightly to use the radius of the last closed isobar (RRP) to reduce the amount of calculations in the domain (similar to the Holland model) by eliminating the nodal points outside RRP (I disagree with this approach but this is for future discussion with the SCHISM developers).

In our case, in the OFCL track files all the RRPs are set to zero, hence the problem with the OFCL files. My suggestion for a temporary workaround is to replace RRP by the max(radius1, radius2, radius3, radius4) of the 34 isotach. Also we might modify SCHISM to use either RRP (if found), or the max R34 found above or setting a default value.

@SorooshMani-NOAA
Copy link
Contributor

Hi @pvelissariou1 thanks for testing. I can modify my script and later the stormevents code to update the RRP field for now.

@pvelissariou1
Copy link
Collaborator

pvelissariou1 commented Aug 14, 2023 via email

@SorooshMani-NOAA
Copy link
Contributor

As we discussed, in short term we use a fork of SCHISM with updated PaHM code to ignore RRP, and then later we'll decide how to address this during normalization in stormevents as well: oceanmodeling/StormEvents#84 (comment)

I'll also rename this ticket to reflect the main issue we're discussing here

@SorooshMani-NOAA SorooshMani-NOAA changed the title Run SCHISM for hurricane florence Handle Missing Radius of Last Closed Isobar in SCHISM-PaHM Aug 14, 2023
@FariborzDaneshvar-NOAA
Copy link
Author

@SorooshMani-NOAA thanks for updating the image. Looks like it worked and runs for OFCL tracks are also simulating the storm.
Here are maximum horizontal wind speed plots of both OFCL and BEST tracks of florence 2018.

image

Here is also the maximum horizontal wind speed for the OFCL track of sandy 2012.
image

Directory of new runs on the NHC_COLAB_2 cluster are:

  • florence, BEST track: /lustre/hurricanes/florence_2018_Fariborz_BEST_v4_5//setup/ensemble.dir/runs/original/outputs_with_wind/
  • florence, OFCL track: /lustre/hurricanes/florence_2018_Fariborz_OFCL_v3_5//setup/ensemble.dir/runs/original/outputs_with_wind/
  • sandy, OFCL track: /lustre/hurricanes/sandy_2012_Fariborz_OFCL_v2_5//setup/ensemble.dir/runs/original/outputs_with_wind/

With this fix, linked issue posted on the ondemand-storm-workflow repository will be resolved!

@pvelissariou1
Copy link
Collaborator

@SorooshMani-NOAA , @FariborzDaneshvar-NOAA , @saeed-moghimi-noaa I came up with a solution that seems to work pretty well. This solution will be implemented in PaHM and in SCHISM/PaHM and most likely I'll push it to ADCIRC as well. @SorooshMani-NOAA , Soroosh you might want to implement this solution from your side as well. See the image below:

outer_radius1

@SorooshMani-NOAA
Copy link
Contributor

@pvelissariou1, in a separate ticket related to NHC collaboration I brought up:

In PaHM Takis uses the RRP field (radius of the last closed isobar) to set all the wind field values to zero outside the contour. This logic breaks down for forecast where there's no such data available for some/all entries. Is it OK to instead use the 34 knot wind radius to set wind field to zero instead in these case?

as you've asked me to. @WPringle suggested:

@SorooshMani-NOAA It may not be necessary to set to zero anywhere. Just keep it as is as it is reducing exponentially.

I wanted to follow up with you to see what you think.

@pvelissariou1
Copy link
Collaborator

@FariborzDaneshvar-NOAA @SorooshMani-NOAA Let's keep the setup we have right now in SCHISM/PaHM for RRP (GaHM model) where the particular piece of code is commented out and RRP is not used. Very soon I will implement the solution we have for RRP. The SCHISM developers added the RRP code in GaHM to reduce the computational load by excluding the nodal locations where the winds are actually zero or very close to zero. As @WPringle pointed out the fields are reduced eventually to zero at locations outside the RRP having though no physical meaning at these locations. The SCHISM developers will still like to have the RRP code for the reason described above.

@FariborzDaneshvar-NOAA
Copy link
Author

@pvelissariou1 @SorooshMani-NOAA should we close this ticket?

@pvelissariou1
Copy link
Collaborator

@FariborzDaneshvar-NOAA @SorooshMani-NOAA Please, let's close it end of next week

@FariborzDaneshvar-NOAA
Copy link
Author

@pvelissariou1 can you please update this ticket and let me know if I can close it? thanks

@pvelissariou1
Copy link
Collaborator

After the coupled simulations with PAHM are complete and evaluate the PAHM results I'll push the updates upstream to PaHM and to SCHISM. There is nothing else to add at this moment. Need to update this ticket when the changes to SCHISM/PAHM have been accepted, let's keep it open for 2-3 weeks.

@pvelissariou1
Copy link
Collaborator

@FariborzDaneshvar-NOAA , @SorooshMani-NOAA I have updated PaHM to include the RRP resolution for both Holland and GAHM models. There are some rare occassions that all R34 and RRP radii are missing from the track file, and in these cases PaHM reverts to use all the nodal points when performing its interpolations. Later today (01/08/2024), I will push the PaHM updates to SCHISM, I'll let you know. Before I submit a PR, please consider testing the changes to SCHISM by cloning the "cmmb" branch.

@pvelissariou1
Copy link
Collaborator

... that is: git clone https://github.com/schism-dev/schism.git -b cmmb. The cmmb branch has been merged with "master" so it should be the latest SCHISM commit.

@janahaddad
Copy link
Collaborator

@pvelissariou1 @FariborzDaneshvar-NOAA seems like we can close this?

@pvelissariou1
Copy link
Collaborator

Yes, it is done from my side and Fariborz, Soroosh have tested the PaHM updates. Will reopen if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants