Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add HU, HT, KMU to get grid #54

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 51 additions & 0 deletions pop_tools/grid.py
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,9 @@ def get_grid(grid_name, scrip=False):
assert kmt_flat.max() <= len(z_t), 'Max KMT > length z_t'
KMT = kmt_flat.reshape(grid_attrs['lateral_dims']).astype(np.int32)

# derive KMU
KMU = _generate_KMU(KMT)

# read REGION_MASK
region_mask_fname = INPUTDATA.fetch(grid_attrs['region_mask_fname'], downloader=downloader)
region_mask_flat = np.fromfile(region_mask_fname, dtype='>i4', count=-1)
Expand All @@ -140,6 +143,15 @@ def get_grid(grid_name, scrip=False):
), f'unexpected dims in region_mask file: {grid_attrs["region_mask_fname"]}'
REGION_MASK = region_mask_flat.reshape(grid_attrs['lateral_dims']).astype(np.int32)

# derive depth of columns of ocean T-points and U-points
KMT_reidx = KMT - 1
KMT_reidx[KMT_reidx == -1] = 0
Comment on lines +148 to +149
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This deals with the issues @klindsay28 mentioned about IndexErrors. Since the index from KMT is not zero-based.

Question: Would one anticipate the HT/HU output for land to be np.nan or 0? Currently it's 0.

HT = z_w[KMT_reidx]

KMU_reidx = KMU - 1
KMU_reidx[KMU_reidx == -1] = 0
HU = z_w[KMU_reidx]

# output dataset
dso = xr.Dataset()
if scrip:
Expand Down Expand Up @@ -267,6 +279,35 @@ def get_grid(grid_name, scrip=False):
},
)

dso['KMU'] = xr.DataArray(
KMU,
dims=('nlat', 'nlon'),
attrs={
'long_name': 'k Index of Deepest Grid Cell on U Grid',
'coordinates': 'ULONG ULAT',
},
)

dso['HT'] = xr.DataArray(
HT,
dims=('nlat', 'nlon'),
attrs={
'units': 'cm',
'long_name': 'depth of ocean column on T grid',
'coordinates': 'TLONG TLAT',
},
)

dso['HU'] = xr.DataArray(
HU,
dims=('nlat', 'nlon'),
attrs={
'units': 'cm',
'long_name': 'depth of ocean column on U grid',
'coordinates': 'ULONG ULAT',
},
)

dso['REGION_MASK'] = xr.DataArray(
REGION_MASK,
dims=('nlat', 'nlon'),
Expand Down Expand Up @@ -320,6 +361,16 @@ def get_grid(grid_name, scrip=False):
return dso


@jit(nopython=True, parallel=True)
def _generate_KMU(KMT):
"""Computes KMU from KMT."""
KMU = np.zeros_like(KMT)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about initializing KMU outside of the function so as to avoid having to initialize it for every function call? This change would involve passing KMU as an input, and updating it inside the function. We could then change the function signature to

def _generate_KMU(KMT, KMU)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will only be called once upon calling get_grid() so I figure it's fine to initialize in there. I went ahead and switched it though. I'm returning the modified KMU since that's generally how I write this kind of thing.

But per your pass-by-reference model, do I need to return it at all? Or will it automatically be modified? I'm still blown away by this. I thought modifications inside functions were just to the private variable, not the global variable.

for i in prange(KMT.shape[0]):
for j in prange(KMT.shape[1]):
KMU[i, j] = min(KMT[i, j], KMT[i - 1, j], KMT[i, j - 1], KMT[i - 1, j - 1])
Copy link
Contributor

@andersy005 andersy005 May 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to use np.min() here. Python's builtin min() function doesn't appear on the list of array operations that can be parallelized by numba

See: https://numba.pydata.org/numba-doc/latest/user/parallel.html#supported-operations

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking more about this, np.min() is probably not needed since you are operating on scalars in min(KMT[i, j], KMT[i - 1, j], KMT[i, j - 1], KMT[i - 1, j - 1])

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So keep it as is? My thinking is the loops are parallelized with prange and numba such that min is just comparing 4 scalars, but in parallel at the (i, j) level.

return KMU


@jit(nopython=True, parallel=True)
def _compute_TLAT_TLONG(ULAT, ULONG, TLAT, TLONG, nlat, nlon):
"""Compute TLAT and TLONG from ULAT, ULONG"""
Expand Down
10 changes: 10 additions & 0 deletions tests/test_grid.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import os

import pytest
import xarray as xr

import pop_tools
Expand All @@ -26,3 +27,12 @@ def test_get_grid_scrip():
ds_test = pop_tools.get_grid('POP_gx3v7', scrip=True)
ds_ref = xr.open_dataset(DATASETS.fetch('POP_gx3v7.nc'))
assert ds_compare(ds_test, ds_ref, assertion='allclose', rtol=1e-14, atol=1e-14)


@pytest.mark.parametrize('grid', pop_tools.grid_defs.keys())
def test_HT_HU_KMU_in_grid(grid):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably unnecessary, but I like to add comprehensive testing for any PR. I know this is at odds with the fact that other variables aren't checked. Let me know if we should retain this.

print(grid)
ds = pop_tools.get_grid(grid)
assert 'HT' in ds, 'Missing variable HT'
assert 'HU' in ds, 'Missing variable HU'
assert 'KMU' in ds, 'Missing variable KMU'