Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cell Tree, Index translation, and Bilinear Interpolation #78

Merged
merged 12 commits into from
Feb 9, 2016

Conversation

jay-hennen
Copy link
Contributor

The Cell Tree locate_faces() allows the user to pass in an array of points and receive an array of x,y indices corresponding to the cells on the nodes grid that contain these points. This function has the same signature and purpose as the locate_faces() function in pyugrid.

Some important caveats:

  1. This function returns x,y indices for the psi/nodes/corners/interior grid only, the grid bounded by node_lon, node_lat.
  2. Extra memory is required to store a copy of the nodes and store a faces array for use in cell_tree

Index translation allows the user to pass in an array of points, x,y indices, and a translation type, and be returned an array of x,y indices where the points are located in the destination grid. This allows you to translate the node indices provided by the Cell Tree into the correct face indices. There is also a translate_index in the pysgrid.utils that is intended to provide fully generic translations (any->any, rather than only node->any). However it is incomplete at this time.

lons = sgrid.center_lon
lats = sgrid.center_lat
translation = 'psi2rho' #These names are not final
face_ind = sgrid.translate_index(points, ind, lons, lats, translation)

The bilinear interpolation takes advantage of both of the above to provide a single function to interpolate a variable on any of the 4 grids to an array of points.

var = sgrid.temp[0, 0]
temps = sgrid.interpolate_var_to_points(points, var )

This gives you the interpolated temperatures at those points. If you want to interpolate multiple variables at the same time on the same grid, you can ask for the interpolation alphas instead, and avoid repeat computations.

alphas = sgrid.interpolation_alphas(points, sgrid.center_lon, sgrid.center_lat)
temps = sgrid.interpolate_var_to_points(points, sgrid.temp[0,0], alphas)
salinity = sgrid.interpolate_var_to_points(points, sgrid.salt[0,0], alphas)

Be aware though that the alphas are only 'good' on the grid they are calculated for.
u_vels = sgrid.interpolate_var_to_points(points, sgrid.u[0,0], alphas)
will give you incorrect answers

Near-term goals:
Rename variables away from ROMS and to SGrid standard
Add demonstration notebook to repo
Implement optional pre-computation and storage of interpolation transform information to avoid recomputation
Implement tests
Fix and rename pysgrid.utils.translate_index.

Added: Interpolation, Cell Tree, variable data lookup
Added a more efficient 1 function interpolation 
Added translation into the sgrid object itself 
points_in_polys can now work on polygons defined as points or arrays of
x/y values 
Added simplistic caching behavior to variables to avoid repeat lookups
Added edge1/2_lon/lat to load_grid() 
Allowed passing in interpolation alphas to avoid repetition
@ocefpaf
Copy link
Member

ocefpaf commented Feb 4, 2016

Thanks @jay-hennen. I will start a review and hopefully we can get this functionally integrated into pysgrid quickly.

PS: I would appreciate smaller PRs in bite size review chunks. (I am as lazy as my variables evaluations 😉)

@jay-hennen
Copy link
Contributor Author

Sorry, it is a lot of stuff. In my mind the most important change design/API-wise is in SGridVariable since it affects the role it might play. The SGrid stuff is mostly fancy new functionality

I'll answer any questions/concerns you have!

'''
Function to get the node values of a given face index.
Emulates the self.grid.nodes[self.grid.nodes.faces[index]] paradigm of unstructured grids
'''
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jay-hennen we are not building docs right now, but to save us some trouble in the future can you wrap the code parts of the docstrings with ``. For example:

Emulates the `self.grid.nodes[self.grid.nodes.faces[index]]` paradigm

Can you also put a a full stop in the last phrase and standardize change the triple single quotes to triple double quotes (to be consistent with the rest of the code).

@ocefpaf
Copy link
Member

ocefpaf commented Feb 5, 2016

@jay-hennen I made a few minor comments about style and coding standards. I will take a second look at this once you are done and the tests a passing.

Most of it looks good BTW. Hopefully I can persuade @kwilcox and @ayan-usgs to take a look too 😉

@jay-hennen
Copy link
Contributor Author

I added periods to most lines and fixed some docstrings. Apologies for the code moving. That's the auto-pep8 script doing it's thing. I don't like it either.

The goal of interpolate_var_to_points is to be able to pick arbitrary points on the grid, and interpolate a variable to those points, whether or not that variable is on the node, face, or edge grids.

points = (list of points on the grid)
time = depth = 0
temperature = sgrid.temp[time,depth]
temperature_at_points = sgrid.interpolate_var_to_points(points, temperature)

@@ -252,6 +268,297 @@ def _save_common_components(self, nc_file):
dataset_grid_var.axes = ' '.join(axes)
return grid_vars

def get_grid(self, grid='node'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some overarching concerns about this. As @hrajagers spoke to in issue #76, the SGRID conventions where drafted to:

The sgrid conventions have been drafted to be
able to get rid of the multiple separate grid definitions for centre,
corner, u, v "points"

From reading through the thread of that issue, it doesn't seem the discussion regarding whether there's a single grid, or separate grids was resolved in #76 (or in another issue as suggested by @ocefpaf).

@ayan-usgs
Copy link
Contributor

There's a hefty number of new methods and functions here. I would feel more comfortable if there where also some tests for the new lines of code. I think there would to wonderful instances of utility here: (1) it's unclear how some code would be used, so the tests would serve as working examples, and (2) it'd help us catch regressions in the future.

That said, I don't think tests are not a reason to not merge.

I would like greater clarity on the topic of a single grid vs separate grids though before merging. @jay-hennen, @hrajagers, and @rsignell-usgs?

@ayan-usgs
Copy link
Contributor

Speaking of tests, I need to look at why the Travic CI runs are failing. I'll create an issue of that and take a poke at it.

@jay-hennen
Copy link
Contributor Author

I think it's fine to think of it as separate grids internally and then present the unified grid to the outside world. The user doesn't need to know anything about the index translations and point-in-cell searching going on behind the scenes which necessitate a separate-grid view.

Regarding tests, those are coming, along with an ipython notebook to demonstrate some of these functions.

@jay-hennen
Copy link
Contributor Author

There's now an ipython notebook to demo these new features.

@ocefpaf
Copy link
Member

ocefpaf commented Feb 8, 2016

I think it's fine to think of it as separate grids internally and then present the unified grid to the outside world. The user doesn't need to know anything about the index translations and point-in-cell searching going on behind the scenes which necessitate a separate-grid view.

To be honest I still need to read again the SGRID standards and pysgrid code to be able to comment properly on your PR. However, internal consistency with the SGRID standards is important as pysgrid goal is to be a base library for any SGRID compliant data reader and/or high level python object.

With that said we need strive to be consistent with the language (and way of thinking) stated in the SGRID standards document.

Sure, if we find that the SGRID standards are problematic or limiting, we can always start a discussion about the standards and update the document.

@ocefpaf
Copy link
Member

ocefpaf commented Feb 8, 2016

There's now an ipython notebook to demo these new features.

Cool! Thanks @jay-hennen.

Can you put a simple profiling, like %%time on the top of the inerpolation cells? I am curious how cell tree fairs against kdtree, etc.

@jay-hennen
Copy link
Contributor Author

Apologies, I just pushed a new version of the notebook and some changes to clean up the API for some of the functions.

I put %%time into two of the cells, where salinity and temperature are interpolated. It should demonstrate the speedup I am talking about (at least once the data is already loaded)

@jay-hennen
Copy link
Contributor Author

@ayan-usgs

The get_grid function you're pointing out is not meant to be a part of the final API (which is why I also didn't bring any attention to it). It's there because this is the active development branch and I needed something like it when I want to print the grid out in pygnome.


Actually, this should probably be handled at a different level. SGrid/UGrid should just provide the raw  edge data.

Internally, I see it as 4 grids. There's 4 different `_lon` and `_lat` arrays, which are related, but not as rigidly as one might think [(as I noted here)](https://github.com/sgrid/pysgrid/issues/76#issuecomment-178841214) Whether the user sees it that way is another matter. When they want to interpolate a variable, they just know it's on the cell edge or center or node. The notion of 4 grids isn't apparent.

@ayan-usgs
Copy link
Contributor

The get_grid function you're pointing out is not meant to be a part of the final API (which is why I also didn't bring any attention to it). It's there because this is the active development branch and I needed something like it when I want to print the grid out in pygnome.

Thanks for the clarification. If it's a method not meant for the final API, would it be appropriate to designate as _get_grid?

Internally, I see it as 4 grids. There's 4 different _lon and _lat arrays, which are related, but not as rigidly as one might think (as I noted here) Whether the user sees it that way is another matter. When they want to interpolate a variable, they just know it's on the cell edge or center or node. The notion of 4 grids isn't apparent.

I'll echo @ocefpaf in saying that we should be internally consistent with the sgrid conventions at https://publicwiki.deltares.nl/pages/viewpage.action?pageId=108953628 and that if the standard is insufficient we should modify accordingly. I'll merge as is pending further discussion on this topic.

ayan-usgs added a commit that referenced this pull request Feb 9, 2016
Cell Tree, Index translation, and Bilinear Interpolation
@ayan-usgs ayan-usgs merged commit 36ef2b5 into sgrid:master Feb 9, 2016
@ayan-usgs
Copy link
Contributor

@jay-hennen, @ocefpaf, @kwilcox, @rsignell-usgs, @hrajagers

I've created an issue at sgrid/sgrid#5 to talk about grids further. It seems more appropriate to have this discussion there.

@ocefpaf
Copy link
Member

ocefpaf commented Feb 9, 2016

@jay-hennen I had the impression that this was not ready to be merged. I expected at least a rebase to see if the tests are passing.

@jay-hennen
Copy link
Contributor Author

@ocefpaf I think it was in an okay place to merge, if that is what's happening. Not the best, but definitely not the worst either.

@ocefpaf
Copy link
Member

ocefpaf commented Feb 9, 2016

@jay-hennen I spoke with @ayan-usgs we he agrees that we need at least some testing passing. We have to options:

  • revert the merge, rebase, keep working until you give us a green light for a next round of review.
  • create a new branch of current master and start addressing the remaining of the PR there.

Which one are you most comfortable with?

@rsignell-usgs
Copy link
Contributor

@jay-hennen, I would say it definitely is not okay to merge a PR when the tests are not passing, unless we have consensus from the group.

@ayan-usgs
Copy link
Contributor

To be fair, I jumped the gun on the merge.

@ayan-usgs
Copy link
Contributor

Those of us with merge rights just need to communicate better.

@jay-hennen
Copy link
Contributor Author

I'm deferring to the group. Currently only one of the pytests in the library fails (ImportError: No module named mock). If you want to revert the merge go ahead, I'll just keep working. There is the list of things I mentioned in the first post still to work on

However, leaving it merged will get these features out there so they can be used.

@ocefpaf
Copy link
Member

ocefpaf commented Feb 9, 2016

However, leaving it merged will get these features out there so they can be used.

Don't worry, people testing from master are OK testing from a branch. Average users will get the functionality once we make a new release and package it.

If you want to revert the merge go ahead, I'll just keep working.

If you are OK with that let's do it then. It is probably the best option to keep all your changes in the same context.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants