Grid.build_face_edges_connectivity #205

philipc2 · 2023-01-15T05:22:31Z

Implement Grid.build_face_edges_connectivity

The text was updated successfully, but these errors were encountered:

philipc2 · 2023-01-15T05:27:00Z

@hongyuchen1030 Have you had a look at the xugrid implementation for this (and other) associated connectivity functions? I'm currently taking some time to review #191 and familiarize myself with xugrid since much of the initial discussion about this package was prior to me joining the team last summer.

I'll also ping @Huite for any feedback and to start up a similar discussion to #130 (comment)

Relevant References

#19 (comment)

https://deltares.github.io/xugrid/examples/connectivity.html

https://github.com/Deltares/xugrid/blob/main/xugrid/ugrid/connectivity.py#L231

hongyuchen1030 · 2023-01-16T01:51:39Z

Thank you for your information. I will take a look at it.

hongyuchen1030 · 2023-01-17T22:32:24Z

After studying the XUGRID, I have some questions:

How did it deal with the dummy nodes from the face_node_connectivity?
After reading through the close_polygons

I noticed that the face_node_connectivity cannot have the same Fill_Value as the def close_polygons(face_node_connectivity: IntArray, fill_value: int) used. But according to the UXARRAY documentation, the input
face_node_connectivity will have Fill_values. The current data read-in will make the FillValue=nan instead of -1 #189

For example
face_node_connectivity = [[0, 1, 2, 3], [1, 4, 3, 5], [2, 4, 3 -1]] # The original face_node_connectivity, the last face has the Fill_Value as -1
If the user called close_polygons(face_node_connectivity, fill_value=-1)Then after line 178(https://github.com/Deltares/xugrid/blob/9741f5bdc47a8f500ba1e562d54fa20b01dd1b9c/xugrid/ugrid/connectivity.py#L173) here, it will become
closed = [[0, 1, 2, 3, -1], [1, 4, 3, 5, -1], [2, 4, 3 -1, -1]] # Fill the last indice as -1
And after line181:
closed = [[0, 1, 2, 3, 0], [1, 4, 3, 5, 0], [2, 4, 3 2, 2]] # It will replace every -1 with the first_node
Which apparently is not what we want for the last face. Since it also replace its Fill_Value with the first_node
The np.full() function cannot work properly when encountering nan values
So in close_polygons function
line 177 closed = np.full((n, m + 1), fill_value, dtype=IntDType),
If our face_node_connectivity contains nan, it will be converted into -9223372036854775808 by the np.full function, causing problems for the following implementations

@Huite Thank you very much for your codes and it helps a lot with my python vectorizations.

However, about the questions above, I would like to know if I missed something about your implementations of these marginal cases.

@philipc2 And thank you for your help again, after a careful walk-through of the XUGRID codes, I think my implementation ways might be more suitable with the XARRAY. But I definitely will make it more python styles by referring to the usages in the XUGRID.

Huite · 2023-01-19T09:41:38Z

Hi @hongyuchen1030

Do you have specific questions?

I'll just make a couple of remarks here, also from briefly reading through #189.

The face_node_connectivity is fundamentally an indexing array. As you cannot index with floating point arrays, you'd regularly have to convert to integer arrays, possibly dropping the fill values. Unfortunately, given a _FillValue, xarray will convert it to floats, with NaNs as the fill value. In my setup, I separate the UGRID variables from the other variables and stuff it into a separate topology object. I do the conversion to integer there: https://github.com/Deltares/xugrid/blob/8168ece26465b565382b85a4a07b151d5d0a9845/xugrid/ugrid/ugridbase.py#L295
(In my case, I expected the user to really not see the connectivity arrays.)
I've chosen for a default value of -1. -1 is a valid indexer in Python, and I've been bitten a few times by it. Of course a fill value is just that, a fill value. As long as it's not in the range of the valid indexes (0 .. n_face), it's not a problem. Probably the best choice is something that will error: e.g. np.iinfo(np.int32).min or np.iinfo(np.int32).max (or the 64 bit version). This could be easily changed by introducing a DEFAULT_FILL_VALUE constant.
Note that a negative value is not a valid indexer in e.g. Fortran, so except for the fill_value, the face_node_connectivity array should never contain negative numbers.
I don't think you understand the purpose of the close_polygons function. The only reason I use this, is so that I can do the vectorized numpy operations afterwards. Because I sort row by row, then call call np.unique afterwards, the duplicates that are introduced here are removed anyway, but it also ensures an edge is created between the first and the last node.
Note that duplicating vertices works in many cases: I also use this approach to compute the face area. (Because a triangle with two duplicated vertices will have an area of 0): https://github.com/Deltares/xugrid/blob/main/xugrid/ugrid/connectivity.py#L311

You also ask about numba in #202. Numba primarily works with numpy arrays. It understands (reimplements) many numpy functions and methods. As long as your functions consume and produce numpy arrays you can easily mix vectorized (numpy) functions and numba compiled functions.

I've chosen to use mostly vectorized numpy (and scipy sparse) function. This results in a significantly shorter code, but it's not necessarily much more readable. But if you're dealing with permutations of permutations, it's not trivial to keep in mind anyway
The biggest upside of numba is that you don't allocate a lot of temporary arrays like you have to with numpy (e.g. creating an index array). The downside is that you run into compilation overhead (although you can cache with numba) and you have to write a lot of loops (more lines, more opportunity to make mistakes). However, you're mostly sorting and getting unique values (which can be easily done from sorted arrays: see https://github.com/EelcoHoogendoorn/Numpy_arraysetops_EP). I'm quite confident the numpy sorting functions are performant.

I see you're using a Python set. Python sets should be pretty fast (we'd expect pretty optimized C code in the background), but numpy.unique should find unique values faster. Interestingly though, with just ad hoc some benchmarking, it looks like the numpy_indexed unique is two times faster than just numpy.unique..., which is in turn two times faster than a set:

import numpy as np
import numpy_indexed as npi

nodes = np.arange(2_000_000)
a = np.random.choice(nodes, 500_000)
b = np.random.choice(nodes, 500_000)
edges = np.column_stack((a, b)) 

%timeit set(tuple(row) for row in edges)
%timeit np.unique(edges, axis=0)
%timeit npi.unique(edges, axis=0)

# 631 ms ± 33.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
# 261 ms ± 3.43 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
# 116 ms ± 903 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Note that you can return the index and the inverse from the unique calls as well at essentially no cost, which can be rather useful!

hongyuchen1030 · 2023-01-19T21:00:32Z

Hi @Huite

Thank you so much for your explanation, that helps a lot.

I think my understanding of the close_polygons function is the same as yours. Just that since the face_node_connectivity in our project will contain _Fill_Value_, some of your implementation might not be compatible with ours.

And thank you for the information about np. unique, I will think about how to better utilize the numpy function in my implementation.

philipc2 added the new feature New feature or request label Jan 15, 2023

philipc2 linked a pull request Jan 15, 2023 that will close this issue

DRAFT: Add Grid.build_face_edges_connectivity #191

Merged

philipc2 assigned hongyuchen1030 Jan 15, 2023

rajeeja closed this as completed in #191 Apr 20, 2023

philipc2 mentioned this issue Jun 13, 2023

Implement Grid.build_node_face_connectivity #130

Closed

philipc2 added this to the Grid Topology milestone Aug 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Grid.build_face_edges_connectivity #205

Grid.build_face_edges_connectivity #205

philipc2 commented Jan 15, 2023

philipc2 commented Jan 15, 2023 •

edited

Loading

hongyuchen1030 commented Jan 16, 2023

hongyuchen1030 commented Jan 17, 2023 •

edited

Loading

Huite commented Jan 19, 2023 •

edited

Loading

hongyuchen1030 commented Jan 19, 2023

Grid.build_face_edges_connectivity #205

Grid.build_face_edges_connectivity #205

Comments

philipc2 commented Jan 15, 2023

philipc2 commented Jan 15, 2023 • edited Loading

hongyuchen1030 commented Jan 16, 2023

hongyuchen1030 commented Jan 17, 2023 • edited Loading

Huite commented Jan 19, 2023 • edited Loading

hongyuchen1030 commented Jan 19, 2023

philipc2 commented Jan 15, 2023 •

edited

Loading

hongyuchen1030 commented Jan 17, 2023 •

edited

Loading

Huite commented Jan 19, 2023 •

edited

Loading