Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to storing CRS WKT in AreaDefinitions instead of the CRS object #264

Merged
merged 11 commits into from
May 4, 2020

Conversation

djhoese
Copy link
Member

@djhoese djhoese commented Apr 2, 2020

This PR is an effort to resolve the issues discussed in:

pyproj4/pyproj#589
pytroll/satpy#1114

The summary of these issues is that the pyproj CRS object is not thread-safe. If you access one from multiple threads you can get strange corrupt strings and bytes. I've found one case in all of satpy, pyresample, and trollimage (I think) where we explicitly pass a CRS or Proj object to a dask function (map_blocks, blockwise, delayed) and that's what this PR fixes (call to invproj with a CRS object). There is an additional usage in the EWA resampling in Satpy where an AreaDefinition is passed to a delayed function. I would like to be able to confidently pass AreaDefinitions between threads so that issue is also addressed here.

The main switch is that AreaDefinition objects will now store crs_wkt (the WellKnownText string of a CRS object) which can be used to reconstruct the CRS object. Since it is a string it is perfectly thread-safe. This has the unintended consequence that if you pass a PROJ dict to an AreaDefinition and then do area_def.proj_dict you will get the result of dict -> CRS -> WKT -> CRS -> dict. Due to the simplifications that pyproj and PROJ do this can result in completely different PROJ dicts being produced which can be confusing and can require rewriting a lot of tests which I also tried to do here. Some examples of the simplications:

  1. Convert a/b parameters to the equivalent ellps.
  2. Convert a/b parameters to the equivalent a + rf parameters.
  3. Drop unused parameters.

However, I'm also finding a few bugs in pyproj/PROJ which I've brought up here: pyproj4/pyproj#592. I consider this PR a WIP until those can get resolved and the tests in pyresample fixed.

@djhoese djhoese added the bug label Apr 2, 2020
@djhoese djhoese requested review from mraspaud and pnuu April 2, 2020 21:41
@djhoese djhoese self-assigned this Apr 2, 2020
@djhoese
Copy link
Member Author

djhoese commented Apr 2, 2020

Oh, also, I've added a workaround for this issue to my own processing to see if it helps like I think it should. If the processing still fails then that could mean this isn't the solution we need.

pyresample/geometry.py Outdated Show resolved Hide resolved
@djhoese
Copy link
Member Author

djhoese commented Apr 16, 2020

On slack @pnuu asked about what version of WKT is used internally. Pyproj defaults to WKT2 but I asked pyproj4/pyproj#604 to get some more details. I was concerned that there may be a problem where pyproj wants to create WKT2 but the installed PROJ version doesn't support it. This is not an issue since the CRS object is only available in pyproj 2.0+ and pyproj 2.0+ requires PROJ 6 which is providing the WKT2.

@coveralls
Copy link

coveralls commented Apr 19, 2020

Coverage Status

Coverage decreased (-0.02%) to 91.907% when pulling b6423c0 on djhoese:bugfix-crs-threadsafe into 3d6d38d on pytroll:master.

@codecov
Copy link

codecov bot commented Apr 19, 2020

Codecov Report

Merging #264 into master will decrease coverage by 0.02%.
The diff coverage is 91.95%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #264      +/-   ##
==========================================
- Coverage   91.94%   91.92%   -0.03%     
==========================================
  Files          42       42              
  Lines        8098     8171      +73     
==========================================
+ Hits         7446     7511      +65     
- Misses        652      660       +8     
Impacted Files Coverage Δ
pyresample/test/utils.py 71.27% <76.47%> (+1.14%) ⬆️
pyresample/utils/_proj4.py 86.95% <87.50%> (-1.57%) ⬇️
pyresample/test/test_geometry.py 97.75% <94.73%> (-0.15%) ⬇️
pyresample/geometry.py 82.46% <100.00%> (+0.05%) ⬆️
pyresample/test/test_grid.py 100.00% <100.00%> (ø)
pyresample/test/test_utils.py 97.37% <100.00%> (+0.13%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3d6d38d...f3f2765. Read the comment docs.

pyresample/geometry.py Outdated Show resolved Hide resolved
@pnuu
Copy link
Member

pnuu commented Apr 20, 2020

Maybe it would be good to also test the usage of WKT as projection definition in AreaDefinition.

@djhoese
Copy link
Member Author

djhoese commented Apr 20, 2020

@pnuu I updated this with a test for different inputs to AreaDefinition (proj dict, proj string, WKT2, WKT1). The only surprise was that WKT1 has some different naming for unknown parameters so a CRS from WKT2 does not equal a CRS created from WKT1. I've commented out that equality check.

Copy link
Member

@sfinkens sfinkens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

'lon_0': '8.00',
'proj': 'stere'
}
crs = CRS(CRS.from_dict(proj_dict).to_wkt())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious: What is the benefit on the extra CRS(crs.to_wkt()) layer here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is needed otherwise the CRS objects are not strictly equal later on. The conversion is done inside the AreaDefinition so when I do area_def.crs == crs they are the same. If I remember correctly if I don't do it then there is some information gained or lost in the conversion to WKT that makes the CRS objects not equal.

Copy link
Member Author

@djhoese djhoese Apr 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checked and the WKT of the two CRS objects is the same, but something in the CRS based on the PROJ dict is not equivalent.

Edit: My guess is that when it is converted to WKT it gets a/b converted to a/rf and PROJ is not considering these equivalent or isn't willing to make an assumption on floating point equality.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh boy! Thanks for sorting out all these nasty details!

Comment on lines +2152 to +2154
geos_area.proj_dict['a'] = 1000.0
geos_area.proj_dict['b'] = 1000.0
geos_area.proj_dict['h'] = np.sqrt(2) * 1000.0 - 1000.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@mraspaud mraspaud merged commit 066da3b into pytroll:master May 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants