-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Common type wrappers for gemoetry/crs formats #7
Comments
A big +1 from my side for moving GeoFormatTypes to JuliaGeo. One candidate to use the interface would be Shapefiles.jl as discussed here. So your plan would be the the |
Thanks for starting the discussion. I agree it would be nice to improve our integration regarding coordinate reference systems. And I guess we would not want that to depend on Proj4.jl for the case we just want to carry it but not reproject. Although for converting between different Why is this issue both about geometry and crs formats? Isn't geometry already addressed as part of GeoInterface? Different geometry formats of course sometimes have different crs formats, but I'd consider them separately. @meggart regarding the Some more quick comments. Perhaps we can generalize the EPSG to SRID, which holds both an authority, such as EPSG, as well as an identifier. I see GeoFormatTypes also has GeoJSON and KML. The latest specifications for both define these formats always to be EPSG:4326, so I think we can leave them out. I'm not so familiar with GML. So SRID's are nice and should be used when possible, but of course they can only be used if the projection you use is defined in a database. So we still need generic formats like PROJ string and WKT. Using PROJ strings is now discouraged by PROJ itself (https://proj.org/faq.html#what-is-the-best-format-for-describing-coordinate-reference-systems). Although we probably need some support there to be able to handle Shapefiles for instance. For WKT it is important to track which version we mean, see https://proj.org/development/reference/cpp/cpp_general.html?#_CPPv44WKT2. Ideally I think we'd want to have a struct for WKT2, such that we can serialize it to different representations such as https://proj.org/usage/projjson.html. |
Yes @visr i should have mentioned there are two issues here: geometry formats and CRS formats. It would be nice if they were separate things entirely but unfortunately they aren't... That's one reason I changed the package name. Well known text and GML both contain geometries and CRS, which can be in the same string. We could use the type system to distinguish CRS only, CRS/Geometry, and Geometry only formats, but that's something that will need to be worked through with implementation problems. The fact that KML is always 4326 could also be encoded in this package so no-one else has to specify that. So The other reason for including geometric only formats is they also suffer from the problem of method permutations. GDAL/ArchGDAL does conversion between geometries in different formats and each conversion has it's own named method each way. That means the calling code has to know all the format names and methods, which is what I'm trying to avoid. GeoInterface.jl defines the spatial types but not the formats they come in, so doesn't really help here. @meggart I'll move GeoFormatTypes.jl here if someone wants to make me an owner of JuliaGeo. Your use case and questions about WKT in the other thread were similar to mine. If we wrap the types the conversions can at least be easy and automated, and eventually we can define an all julia type-system driven format for internal use, and some day julia-based parsers. And yes GeoInterface.jl should return a wrapped GeoFormat instead of a Dict for CRS. |
@visr Those are good points about well known text. I was thinking we could define Can you also explain a little more what you mean by having both an authority and an identifier for SRID? How would we use method dispatch on this general type? I would also like more details on why we don't need wrappers for string geometry formats? They can definately simplify some of the methods in ArchGDAL. |
We debated how to handle SRIDs and CRS a lot at my previous job (without coming to a definitive answer mind you).
You may find the following documentation somewhat helpful for terminology: https://github.com/JuliaGeo/Geodesy.jl#coordinate-reference-systems-and-spatial-reference-identifiers (disclaimer: I wrote this, but I am not a Geodecist) |
I mean how do we implement that generalisation in the context of dispatch and a type hierarchy. Having I was imagining other authorities that matter would get their own types, so you could do |
Ah I misunderstood. Might it make sense to have |
They are different concepts and I believe we should treat them seperately in our APIs. I see a CRS as a property of any geospatial data (vector/raster/mesh/...) that may be present, at different levels. Since you often want to store the CRS with your data, they are encoded with that data in different forms. But instead of having a CRS type for each format, I'd rather just have the basic CRS types such as WKT, SRID, PROJ string. So Shapefile.jl would as you suggest return a PROJ string type for
Agree, good suggestion. Though to me this would be most natural in a potential KML package rather than in this package.
Something like that could work. Haven't looked much into the actual WKT format or the differences between versions, so it's hard to say much at this point.
I can see how it would simplify this in the case of GDAL and its API that just has strings going in and out, so you currently need to select the right method for the format. I wouldn't say we don't need it, or that we cannot improve the interoperability here. So if I understand correctly, this concerns specifically geometries serialized to strings in different formats, and not the parsed versions, which already would be a
Yeah indeed, I was thinking exactly that. We could always add a constructor |
Yeah mostly everything here concerns handling strings or numbers representing some kind of geospatial data that is otherwise unidentifiable or requires parsing to do so, and for whatever reason isn't parsed immediately (e.g. not required for current use case or just lazy loading). With one caveat that if something is parsed then put in a
This is really the problem. Some intermediate package, or just higher-level methods in the receiving package, will still need to know that it's a GeoJSON string and send it to a specific method if it wants to convert it. And what if it was KML or WellKnownText? we will need lists of custom methods like in ArchGDAL (where they are actually needed to interface with the C++ methods) and that requirement will propagate through the ecosystem. If the strings are wrapped the method permutations are only necessary at the point of conversion - say in |
At some point, I'll like to have an interface for projections and reference systems in GeoInterface, and I'm not entirely convinced it needs to be outside of GeoInterface.jl just based on yeesian/ArchGDAL.jl#95 (comment). Nonetheless,
So I'm okay with GeoInterface being slower in the pace of development (w.r.t. GeoFormatTypes), and for GeoFormatTypes.jl to move to JuliaGeo and for packages to start adopting it (given the endorsement in #7 (comment)).
Sorry I've been slow to the discussion. I also agree that I prefer to operate from understanding concrete use-cases. From what I understood in yeesian/ArchGDAL.jl#95, it allows us to have functions like transform(sourceproj::WellKnownText, targetproj::EPSGcode, val::Geometry) rather than having the type in the function names like what ArchGDAL currently does:
|
@yeesian, to reply to some of your points:
As far as I can tell the datums etc in Geodesy.jl are mostly orthogonal to this package?
These wrappers can't be trait-based as the objects are mostly just And yes being able to write that GeoFormatTypes is pretty much finished for now: I'm keen to register it as it's one of the last things blocking GeoData.jl from having coherent load/save between NetCDF/tiff/grd/hdf5 files, which will be great. I would like it to be part of JuliaGeo, but I'll need to work on it further so I need to be an owner here to want to transfer it here. My pull request at DimensionalArrayTraits is sitting dormant because I can't merge it even though @meggart reviewed it, so I'm hesitant to base my infrastructure on packages here. |
I have made you an owner of JuliaGeo, sorry I took so long to get around to it -- I like how you continue to push through friction where you encounter it, and escalate it if you feel things are not moving for you. Friction does exist in JuliaGeo, not always without reason, and should not be taken as a passive-aggressive display of people's intentions. (Volunteers work on different schedules, and with varying levels of commitment, etc. There have been multi-year initiatives and discussions that did not pan out, but they often provide learning points for the next round of packages and initiatives.) |
Thanks! And no problem at all. I have a bunch of other projects I barely respond to at all due to priorities, and people here are comparatively pretty responsive to ideas. These current changes are just central to my paid work and other research moving forward right now, so there is a tension between wanting to work collectively and integrate (which is always better in the long run), and getting things done now. Apologies if that ever appears pushy or rushed. |
Well said! Collaborating with mixed priorities is always difficult. I skimmed through GeoFormatTypes quickly, it looks useful. A few thoughts which may or may not be helpful:
Unfortunately, not really! Datum is inescapable (at least for high accuracy ~decimeter level measurements). When you measure a point using a GNSS receiver, what are you really measuring with respect to? It's not the satellite constellation! The satellites are just a mechanism for transferring the reference frame defined by an ensemble of ground stations to the handset position. That is, the satellite positions themselves are defined relative to this ensemble of ground stations. If you choose different ground stations, you have a different datum. WGS84 is the datum defined by a set of ground stations maintained by the US department of defense (eg, see here: https://www.ga.gov.au/scientific-topics/positioning-navigation/wgs84 ). The ITRF datums (and regional refinements like ETRS) are defined by a different set of ground stations. Of course, these are just some thoughts from a quick read and may not be applicable to you right now. So feel free to do with them what you will. |
By orthogonal I mean the datum is a property of the data, and this package simply labels the format of the data and knows very little about what is contained in it. It's just explicitly categorising format standards to make it easy to write combinatorial method dispatch, it doesn't actually do anything :) I think the main conflict is that I included the method
Absolutely. Same goes for GML as you say - that's how I imagined it would work. This kind of organised type pyracy seems not that uncommon in But I think having the I kind of assumed someone would pull me up on it too, so Edit: the acronym thing is a larger question here too. I used KML because I don't think people know or care that the K stands for Keyhole, and they know what kml is. It also follows the name of XML so people know roughly what it is by similarity with a widely known standard. But WKT isn't universally known to be Well Known Text so I avoided the acronym. But it's all somewhat arbitrary. |
Yeah, I don't mean to suggest that this form of type piracy is bad. It's a very reasonable way of splitting code into a lightweight core where the detail can be implemented elsewhere. All I mean is that implementing some
Agreed. The larger goal of the "avoid acronyms" thing is to improve clarity and reduce unnecessary jargon so it's definitely subjective. But people will only learn that the company Keyhole existed via kml, not the other way around. At that stage, kml has become the proper name for this thing. (Side note, I've always thought "well known text" was a really terrible name for a format. The expansion says virtually nothing about the format or the domain in which it's used.) |
Dealing with formats for geometry and crs data has been annoying me for a while. We end up with a lot of
importWKT
/toEPSG
methods that mean that calling method/package needs to know about all the formats instead of using generic approaches. We also need to somehow track what format a string is holding we are passing it between packages.Using the type system can resolve this pretty neatly by wrapping a format once when it is loaded, avoiding any future validation or tracking. We can then leverage multiple dispatch to handle everything after that.
In GeoFormatTypes.jl I'm defining
WellKnownText
,ProjString
,EPGScode
and other wrappers that classify geometries and metadata, see yeesian/ArchGDAL.jl#97 for an example use case.Once formats are wrapped we can define pretty powerful generic tools in a few lines - like this
reproject
method that transforms coordinates between projections in any format such asWellKnownText
->EPSGcode(4326)
without the calling method/package having to know about any of the formats.Any thoughts? It would be good to move GeoFormatTypes.jl to JuliaGeo and have it adopted in the ecosystem.
The text was updated successfully, but these errors were encountered: