-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EDR API and the collections
endpoint.
#38
Comments
I suggest going back to look at the |
Can you expand on what you mean a little and/or give some pointers to more info? |
A single SOS can serve observations with a variety of sensors, observed properties, features of interest, time periods. |
Josh Lieberman was responsible ... In the context of the query, it seemed to serve a similar purpose to 'layer' in WMS, and 'collection' in WFS. |
Right -- got it. I think I see a future (extension of EDR or maybe Features -- potentially just a best practice?) that would create (sampling) feature collections in the ObservationOfferings paradigm. That could use the collections end point and the items in the collection would be non-simple features. Using At this juncture, EDR is really more focused on providing query interfaces to data-cubes with a nice side benefit that the sampling geometries can be saved and/or pre-defined. It just so happens that the API for those saved sampling geometries is a simple core spec for real-world sampling so we are kind fo tagging that use case on here with an eye on the future. |
Reserving collections to handle (possibly ad-hoc) collections of sampled geometries makes sense to me. But a short notation to
In many cases there are arbitrary many available locations. From data cubes, you may sample from any location with any accuracy. |
The idea with /locations and possible using /collections/{id}/items is not to require all possible sampling features. It's a convenience to allow exposure of them however an API wants to. I could see it used for pre-defined locations, monitoring stations, caching / saving user submitted geometries, etc. |
Some definitions from API Features: Using substitution: a Feature Collection is a set of features from a collection of data. We can argue that the terms "set" and "collection" are synonymous What then is a collection? How about: "A body of resources that belong or are used together. An aggregate, set, or group of related resources." (API-Common Part 2: Collections) This definition is derived from Websters' definitions for collection, set, and aggregate. The items in the collection are untyped. If they are data, then we have a dataset (a collection of data). You can also have a collection of styles (is a style data?), or processes, or anything else you may want to collect. "/collections" is where you go to find out about the collections available from this API. Note that "/collections" returns metadata. It does not specify nor assume a type for the items in the collections. Nor does it require that the collections all be of the same type. |
What's your point? Are you arguing that EDR should use the collections endpoint as a cataloging mechanism for EDR datasets or do you support the proposal that I put forward -- that EDR should not take on the cataloging use case but, rather, focus efforts on representation of a single EDR dataset? |
Definitions of Collection is not a new issue. From an OGC 1999 document: "Much fundamental work on Feature Collection is needed. What are the fundamental classes and |
@dblodgett-usgs Neither. Just trying to lay out come core concepts. What is a collection? What is a dataset? How do they relate to each other? Without a common understanding of these terms, we will continue to talk past each other. |
@chris-little 99-110 is about Feature Collections. But not all resources exposed through OGC APIs are Features. So not all collections are Feature Collections. I think we will make better progress, and have more coherent discussions, if we acknowledge that Collection and Feature Collection are two separate concepts. |
@dblodgett-usgs Why not do both? If you are exposing one dataset then branch it directly off of the landing page. If you have more than one, then use the /collections construct. |
Thanks, Chuck. I agree that we need common definitions. When we are talking about URL path semantics, definitions are nuanced beyond dictionary deffs with the social and web-engineering context. What we are talking about here is the " re:
Fair question. I think doing both adds a lot of complexity and re-invents a wheel that is being established elsewhere. e.g. OGC-API Records, various EO-related cataloging efforts, etc. I think from a best practice point of view, the argument is that the EDR API should have a single-responsibility, provide access to an EDR Dataset. Providing cataloging for EDR Datasets should be the responsibility of another OGC-API standard (records?). |
@dblodgett-usgs But there is a requirement to group EDR 'datasets'. A typical production dataset of an NWP or Climate model , whether meteorologiocal or oceanographic, will have multiple vertical coordinates. We have agreed that these be split apart to make EDR datasets, each of which has one consistent (4D) CRS. There is a case that these be grouped for EDR purposes, because they are closely tied, they are the same production dataset. This may be too detailed for some catalogues or not efficient enough, or too messy to implement in a production environment. The workflow we have been using is: A user wishing to compare several forecasts from different providers, or different times, or different experiments, needs to interrogate in turn several collections. Hence collection of collections, or group of groups, needed. I propose that we leave this functionality in the current version of the EDR API to minimise the dependencies on other OGC APIs. When we have cross-consulted the other API SWGs and they have demonstrated that their implementations work and have the functionality that we require, we can then simplify the EDR API and take the collection/group function out. |
@chris-little I'm all for having a grouping / nested group capability. However, given that what we are grouping are not feature collections, I don't think they should be grouped under a Let me dive into the argument a little bit... sorry this is kinda long. I'm thinking of use cases like displaying the list of collections an OGC-API instance that conforms to Features-core provides. (example) If you have typed collections, that list has to be nuanced by some formal type -- more complicated than media-type. We don't have that ability now and adding it is complexity we don't need. The other option is pre-fetching and introspection of hypermedia -- something we also know people just don't do. The alternative here is hard coding things specific to API instances -- a pure anti-pattern when it comes to reusability. I appreciate that the debate on this is still in full swing and can agree to leaving the |
Again, I urge you to look back at the |
This issue is about reserving the |
Indeed. Collections should make sense, by being homogeneous on at least one dimension. That's what is currently being proposed for 'ObservationCollection` class in the O&M revision - https://github.com/opengeospatial/om-swg |
@dblodgett-usgs Perhaps we need to Pull in to our standard the update to API-Common, Part 1: Core then look at what is now in API-Common, Part 2: Collections ? |
Yes -- I think we should look at how we are going to incorporate Part 2, collections. The question remains, will we use Part 2 and limit our use of collections to feature collections? e.g.
Where an EDR-API has a catalog of EDR Resource Collections and introduces a collection or option 2:
Where we use the literal |
As discussed on https://github.com/opengeospatial/Environmental-Data-Retrieval-API/wiki/2020-05-21, there is a third option -- or perhaps a refinement of 2. The EDR query pattern ends up in the
Need to follow up with API Common and others. |
Based on the outcome of opengeospatial/ogcapi-common#140 (opengeospatial/ogcapi-common#140 (comment) is especially relevant) keeping the current approach to collections is probably going to be best. Given that
We need to take on the complexity of listing available spatial data resources at the collection level. These slides are being used to document the complete logic for this such that the community can move forward using this scheme with a common understanding of why. Above all else, we are agreeing that a collection is: "A geospatial data resource that may be available as one or more sub-resource distributions that conform to one or more OGC API standards." and that: Any OGC API that uses the /collection path should define their resource as a representation of a collection of geospatial data. So in the context of EDR: An EDR resource is a collection of spatiotemporal data that can be sampled using OGC-API Environmental Data Resources query patterns. |
On https://github.com/opengeospatial/Environmental-Data-Retrieval-API/wiki/2020-06-04 we agreed that this issue can be closed and we can move forward with:
As long as we implement some changes implied in: opengeospatial/ogcapi-common#140 (comment) #71 and #72 are follow up issues that need discussion. We need to add the definition of an EDR resource I gave above to the spec, which I will do with a commit that closes this issue. |
(updated to reflect latest estimate of terminology and move proposal to top on 4-9-20)
Proposal: EDR should not use
/collections/
EDR should focus on query patterns for distribution of a dataset such as are organized in THREDDS catalogs.
EDR dataset metadata would be included in the landing page or a clearly linked metadata search API that provides cataloging over many EDR endpoints.
/position
,/area
,/location
, would be allowed off the root of an EDR compliant API./collections
would be reserved for use with API-Features feature collections -- which may be useful to an API that also implements the EDR query patterns but would not be required. See #28 for more on this front./groups
as explored in opengeospatial/EDR-API-Sprint#14 would work the same only they would be aggregations of EDR datasets rather than EDR collections.Summary of current situation
At present, we have planned to overload the
collections
endpoint such that:Would return a list of available EDR collections with links and collection metadata. The premise is that EDR will have a unique collection metadata json-schema.
Would return collection metadata for the selected collection.
would not be required because an EDR collection may not be accessible as a collection of items.
With one or more allowed
queryTypes
would be the pattern to access EDR functions.Consideration 1:
collections
is already in API Features as feature collections.There's been some strong arguments to keep the typing of OGC API resources simple. e.g. what you get back when you hit a
collection
endpoint shouldn't change depending which OGC API you are hitting. Given thatcollections
is already defined in API-Features, by overloading the endpoint we are going against this. A client that thinks it knows what a collection is will hit an EDR API and get something much more than a feature collection metadata.Consideration 2:
collections
is a convenient data-user-centric wrapper around useful stuff.@cmheazel has a great discussion in the API-Common wiki. I'll leave a couple quotes from the wiki here.
...
Consideration 3:
collections
is used as an analogue for dataset.In this issue-ending comment, @heidivanparys lays out the case for typing collection with care (e.g. feature collection is a type of collection) and this well-reasoned comment that dataset ≠ (feature) collection.
Additionally, in API-Features Core, an API instance is limited to one and only one dataset which can have one or more collections. I recently poked around this issue in #33 which leads me to want to push for EDR to focus on describing EDR "datasets" and not EDR Collections.
The text was updated successfully, but these errors were encountered: