-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to represent dataset source? #210
Comments
Isn't that similar to what we are discussing in #179? Sounds like this could be solved by the |
It is a very simple case of provenance. But while specifying provenance in general is very hard and not always needed, specifying the data source (especially in case of EE that is mostly a mirror) is very easy and always necessary. |
In #179 it is proposed to just link them using a link with rel type derived_from until there is a more concrete standard to describe provenance. I think these issues are closely related and should be discussed together. I would imaging a newly created standard to describe provenance would also include something to link to the source?! Other than that, I think most parts of my first comment in #225 also apply here... |
When I first started thinking about this I did think there'd be a 'derived_from' as well as 'copy_of' (or something like that). The first representing that there was processing done, the second just that it's stored in a different location - but that it points back to where it came from. As mentioned in #225 - I'd see 'copy_of' include the 'metadata processing' - unzipping, putting into a COG, etc. I'm hesitant to add two more core link relationships, since derived_from seems like a stretch with few implementations and so we already have one in. I could see an extra attribute on derived_from that indicates that it is just a copy, not actual processing. Or just make a new link type, but we 'incubate' it for a bit. But I'd say if GEE uses it, and we also have Sentinel & Landsat in AWS also link back to their 'source' then it'd be pretty easy to bring in to the core. |
Recently we added the rel type via, which links back to the original metadata. derived_from links to the STAC Item for the source data. Having that should be enough, I think. |
For hosted datasets (eg, for MODIS MOD13Q1 dataset in the EE catalog) I want to point at the original file source. Should I use links for that? If yes, what do I put as 'rel' instead of XXX?
"links": [
{ "rel": "XXX", "href": " https://e4ftl01.cr.usgs.gov/MOLT/MOD13Q1.006/ " },
]
If no, should I create a separate Host section?
The text was updated successfully, but these errors were encountered: