From a155475b40a350dd6110c0f8859eb49f06292600 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Tue, 5 May 2020 12:36:46 +0200 Subject: [PATCH 1/3] Collection-level assets #778 #779 --- CHANGELOG.md | 1 + extensions/README.md | 35 +++---- extensions/collection-assets/README.md | 38 ++++++++ .../examples/example-esm.json | 97 +++++++++++++++++++ .../collection-assets/json-schema/schema.json | 22 +++++ item-spec/json-schema/item.json | 15 +-- 6 files changed, 185 insertions(+), 23 deletions(-) create mode 100644 extensions/collection-assets/README.md create mode 100644 extensions/collection-assets/examples/example-esm.json create mode 100644 extensions/collection-assets/json-schema/schema.json diff --git a/CHANGELOG.md b/CHANGELOG.md index 62a313392..d237f47fd 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -18,6 +18,7 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0. - Several new sections to 'best practices' document. - Added the ability to define Item properties under Assets (item-spec/item-spec.md) - Add `proj:shape` and `proj:transform` to the projections extension. +- Collection-level assets extension ### Changed - Moved item recommendations to best practices, and added a bit more in item spec about 'search' diff --git a/extensions/README.md b/extensions/README.md index c54befaea..12694803d 100644 --- a/extensions/README.md +++ b/extensions/README.md @@ -44,23 +44,24 @@ stable for over a year and are used in twenty or more implementations. An extension can add new fields to STAC entities (content extension), or can add new endpoints or behavior to the API (API extension). Below is a list of content extensions, while API extensions are published in the [STAC API repository](https://github.com/radiantearth/stac-api-spec/tree/master/extensions/). -| Extension Title | Identifier | Field Name Prefix | Scope | Maturity | Description | -| ---------------------------------------------- | ---------------- | ------------------- | ------------------------- | ---------- | ---------------------------------- | -| [Checksum](checksum/README.md) | checksum | checksum | Item, Catalog, Collection | *Proposal* | Provides a way to specify file checksums for assets and links in Items, Catalogs and Collections. | -| [Commons](commons/README.md) | commons | - | Item, Collection | *Proposal* | Provides a way to specify data fields in a collection that are common across the STAC Items in that collection, so that each does not need to repeat all the same information. | -| [Data Cube](datacube/README.md) | datacube | cube | Item, Collection | *Proposal* | Data Cube related metadata, especially to describe their dimensions. | -| [Electro-Optical](eo/README.md) | eo | eo | Item | *Pilot* | Covers electro-optical data that represents a snapshot of the earth for a single date and time. It could consist of multiple spectral bands, for example visible bands, infrared bands, red edge bands and panchromatic bands. The extension provides common fields like bands, cloud cover, gsd and more. | -| [Item Asset Definition](item-assets/README.md) | item-assets | - | Collection | *Proposal* | Provides a way to specify details about what assets may be found in Items belonging to a collection. | -| [Label](label/README.md) | label | label | Item | *Proposal* | Items that relate labeled AOIs with source imagery | -| [Point Cloud](pointcloud/README.md) | pointcloud | pc | Item | *Proposal* | Provides a way to describe point cloud datasets. The point clouds can come from either active or passive sensors, and data is frequently acquired using tools such as LiDAR or coincidence-matched imagery. | -| [Projection](projection/README.md) | projection | proj | Item | *Proposal* | Provides a way to describe items whose assets are in a geospatial projection. | -| [SAR](sar/README.md) | sar | sar | Item | *Proposal* | Covers synthetic-aperture radar data that represents a snapshot of the earth for a single date and time. | -| [Satellite](sat/README.md) | sat | sat | Item | *Proposal* | Satellite related metadata for data collected from satellites. | -| [Scientific](scientific/README.md) | scientific | sci | Item, Collection | *Proposal* | Scientific metadata is considered to be data that indicate from which publication data originates and how the data itself should be cited or referenced. | -| [Single File STAC](single-file-stac/README.md) | single-file-stac | - | ItemCollection | *Proposal* | An extension to provide a set of Collections and Items as a single file catalog. | -| [Tiled Assets](tiled-assets/README.md) | tiled-assets | tiles | Item, Catalog, Collection | *Proposal* | Allows to specify numerous assets using asset templates via tile matrices and dimensions. | -| [Versioning Indicators](version/README.md) | version | - | Item, Collection | *Proposal* | Provides fields and link relation types to provide a version and indicate deprecation. | -| [View Geometry](view/README.md) | view | view | Item | *Proposal* | View Geometry adds metadata related to angles of sensors and other radiance angles that affect the view of resulting data | +| Extension Title | Identifier | Field Name Prefix | Scope | Maturity | Description | +| ------------------------------------------------ | ----------------- | ------------------- | ------------------------- | ---------- | ----------- | +| [Checksum](checksum/README.md) | checksum | checksum | Item, Catalog, Collection | *Proposal* | Provides a way to specify file checksums for assets and links in Items, Catalogs and Collections. | +| [Collection Assets](collection-assets/README.md) | collection-assets | - | Collection | *Proposal* | Provides a way to specify assets available on the collection-level. | +| [Commons](commons/README.md) | commons | - | Item, Collection | *Proposal* | Provides a way to specify data fields in a collection that are common across the STAC Items in that collection, so that each does not need to repeat all the same information. | +| [Data Cube](datacube/README.md) | datacube | cube | Item, Collection | *Proposal* | Data Cube related metadata, especially to describe their dimensions. | +| [Electro-Optical](eo/README.md) | eo | eo | Item | *Pilot* | Covers electro-optical data that represents a snapshot of the earth for a single date and time. It could consist of multiple spectral bands, for example visible bands, infrared bands, red edge bands and panchromatic bands. The extension provides common fields like bands, cloud cover, gsd and more. | +| [Item Asset Definition](item-assets/README.md) | item-assets | - | Collection | *Proposal* | Provides a way to specify details about what assets may be found in Items belonging to a collection. | +| [Label](label/README.md) | label | label | Item | *Proposal* | Items that relate labeled AOIs with source imagery | +| [Point Cloud](pointcloud/README.md) | pointcloud | pc | Item | *Proposal* | Provides a way to describe point cloud datasets. The point clouds can come from either active or passive sensors, and data is frequently acquired using tools such as LiDAR or coincidence-matched imagery. | +| [Projection](projection/README.md) | projection | proj | Item | *Proposal* | Provides a way to describe items whose assets are in a geospatial projection. | +| [SAR](sar/README.md) | sar | sar | Item | *Proposal* | Covers synthetic-aperture radar data that represents a snapshot of the earth for a single date and time. | +| [Satellite](sat/README.md) | sat | sat | Item | *Proposal* | Satellite related metadata for data collected from satellites. | +| [Scientific](scientific/README.md) | scientific | sci | Item, Collection | *Proposal* | Scientific metadata is considered to be data that indicate from which publication data originates and how the data itself should be cited or referenced. | +| [Single File STAC](single-file-stac/README.md) | single-file-stac | - | ItemCollection | *Proposal* | An extension to provide a set of Collections and Items as a single file catalog. | +| [Tiled Assets](tiled-assets/README.md) | tiled-assets | tiles | Item, Catalog, Collection | *Proposal* | Allows to specify numerous assets using asset templates via tile matrices and dimensions. | +| [Versioning Indicators](version/README.md) | version | - | Item, Collection | *Proposal* | Provides fields and link relation types to provide a version and indicate deprecation. | +| [View Geometry](view/README.md) | view | view | Item | *Proposal* | View Geometry adds metadata related to angles of sensors and other radiance angles that affect the view of resulting data | ## Third-party / vendor extensions diff --git a/extensions/collection-assets/README.md b/extensions/collection-assets/README.md new file mode 100644 index 000000000..c773ca2e4 --- /dev/null +++ b/extensions/collection-assets/README.md @@ -0,0 +1,38 @@ +# Collection Assets Extension Specification + +- **Title: Collection Assets** +- **Identifier: collection-assets** +- **Field Name Prefix: -** +- **Scope: Collection** +- **Extension [Maturity Classification](../README.md#extension-maturity): Proposal** + +A Collection extension to provide a way to specify assets available on the collection-level. + +- [Example](examples/example-esm.json) +- [JSON Schema](json-schema/schema.json) + +This extension introduces a single new field, `assets` at the top level of a collection. +An Asset Object defined at the Collection level is the same as the [Asset Object in Items](../../item-spec/item-spec.md#asset-object). + +Collection-level assets MUST NOT list any files also available in items. +If possible, item-level assets are always the preferable way to expose assets. +To list what assets are available in items see the [Item Assets Definition Extension](../item-assets/README.md). + +Collection-level assets can be useful in some scenarios, for example: +1. Exposing additional data that applies collection-wide and you don't want to expose it in each Item. This can be collection-level metadata or a thumbnail for visualization purposes. +2. Individual items can't properly be distinguished for some data structures, e.g. [Zarr](https://zarr.readthedocs.io/) as it's a data structure not contained in single files. +3. Exposing assets for "[Standalone Collections](https://github.com/radiantearth/stac-spec/blob/master/collection-spec/collection-spec.md#standalone-collections)". + +## Collection fields + +| Field Name | Type | Description | +| ---------- | ---------------------------------------------------------------------- | ----------- | +| assets | Map | **REQUIRED.** Dictionary of asset objects that can be downloaded, each with a unique key. | + +**assets**: In general, the keys don't have any meaning and are considered to be non-descriptive unique identifiers. +Providers may assign any meaning to the keys for their respective use cases, but must not expect that clients understand them. +To communicate the purpose of an asset better use the `roles` field in the [Asset Object](../../item-spec/item-spec.md#asset-object). + +## Implementations + +- The [ESM collection spec](https://github.com/NCAR/esm-collection-spec) uses this extension to expose Zarr archives. diff --git a/extensions/collection-assets/examples/example-esm.json b/extensions/collection-assets/examples/example-esm.json new file mode 100644 index 000000000..18d398655 --- /dev/null +++ b/extensions/collection-assets/examples/example-esm.json @@ -0,0 +1,97 @@ +{ + "stac_version": "0.9.0", + "stac_extensions": [ + "collection-assets", + "https://github.com/NCAR/esm-collection-spec/tree/v0.2.0/schema.json" + ], + "id": "pangeo-cmip6", + "title": "Google CMIP6", + "description": "This is an ESM collection for CMIP6 Zarr data residing in Pangeo's Google Storage.", + "extent": { + "spatial": { + "bbox": [[-180, -90, 180, 90]] + }, + "temporal": { + "interval": [["1850-01-15T12:00:00Z", "2014-12-15T12:00:00Z"]] + } + }, + "providers": [ + { + "name": " World Climate Research Programme", + "roles": ["producer","licensor"], + "url": "https://www.wcrp-climate.org/wgcm-cmip/wgcm-cmip6" + }, + { + "name": "The Pangeo Project", + "roles": ["processor"], + "url": "https://console.cloud.google.com/pangeo.io" + }, + { + "name": "Google", + "roles": ["host"], + "url": "https://console.cloud.google.com/marketplace/details/noaa-public/cmip6" + } + ], + "license": "proprietary", + "links": [ + { + "href": "https://pcmdi.llnl.gov/CMIP6/TermsOfUse/TermsOfUse6-1.html", + "type": "text/html", + "rel": "license", + "title": "CMIP6: Terms of Use" + } + ], + "assets": { + "thumbnail": { + "href": "logo.png", + "title": "A preview image for visualization.", + "type": "image/png", + "roles": ["thumbnail"] + }, + "catalog": { + "href": "sample-pangeo-cmip6-zarr-stores.csv", + "title": "Catalog", + "description": "Path to a the CSV file with the catalog contents.", + "type": "text/csv", + "roles": ["esm-catalog"], + "esm:column_name": "path" + }, + "activity_id": { + "href": "https://raw.githubusercontent.com/WCRP-CMIP/CMIP6_CVs/master/CMIP6_activity_id.json", + "type": "application/json", + "roles": ["esm-vocabulary"], + "esm:column_name": "activity_id" + }, + "source_id": { + "href": "https://raw.githubusercontent.com/WCRP-CMIP/CMIP6_CVs/master/CMIP6_source_id.json", + "type": "application/json", + "roles": ["esm-vocabulary"], + "esm:column_name": "source_id" + }, + "institution_id": { + "href": "https://raw.githubusercontent.com/WCRP-CMIP/CMIP6_CVs/master/CMIP6_institution_id.json", + "type": "application/json", + "roles": ["esm-vocabulary"], + "esm:column_name": "institution_id" + }, + "experiment_id": { + "href": "https://raw.githubusercontent.com/WCRP-CMIP/CMIP6_CVs/master/CMIP6_experiment_id.json", + "type": "application/json", + "roles": ["esm-vocabulary"], + "esm:column_name": "experiment_id" + }, + "table_id": { + "href": "https://raw.githubusercontent.com/WCRP-CMIP/CMIP6_CVs/master/CMIP6_table_id.json", + "type": "application/json", + "roles": ["esm-vocabulary"], + "esm:column_name": "table_id" + }, + "grid_label": { + "href": "https://raw.githubusercontent.com/WCRP-CMIP/CMIP6_CVs/master/CMIP6_grid_label.json", + "type": "application/json", + "roles": ["esm-vocabulary"], + "esm:column_name": "grid_label" + } + }, + "esm:attributes": ["activity_id", "source_id", "institution_id", "experiment_id", "member_id", "table_id", "variable_id", "grid_label"] +} \ No newline at end of file diff --git a/extensions/collection-assets/json-schema/schema.json b/extensions/collection-assets/json-schema/schema.json new file mode 100644 index 000000000..741fbd76b --- /dev/null +++ b/extensions/collection-assets/json-schema/schema.json @@ -0,0 +1,22 @@ +{ + "$schema": "http://json-schema.org/draft-07/schema#", + "$id": "schema.json#", + "title": "Collection Assets Extension Specification", + "description": "STAC Collection-level assets Extension to a STAC Collection", + "allOf": [ + { + "$ref": "../../../collection-spec/json-schema/collection.json" + }, + { + "type": "object", + "required": [ + "assets" + ], + "properties": { + "assets": { + "$ref": "../../../item-spec/json-schema/item.json#/definitions/assets" + } + } + } + ] +} \ No newline at end of file diff --git a/item-spec/json-schema/item.json b/item-spec/json-schema/item.json index 57d05557c..f54547a19 100644 --- a/item-spec/json-schema/item.json +++ b/item-spec/json-schema/item.json @@ -108,12 +108,7 @@ } }, "assets": { - "title": "Asset links", - "description": "Links to assets", - "type": "object", - "additionalProperties": { - "$ref": "#/definitions/asset" - } + "$ref": "#/definitions/assets" }, "properties": { "allOf": [ @@ -170,6 +165,14 @@ } } }, + "assets": { + "title": "Asset links", + "description": "Links to assets", + "type": "object", + "additionalProperties": { + "$ref": "#/definitions/asset" + } + }, "asset": { "type": "object", "required": [ From 0366d84a5406e7d30b177cea7349411fcf5c8d61 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Thu, 7 May 2020 21:28:07 +0200 Subject: [PATCH 2/3] Add collection-assets to collection schema --- collection-spec/json-schema/collection.json | 1 + 1 file changed, 1 insertion(+) diff --git a/collection-spec/json-schema/collection.json b/collection-spec/json-schema/collection.json index 37f445086..d4a609729 100644 --- a/collection-spec/json-schema/collection.json +++ b/collection-spec/json-schema/collection.json @@ -37,6 +37,7 @@ "type": "string", "enum": [ "asset", + "collection-assets", "commons", "checksum", "datacube", From d7a63bfcccdd559a862e87099669a21bf990a0cc Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Wed, 27 May 2020 08:17:04 +0200 Subject: [PATCH 3/3] Update extensions/collection-assets/README.md --- extensions/collection-assets/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/extensions/collection-assets/README.md b/extensions/collection-assets/README.md index c773ca2e4..13e579e93 100644 --- a/extensions/collection-assets/README.md +++ b/extensions/collection-assets/README.md @@ -14,7 +14,7 @@ A Collection extension to provide a way to specify assets available on the colle This extension introduces a single new field, `assets` at the top level of a collection. An Asset Object defined at the Collection level is the same as the [Asset Object in Items](../../item-spec/item-spec.md#asset-object). -Collection-level assets MUST NOT list any files also available in items. +Collection-level assets SHOULD NOT list any files also available in items. If possible, item-level assets are always the preferable way to expose assets. To list what assets are available in items see the [Item Assets Definition Extension](../item-assets/README.md).