Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test schema org #82

Merged
merged 7 commits into from
Dec 11, 2018
Merged

Test schema org #82

merged 7 commits into from
Dec 11, 2018

Conversation

jyucsiro
Copy link
Contributor

First cut.

See #80

@jyucsiro jyucsiro requested review from adamml and marqh November 19, 2018 10:54
@jyucsiro jyucsiro self-assigned this Nov 19, 2018
@jyucsiro
Copy link
Contributor Author

jyucsiro commented Nov 20, 2018

Allowing for netCDF metadata on the web to be tested, e.g. URLs passed in like so:

$ python nc2rdf.py -o json-ld --schema-org https://www.ngdc.noaa.gov/thredds/dodsC/arctic/Polar-APP-X_v01r01_Nhem_1400_d20160801_c20160803.nc

result:

{
    "@context": "http://schema.org/",
    "description": "The Extended AVHRR Polar Pathfinder (APP-x) version-2 Thematic Climate Data Record (CDR) includes surface temperature, surface albedo, surface and the Top Of the Atmosphere (TOA) shortwave and longwave radiative fluxes, cloud properties (amount, phase, particle size, optical depth,top pressure and temperature, surface and TOA radiative effect), and ice thickness and age. The APP-x CDR has twice daily data at local solar times of 14 and 04(02) for the Arctic(Antarctic) at a spatial resolution of 25 km for both poles.",
    "http://schema.org/identifier": "Polar-APP-X_v01r01_Nhem_1400_d20160801_c20160803.nc",
    "http://schema.org/license": "No restrictions on access or use",
    "id": "_:N1ffe69b6dd6c4ceb895f7c79d3c17b2d",
    "keywords": "EARTH SCIENCE > ATMOSPHERE > ATMOSPHERIC RADIATION > SOLAR RADIATION, EARTH SCIENCE > ATMOSPHERE > ATMOSPHERIC RADIATION > LONGWAVE RADIATION, EARTH SCIENCE > ATMOSPHERE > ATMOSPHERIC RADIATION > RADIATIVE FLUX, EARTH SCIENCE > ATMOSPHERE > CLOUDS > CLOUD MICROPHYSICS > CLOUD DROPLET CONCENTRATION(SIZE), EARTH SCIENCE > ATMOSPHERE > CLOUDS > CLOUD MICROPHYSICS > CLOUD OPTICAL DEPTH(THICKNESS), EARTH SCIENCE > ATMOSPHERE > CLOUDS > CLOUD MICROPHYSICS > CLOUD DROPLET PHASE, EARTH SCIENCE > ATMOSPHERE > CLOUDS > CLOUD PROPERTIES > CLOUD TOP PRESSURE, EARTH SCIENCE > ATMOSPHERE > CLOUDS > CLOUD PROPERTIES > CLOUD TOP TEMPERATURE, EARTH SCIENCE > ATMOSPHERE > CLOUDS > CLOUD PROPERTIES > CLOUD TYPE, EARTH SCIENCE > ATMOSPHERE > CLOUDS > CLOUD PROPERTIES > CLOUD FRACTION, EARTH SCIENCE > ATMOSPHERE > CLOUDS > CLOUD RADIATIVE TRANSFER > CLOUD RADIATIVE FORCING, EARTH SCIENCE > CLIMATE INDICATORS > CRYOSPHERIC INDICATORS > ICE DEPTH(THICKNESS), EARTH SCIENCE > CRYOSPHERE > SEA ICE > ICE DEPTH(THICKNESS),EARTH SCIENCE > CRYOSPHERE > SNOW and ICE > ALBEDO, EARTH SCIENCE > CRYOSPHERE > SNOW and ICE > SNOW and ICE TEMPERATURE, EARTH SCIENCE > LAND SURFACE > LAND TEMPERATURE, EARTH SCIENCE > LAND SURFACE > LAND ALBEDO, EARTH SCIENCE > OCEANS > SEA ICE > ICE DEPTH(THICKNESS),EARTH SCIENCE > TERRESTRIAL HYDROSPHERE > SNOW and ICE > ALBEDO, EARTH SCIENCE > TERRESTRIAL HYDROSPHERE > SNOW and ICE > ICE DEPTH(THICKNESS), ",
    "name": "Extended AVHRR Polar Pathfinder Fundamental Climate Data Record (APPx CDR)",
    "type": "Dataset"
}

@adamml
Copy link
Contributor

adamml commented Nov 20, 2018

@jyucsiro I think you should (strictly) preface the "type" and "id" keys with @ symbols, such that they become "@type" and "@id". Without, they parse ok but with they are reserved in JSON-LD for the explicit definitions.

On the identifier and license and keys, the "http://schema.org/" is not required.

As we discussed on the call, it would be nice to split those GCMD keywords into an array...

@@ -670,8 +670,10 @@ def load(afilepath):
loader = netCDF4.Dataset
else:
raise ValueError('filepath suffix not supported: {}'.format(afilepath))
if not os.path.exists(afilepath):
raise IOError('{} not found'.format(afilepath))
#Disable this check for now to allow URL input
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do both the hdf and netcdf APIs support loading from URI?

if so, we should just let them do that

is that what you are doing in this case? does it already work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't checked with HDF, but it seems to work with the netCDF APIs.

Yes, that's what I'm doing in this case - both local and from URLs (mainly netCDF files served through thredds).

@jyucsiro
Copy link
Contributor Author

jyucsiro commented Nov 20, 2018

@jyucsiro I think you should (strictly) preface the "type" and "id" keys with @ symbols, such that they become "@type" and "@id". Without, they parse ok but with they are reserved in JSON-LD for the explicit definitions.

Thanks @adamml. I'm using the rdflib-json-ld serializer (https://github.com/RDFLib/rdflib-jsonld). For some reason, for the example above, it doesn't add "type" and "id" with those symbols. Tinkering around https://json-ld.org/playground/, it looks like the output is a 'compacted' version, where "type" and "id" are implicitly equivalent to "@type" and "@id"?

I tried getting the json-ld serializer to output with those '@' symbols but am not getting much success.

On the identifier and license and keys, the "http://schema.org/" is not required.

Can you clarify? I'm not sure where to find this.
I see what you mean - it was from another example on #80

The issue will be then which namespace the identifier/license properties will come from. This link (https://developers.google.com/search/docs/data-types/dataset) lists license and identifier...

As we discussed on the call, it would be nice to split those GCMD keywords into an array...

Yes, that would be good.

It would need to be either at the time it gets parsed into the native BALD graph or in converting to schema.org?

I've implemented this in nc2rdf for now. @marqh what do you think?

@adamml
Copy link
Contributor

adamml commented Nov 21, 2018

@jyucsiro

The issue will be then which namespace the identifier/license properties will come from. This link (https://developers.google.com/search/docs/data-types/dataset) lists license and identifier...

Because you have {"@context": "http://schema.org",...} without the identifier/license properties, you are saying use the schema.org namespace.

@marqh
Copy link
Member

marqh commented Dec 11, 2018

thank you for all the input, this looks good to go

@marqh marqh merged commit c6b8cb1 into binary-array-ld:master Dec 11, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants