Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Values of hasFeatureType use wrong place type URIs (extra /plone in URI path) #518

Closed
rybesh opened this issue Sep 23, 2024 · 8 comments
Closed

Comments

@rybesh
Copy link

rybesh commented Sep 23, 2024

In the RDF datasets, values of pleiades:hasFeatureType have URIs with an extra /plone at the start of the path, like:

https://pleiades.stoa.org//plone/vocabularies/place-types/villa

They should be like:

https://pleiades.stoa.org/vocabularies/place-types/villa

This currently breaks SPARQL joins between places and their type concepts.

@paregorios
Copy link
Member

The problem is currently only manifest in the bulk download in TTL format, generated weekly on Sundays and obtained via https://atlantides.org/downloads/pleiades/rdf/:

ggrep plone *.ttl | more
errata.ttl:    pleiades:hasFeatureType <https://pleiades.stoa.org//plone/vocabularies/place-types/province>;
errata.ttl:    pleiades:hasFeatureType <https://pleiades.stoa.org//plone/vocabularies/place-types/province>;
errata.ttl:    pleiades:hasFeatureType <https://pleiades.stoa.org//plone/vocabularies/place-types/province>;
errata.ttl:    pleiades:hasFeatureType <https://pleiades.stoa.org//plone/vocabularies/place-types/province>;
errata.ttl:    pleiades:hasFeatureType <https://pleiades.stoa.org//plone/vocabularies/place-types/province>;
errata.ttl:    pleiades:hasFeatureType <https://pleiades.stoa.org//plone/vocabularies/place-types/province>;
errata.ttl:    pleiades:hasFeatureType <https://pleiades.stoa.org//plone/vocabularies/place-types/province>;
errata.ttl:    pleiades:hasFeatureType <https://pleiades.stoa.org//plone/vocabularies/place-types/province>;
errata.ttl:    pleiades:hasFeatureType <https://pleiades.stoa.org//plone/vocabularies/place-types/province>;
errata.ttl:    pleiades:hasFeatureType <https://pleiades.stoa.org//plone/vocabularies/place-types/province>;
errata.ttl:    pleiades:hasFeatureType <https://pleiades.stoa.org//plone/vocabularies/place-types/port>,
errata.ttl:        <https://pleiades.stoa.org//plone/vocabularies/place-types/settlement>;
errata.ttl:    pleiades:hasFeatureType <https://pleiades.stoa.org//plone/vocabularies/place-types/unknown>;
errata.ttl:    pleiades:hasFeatureType <https://pleiades.stoa.org//plone/vocabularies/place-types/road>;
errata.ttl:    pleiades:hasFeatureType <https://pleiades.stoa.org//plone/vocabularies/place-types/road>;
errata.ttl:    pleiades:hasFeatureType <https://pleiades.stoa.org//plone/vocabularies/place-types/villa>;

In evaluating any fix, we should check for these problems not only in the bulk TTL format, but also in both of the individual place-level RDF serializations where the problem is not currently found:

@jessesnyder
Copy link

@paregorios I regenerated the full archive on staging the way it's generated "in the wild" (via cron), and I think it's looking good. I have not tested the per-Place rdf download, but I didn't find any common code paths, so I think that will be OK (but please test!)

pleiades-latest.tar.gz

@paregorios
Copy link
Member

@jessesnyder can you double-check that the gzipped tar file you attached above is the result of the latest staging run with modified code? I'm still seeing /plone/ in it:

grep plone *.ttl | more
errata.ttl:    pleiades:hasFeatureType <https://pleiades.stoa.org//plone/vocabularies/place-types/province>;
errata.ttl:    pleiades:hasFeatureType <https://pleiades.stoa.org//plone/vocabularies/place-types/province>;
errata.ttl:    pleiades:hasFeatureType <https://pleiades.stoa.org//plone/vocabularies/place-types/province>;
errata.ttl:    pleiades:hasFeatureType <https://pleiades.stoa.org//plone/vocabularies/place-types/province>;
errata.ttl:    pleiades:hasFeatureType <https://pleiades.stoa.org//plone/vocabularies/place-types/province>;
errata.ttl:    pleiades:hasFeatureType <https://pleiades.stoa.org//plone/vocabularies/place-types/province>;
errata.ttl:    pleiades:hasFeatureType <https://pleiades.stoa.org//plone/vocabularies/place-types/province>;
errata.ttl:    pleiades:hasFeatureType <https://pleiades.stoa.org//plone/vocabularies/place-types/province>;
errata.ttl:    pleiades:hasFeatureType <https://pleiades.stoa.org//plone/vocabularies/place-types/province>;
errata.ttl:    pleiades:hasFeatureType <https://pleiades.stoa.org//plone/vocabularies/place-types/province>;
errata.ttl:    pleiades:hasFeatureType <https://pleiades.stoa.org//plone/vocabularies/place-types/port>,

and so on across all .ttl files containing hasFeatureType

@jessesnyder
Copy link

@paregorios The archive does indeed not contain the files I expected it to contain. Not sure what happened. Running the cron task again now.

@jessesnyder
Copy link

OK @paregorios, take 2:
pleiades-20241009.tar.gz

@paregorios
Copy link
Member

@jessesnyder that's got it! And the individual serializations are still fine too. Please merge, wrap, ship, refrigerate, and deploy to production.

@jessesnyder
Copy link

Deployed 🚀

@paregorios
Copy link
Member

This is fixed in the weekly export on 10/13. Closing as resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants