Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extra slash in Pleiades URIs in RDF data dumps #525

Open
rybesh opened this issue Nov 11, 2024 · 3 comments
Open

Extra slash in Pleiades URIs in RDF data dumps #525

rybesh opened this issue Nov 11, 2024 · 3 comments

Comments

@rybesh
Copy link

rybesh commented Nov 11, 2024

In the most recent version of the Pleiades RDF data dumps, there is an extra slash in the Pleiades URIs after pleiades.stoa.org, e.g.:

The extra slash was not present in earlier dumps (I checked April 2024 via the Wayback Machine). The insertion of the extra slash is consistent, so unlike the problem addressed in #518, this problem does not break the structure of the RDF graph. But since earlier dumps did not have the extra slash, this effectively changes all the Pleiades URIs, making comparison across dumps difficult.

@paregorios
Copy link
Member

Some more explanation:

  • RDF/TTL serializations for individual Pleiades places do not exhibit the reported problem and must be maintained in the current state. So, for example, if you download https://pleiades.stoa.org/places/140569/turtle, all the components of the URI path are separated by single slashes, just like we want.
  • The complete RDF/TTL export dump, triggered by cron once a week on Sundays, exhibits the problem reported above (see further, below).

Steps to verify the problem:

  1. Download https://atlantides.org/downloads/pleiades/rdf/pleiades-latest.tar.gz
  2. Decompress and extract the file.
  3. cd pleiades-latest
  4. grep 'pleiades.stoa.org//' places-*.ttl
  5. Note the matching lines (excerpt):
places-1.ttl:<https://pleiades.stoa.org//places/1001137%23this> a <http://geovocab.org/spatial#Feature>;
places-1.ttl:    foaf:isPrimaryTopicOf <https://pleiades.stoa.org//places/1001137> .
places-1.ttl:<https://pleiades.stoa.org//places/1001887%23this> a <http://geovocab.org/spatial#Feature>;
places-1.ttl:    spatial:C <https://pleiades.stoa.org//places/511185%23this>,
places-1.ttl:        <https://pleiades.stoa.org//places/511190%23this>,
places-1.ttl:        <https://pleiades.stoa.org//places/511347%23this>,
places-1.ttl:        <https://pleiades.stoa.org//places/511414%23this>;
places-1.ttl:    foaf:isPrimaryTopicOf <https://pleiades.stoa.org//places/1001887> .
places-1.ttl:<https://pleiades.stoa.org//places/1001888%23this> owl:sameAs <http://pleiades.stoa.org/places/991370%23this> .
places-1.ttl:<https://pleiades.stoa.org//places/1001889%23this> a <http://geovocab.org/spatial#Feature>;
places-1.ttl:    foaf:isPrimaryTopicOf <https://pleiades.stoa.org//places/1001889> .
places-1.ttl:<https://pleiades.stoa.org//places/1001892%23this> a <http://geovocab.org/spatial#Feature>;
places-1.ttl:    foaf:isPrimaryTopicOf <https://pleiades.stoa.org//places/1001892> .
places-1.ttl:<https://pleiades.stoa.org//places/1001893%23this> a <http://geovocab.org/spatial#Feature>;
places-1.ttl:    foaf:isPrimaryTopicOf <https://pleiades.stoa.org//places/1001893> .
places-1.ttl:<https://pleiades.stoa.org//places/1001894%23this> a <http://geovocab.org/spatial#Feature>;
places-1.ttl:    foaf:isPrimaryTopicOf <https://pleiades.stoa.org//places/1001894> .
places-1.ttl:<https://pleiades.stoa.org//places/1001895%23this> owl:sameAs <http://pleiades.stoa.org/places/991374%23this> .
places-1.ttl:<https://pleiades.stoa.org//places/1001896%23this> a <http://geovocab.org/spatial#Feature>;
places-1.ttl:    foaf:isPrimaryTopicOf <https://pleiades.stoa.org//places/1001896> .
places-1.ttl:<https://pleiades.stoa.org//places/1001897%23this> a <http://geovocab.org/spatial#Feature>;
places-1.ttl:    foaf:isPrimaryTopicOf <https://pleiades.stoa.org//places/1001897> .
places-1.ttl:<https://pleiades.stoa.org//places/1001898%23this> a <http://geovocab.org/spatial#Feature>;
places-1.ttl:    spatial:C <https://pleiades.stoa.org//places/536101%23this>,
places-1.ttl:        <https://pleiades.stoa.org//places/536129%23this>;
places-1.ttl:    foaf:isPrimaryTopicOf <https://pleiades.stoa.org//places/1001898> .
places-1.ttl:<https://pleiades.stoa.org//places/1001899%23this> a <http://geovocab.org/spatial#Feature>;

As noted above, the // after the domain name has also made it into URIs in the place-types.ttl and time-periods.ttl files.

grep 'pleiades.stoa.org//' place-types.ttl yields results like:

<https://pleiades.stoa.org//vocabularies/place-types/abbey> a <http://www.w3.org/2004/02/skos/core#Concept>;
    owl:sameAs <http://pleiades.stoa.org//vocabularies/place-types/abbey>,
    skos:inScheme <https://pleiades.stoa.org//vocabularies/place-types>;
<https://pleiades.stoa.org//vocabularies/place-types/abbey-church> a <http://www.w3.org/2004/02/skos/core#Concept>;
    owl:sameAs <http://pleiades.stoa.org//vocabularies/place-types/abbey-church>,
    skos:inScheme <https://pleiades.stoa.org//vocabularies/place-types>;
<https://pleiades.stoa.org//vocabularies/place-types/acropolis> a <http://www.w3.org/2004/02/skos/core#Concept>;
    owl:sameAs <http://pleiades.stoa.org//vocabularies/place-types/acropolis>,
    skos:inScheme <https://pleiades.stoa.org//vocabularies/place-types>;
<https://pleiades.stoa.org//vocabularies/place-types/agora> a <http://www.w3.org/2004/02/skos/core#Concept>;
    owl:sameAs <http://pleiades.stoa.org//vocabularies/place-types/agora>,
    skos:inScheme <https://pleiades.stoa.org//vocabularies/place-types>;
<https://pleiades.stoa.org//vocabularies/place-types/amphitheatre> a <http://www.w3.org/2004/02/skos/core#Concept>;
    owl:sameAs <http://pleiades.stoa.org//vocabularies/place-types/amphitheatre>;
    skos:inScheme <https://pleiades.stoa.org//vocabularies/place-types>;
<https://pleiades.stoa.org//vocabularies/place-types/anchorage> a <http://www.w3.org/2004/02/skos/core#Concept>;
    owl:sameAs <http://pleiades.stoa.org//vocabularies/place-types/anchorage>,
    skos:inScheme <https://pleiades.stoa.org//vocabularies/place-types>;
<https://pleiades.stoa.org//vocabularies/place-types/aqueduct> a <http://www.w3.org/2004/02/skos/core#Concept>;
    owl:sameAs <http://pleiades.stoa.org//vocabularies/place-types/aqueduct>;
    skos:inScheme <https://pleiades.stoa.org//vocabularies/place-types>;
<https://pleiades.stoa.org//vocabularies/place-types/arch> a <http://www.w3.org/2004/02/skos/core#Concept>;
    owl:sameAs <http://pleiades.stoa.org//vocabularies/place-types/arch>,
    skos:inScheme <https://pleiades.stoa.org//vocabularies/place-types>;
<https://pleiades.stoa.org//vocabularies/place-types/archaeological-site> a <http://www.w3.org/2004/02/skos/core#Concept>;
    owl:sameAs <http://pleiades.stoa.org//vocabularies/place-types/archaeological-site>,
    skos:inScheme <https://pleiades.stoa.org//vocabularies/place-types>;
<https://pleiades.stoa.org//vocabularies/place-types/archipelago> a <http://www.w3.org/2004/02/skos/core#Concept>;

and grep 'pleiades.stoa.org//' time-periods.ttl yields results like:

<https://pleiades.stoa.org//vocabularies/time-periods/1200-bc-middle-east> a <http://www.w3.org/2004/02/skos/core#Concept>;
        <http://pleiades.stoa.org//vocabularies/time-periods/1200-bc-middle-east>;
    skos:inScheme <https://pleiades.stoa.org//vocabularies/time-periods>;
<https://pleiades.stoa.org//vocabularies/time-periods/13th-century-ad-eastern-mediterranean> a <http://www.w3.org/2004/02/skos/core#Concept>;
        <http://pleiades.stoa.org//vocabularies/time-periods/13th-century-ad-eastern-mediterranean>;
    skos:inScheme <https://pleiades.stoa.org//vocabularies/time-periods>;
<https://pleiades.stoa.org//vocabularies/time-periods/1500-ad-middle-east> a <http://www.w3.org/2004/02/skos/core#Concept>;
        <http://pleiades.stoa.org//vocabularies/time-periods/1500-ad-middle-east>;
    skos:inScheme <https://pleiades.stoa.org//vocabularies/time-periods>;
<https://pleiades.stoa.org//vocabularies/time-periods/1st-millennium-bce> a <http://www.w3.org/2004/02/skos/core#Concept>;
    owl:sameAs <http://pleiades.stoa.org//vocabularies/time-periods/1st-millennium-bce>;
    skos:inScheme <https://pleiades.stoa.org//vocabularies/time-periods>;
<https://pleiades.stoa.org//vocabularies/time-periods/1st-millennium-ce> a <http://www.w3.org/2004/02/skos/core#Concept>;
    owl:sameAs <http://pleiades.stoa.org//vocabularies/time-periods/1st-millennium-ce>;
    skos:inScheme <https://pleiades.stoa.org//vocabularies/time-periods>;
<https://pleiades.stoa.org//vocabularies/time-periods/2nd-millenium-bce> a <http://www.w3.org/2004/02/skos/core#Concept>;
        <http://pleiades.stoa.org//vocabularies/time-periods/2nd-millenium-bce>;
    skos:inScheme <https://pleiades.stoa.org//vocabularies/time-periods>;
<https://pleiades.stoa.org//vocabularies/time-periods/2nd-millennium-bc-egypt> a <http://www.w3.org/2004/02/skos/core#Concept>;
        <http://pleiades.stoa.org//vocabularies/time-periods/2nd-millennium-bc-egypt>;
    skos:inScheme <https://pleiades.stoa.org//vocabularies/time-periods>;

Desired outcome:

Ensure that // never appears between the domain name (netloc) and the first element of the path in any URI emitted in the RDF/TTL export dump.

@paregorios
Copy link
Member

@jessesnyder please merge and deploy to production. Would be great if you could kick off a manual run once it's deployed so we don't have to wait until Sunday to see the results. Thanks.

jessesnyder added a commit to isawnyu/pleiades3-buildout that referenced this issue Dec 5, 2024
@jessesnyder
Copy link

Merged and deployed, and export run! 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants