Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using GBFS within a Linked Data/RDF publishing strategy #394

Closed
pietercolpaert opened this issue Dec 4, 2021 · 11 comments
Closed

Using GBFS within a Linked Data/RDF publishing strategy #394

pietercolpaert opened this issue Dec 4, 2021 · 11 comments

Comments

@pietercolpaert
Copy link

pietercolpaert commented Dec 4, 2021

Who am I

I’m a professor at Ghent University in Belgium, researching how to publish knowledge on Web-Scale. Previous work of my research team includes Linked Connections as a light-weight interface for public transit route planning, helping the European Railway Agency with publishing a dataset on railway infrastructure, helping the Flemish government publishing their base registries such as their address database and today we’re working on the Flemish Sensor Data Space, in which have a use case on bike sharing.

Motivating user stories

  1. As a data publisher, I want to use the GBFS terminology to annotate my website about my bike sharing initiative (e.g., with RDFa or together with schema.org)
  2. As the Flemish government, I want to align our vocabularies with GBFS and link towards the terms in the authoritative specification
  3. As a data consumer working on smart city infrastructure, I want to import GBFS data in my city’s NGSI-LD context broker

Solution

Convert the terms you define in the JSON schema towards an RDFS vocabulary. This can be done using a 1 on 1 mapping (I’m willing to pull request this if this is desired).

What should be the base URL on which all terms will be dereferenceable?

I’d propose https://w3id.org/gbfs#. This way, for example the term num_docks_available would get the URI https://w3id.org/gbfs#num_docks_available. We can open a pull request at https://github.com/perma-id/w3id.org to add a redirect from w3id.org/gbfs to for example a github pages on this repository with this RDF file behind it. This way machines will be able to look up the authoritative definitions.

Is your potential solution a breaking change?

  • Certainly not
@pietercolpaert
Copy link
Author

Probably good idea to wait until this breaking change passed: #354

@isabelle-dr
Copy link
Contributor

isabelle-dr commented Jan 5, 2022

Hello @pietercolpaert, I'm a Product Manager at MobilityData, working on our tools and initiatives to increase data quality. 👋
Thanks for opening up this discussion.

I have very limited experience with linked data, RDF, and context information. I think this is a great opportunity, there is discussions in GTFS around versioning and URL schemes mentioning linked data.

I have a few questions to get a better understanding of what this proposal would imply:

As a data publisher, I want to use the GBFS terminology to annotate my website about my bike-sharing initiative (e.g., with RDFa or together with schema.org)

  • Why? In order to increase the discoverability? What are the motivations? Do you have an example of this in another area?

  • If we were to build an RDF schema vocabulary, could it replace the JSON Schema, or would they be complementary?

  • Did you consider JSON-LD?

  • Do you foresee any disadvantages or risks? e.g. higher complexity for consumers, or higher barrier to entry for producers

  • What could be other advantages of using linked data?

    • How exactly could it help with the machine readability of GBFS?
    • What would be the impact on versioning and on discoverability for different versions (currently covered by gbfs_versions.json)

@pietercolpaert
Copy link
Author

Hi @isabelle-dr thanks for getting back to me: much appreciated!

  • Why would you annotate a web page with GBFS semantics? In order to increase the discoverability? What are the motivations? Do you have an example of this in another area?

Discoverability and interoperability are certainly big motivations:

These two examples give an idea of the motivation behind Linked Data, which I like to summarize as drastically lowering the cost of integrating a dataset in a different domain.

  • If we were to build an RDF schema vocabulary, could it replace the JSON Schema, or would they be complementary?

I was (and still am) proposing a complementary approach where we try to generate an RDFS vocabulary and SHACL schema based on the JSON Schema files. However, we already know from experimenting with it together with @andreipopi that additional configuration is going to be necessary as there’s no full 1 on 1 mapping between these.

Just for being complete (this is not what I propose as it would requires changing your entire process as it is today and would broaden the scope of the GBFS schemas), the other way around would be possible in a more automated way: @ioggstream is working on RDF to JSON Schema: https://twitter.com/ioggstream/status/1473708713525534722

  • Did you consider JSON-LD?

JSON-LD is one of the serializations in which Linked Data can be serialized. What I propose above would be a requirement before being able to use JSON-LD.

  • Do you foresee any disadvantages or risks? e.g. higher complexity for consumers, or higher barrier to entry for producers

Disadvantage is that you’re going to do a little bit more. We’re going to document the extra configuration file that would be needed to document how the JSON schema can be translated towards RDFS and SHACL. Things I already think about:

Per JSON schema we’ll need:

  • a base web address or namespace (URI) to start building the web addresses. This could be for example https://w3id.org/gbfs#num_bikes_available
  • a type to give to the entity if the JSON schema describes an object
  • how to map enums to codelists
  • What could be other advantages of using linked data?
  • You can use the GBFS data model in more than just JSON. You’ll be able to use it in HTML annotations, RDF/XML, text/turtle, CSV on the Web, ...
  • You can describe similarities to other domain models
  • You will convince data publishers to also adopt a good identifier strategy for their own bike stations
  • ...
  • How exactly could it help with the machine readability of GBFS?

Next the JSON schema tooling, also RDF tooling will be able to look up definitions and validate a file in any RDF serialization against the SHACL shape. I don’t see this as the biggest advantage.

  • What would be the impact on versioning and on discoverability for different versions (currently covered by gbfs_versions.json)

We can also include the major version number of GBFS in the web address of the term. Otherwise I don’t expect any impact.

@stale
Copy link

stale bot commented May 25, 2022

This discussion has been automatically marked as stale because it has not had recent activity. It will be closed in 60 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label May 25, 2022
@pietercolpaert
Copy link
Author

We are still working on a PR as a side-project. Not stale, give us a bit more time :)

@stale stale bot removed the stale label May 26, 2022
@stale
Copy link

stale bot commented Sep 24, 2022

This discussion has been automatically marked as stale because it has not had recent activity. It will be closed in 60 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Sep 24, 2022
@pietercolpaert
Copy link
Author

Still working on it. We are:

  1. Creating a spec for adding tags to JSON schemas that can then allow a processor to translate the JSON schema to RDFS and SHACL
  2. Prototyping the actual processor
  3. Creating a github action that we could pull request here to automatically generate the Linked Data specs inside this repository and start from there to have more discussions

Will share the link to the spec, processor codebase and github action applied on the GBFS json schemas after validating it internally.

@mobilitydataio
Copy link
Contributor

This discussion has been automatically marked as stale because it has not had recent activity. It will be closed in 30 days if no further activity occurs. Thank you for your contributions.

@pietercolpaert
Copy link
Author

pietercolpaert commented Jan 16, 2024

You can find the code of our experiments here: https://github.com/jiaoxlong/json-schema-ld/tree/main

We found the generated RDF Vocabulary and SHACL shape at this moment to not be good enough. The idea however remains interesting to pursue.

@richfab
Copy link
Contributor

richfab commented Jan 16, 2024

Hi @pietercolpaert,
I am a Product Manager for shared mobility at MobilityData.
Thank you very much for sharing your work on Linked Data for GBFS. The topic of interoperability is very interesting and important to us.
As per the governance, this issue will be closed in 30 days if there is no additional re-engagement.
Have a great day!
Fabien

@richfab
Copy link
Contributor

richfab commented Feb 16, 2024

This discussion has been closed due to inactivity. Discussions can always be reopened after they have been closed.

@richfab richfab closed this as completed Feb 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants