-
Notifications
You must be signed in to change notification settings - Fork 756
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSON-LD export/import #917
Comments
So we'd like to move forward on this, and we'd be happy to sponsor the work on this if someone is interested. The primary goal would be to be able to generate JSON-LD for items via the Zotero API (via the API's support for export translators) and import that back into Zotero clients. One thing that's not clear to me is the state of the various bibliographic ontologies — bib.schema.org, BIBO, Zotero RDF… There's a JS library that can generate JSON-LD from RDF, so we might be able to piggyback off the existing RDF translators, but as far as I know there hasn't really been much uptake of BIBO, so I don't know what makes the most sense. I'd add that, while we obviously want a round-trip to be close to lossless, I don't think it's a fixed requirement — meaning that I don't think we need to use API JSON (which we don't really want to be an exchange format) or Zotero RDF (same). Thoughts? /cc @rmzelle @aurimasv @fcheslack others? |
That's wonderful. Very much agree this should be a priority. This has been on things I've been wanting to do for quite a while (and learning more about JSON-LD is going to be quite useful for me), so I'd be interested, but I do have a regular day job now, so I'm happy to leave this to someone with more time. I always imagined writing this from scratch rather than draw on RDF, in particular because once we specify the RDF to JSON-LD mappings, I'm not sure how much time is saved. I agree that Bibo hasn't really caught on at all, so I wouldn't feel great on using that. Zotero RDF doesn't even have a written specification. If we want a new, rich RDF format, that should almost certainly be BibFrame, but that's very heavily oriented towards linked data, which doesn't jibe well with Zotero's current data model. Those my not yet fully coherent thoughts. |
Nice to hear the support on this. Which vocabulary to use might also depend on our goal: do we want the most widespread vocabulary or do we want the most detailed vocabulary? I guess that currently schema.org is the way to go if we are looking for mainstream (and I think we are). To answer what is the most used structured web data format, there is the http://webdatacommons.org/ . They told me, that they try to extract JSON-LD in their next run and then there might be also scientific analysis on that. We should consider the bib-extension which add some special cases and fields like for thesis. On the other hand BIBFRAME is very detailed and discussed as a successor of the MARC format in the library world ("MARC must die"). I have a very preliminary import translator written: https://github.com/UB-Mannheim/zotkat/blob/master/BIBFRAME.js . However, I think now BIBFRAME2 is out. The goal is also to have a large scope, but I read that more as besides libraries also museums or other culture heritage institutions might use BIBFRAME. |
Yes, for export, I think we're looking for mainstream. Nothing stopping us from importing more esoteric ontologies like BIBFRAME (we have a MARC translator, after all), but I don't think that's what we need to export (we don't export MARC). So schema.org seems right for that. What's not clear to me is the level of lossiness we have to be comfortable with if we don't augment schema.org ontologies (the way we do in Zotero RDF).
After looking at our existing code, I think I have a bit clearer picture of how this needs to work — apologies if I'm stating the obvious. If our JSON-LD support was only ever going to work with schema.org, a totally separate translator would make sense, but I don't think that's the case here. We may or may not decide to export only schema.org in JSON-LD, but for import it wouldn't make sense to throw away data in all the other ontologies that we already know about. (This functionality will also be a cornerstone of custom type support.) So I think we're looking at 1) a clean JSON-LD translator with |
The plan with 1) and 2) looks good for me and I think we might also need to expand the RDF.js translator during 2). For the expressiveness of schema.org etc: OCLC is also using schema.org and you can look at examples there, e.g. https://www.worldcat.org/oclc/920898066#microdatabox (at the very bottom there is collapsed section with microdata). This is in Turtle notation and not JSON-LD. |
Yes, exactly. We'd add schema.org support to RDF.js. |
that sounds great (and yes, you're correct on Zotero RDF import). So I understand the import part right, that would be
|
Yeah, or just break out HTML/JSON-LD parsing into separate functions within Embedded Metadata.js. (Not sure if JSON-LD parsing would have a use outside of that translator.) Embedded Metadata detectWeb would need to return 'multiple' if more than one result found per page — right now it's only one per page. Not sure if there are pages with different kinds of embedded metadata for the same item. If so, the translator could do some quick checks to make sure it's not finding redundant results. |
OK, I'd like to take this if there's no one else. Learning more about the relevant vocabularies will be useful for me any way. I should be able to have a viable version within a month. @dstillman -- do you want to contact me by e-mail about how sponsorship for this should look? |
I just found a nice written article about how to use JSON-LD: http://blog.codeship.com/json-ld-building-meaningful-data-apis/ . They are using a JavaScript library https://github.com/digitalbazaar/jsonld.js for that (BSD 3-clause license). I guess it is worth to consider this for the issue here. |
Would there come a feature to export and import to/from json-ld in Zotero Standalone? IT would be great, because then it can be used to "sync" between Tropy and Zotero |
and then also include into embedded metadata translator. Suggested by Faolan here:
https://forums.zotero.org/discussion/50327/is-unapi-deadbecause-their-website-sure-seems-to-be/#Item_8
seems like a great idea.
The text was updated successfully, but these errors were encountered: