-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Find a sensible way to handle many new URI/literals #2119
Comments
ProblemPeel will introduce many new URIs in it's metadata. This is different than the Controlled Vocabularies used before where there were a limited/restricted permitted responses that we drew from for submitting items. The part that is similar is the need to convert to/from URI to humanized string. I think that having an intermediate symbol is desirable in URI translation so that talking about special cases in our code remain readable. I.e. What are we doing?We have a number of config/controlled_vocabularies which are used to set up the app-wide Each of the URIs in the controlled vocabularies has a symbol which matches the same symbol in our locales to give a humanized literal jupiter/app/helpers/application_helper.rb Lines 4 to 13 in 523f6a2
We use this to
What have others done?Rails i18nThis guide discusses i18n in rails.
The public API is
The default backend is Simple. There's discussion in the guide about using the Chain backend. It's useful when you want to use standard translations with a Simple backend but store custom application translations in a database or other backends. The same author that worked on i18n gem has a i18n-active_record gem that implements an ActiveRecord backend. This backend stores translations in a Translation table with Other references
URI ServiceURI Service - found a ruby gem from Columbia University Libraries which is database backed and Solr cached lookup/creation service for URIs.
This gem also allows you to
Linked Data conceptsResource Caching
Label Everything
|
SolutionGenerate an ActiveRecord Model called The API for this model:
Use the ActiveRecord backend for i18n to store the translations/labels for each URI/code. Have Other considerations
|
This was an exercise to see if we could get a label from a URI programmatically. We might do this once on ingest and then store in our database(s).
From Getting data from the Semantic Web (Ruby) and Queryable:query |
Pretty sure there's a much cheaper way to do that, because the RDF::Graph calls are very expensive/slow. This was a huge issue in ActiveTriples/ActiveFedora. Give me a bit to check |
Hmmm, so you can do something like: irb(main):016:0> uri = "http://purl.org/dc/terms/title"
=> "http://purl.org/dc/terms/title"
irb(main):017:0> RDF::Vocabulary.find_term(uri)
=> #<RDF::Vocabulary::Term:0x3fcd9556c1a4 ID:http://purl.org/dc/terms/title>
irb(main):018:0> RDF::Vocabulary.find(uri)
=> RDF::StrictVocabulary(http://purl.org/dc/terms/)
irb(main):019:0> RDF::Vocabulary.find_term(uri).label.to_s
=> "Title" generally, which is good both because it bypasses all the in memory graph stuff and because in your example we have to know the vocab the URI comes from to form the query, which is a bunch of extra work to have to do if we can't just have a Vocab-agnostic conversion. It's still not terribly efficient because it does this in just about the dumbest way possible – it simply looks through every vocab it knows about sequentially trying to find the corresponding URI. But the Graph query is worse in that I think it's built on top of this, so it's several more layers of inefficient on top of this inefficiency. Looking up the specific URI "http://id.loc.gov/authorities/names/n79007225" doesn't seem to work, however. Could be a bug in RDF, could be something else at play there. Long story short, Not sure there's any value in the URI Service gem either beyond what we could more easily do ourselves without having to worry that it's abandoned. |
There could be something else at play there. The example you gave was the URI corresponding to a predicate. The example I'm trying to solve is the object as a URI. I tried some of the other URIs in our config/controlled_vocabularies and none of these worked either. This is an example that didn't work with RDF::Graph so this method isn't bullet proof either.
I have no intention of using RDF::Graph in the application code for any just in time type lookup. The problem I was trying to solve with that snippet was looking up a reasonable label to the URIs in the Folk Fest data. I did this by hand for [config/controlled_vocabularies/digitization_*] in my FolkFest Modelling PR by visiting the URI in my browser to find a reasonable
💯 agree. The URI Service gem will not be useful for us. This was just part of my scan of the state of the world. |
Hmmm yeah good point, wasn't thinking in terms of Predicate vs Object. I almost want to say we should look at more of a Questioning Authority approach for populating the DB, but I'm still concerned about some of the labels maybe not being what we'd want to present to an end user (perhaps needlessly worried, but Hydra was so bad for presenting really poor UX due to that kind of reliance on mechanical mapping) |
It would be nice to have something where we could just call [edit] vocabulary agnostic is I guess not really what I meant. Maybe graph agnostic? Most nodes in a linked data graph will terminate with a prefLabel. |
Tell me more about Questioning Authority approach? The blog post about the i18n ActiveRecord backend includes a basic admin interface for adding/editing translations. Would that help with fixing bad mappings? |
I was actually thinking less in terms of "it might not be Lemme mull the questioning authority vs graph query approach some more, may or may not come up with some thoughts |
First we can create a graph from a spreadsheet like 'FolkFestTriples - v.1' downloaded as a csv called 'follkfest.csv' in the current path.
Then we can find the URI's and their labels
|
I think this is solved by @mbarnett's ControlledVocabulary API |
Let's add a TODO to consider more robust ways of doing this (lookup table in DB or some kind of authority query), because I could see this growing quite large.
Originally posted by @mbarnett in #2089 (comment)
Peel will introduce many new URIs in it's metadata. This is different than the Controlled Vocabularies used before where there were a limited/restricted permitted responses that we drew from for submitting items. The part that is similar is the need to convert to/from URI to humanized string.
The text was updated successfully, but these errors were encountered: