-
Notifications
You must be signed in to change notification settings - Fork 30
Fedora 3 Content Type Example: Journal Article
This tutorial is only known to work with Hydra-Head 4 and Fedora 3. However, it is a useful introduction to all the pieces involved in content modification, including tests. Do not take this word-for-word or expect that examples will work. If you want to take over maintenance of this page, please do so.
This document assumes you have read How to Get Started
- Understand how Hydra fits into and uses Rails MVC (Model, View, Controller) structures
- Define an ActiveFedora Model for JournalArticles
- Define Controller & Views for Creating, Editing, Viewing and Deleting Journal Articles
- Customize how JournalArticles appear in Blacklight search results
In an MVC framework, the Model defines the attributes and behaviors of your various objects, allowing you to persist those objects and retrieve them. By default, Rails Models uses ActiveRecord to persist & retrieve objects using SQL databases. With Hydra, we use ActiveFedora to connect with Fedora and Solr instead of a SQL database.
Controllers handle requests from clients (ie. HTTP requests from a web browser), loading the necessary information and rendering the appropriate responses (ie. HTML pages returned to the browser). They use your Models to load the information and they use your Views to render the response. In this way, Controllers are like connectors or coordinators — they coordinate the flow of activity in your application when it receives requests.
In this tutorial, we are creating a new JournalArticle content type. This will allow us to create Journal Articles in a Fedora Repository, collect custom metadata for them, index them in solr, and display that custom metadata in the user interface.
In order to describe our Fedora objects, we can use whatever metadata schemas suit our needs. Some metadata schemas currently being used in Hydra Heads include MODS, Dublin Core, EAD, PBcore, EAC-CPF and VRE. This list continues to grow as people set up Hydra Heads to deal with their own specialized content.
The JournalArticle content type will use MODS (handily available in hydra-head) to track descriptive metadata about Articles. Some of the metadata is common to many types of content:
- title
- author (first name, last name, role)
- abstract
Other metadata fields are more specifically relevant to journal articles, but they still fit into the MODS schema:
- journal title
- publication date
- journal volume
- journal issue
- start page
- end page
In addition to the MODS metadata, JournalArticle objects will use Hydra Rights Metadata to track information about licenses, rights, and which people/groups should be able to discover, view, and/or edit each Journal Article.
The first thing to do when adding a new content type is to create the ActiveFedora Model. This model is a Ruby class that uses ActiveFedora to tell the application the structure of your content and its metadata.
The model we create can be used in any application with ActiveFedora and OM (Opinionated Metadata), not just in a Hydra Head. For example, ActiveFedora models can be used in batch scripts, command line utilities, and robots that perform automated actions on your fedora objects based on information and behaviors stored in their ActiveFedora models.
These tests describe how our JournalArticle objects will behave once the Model is fully defined. We will have to do a few things before these tests pass, but it’s important to define your goals before you start coding.
# spec/models/journal_article_spec.rb require 'spec_helper' describe JournalArticle do before(:each) do # This gives you a test article object that can be used in any of the tests @article = JournalArticle.new end it "should have the specified datastreams" do # Check for descMetadata datastream with MODS in it @article.datastreams.keys.should include("descMetadata") @article.descMetadata.should be_kind_of JournalArticleModsDatastream # Check for rightsMetadata datastream @article.datastreams.keys.should include("rightsMetadata") @article.rightsMetadata.should be_kind_of Hydra::Datastream::RightsMetadata end it "should have the attributes of a journal article and support update_attributes" do attributes_hash = { "title" => "All the Awesome you can Handle", "abstract" => "Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.", "journal_title" => "The Journal of Cool", "publication_date" => "1967-11-01", "journal_volume" => "3", "journal_issue" => "2", "start_page" => "25", "end_page" => "30", } @article.update_attributes( attributes_hash ) # These attributes have been marked "unique" in the call to delegate, which causes the results to be singular @article.title.should == attributes_hash["title"] @article.abstract.should == attributes_hash["abstract"] # These attributes have not been marked "unique" in the call to the delegate, which causes the results to be arrays @article.journal_title.should == [attributes_hash["journal_title"]] @article.publication_date.should == [attributes_hash["publication_date"]] @article.journal_volume.should == [attributes_hash["journal_volume"]] @article.journal_issue.should == [attributes_hash["journal_issue"]] @article.start_page.should == [attributes_hash["start_page"]] @article.end_page.should == [attributes_hash["end_page"]] end end
The test code should execute but the tests should fail because we haven’t written the code yet.
On the command line, run
rake spec
You should get an error complaining that
spec/models/journal_article_spec.rb:3: uninitialized constant JournalArticle (NameError)
This is because we haven’t defined the Model yet. Now let’s define the JournalArticle Model, but first we have to define the datastream that will contain its MODS XML descriptive metadata.
If you get an error complaining that
journal_article_spec.rb:2:in `require’: no such file to load — …/spec/spec_helper (LoadError)that means you need to run.
rails g rspec:install
A Fedora object is made up of any number of datastreams. Datastreams can have content of any type and each datastream is identified by a datastream id or dsid. The ActiveFedora model tells us which datastreams to expect or create in an object and tells us what kind of content is expected inside each datastream.
For our JournalArticle model, we’re particularly interested datastreams with XML content because a JournalArticle object is basically a container for XML metadata. The actual content of the Article (PDF,text,whatever.) will be stored in a separate Fedora object, a primitive, with the RDF isPartOf relationship connecting the JournalArticle (primarily metadata) to its content (any number of primitives with files in them). For more information about datastreams, primitives, and where the actual content of an object lives, see the Reference links at the end of this tutorial.
Our MODS xml will go into a datastream with the datastream id of descMetadata. Technically, we could give it any name we want but the Hydra community has come up with some conventions to make things simpler. One of these conventions is to always put descriptive metadata in a datastream called descMetadata.
As we said above, we want to create MODS metadata that keeps track of title, author (first name/last name/role), publication date, abstract, journal title, journal volume, journal issue, start page and end page. In order to do this we will use ActiveFedora to define a special type of Ruby object that uses OM to read and modify XML.
Example of the MODS XML we will be creating:
<mods xmlns="http://www.loc.gov/mods/v3" version="3.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.loc.gov/mods/v3 http://www.loc.gov/standards/mods/v3/mods-3-0.xsd"> <titleInfo> <title>ARTICLE TITLE</title> <!-- title --> </titleInfo> <name type="personal"> <namePart type="family">FAMILY NAME</namePart> <!-- author last name --> <namePart type="given">GIVEN NAMES</namePart> <!-- author first name --> <role> <roleTerm authority="marcrelator" type="text">Creator</roleTerm> <!-- author role --> </role> </name> <abstract>ABSTRACT</abstract> <!-- abstract --> <relatedItem type="host"> <titleInfo> <title>TITLE OF HOST JOURNAL</title> <!-- journal title --> </titleInfo> <part> <detail type="volume"> <number>2</number> <!-- journal volume --> </detail> <detail type="level"> <number>2</number> <!-- journal issue --> </detail> <extent unit="pages"> <start>195</start> <!-- start page --> <end>230</end> <!-- end page --> </extent> <date>FEB. 2007</date> <!-- publication date --> </part> </relatedItem> </mods>
The constraints of a metadata schema sometimes force us to put information into structures that don’t map directly to the vocabulary that we use when talking about that information. The “title” has ended up in a spot that you might call the “mods titleInfo title” and the “start page” has ended up in a spot that you might call the
“mods host-relatedItem part pages-extent start”or
“//mods/relatedItem[type=host]/part/extent[
unit=‘pages’]/start”
This test is a bit more fine-grained than you really need, but it lets you see how you can access the information in an XML Datastream with OM.
# spec/datastreams/journal_article_mods_datastream_spec.rb require 'spec_helper' describe JournalArticleModsDatastream do before(:each) do @mods = fixture("article_mods_sample.xml") @ds = JournalArticleModsDatastream.from_xml(@mods) end it "should expose bibliographic info for journal articles with explicit terms and simple proxies" do @ds.mods.title_info.main_title.should == ["SAMPLE ARTICLE TITLE"] @ds.title.should == ["SAMPLE ARTICLE TITLE"] @ds.abstract.should == ["THIS IS AN ABSTRACT"] @ds.journal.title_info.main_title.should == ["SAMPLE HOST JOURNAL TITLE"] @ds.journal_title.should == ["SAMPLE HOST JOURNAL TITLE"] @ds.journal.issue.date.should == ["FEB. 2007"] @ds.publication_date.should == ["FEB. 2007"] @ds.journal.issue.volume.number.should == ["2"] @ds.journal_volume.should == ["2"] @ds.journal.issue.level.number.should == ["18"] @ds.journal_issue.should == ["18"] @ds.journal.issue.pages.start.should == ["195"] @ds.start_page.should == ["195"] @ds.journal.issue.pages.end.should == ["230"] @ds.end_page.should == ["230"] end it "should expose nested/hierarchical metadata" do @ds.author.first_name.should == ["George","Abraham"] @ds.author.last_name.should == ["Washington", "Lincoln"] @ds.author.role.text.should == ["Creator", "Contributor"] @ds.author(0).first_name.should == ["George"] @ds.author(0).last_name.should == ["Washington"] @ds.author(0).role.text.should == ["Creator"] end end
You need to add this simple method to your spec helper so that the tests will be able to load xml files from the spec/fixtures directory.
# spec/spec_helper.rb ... Spec::Runner.configure do |config| ... def fixture(file) File.new(File.join(File.dirname(__FILE__), 'fixtures', file)) end end
Now download the sample MODS xml from https://raw.githubusercontent.com/scande3/hydra-tutorial-application/master/spec/fixtures/article_mods_sample.xml
and save it as spec/fixtures/article_mods_sample.xml
Now run the test to see it fail before we write the code to make it pass. We know that the journal_article_spec.rb tests are still failing, so instead of using rake spec to run all the tests, we will just run this one spec file from the command line:
rspec spec/datastreams/journal_article_mods_datastream_spec.rb
You should see an error that includes a message like this:
hydra-tutorial-app/spec/datastreams/journal_article_mods_datastream_spec.rb:3: uninitialized constant JournalArticleModsDatastream (NameError)
Now we will define the JournalArticleModsDatastream class so the test will pass.
Here’s how we define the datastream class for the descMetadata. Notice that we use set_terminology which defines its OM Terminology.
Create a new file in app/models/datastreams called journal_article_mods_datastream.rb and put this into it (NOTE: you could also save this file as lib/journal_article_mods_datastream.rb and get the same results):
# app/models/datastreams/journal_article_mods_datastream.rb # a Fedora Datastream object containing Mods XML for the descMetadata # datastream in the Journal Article hydra content type, defined using # ActiveFedora and OM. require 'hydra-mods' class JournalArticleModsDatastream < ActiveFedora::NokogiriDatastream # OM (Opinionated Metadata) terminology mapping for the mods xml set_terminology do |t| t.root(:path=>"mods", :xmlns=>"http://www.loc.gov/mods/v3", :schema=>"http://www.loc.gov/standards/mods/v3/mods-3-2.xsd") t.title_info(:path=>"titleInfo") { t.main_title(:index_as=>[:facetable],:path=>"title", :label=>"title") } t.author(:path=>"name", :attributes=>{:type=>"personal"}) { t.first_name(:path=>"namePart", :attributes=>{:type=>"given"}) t.last_name(:path=>"namePart", :attributes=>{:type=>"family"}) t.role { t.text(:path=>"roleTerm",:attributes=>{:type=>"text"}) } } t.abstract t.journal(:path=>'relatedItem', :attributes=>{:type=>"host"}) { t.title_info(:ref=>[:title_info]) t.issue(:path=>"part") { t.volume(:path=>"detail", :attributes=>{:type=>"volume"}) { t.number } t.level(:path=>"detail", :attributes=>{:type=>"level"}) { t.number } t.pages(:path=>"extent", :attributes=>{:unit=>"pages"}) { t.start t.end } t.date } } # these proxy declarations allow you to use more familiar term/field names that hide the details of the XML structure t.title(:proxy=>[:mods, :title_info, :main_title]) t.journal_title(:proxy=>[:journal, :title_info, :main_title]) t.journal_volume(:proxy=>[:journal, :issue, :volume, :number]) t.journal_issue(:proxy=>[:journal, :issue, :level, :number]) t.start_page(:proxy=>[:journal, :issue, :pages, :start]) t.end_page(:proxy=>[:journal, :issue, :pages, :end]) t.publication_date(:proxy=>[:journal, :issue, :date]) end # set_terminology # This defines what the default xml should look like when you create empty MODS datastreams. # We are reusing the ModsArticle xml_template that Hydra provides, but you can make this method return any xml you desire. # See the API docs for more info. http://rubydoc.info/github/projecthydra/om/OM/XML/Container/ClassMethods#xml_template-instance_method def self.xml_template return Hydra::ModsArticle.xml_template end end # class
Save that file and run the tests again. They should pass – if not – you need to add a cmd in the config/application.rb to ensure that your new app/model/datastreams/journal_article_mods_datastream.rb is available.
class Application < Rails::Application
…
config.autoload_paths += Dir[Rails.root.join(‘app’, ‘models’, ‘{**}’)]
…
end
Good to go. You might get some extra information output onto the console while the tests run.
rspec spec/datastreams/journal_article_mods_datastream_spec.rb .. Finished in 7.26 seconds 2 examples, 0 failures
The key here is “2 examples, 0 failures”.
Try deleting lines from the datastream definition or changing values in the fixture xml then re-run the tests to see what it looks like when the tests fail.
The hydra-head plugin provides a class definition for the rightsMetadata datastream, so you won’t have to define the OM Terminology yourself. The definition is in the hydra-head plugin code in lib/fedora_migrate/rights_metadata.rb.
Here’s an example of what rightsMetadata XML looks like:
<rightsMetadata xmlns="http://hydra-collab.stanford.edu/schemas/rightsMetadata/v1"> <copyright> <human>(c)2009 The Hydra Project</human> <human type="someSpecialisedType">Blah Blah</human> <human type="aDifferentType">More blah</human> <machine><a rel="license" href="http://creativecommons.org/licenses/publicdomain/"><img alt="Creative Commons License" style="border-width:0" src="http://i.creativecommons.org/l/publicdomain/88x31.png" /></a><br />This work is in the <a rel="license" href="http://creativecommons.org/licenses/publicdomain/">Public Domain</a>.</machine> </copyright> <access type="discover"> <human></human> <machine> <policy>hydra-policy:4502</policy> <group>public</group> </machine> </access> <access type="read"> <human></human> <machine> <group>public</group> </machine> </access> <access type="edit"> <human></human> <machine> <person>researcher1</person> <group>archivist</group> </machine> </access> <access type="etc"> <!-- etc --> </access> <use> <human>You are free to re-distribute this object, but you cannot change it or sell it. </human> <machine><a rel="license" href="http://creativecommons.org/licenses/by-nc-nd/3.0/us/"><img alt="Creative Commons License" style="border-width:0" src="http://i.creativecommons.org/l/by-nc-nd/3.0/us/88x31.png" /></a><br />This <span xmlns:dc="http://purl.org/dc/elements/1.1/" href="http://purl.org/dc/dcmitype/Sound" rel="dc:type">work</span> is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-nc-nd/3.0/us/">Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License</a>.</machine> </use> </rightsMetadata>
The Hydra::RightsMetadata datastream definition is provided by hydra-head, so we don’t need to implement it or write tests for it.
There are two special datastreams that Fedora creates for you — the DC datastream and the RELS-EXT datastream. We don’t really use the DC datastream. It contains simple Dublin Core metadata that mainly exists for Fedora’s internal use. The RELS-EXT datastream contains RDF representing the relationships between the objects in a Fedora repository. ActiveFedora and Hydra both use these RDF relationships in a number of ways. For more information about how to work with RDF relationships in ActiveFedora, see the ActiveFedora documentation links at the end of this tutorial.
Now we’re ready to assemble the JournalArticle model and make its tests pass. First, rerun the tests to see them pass:
rake spec
Create a file in app/models called journal_article.rb and put these lines into it:
# app/models/journal_article.rb # a Fedora object for the Journal Article hydra content type class JournalArticle < ActiveFedora::Base include Hydra::ModelMethods has_metadata :name => "descMetadata", :type=> JournalArticleModsDatastream has_metadata :name => "rightsMetadata", :type => Hydra::Datastream::RightsMetadata end
Rerun the tests in journal_article_spec.rb and you will see actual failures (probably in red) instead of the error message about JournalArticle being undefined.
rspec ./spec/models/journal_article_spec.rb
.F Failures: 1) JournalArticle should have the attributes of a journal article and support update_attributes Failure/Error: @article.update_attributes( attributes_hash ) ActiveFedora::UnknownAttributeError: unknown attribute: end_page # ./spec/models/journal_article_spec.rb:31 Finished in 7.51 seconds 2 examples, 1 failure Failed examples: rspec ./spec/models/journal_article_spec.rb:19 # JournalArticle should have the attributes of a journal article and support update_attributes
“2 examples, 1 failure” means that the first test is now passing. You have defined the datastreams correctly using has_metadata. Now we need to make the JournalArticle model delegate the attributes we want to the descMetadata datastream. To do that, add these lines into the JournalArticle class defnition:
# The delegate method allows you to set up attributes on the model that are stored in datastreams # When you set :unique=>"true", searches will return a single value instead of an array. delegate :title, :to=>"descMetadata", :unique=>"true" delegate :abstract, :to=>"descMetadata", :unique=>"true" delegate :start_page, :to=>"descMetadata" delegate :end_page, :to=>"descMetadata" delegate :publication_date, :to=>"descMetadata" delegate :journal_title, :to=>"descMetadata" delegate :journal_volume, :to=>"descMetadata" delegate :journal_issue, :to=>"descMetadata"
Now rerun the tests. They should all pass.
rspec ./spec/models/journal_article_spec.rb .. Finished in 8.09 seconds 2 examples, 0 failures
rake spec ....* Pending: User add some examples to (or delete) /Users/matt/Develop/projects/hydra-tutorial-app/spec/models/user_spec.rb # Not Yet Implemented # ./spec/models/user_spec.rb:4 Finished in 8.12 seconds 5 examples, 0 failures, 1 pending
The test that’s marked “pending” was generated when you ran the blacklight generator. You can either add some assertions to it or delete it in order to make your test suite “green” (everything passing, nothing pending).
Now that you’ve defined the Model, you need to define the Controller & Views for Creating, Retrieving, Updating and Deleting (CRUD) Journal Articles.
# spec/controllers/journal_articles_controller_spec.rb describe JournalArticlesController do describe "creating" do it "should render the create page" do get :new assigns[:journal_article].should be_kind_of JournalArticle renders.should == "new" end end end
… no controller …
# config/routes.rb ... resources :journal_articles ...
# app/controllers/journal_articles_controller.rb class JournalArticlesController < ApplicationController def new @journal_article = JournalArticle.new end end
You will need to have run:
rails g cucumber:install
Given I am logged in as [email protected] And I am on the home page When I click "Add a Journal Article" Then I should see ... When I fill in ... And I click "Create this Journal Article" Then I should see "Journal Article 'My title' created."
First, we need to add a link to the Hydra Head that lets you create Journal Articles. To do this, you need to override the _add_asset_links view partial. Here’s the cucumber test for what you want:
hydra-head puts a list of “add asset” links into the user_util_links section of the page. This list is defined in app/views/add_asset_links.html.erb. By default, this list includes links for adding Images, MODS Assets and Generic Content. We want it to have just one link — create an Article. To override the list, create a file at app/views/_add_assetslinks.html.erb and put this into it:
<div id="select-item-box"> <a class="add-new-asset" href="#">Add a New Asset</a> <ul id="select-item-list"> <li> <%= link_to_create_asset 'Add an Article', 'journal_article' %> </li> </ul> </div>
# spec/controllers/journal_articles_controller_spec.rb describe JournalArticlesController do describe "creating" do it "should render the create page" do ... end it "should support create requests" do post :create, :journal_article=>{"title"=>"My title"} ja = assigns[:journal_article] ja.title.should == "My title" end end end
Restart the app, load up http://localhost:3000/journal_articles/new and try it out.
Given I am on the edit page for "hydra:fixture_journal_article" And I fill in "Title" with "The History of Hopscotch" When I click "Save" Then I should see "The History of Hopscotch has been updated."
# spec/controllers/journal_articles_controller_spec.rb describe JournalArticlesController do describe "creating" do ... end describe "editing" do it "should support edit requests" do get :edit, :id=>"hydra:fixture_journal_article" assigns[:journal_article].should be_kind_of JournalArticle assigns[:journal_article].pid.should == "hydra:fixture_journal_article" end it "should support updating objects" do put :update, :journal_article=>{"title"=>"My Newest Title"} ja = assigns[:journal_article] ja.title.should == "My Newest Title" end end end
# app/controllers/journal_articles_controller.rb class JournalArticlesController < ApplicationController def new ... end def edit @journal_article = JournalArticle.find(params[:id]) end def update @journal_article = JournalArticle.find(params[:id]) @journal_article.update_attributes(params[:journal_article]) redirect_to :edit end end
# app/views/journal_articles/_edit.html.erb <%= form_for @journal_article do |f| %> <%= f.label :title %> <%= f.text_field :title %> <%= f.submit "Save" %> <% end %>
The show method definition is almost identical to the edit method. With time, the two methods will accumulate different logic within your application, but for now they both have one basic role: load the requested object and pass it into the requested view template.
# app/controllers/journal_articles_controller.rb class JournalArticlesController < ApplicationController def new ... end def show @journal_article = JournalArticle.find(params[:id]) end def edit @journal_article = JournalArticle.find(params[:id]) end def update ... end end
# app/views/journal_articles/edit.html.erb <h1><%= @journal_article.title %></h1> <dl> <dt>Title</dt> <dd> <%= @journal_article.title %> </dd> <dt>Journal</dt> <dd> <%= @journal_article.journal_title %> </dd> </dl>
title_t, journal_title_t
active_fedora_model_s and Blacklight.config[:display_type]
# app/views/journal_articles/_index.html.erb <dl> <dt>Title</dt> <dd> <%= document["title_t"] %> </dd> <dt>Journal</dt> <dd> <%= document["journal_title_t"] %> </dd> </dl>
See Hydra Modeling Conventions
See Reference for more links.