-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Working version of Aip api #1441
Conversation
Simple tests checking if access is managed correctly. AIP V1 API follows the same pundit policies as the Items controller for its show view.
Update the acts_as_rdfable gem Add entry points for aip item aip aip Add File attachment policies
Moved the logic from the items controller to the base class AIPBaseController so we can serve the information required by PMPY from the different objects we may serve from jupiter. This commit has problems with obtaining the rdf annotations from files. ActsAsRdfable.add_annotation_bindings! is not modifying the class to include the rdf_annotations method, still need to figure out why that is happening.
As part of the WIP strings were hardcoded in the AIP API controller to define the namespaces that are included for each metadata file. These harcoded strings have now been changed to use the strings from the rdf-vocab gem and terms.yml file. The harcoded strings represented namespaced that need to be repalced after further research is done by the metadata team.
As part of the WIP the AIP API controller included hardcoded strings for ontology namespaces that were treated as placeholders. These strings have now been removed instead the namespaces from the rdf-vocab gem and terms.yml file. These last namespaces that were still being represented by hardcoded strings will need to be replaced to reflect the changes moving away from fedora. The new ontologies will be provided by the metadata team.
The AIP API will be the communication layer between Jupiter phase 2 an the PushmiPullyu client which creates the AIP file for long term archival. AIP was built using the information provided from fedora, as we move away from fedora and onto PostgreSQL we need to provide an alternative. This API provides the information required for the AIP file in a simpler manner avoiding fedora's overhead.
Generated by 🚫 Danger |
A migration from previous work was still on local branch after it was added to main development branch. Everything else should be OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation looks solid overall. Concerns largely revolve around efficiency of continuing to have PMPY download files when it could just operate off of the SAN, tests violating encapsulation when making assumptions about ActiveStorage internals, and tests assuming the AIP API isn't private.
|
||
graph = get_n3_graph(url) | ||
assert_response :success | ||
assert_equal true, graph.graph? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here (and everywhere else that graph.graph?
is being used as a test) this is probably a bit too cursory? It's probably worth testing that the graph actually contains what you expect, because any trivial graph will pass this test eg)
RDF::Graph.new.graph?
=> true
We probably shouldn't pass the test if some random bug leads us to successfully doing no work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, they are simple tests for now. As I wrote in the comment above:
# Basic tests checking if response has n3 serialization.
# This should be changed to verify the response content is correct.
We will be sure of the response content after some discussions with the metadata team. I need to make this clearer.
From @mbarnett
If this is coming soon we might as well wait to make those tests useful before merging this. If we're expecting it's going to take a while before we have an idea of what graph contents to expect we need to create a ticket so that this work doesn't get lost
@lagoan
This is currently in the works, will add a ticket so that it does not get lost
Add model route constraint A route constraint will bounce a request for an ubnknown model faster than catchin it on the AIP API controller. Change AIP API to be private The AIP API will be closed off from the public and will only be used by admins for maintanence and the PMPY client. Rename model and asset references to entity There was a discrepancy in the naming conventions for parameters, methods, and variables throughout the AIP API that referenced objects in the system. References to these objects will be named entity in order to avoid naming issues with rails commonly used words and OOP definitions. Improve way helper module is loaded Move module from concerns directory to a support directory and handle it as a helper. Require it only when needed. Add AIP API file path entry point The PMPY client will not download files directly through the API, instead it will fetch the file path and access the file directly on the system. This implementation provides the path for a single fileset, this can be changed to provide the paths for all files to lower calls made to server Add json schema validator to entity tests Remove previous place holder test and now properly check for json response format for Item and Thesis entities. Add tests checking if files on entity actually exist
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation looks solid overall. Concerns largely revolve around efficiency of continuing to have PMPY download files when it could just operate off of the SAN, tests violating encapsulation when making assumptions about ActiveStorage internals, and tests assuming the AIP API isn't private.
Addressing your comments @mbarnett :
- The AIP API now returns the paths for the entity's files
- Tests do not violate encapsulation from ActiveStorage
- AIP API is now private
config/routes.rb
Outdated
|
||
# AIP API v1 | ||
|
||
entity_contraint_regex = /(?:#{Item.name.underscore.pluralize}|#{Thesis.name.underscore.pluralize})/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From @mbarnett :
let's encapsulate this into a proper constraint class. The logic is only going to get more complicated over time as new models are added, and it's better to keep the routes file simple & declarative.
The files will be moved directly on the server. By adding the file uuid we know the destination path directly with no need of extra calls or processing.
Logic will get more complicated as new entities are introduced. This class helps keep the routes.rb file simpler.
# AIP API v1 | ||
|
||
namespace :aip, | ||
constraints: EntityConstraint.new, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work with the constraint 👍
:original_file | ||
] | ||
|
||
skip_before_action :verify_authenticity_token |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain a bit about this? This is for CSRF tokens verification correct? Is there a reason we want to skip this check?
This check shouldn't run on any of these controller actions anyways since they are all get
requests. GETs should be safe and idempotent. You can see the underlying code here:
And you will notice it will never get ran, so we shouldn't need to skip this check manually here since its already skipped
I think? Unless I am missing something 😂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct on this. I just made the change.
As we are using the AIP API for getting information there is no need to skip the verification for CSRF tokens. This API has not been designed to modify Jupiter so this should not be a requirement.
Change from instantiation to fixtures when using entities on AIP API tests. As we upgrade to Rails 6 all of the tests will need to use fixtures so we can parallelize our test suite.
Using fixtures continued to be a problem under certain conditions. Move back to instantiating classes on setup since this will not be a problem upgrading to rails 6.
Fixtures continued to be a problem failing on some tests based on the seed used. Tests flip too much and it is something we need to address. For now we can instantiate entities on the setup method as needed. We can look back on this problem once we move away from minitest-hooks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one tiny note on ways we can document the issue with fixtures, and a bit of a pitfall with opening files in Ruby.
} | ||
end | ||
|
||
def create_entity( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest that we should add a TODO: mentioning that this can be removed when we settle on a consistent way of defining test data and explaining the issues you ran into when trying to replace this with fixtures, so that someone doesn't try to clean this up and just run right into the issues you've already discovered.
test/support/aip_helper.rb
Outdated
Sidekiq::Testing.inline! do | ||
entity.add_and_ingest_files( | ||
files.map do |file| | ||
File.open(file, 'r') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit of a non-obvious Ruby anti-pattern: you're actually leaking file handles (the OS never sees them as closed until the process ends).
The only guaranteed-safe way to use File.open
is with a block:
files.map do |file|
File.open(file, 'r') do |file_handle|
entity.add_and_ingest_files([file_handle])
end
end
file_handle
is guaranteed to be closed after the block even in the presence of exceptions or underlying OS interrupts hitting the process.
(I know this is less efficient in that you're ingesting in two separate calls rather than one, which is an issue with the ingest API, but it's easier to stick to the safe pattern)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not know this. Will make the change right now.
This commit includes the following changes: - Fix problem with adding files to an entity that was leaking file handles. - Document a possible change in the future for AIP API entity tests with how the entities are instantieted instead of using fixtures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to cross-link it here, the ticket capturing the remaining work on the n3 graph content tests is over at: #1448
# with the order the tests were run. The errors included: | ||
# - No items on the database | ||
# - Items not being found on database by id | ||
# - Items defined to have files did not contain them |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR includes the AIP API in working condition.
Changes will need to be made after some discussions with the metadata team regarding the predicates we use for each API file used to create the AIP.
Feedback is welcome from everyone.