-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added test for DISABLED digest type to FoxML digest Validation #190
Conversation
Would be good to add an actual test to the tool to ensure we don't break this in the future. |
Agreed on the test. This PR also now kicks the can down the road to fail at the validation stage in fcrepo-storage-ocfl src/main/java/org/fcrepo/storage/ocfl/validation/ValidationUtil.java so we need to consider how to handle this sitation before merging this |
At a quick glance this looks like it will work. I updated a managed datastream to set the checksum type to |
That seems different from what @Surfrdan was experiencing. Where he could complete the migration but the reindexing would fail. |
I'm building a tiny test batch with the offending object so I can re-test this quickly and identify the exact issue again. |
I've updated the Jira ticket with the full stack trace and attacheds the offending FoxML |
@Surfrdan : What is the status on this ticket? Re your comment above: Were you going to add an integration test to test the behavior?
|
@dbernstein Would it be sufficient to modify the input FoxML from one of the existing tests to represent an EXTERNAL datastream with a DISABLED and unreachable URL ? Something like https://github.com/fcrepo-exts/migration-utils/blob/main/src/test/resources/legacyFS/objects/2015/0430/16/01/example_1#L141 for example |
I've decided to build by own test with the NLW data which failed in the first place. I'm just going to get clearance to use the data before I commit it to the repo though. |
let me fix those up. I can't run checkstyle from behind a corporate proxy unfortunately. Sorry. |
… parse tge datastream dir
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Surfrdan This looks mostly good, I just have one question below from some behavior that I believe changed from the initial PR. Also, just wanted to note that the migrator will generate sha512 sums for all ocfl objects, so we don't need to worry about anything missing from the Fedora 6/ocfl side of things.
@@ -468,6 +474,11 @@ private String extractInlineXml() throws XMLStreamException { | |||
|
|||
private void validateInlineXml() { | |||
if (isInlineXml && contentDigest != null && StringUtils.isNotBlank(contentDigest.getDigest())) { | |||
|
|||
if (StringUtils.equals(contentDigest.getType(), "DISABLED")) { | |||
throw new RuntimeException("DISABLED digest. Skipping digest validation"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we want to throw an exception here or not. As I understood the problem initially, it seemed like we wanted to continue migration of the pids with disabled digests because it was a feature of Fedora 3. If so, we could just log this instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went the exception route to assist with building an integration test case but if you think logging would be sufficient I can revert that change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think logging then return to skip the checksum validation is acceptable. For the integration test we can switch to check if the migration was successful. It looks like the InlineXmlIT
does something similar, but for this case we can check if the resources exist in the ocfl object instead of computing checksums.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've now made those changes as suggested @mikejritter
public void testMigrateObjectWithExternalDatastreamAndDisabledDigest() throws Exception { | ||
setup("inline-disabled-it"); | ||
migrator.run(); | ||
final var migratedAuditPath = "target/test/ocfl/inline-disabled-it/storage/8f8/e55/54c/" + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to use the OCFL classes to look for the audit resource here. fcrepo-storage-ocfl contains the OcflObjectSession which can be used to check, though it doesn't look like we have javadocs up for that module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing me in the right direction. I've uipdatedf the IT
Bumps [woodstox-core](https://github.com/FasterXML/woodstox) from 6.2.3 to 6.4.0. - [Release notes](https://github.com/FasterXML/woodstox/releases) - [Commits](FasterXML/woodstox@woodstox-core-6.2.3...woodstox-core-6.4.0) --- updated-dependencies: - dependency-name: com.fasterxml.woodstox:woodstox-core dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
… parse tge datastream dir
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Surfrdan looks good now 👍
setup("inline-disabled-it"); | ||
migrator.run(); | ||
final var session = sessionFactory.newSession("info:fedora/llgc-id:1591190"); | ||
assertTrue(session.containsResource("info:fedora/llgc-id:1591190/AUDIT")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we be testing for the existence of one of the datastreams with a DISABLED digest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed the ID to look for a DISABLED digest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
The title of this pull-request should be a brief description of what the pull-request fixes/improves/changes. Ideally 50 characters or less.
JIRA Ticket: https://fedora-repository.atlassian.net/browse/FCREPO-3847
What does this Pull Request do?
Adds a test for the DISABLED digest type and continues execution without validation if true.
What's new?
To the best of my understanding, this is a valid use case that was not accounted for previously. A Sanity check would be appreciated though.
How should this be tested?
Additional Notes:
Any additional information that you think would be helpful when reviewing this PR.