Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added test for DISABLED digest type to FoxML digest Validation #190

Merged
merged 24 commits into from
Feb 16, 2023

Conversation

Surfrdan
Copy link
Contributor

The title of this pull-request should be a brief description of what the pull-request fixes/improves/changes. Ideally 50 characters or less.


JIRA Ticket: https://fedora-repository.atlassian.net/browse/FCREPO-3847

What does this Pull Request do?

Adds a test for the DISABLED digest type and continues execution without validation if true.

What's new?

To the best of my understanding, this is a valid use case that was not accounted for previously. A Sanity check would be appreciated though.

How should this be tested?

  • run migration utils against FoxML inline XML objects with and without digests

Additional Notes:

Any additional information that you think would be helpful when reviewing this PR.

@whikloj
Copy link
Contributor

whikloj commented Sep 20, 2022

Would be good to add an actual test to the tool to ensure we don't break this in the future.

@Surfrdan
Copy link
Contributor Author

Agreed on the test. This PR also now kicks the can down the road to fail at the validation stage in fcrepo-storage-ocfl src/main/java/org/fcrepo/storage/ocfl/validation/ValidationUtil.java so we need to consider how to handle this sitation before merging this

@mikejritter
Copy link
Contributor

At a quick glance this looks like it will work. I updated a managed datastream to set the checksum type to DISABLED and was able to run the migration, validation, and reindexing without a problem. I still want to check the migrator a little more to make sure it will be ok (it looks like if no digest is found on a datastream it's ignored in the ArchiveGroupHandler) and maybe run against external content as well. Note that it also appears the migrator will give all datastreams a sha512 digest so ocfl remains stable.

@whikloj
Copy link
Contributor

whikloj commented Oct 8, 2022

That seems different from what @Surfrdan was experiencing. Where he could complete the migration but the reindexing would fail.

@Surfrdan
Copy link
Contributor Author

I'm building a tiny test batch with the offending object so I can re-test this quickly and identify the exact issue again.

@Surfrdan
Copy link
Contributor Author

I've updated the Jira ticket with the full stack trace and attacheds the offending FoxML

@dbernstein
Copy link
Contributor

@Surfrdan : What is the status on this ticket? Re your comment above: Were you going to add an integration test to test the behavior?

Agreed on the test. This PR also now kicks the can down the road to fail at the validation stage in fcrepo-storage-ocfl src/main/java/org/fcrepo/storage/ocfl/validation/ValidationUtil.java so we need to consider how to handle this sitation before merging this

@Surfrdan
Copy link
Contributor Author

@dbernstein Would it be sufficient to modify the input FoxML from one of the existing tests to represent an EXTERNAL datastream with a DISABLED and unreachable URL ?

Something like https://github.com/fcrepo-exts/migration-utils/blob/main/src/test/resources/legacyFS/objects/2015/0430/16/01/example_1#L141 for example

@Surfrdan
Copy link
Contributor Author

I've decided to build by own test with the NLW data which failed in the first place. I'm just going to get clearance to use the data before I commit it to the repo though.

@Surfrdan
Copy link
Contributor Author

let me fix those up. I can't run checkstyle from behind a corporate proxy unfortunately. Sorry.

Copy link
Contributor

@mikejritter mikejritter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Surfrdan This looks mostly good, I just have one question below from some behavior that I believe changed from the initial PR. Also, just wanted to note that the migrator will generate sha512 sums for all ocfl objects, so we don't need to worry about anything missing from the Fedora 6/ocfl side of things.

@@ -468,6 +474,11 @@ private String extractInlineXml() throws XMLStreamException {

private void validateInlineXml() {
if (isInlineXml && contentDigest != null && StringUtils.isNotBlank(contentDigest.getDigest())) {

if (StringUtils.equals(contentDigest.getType(), "DISABLED")) {
throw new RuntimeException("DISABLED digest. Skipping digest validation");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we want to throw an exception here or not. As I understood the problem initially, it seemed like we wanted to continue migration of the pids with disabled digests because it was a feature of Fedora 3. If so, we could just log this instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went the exception route to assist with building an integration test case but if you think logging would be sufficient I can revert that change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think logging then return to skip the checksum validation is acceptable. For the integration test we can switch to check if the migration was successful. It looks like the InlineXmlIT does something similar, but for this case we can check if the resources exist in the ocfl object instead of computing checksums.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've now made those changes as suggested @mikejritter

public void testMigrateObjectWithExternalDatastreamAndDisabledDigest() throws Exception {
setup("inline-disabled-it");
migrator.run();
final var migratedAuditPath = "target/test/ocfl/inline-disabled-it/storage/8f8/e55/54c/" +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to use the OCFL classes to look for the audit resource here. fcrepo-storage-ocfl contains the OcflObjectSession which can be used to check, though it doesn't look like we have javadocs up for that module.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing me in the right direction. I've uipdatedf the IT

Copy link
Contributor

@mikejritter mikejritter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Surfrdan looks good now 👍

setup("inline-disabled-it");
migrator.run();
final var session = sessionFactory.newSession("info:fedora/llgc-id:1591190");
assertTrue(session.containsResource("info:fedora/llgc-id:1591190/AUDIT"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we be testing for the existence of one of the datastreams with a DISABLED digest?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed the ID to look for a DISABLED digest

Copy link
Contributor

@whikloj whikloj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@whikloj whikloj merged commit bad1e17 into fcrepo-exts:main Feb 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants