-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
various fixes #125
base: master
Are you sure you want to change the base?
various fixes #125
Conversation
… not just payload manifest
…y issues in the future
@@ -6,9 +6,9 @@ | |||
public enum StandardSupportedAlgorithms implements SupportedAlgorithm{ | |||
MD5("MD5"), | |||
SHA1("SHA-1"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this "SHA-1"
now different than the others? Now the others don't have this dash anymore, but this one still has it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I will update to include sha1 as well as tests on very basic valid test bags
@jscancella Not sure if I can ask this here or if I should make a new issue for this. However, since you're touching the class I wanted to ask about in this PR, I'd try it here... I was wondering if there is any special reason for |
…. Also added tests for valid bags of each standard algorithm
@rvanheest The original reason was to convey that these classes were not designed to be overridden, That's not to say you would be in danger if you did, merely that you were on your own if you wanted to fork this and change the class to be one you can override. I also personally like to make everything as final as possible to decrease the attack surface should a vulnerability be found in the library or dependency library. But I no longer work at the Library of Congress so if the maintainers wish to make this public that is their choice. What are you doing that you need to check the hashes again later without using |
@jscancella I understand why you made these classes final. We defined a another kind of validness (called 'virtually valid'), which is basically the same as 'valid', but with some different rules for the fetch file. So it would be ideal if I could extend from Finally, sorry to hear that you left Library of Congress. It was a pleasure for my coworkers and me to collaborate and interact with you via GitHub. Best wishes in your new position. |
@rvanheest interesting, virtually valid makes me think you are building a bag and then checking it later, is this a correct assumption? If so my only comment would be that I tend to think of bagit as a method of transferring files correctly using (typically) a "sneakernet". Thanks for the kind remarks. As for leaving the Library of Congress, it was most unfortunate but as you can see I am still active on bagit and hope to remain so (even if it is on my own time). |
@jscancella, as I mentioned in another place to you, we use the concept of Bags, BagIt, etc. as one of our storage formats. Indeed we also use it for transferring datasets from our clients to us, but also use it for storage (after having done some cleanup and metadata enrichment). See our public documentation for a description of In the implementation of |
@rvanheest could you not do something like create a derivative bag that doesn't include the files listed in the fetch file in the manifest(s)? That way you can use all the tools normally and you don't need to define a "virtually" valid bag. For example, something like this:
|
… least one manifest when option is enabled
@jscancella Would you please share the compiled jar with me that I can use in my project as I am also having the Thread pool issue for the larger bag. |
@smntb I would wait until someone from the Library of Congress publishes an official jar. But if you really can't wait, you can clone my repo and use the instructions in the README to build it yourself |
@acdha thoughts? It looks like there are people who are interested in this being merged in and deployed to the main java repositories (maven central and jcenter) |
I don't have access to this repo. Perhaps @dbrunton knows who's picking up maintainership? |
Any update on this pull request? |
bump |
fixes #124 - by renaming to make it more clear that we verify all manifests
fixes #123 - by fixing the typo in the message formatting so the path gets added to the message
fixes #122 - fixed when checking if a file is in at least one manifest and file is hidden. It was not skipping if it was hidden and the
ignoreHidden
setting was setfixes #121 - changes the number of threads to be limited by the cpu count instead of unbounded
fixes #119 - by adding test to ensure we don't get warnings when there shouldn't be any