Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bag Splitting post Bag-it 5.x #5

Open
ntallman opened this issue Nov 15, 2016 · 8 comments
Open

Bag Splitting post Bag-it 5.x #5

ntallman opened this issue Nov 15, 2016 · 8 comments

Comments

@ntallman
Copy link

LOC is removing the CLI tools from bagit-java in version 5, essentially breaking this script. Since bagit-python, bagit ruby (https://github.com/tipr/bagit), and bagger all do not have bag splitting capability, the only way for institutions to split bags is to write JAVA code, which is not feasible for all dev shops. Any thought to updating this script to make direct use of the Java library?

@nkrabben
Copy link

I think this script is a pure python implementation of bag splitting. It uses bagit-python which is itself pure python, so it won't be affected by changes to bagit-java.

@ntallman
Copy link
Author

ntallman commented Nov 15, 2016

But the documentation says the first command you have to run is bagit-jave CLI to actually split the bag, with the second python script being verification and added tags/documentation?

Splitting a Bag

$ bag splitbagbysize <BAG> --maxbagsize 30
$ python bag-split.py split <BAG>

The first command above uses the official BagIt command-line utility
(bag) to split the original , in this example using 30GB as the
per-bag limit. You can also use --maxbagsize values like .001 to indicate
1 MB (for example).

The second command uses this tool to verify the split bags against the
original bag for integrity and completeness, as well as to create an
additional "metadata" bag among the split bags; the /data directory of
the metadata bag will contain the original bag's manifests and bag-info.txt file.

@nkrabben
Copy link

nkrabben commented Nov 15, 2016

Oh, looks like I read the code wrong. In that case, it seems like an update is needed if this script is still in use.

Would you be interested in helping to add bag splitting to bagit-python? I'd like to bring the various bag libraries closer to parity on features that they support.

@ntallman
Copy link
Author

I would absolutely love it if bag splitting was built into bagit-python! In fact, I've already commented on a GitHub Issue for just that. I'm using bagit-python for other parts of the bagging, would be great to not have to pull in another script.

@nkrabben
Copy link

@ntallman
Copy link
Author

+1! Thank you! Digital preservation practitioners of the world thank you too!

@Educopia
Copy link

thanks for taking a swing at updating Nick!

@ntallman
Copy link
Author

Any update? Bagit-python still doesn't have bag splitting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants