-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integration of further functionality #1
Comments
Good day, Michael. Thanks for sharing, this should be a useful functionality when one needs to migrate / verify such a database. I had a quick look at the code and, as I understand, it provides the following functionality:
This is a mixture of functionality, that in my view, would be better integrated into appropriate places in parts. If that is ok with you, here is what I would suggest:
What do you think? Is that ok with you? Kirill P.S. regarding ignoring non-zero version length I'm not so sure it is ok to ignore FileStorage format problems. Non-zero version length might represent either corruption of data.fs, or actually an old record with version information present. If it is corruption of data.fs, then we should be checking and repairing other format errors as well, e.g. broken transaction record headers, broken data record headers, inconsistencies in between those, etc. For the reference FileStorage/go does that automatically and also there is But if it is actually an old record with version information present - then just ignoring non-zero version length would lead to loading object data payload from wrong position (and thus getting "bad" data for the object). Why it is like this should be clear from the following link: So, I believe, if you actually hit those non-zero version lengths, it might be that many objects - that could be preserved - were turned to be Broken unnecessary. I, of course, might be missing something, but that what comes to mind after checking quickly. /cc @perrinjerome |
Hi @navytux I agree, that this functionality ideally should be split up and be integrated in the according tools. I just put all together in one file / program as there is quite some overlap for the tools in order to make the object iteration fault tolerant. I'm not familiar enough with ZODB, if the used iteration approach can be used for any kind of ZODB storage, so that a general About fixing objects: I was mainly dealing with broken zlib objects. So it was unrecoverable data garbage for me, when unpacking failed, and I wanted just anything marking an error. Providing a hook for what to do with broken objects would probably be a good idea. In the ZODB the versioning feature had never been used and was just corrupted. So you're right, it should be covered in a better way if that's a concern.
All good for me 👍 |
Hi @ml31415, thanks for feedback. I understand that such kind of tools are usually one-shot, and that, after database is migrated, the interest to enhance the tool is usually no longer there. Still if maybe you want to proceed with upstreaming a bit I could give you some (imho) useful direction:
If that is ok, we could try to proceed further step-by-step in pull-requests. But, once again, I will understand that if the migration is already done and the tool is no longer used, there is not so much interest to continue working on it, and that you already tried to share your work via posting the gist and this issue. Kirill |
Hi @navytux I'm not sure how any changes would make sense without having the fault-tolerant I wasn't planning on writing PRs myself right now, as you assumed correctly, I'm done with my week of grief, to get that stuff working at all, and I still got enough to do with the project I wrote the converter for. Might come back to that later, though. But as I said, welcome to take any code snippets you like. :) |
Michael, I see; thanks for feedback. |
@ml31415 thanks for sharing this script, we will also soon be working on converting some ZODBs to python 3 and if we encounter problems I'll remember that there is this code. @navytux I agree with everything you said and I also believe that these would be good additions to zodbtools that we can add if we find the need and/or the time. PS: it's totally of topic but since we are discussing about wrapper storages and storage migration to python3, did you try to write a wrapper storage that would include zodbupdate logic to do the python3 conversion on the fly ? The idea would be to migrate to python3 without much downtime. |
I recently had all sorts of issues converting a py2 db with 50GB of compressed grown data, including some unreadable pickles, compression errors and so on, into something readable for py3. In the end I had do patch quite some parts in
zodbupdate
and other tools in order to get the job done. The result is in the link below. I don't think it would be worth being published as a separate project, but it might be worth being integrated in yourzodbtools
. What do you think?https://gist.github.com/ml31415/b4d00251e76afe35358070564864287a
The text was updated successfully, but these errors were encountered: