-
-
Notifications
You must be signed in to change notification settings - Fork 481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revisit the pickle jar procedure #10768
Comments
comment:1
While we're at it, why does the pickle jar need to be a |
comment:2
One major advantage of not having the tar file would be that the pickle jar could be updated using standard
|
comment:3
Related ticket: #11069 |
comment:4
Nicolas, just to make sure I understand you correctly, is your proposal the following:
I can see some merit to this proposal, however I would save only the pickles which actually changed. Otherwise you will end up with lots of copies of the same pickle. |
comment:5
Replying to @jdemeyer:
+1, definitely! Actually I did not suggest it earlier because I was I was also wondering whether this could possibly slow down |
comment:6
Hi Jeroen! Replying to @jdemeyer:
I am going to use the occasion to amend a bit the proposal :-)
Yes.
Yes. More precisely sage-4.7 would still contain the subset of the
More precisely: the release manager recreates a fresh pickle jar by running all the sage tests with SAGE_PICKLE_JAR set (as described in unpickle_all). And then removes from pickle_jar-$OLDVERSION those that did not change. An easy thing to script.
+1; this is a good refinement of the last point in the ticket description. The comments above should take care of this. Note that if the pickle_jar for 3.1 and 4.6.2 contain the same pickle X (version numbers just for the example), then I prefer to delete that of 3.1 and keep that of 4.6.2. Indeed, if X does not unpickle anymore with 4.7, then the relevant question is: "is it acceptable to not unpickle in 4.7 a pickle generated by 4.6.2?". Do you mind rephrasing the ticket description accordingly, and then make a quick call for comments on sage-devel? Thanks! Cheers, |
comment:7
Replying to @nthiery:
If we use |
comment:8
Replying to @nthiery:
Currently, the pickle jar contains 1174 files. Assuming each file takes 4kB of actual disk space, this would use a few megabytes. I don't think this is an issue.
This would depend very much on the operating system and file system... |
This comment has been minimized.
This comment has been minimized.
comment:10
Hi Nicolas, I want to add to your proposal that the pickle_jar be properly documented. As far as I am aware, there is currently no documentation on what the pickle jar is for, how it should be used, and what to do when a pickle breaks with
for example. A non-trivial example for using Secondly, I think that the procedure for adding new pickles to the jar needs to streamlined. Again, I don't believe that it is described anywhere when or how this happens, but I do know that there are many "new" classes which are not represented in the pickle_jar with the consequence that the pickle_jar is unable to check backward compatibility for these classes. Andrew |
comment:11
Do we really put all that into the git repo? The current (incredibly old) pickle jar is about 2MB uncompressed. A new one is likely considerably larger. There are of the order of 10 minor Sage releases every year. I don't know often the pickle changes, but it seems likely that this'll generate on the order of 10MB/year that will be with us forever. The whole git repo is currently <100MB. |
This comment has been minimized.
This comment has been minimized.
comment:12
Hi Volker! I don't have a good view on the order of magnitudes. Yet, with the proposed protocol, pickles that don't change don't get duplicated between versions, and I'd expect that only a few pickles get changed from one version to the other (especially if we emphasize pickling by construction rather than by internal data structure). A good experiment would be to regenerate a new pickle jar, and see how much we have added to it since last time! I don't have a strong opinion about whether the pickle jar should be maintained under git or not. If we can affor it, that makes things easier, as changes to the pickle jar can be done within the usual workflow. But if it's too big, it's too big. Cheers, |
The current pickle jar mechanism has some drawbacks:
We never add new pickles to the pickle jar
We don't know how old pickles in the pickle jar are
We may be testing an old pickle, but not a recent one
Updating specific pickles is a bit tedious
Here is a new proposal:
.tar.bz2
file but simply as files within the directoryextcode/pickle_jar/$VERSION
. This will likely increase the on-disk space needed for a Sage install, but will not have a big influence on Sage distributions, since we have an extcode spkg anyway (which is tarred and compressed).git
control (this will now become possible).$VERSION
in the directory name refers to the Sage version used to create the pickle. Once a pickle has been made, it will remain in place in that directory, even in subsequent Sage versions (so sage-4.7.2 will containpickle_jar/4.7
,pickle_jar/4.7.1
andpickle_jar/4.7.2
).pickle_jar/$NEWVERSION
. The old pickle is kept where it was.git remove
the old pickle.CC: @sagetrac-sage-combinat @ohanar
Component: pickling
Issue created by migration from https://trac.sagemath.org/ticket/10768
The text was updated successfully, but these errors were encountered: