Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rewrite conway polynomial spkg and code in Sage library to not use ZODB #12205

Closed
williamstein opened this issue Dec 20, 2011 · 69 comments
Closed

Comments

@williamstein
Copy link
Contributor

This is necessary so we can dump the ZODB spkg from Sage.


Installation Instructions

Depends on #13896
Depends on #13963

CC: @kini

Component: packages: huge

Author: R. Andrew Ohana

Reviewer: François Bissey

Merged: sage-5.7.beta0

Issue created by migration from https://trac.sagemath.org/ticket/12205

@williamstein
Copy link
Contributor Author

comment:2

I did some work on this on the airplane, and better post a link before I forget about it or loose it: http://wstein.org/patches/conway_polynomials-0.3.spkg

Basically, the database is tiny so it can easily and very efficiently just be stored as a standard sobj. Using ZODB is silly.

@kiwifb
Copy link
Member

kiwifb commented Jan 28, 2012

comment:3

That's a good start. Now we need to put SPKG.txt and spkg-install in proper shape. I suppose there need to be more work sage side to use the sobj rather than zodb.

@ohanar
Copy link
Member

ohanar commented Jun 18, 2012

comment:4

William, do you have a work-in-progress patch for the sage library? I would like to finish this task.

Edit: Other than a description in the SPKG.txt, this spkg should final.

Edit2: Ok, there is now a small description, and it is rebased off of the spkg that I made for #13123.

@kiwifb
Copy link
Member

kiwifb commented Dec 29, 2012

comment:5

Looks like I missed your work on this ohanar getting rid of zodb would be good news all around. Do this need any rebasing?

@simon-king-jena
Copy link
Member

comment:6

What's the status? According to comment:4, conway_polynomials-0.4.spkg should be final. So, needs review?

What should be done to test whether this is really enough to get rid of zodb? I mean: How can one remove zodb from an existing Sage install?

@ohanar
Copy link
Member

ohanar commented Dec 29, 2012

comment:7

Well, a patch is also needed to the sage library, and unfortunately William never provided that. It probably wouldn't be that hard to reverse engineer what he was thinking from the SPKG, but I haven't really looked into it.

The only other database in sage that used zodb was the Cremona database, but I rewrote that a year and a half ago to use sqlite3, so this is the only thing that should really be using zope at all (there may be other things that stupidly depend on it, but they should be easy to fix to not use zope).

@kiwifb
Copy link
Member

kiwifb commented Dec 30, 2012

comment:8

I think all that needs to be changed are the files: databases/compressed_storage.py, databases/db.py and databases/conway.py. conway.py is where we will have to be careful to present the same interface to the rest of sage so has not to break anything.

@kiwifb
Copy link
Member

kiwifb commented Dec 30, 2012

comment:9

OK rewrite of conway.py belongs here and the other two files belong to #10353. I am looking at conway.py and see if it is in my abilities.

@kiwifb
Copy link
Member

kiwifb commented Dec 31, 2012

comment:10

I think it is easy but it will be question of getting the syntax right. In the mean time I found that the stein-watkins optional "huge" database also use zodb and will need a similar treatment.

@kiwifb
Copy link
Member

kiwifb commented Jan 2, 2013

comment:11

In the end I am hitting a brick wall here. Possibly due to my lack of python skills and my will to only have minimal changes in conway.py. The issue is that the created sobj is a double dictionary and that the call to the ConwayPolynomials() class follow the same pattern as a double dictionary. You have calls of the kind:
ConwayPolynomials()[3][21]

Implementing the class as a dictionary allows sage to start

import os

from sage.structure.sage_object import load
from sage.misc.misc import SAGE_DATA

_CONWAYDATA = os.path.join(SAGE_DATA,'conway_polynomials')

class ConwayPolynomials(dict):
    def __init__(self,*arg,**kw):
        """
        Initialize the database.
        """
        super(ConwayPolynomials, self).__init__(*arg, **kw)

    def _init(self):
        if not os.path.exists(_CONWAYDATA):
            raise RuntimeError, "In order to initialize the database, the directory must exist."%_CONWAYDATA
        os.chdir(_CONWAYDATA)
        self = load(os.path.join(_CONWAYDATA, 'conway_polynomials'))

but unfortunately but I get key errors:

sage: sage.databases.conway.ConwayPolynomials()[3][21]
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)

/home/fbissey/Work/Overlays/BlueFern/sci-mathematics/sage/<ipython console> in <module>()

KeyError: 3
sage: sage.databases.conway.ConwayPolynomials().keys()
[]

I was hoping that inheriting the dict class the access problem would be solved but I am obviously doing something wrong here. Loading the sobj directly from sage is totally usable, it is just when trying to put it in a class.

@gagern
Copy link
Mannequin

gagern mannequin commented Jan 2, 2013

comment:12

Replying to @kiwifb:

self = load(os.path.join(_CONWAYDATA, 'conway_polynomials'))

This only changes the meaning of the self variable in the local scope. To modify the self object, use self.update(load(…)) instead.

@kiwifb
Copy link
Member

kiwifb commented Jan 2, 2013

comment:13

Interesting point but here it will call update() from the dict class which I don't think is what you meant. I still get the same results after doing your suggested replacement.

@nbruin
Copy link
Contributor

nbruin commented Jan 2, 2013

comment:14

Replying to @kiwifb:

In the end I am hitting a brick wall here. Possibly due to my lack of python skills and my will to only have minimal changes in conway.py.

I don't think _init is a magic python method that automatically gets called. It may well be part of a protocol involved in sage.databases.db.Database. However, when you want to implement this as inheriting from dict, you'll have to call _init yourself in __init__, or better: just make _init part of it.

To make the dictionary read-only, you may want to override the dictionary modification methods too. Plus you probably want to amend pickling (interesting question: How should this pickle? Should it write its content to a file or should it pickle to "initialize ConwayPolynomials() for unpickling"?

@kiwifb
Copy link
Member

kiwifb commented Jan 2, 2013

comment:15

Thanks Nils. I actually though about that _init method as it was strange to have two. Merging the two didn't improve my problems however. I wouldn't know about the pickling. I'd like this to give me the correct behaviour of the class first. I'll retry merging inits as there may have been caching getting in the way.

@kiwifb
Copy link
Member

kiwifb commented Jan 3, 2013

comment:16

I found a way to properly initialise the database. It is a bit ugly but we may be able to move to the other problems: pickling and dict method override. I'll post a patch when I am on something else than that iPad.

@nbruin
Copy link
Contributor

nbruin commented Jan 3, 2013

comment:17

Actually, what you should do is just write the ConwayPolynomials class without thinking about how to initialize from an sobj at all, but just as either a normal class that has an attribute storing the info or inheriting from dict and ensure that pickling works. Then you just have to make sure that the .sobj is a pickled version of it with the appropriate content. (and you may store in the documentation somewhere a script that can produce this pickle from a human-readable file)

@ohanar
Copy link
Member

ohanar commented Jan 3, 2013

comment:18

Rather than inheriting from dict, I think it makes more sense to inherit from collections.Mapping since we only want read functionality. I uploaded a patch that does this, and uses the current dictionary format in the background (I don't see a good alternative to this storage method, but I'm open to suggestions).

@kiwifb
Copy link
Member

kiwifb commented Jan 3, 2013

comment:19

I think we have to stick with the dictionary format since the code in sage is littered with calls to ConwayPolynomials()[p][n] and changing the storage would imply changing an enormous amount of code. I trust that your python foo is way ahead of mine. I would never have produced that getitem and didn't find any examples to do that with google. Your patch is longer than my current one but probably better behaved in the long run. It is very educative to me too.

@nbruin
Copy link
Contributor

nbruin commented Jan 3, 2013

comment:20

Looks great! Two questions:

  • How do we (re)produce the *.sobj file? I think we do want to be able to bootstap sage from human-readable formats? Just documenting how to obtain the appropriate *.sobj is probably sufficient.
  • Do you have a good reason to inherit from UniqueRepresentation? Doing so is definitely not free. It makes instantiation a much more expensive affair and there'll be a cache floating around. You should only do so if you need it and I don't think you do here (ConwayPolynomials is not a parent).

@kiwifb
Copy link
Member

kiwifb commented Jan 3, 2013

comment:21

I can answer the first question: It is generated in the spkg attached in this ticket from a python table in text format look it up. Technically it doesn't have to be a sobj the previous spkg was installing a bzipped version of the python table and then inserting it in the database. We could do something similar, not sure about the cost.

@simon-king-jena
Copy link
Member

comment:22

Replying to @kiwifb:

I think we have to stick with the dictionary format ...

As much as I understand, the "format" (in the sense of the "interface to retrieve data") does not change. That's exactly the point: If you wanted to inherit from dict, then you'd also have to have a __setitem__ method - but we don't want to set data in an interactive session. All what we want is that during initialisation, data are obtained from a file, and then the user can retrieve these data (which is granted by the __getitem__ method).

... since the code in sage is littered with calls to ConwayPolynomials()[p][n]

Would that be changed by the patch? Looking at the lines

    def __getitem__(self, key): 
        try: 
            return self._store[key] 
        except KeyError, err: 
            try: 
                if isinstance(key, (tuple, list)): 
                    if len(key) == 2: 
                        return self._store[key[0]][key[1]] 
            except KeyError: 
                pass 
            raise err 

it seems to me that ConwayPolynomials()[p][n] is still a possible syntax (that's return self._store[key]), but in addition to the old syntax we now also allow the syntax ConwayPolynomials()[p, n] (that's return self._store[key[0]][key[1]]).

and changing the storage would imply changing an enormous amount of code.

By "storage", I understand the format of the data stored in the .sobj file containing the data base. I don't think that would be a problem, as long as there is a function that is able to read old data and transforms it into the new format.

I trust that your python foo is way ahead of mine. I would never have produced that getitem and didn't find any examples to do that with google.

That's in dive into python or other Python introductions, or in the Python documentation.

@ohanar
Copy link
Member

ohanar commented Jan 3, 2013

comment:23

I noticed a a flaw with what I did -- ConwayPolynomials()[p] returns a mutable Python dict, so you can actually write to the database. I'll post a fix for this in a moment.

There is no good reason as far as I can tell to use anything but plain text in the SPKG considering how small the database is.

As for the second question, the only concern would be that the entire database is loaded into memory with each instance (granted this might not be a huge concern due to the small size of the database). The main reason I added the inheritance of UniqueRepresentation was because of a comment I seem to recall being told that Sage databases should be unique. It is simple enough to remove if we deem it not necessary.

@jdemeyer

This comment has been minimized.

@jdemeyer
Copy link

comment:48

Attachment: 12205_root.patch.gz

This needs to depend on $(SAGERUNTIME), not only the Sage library because we actually run Sage code. See new patch.

Moreover, this causes build problems for sagenb:

Processing dependencies for Twisted==12.1.0
Searching for zope.interface

Link to http://pypi.python.org/simple/zope.interface/ ***BLOCKED*** by --allow-hosts

Couldn't find index page for 'zope.interface' (maybe misspelled?)
Scanning index of all packages (this may take a while)

Link to http://pypi.python.org/simple/ ***BLOCKED*** by --allow-hosts

No local packages or download links found for zope.interface
error: Could not find suitable distribution for Requirement.parse('zope.interface')
Error installing Twisted-12.1.0.tar.bz2 !

@kiwifb
Copy link
Member

kiwifb commented Jan 15, 2013

comment:49

SAGERUNTIME? I hadn't seen that. I should have paid more attention to the originating ticket. OK I hadn't thought of twisted but the comment really is for #10353. I shall try to update the sagenb spkg to include zope.interface sometime this week.

@jdemeyer
Copy link

comment:50

Replying to @kiwifb:

SAGERUNTIME? I hadn't seen that.

Basically, $(INST)/$(SAGE) is purely the Sage library (i.e. the stuff in devel/sage). But actually starting Sage or running Sage code requires more, because things are imported from other packages, for example the notebook. That stuff is in $(SAGERUNTIME).

It used to be the case that $(INST)/$(SAGE) was what is essentially $(SAGERUNTIME) now. But that's silly because it adds extra unneeded build-time dependencies for $(INST)/$(SAGE), which needs a long time to build.

@kiwifb
Copy link
Member

kiwifb commented Jan 15, 2013

comment:51

Understood. But zope.interface shouldn't be a blocker for this ticket, only for #10353.

@jdemeyer
Copy link

comment:52

Replying to @kiwifb:

Understood. But zope.interface shouldn't be a blocker for this ticket, only for #10353.

Fair enough, I always confuse these two tickets.

@jdemeyer
Copy link

Changed dependencies from #13896 to #13896, #13963

@jdemeyer
Copy link

comment:54

Potential build failures due to #13963.

@jdemeyer
Copy link

comment:55

Could this have caused (note this is over GF(8)):

sage -t  "devel/sage-main/sage/rings/polynomial/polynomial_ring.py"
**********************************************************************
File "/release/merger/sage-5.7.beta0/devel/sage-main/sage/rings/polynomial/polynomial_ring.py", line 1641:
    sage: R.lagrange_polynomial([(a^2+a,a),(a,1),(a^2,a^2+a+1)], algorithm="neville", previous_row=p)[-1]
Expected:
    a^2*x^2 + a^2*x + a^2
Got:
    a^2*x^2 + a*x + a^2 + a
**********************************************************************

@kiwifb
Copy link
Member

kiwifb commented Jan 18, 2013

comment:56

Is it a new doctest? I see this on unreleased 5.7.beta0. I agressively patched sage-on-gentoo (5.5) to get rid of zodb and we haven't seen that anywhere. I'll admit the move to new gap is giving me trouble but it should be unrelated - or should it?

@jdemeyer
Copy link

comment:57

Replying to @kiwifb:

Is it a new doctest?

No, it's not. On first sight, it doesn't seem to use anything besides basic polynomial arithmetic. But if basic polynomial arithmetic were broken, we would see a lot more doctests failing.

I'm not at all saying that this ticket has anything to do with it, I merge a bunch of tickets and when there's a failure I have to guess which ticket caused it.

@jdemeyer
Copy link

Merged: sage-5.7.beta0

@kcrisman
Copy link
Member

comment:59

Has anybody noticed the very very long string of primes you get when installing this spkg yet? This is with beta1 + #6495, but I assume it is part of this spkg. See the log.

@kiwifb
Copy link
Member

kiwifb commented Jan 30, 2013

comment:60

It is part of the spkg. Don't ask me why ohanar put a "print p " in there. I don't mind the spam but I had a complaint about it in sage-on-gentoo and removed it there.

@kcrisman
Copy link
Member

comment:61

It is part of the spkg. Don't ask me why ohanar put a "print p " in there. I don't mind the spam but I had a complaint about it in sage-on-gentoo and removed it there.

Don't worry, it wasn't directed at anyone in particular! Presumably it's something left over from debugging. Shall I assume this is to be considered non-optimal and a ticket should be opened?

@kini
Copy link
Contributor

kini commented Jan 30, 2013

comment:62

fbissey: sage-on-gentoo is like a third level of ticket review after the official reviewer and the release manager! :D

@kcrisman
Copy link
Member

kcrisman commented Feb 7, 2013

comment:63

It is part of the spkg. Don't ask me why ohanar put a "print p " in there. I don't mind the spam but I had a complaint about it in sage-on-gentoo and removed it there.

Don't worry, it wasn't directed at anyone in particular! Presumably it's something left over from debugging. Shall I assume this is to be considered non-optimal and a ticket should be opened?

So... should I open a ticket? It does seem bizarre, and apparently easily fixed? But if it was necessary, I don't want to.

@kiwifb
Copy link
Member

kiwifb commented Feb 7, 2013

comment:64

That's very easy to fix but opening a ticket is pretty much the only way it will ever get merged.

@kcrisman
Copy link
Member

kcrisman commented Feb 7, 2013

comment:65

That's very easy to fix but opening a ticket is pretty much the only way it will ever get merged.

#14075

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants