Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try to handle changes to type definitions #43

Merged
merged 11 commits into from
Sep 12, 2015
Merged

Conversation

Changaco
Copy link
Member

Fixes #26 (partially, because we can't detect all schema changes).

assert one.biz == 'x'
assert not hasattr(one, 'bar')

@mark.xfail
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we using an xfail here instead of a raises?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because ideally it shouldn't raise, but it's a problem we can't fix.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But xfail will silently swallow all exceptions, no? Seems like we should test for the specific exception we're expecting.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the appropriate raises parameter. From pytest's documentation:

If you want to be more specific as to why the test is failing, you can specify a single exception, or a list of exceptions, in the raises argument. Then the test will be reported as a regular failure if it fails with an exception not mentioned in raises.

class EmptyModel(Model): pass
self.db.register_model(EmptyModel, 'grok')
# Add a new column then drop the original one
self.db.run("ALTER TABLE grok ADD COLUMN biz text NOT NULL DEFAULT 'x'")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't biz the same type as bar? Why is this test "different_type"?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, this is grok here, not foo.

@chadwhitacre
Copy link
Collaborator

IRC

@Changaco
Copy link
Member Author

Changaco commented Mar 5, 2015

Re IRC: a silent failure is when something bad happens but the problem is ignored. It's not the case here, the ValueError is only masked if it can be remedied, if not we re-raise it.

tokens = self.tokenize(s)
if len(tokens) != len(self.atttypes):
# The number of columns has changed, re-fetch the type info
self.__dict__.update(self._from_db(self.name, curs).__dict__)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where did we get this line from, and why do we not call it in the normal course of operation? Why are we doing something in this case that we don't otherwise do? Don't we need to configure the class when it's first instantiated? Why am I not seeing that in our code? Is it in psycopg2?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, we're calling _from_db, which is a constructor classmethod, and then we're overwriting all possible attributes on ourself with the corresponding attributes from the new instance.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarified in 0f422eb.

@chadwhitacre
Copy link
Collaborator

I'm partway through a commit to update the documentation.

@chadwhitacre
Copy link
Collaborator

And I still haven't satisfied myself on the ways that our parse override could fail.

@chadwhitacre
Copy link
Collaborator

I'm dusting off my postgres.py dev env. I've got env's rebuilt for Python 2 and 3. Hitting an import error though ...

@chadwhitacre
Copy link
Collaborator

Imports clean up in #45.

@chadwhitacre
Copy link
Collaborator

I've got docs again locally.

@chadwhitacre
Copy link
Collaborator

I'm unraveling what the parse method is about to understand exactly what we're doing with it. I want to understand it and in particular how our implementation of it can fail, so that I can document it properly.

In the CompositeCaster constructor, parse is used as the adapter function, aka, typecaster, for new_type. It's this typecaster that is then passed to register_type underneath register_composite.

@chadwhitacre
Copy link
Collaborator

register_type ends up adding the type adapter to a dictionary that is ... somehow referenced when hydrating result sets.

@chadwhitacre
Copy link
Collaborator

>>> psycopg2._ext.string_types.keys()
[3904, 1028, 3905, 1005, 16, 17, 1042, 1043, 20, 21, 3909, 23, 25, 26, 3802, 1182, 1183, 1184, 1185, 1186, 1187, 1700, 3911, 114, 1082, 1083, 700, 701, 704, 705, 3906, 3907, 3908, 1270, 3910, 199, 3912, 3913, 1231, 3926, 3927, 1114, 1115, 3807, 1000, 1001, 1002, 1003, 18, 1006, 1007, 1009, 1266, 19, 1013, 1014, 1015, 1016, 1017, 1021, 1022]
>>> psycopg2._ext.binary_types.keys()
[]
>>>

@chadwhitacre
Copy link
Collaborator

I'm trying to find where the adapter function parse is called.

@chadwhitacre
Copy link
Collaborator

Looks like curs_get_cast dereferences the string_types dict (I'm not sure where binary_types gets looked at) and returns back to psyco_curs_cast, which is where the caster is actually called.

@chadwhitacre
Copy link
Collaborator

psyco_curs_cast ends up at Cursor.cast, but I'm not seeing where that's actually called outside of CompositeCaster.parse! Unless what we're getting is the result of curs_get_cast in pqpath? Looks likely, that's where DB-API 2.0's fetch seems to be implemented. Actually, fetch* is implemented in cursor_type.c, and pq_fetch is called under there. Okay!

@chadwhitacre
Copy link
Collaborator

So here's the call chain:

  • one/all (us)
  • fetch* (psycopg2)
  • pq_fetch
  • curs_get_cast
  • parse

@chadwhitacre
Copy link
Collaborator

Yeah, binary casts appear to be broken:

FIXME: what the hell am I trying to do here? This just can't work..

:-)

@chadwhitacre
Copy link
Collaborator

psycopg/psycopg2#297 💃

@chadwhitacre
Copy link
Collaborator

Hmmm ... a little more complicated. Long story short, parse is called inside typecast_cast.

@chadwhitacre
Copy link
Collaborator

Alright. I'm back into parse, understanding how it works.

@chadwhitacre
Copy link
Collaborator

We have a list of tokens and a list of types. We also have a list of field names.

@chadwhitacre
Copy link
Collaborator

Here are the basic ways in which a type can change underneath us:

  • a field can be removed
  • a field can be added
  • a field can change type

An arbitrary number of fields can be simultaneously changed in one of these three ways.

@chadwhitacre
Copy link
Collaborator

Changes to the type definition affect both reads and writes.

@chadwhitacre
Copy link
Collaborator

@Changaco Can you remind me why we need to support this? Why don't we just always restart the web app when the schema changes? Convenience?

@chadwhitacre
Copy link
Collaborator

What if we have an object with an update method and we pass data to it that gets stored in the wrong field because of a schema change between when the object was created and when we updated it? Is that possible?

@chadwhitacre
Copy link
Collaborator

More tomorrow ...

@chadwhitacre
Copy link
Collaborator

I mean, now that I revisit this, it strikes me as really weird to try to expect that the Python layer can seamlessly deal with an underlying schema change.

@Changaco
Copy link
Member Author

It's not weird to expect postgres.py to keep working when the schema changes, because it's not even supposed to know what the schema is.

@Changaco
Copy link
Member Author

I've added a commit to simplify our parse method: it occurred to me that we don't need to duplicate the code of psycopg's parse method like it was suggested in #26 (comment).

@chadwhitacre
Copy link
Collaborator

It's not weird to expect postgres.py to keep working when the schema changes, because it's not even supposed to know what the schema is.

Postgres.py itself isn't supposed to know what the schema is, but:

  • Model is intended to be subclassed, and the subclass probably knows what the schema is, especially if it implements update methods.
  • Instances of Model subclasses will have their attributes consumed by other code, which will expect a certain schema.

Any schema update is expected to result in other code changes, in the Model subclass as well as in consumers of subclass instances. If there is no code change, then what was the point of making the schema change? If there is a code change, then of course we need to restart the app.

I hate to say it, but I think we should close this PR. :-(

@chadwhitacre
Copy link
Collaborator

... though a documentation update could be in order. I can open a new PR for that.

@Changaco
Copy link
Member Author

Firstly, this PR is useful. It will reduce the downtime of the apps that use postgres.py, because it will no longer be necessary to take them down every time a backwards-compatible change to the DB schema is made, like dropping an unused column, or adding a new column with a default.

Secondly, what this PR (partially) fixes is a bug, I don't think it needs any other justification: bugs should be fixed whenever possible.

@chadwhitacre
Copy link
Collaborator

Secondly, what this PR (partially) fixes is a bug

Is #26 a bug? The fact that we can only partially fix it seems to be due to the fundamental constraints of a networked database, suggesting that, while it's certainly a constraint, it's not a bug. Can you explain why you see #26 strictly as a bug?

Very early on (I just spent some time looking for a relevant commit, but didn't find one) I worked around something like #26 on Gittip (at the time) with a three-step deploy process:

  • deploy code that supports both the old and new schemas
  • deploy the new schema
  • deploy code that drops support for the old schema

I believe, given Python and Postgres, that's the only possible way to avoid downtime when we want to make a non-backwards-compatible schema change to the database.

It will reduce the downtime of the apps that use postgres.py, because it will no longer be necessary to take them down every time a backwards-compatible change to the DB schema is made

I can see the value in that. I still need to satisfy myself that supporting this use case doesn't introduce too powerful a footgun.

For the record, I'm being more conservative with this PR than I would for one on Gratipay or Aspen, because postgres.py has already achieved the status of a thoroughly-documented software product in a way that Gratipay and Aspen haven't yet. I want to change postgres.py deliberately and carefully.

@Changaco
Copy link
Member Author

Is #26 a bug? The fact that we can only partially fix it seems to be due to the fundamental constraints of a networked database, suggesting that, while it's certainly a constraint, it's not a bug. Can you explain why you see #26 strictly as a bug?

#26 is a caching bug, psycopg caches type definitions but it doesn't refresh them when they've become stale. Does it seem normal to you for a caching system to fail instead of refetching the data from upstream?

The fact that we can't fully fix this is not a "fundamental constraint of a networked database", it's just that postgresql wasn't designed to be used the way postgres.py uses it. It might be possible to modify postgresql to make the problem disappear entirely, but even if it is I don't want to wait for it.

I believe, given Python and Postgres, that's the only possible way to avoid downtime when we want to make a non-backwards-compatible schema change to the database.

You can't avoid downtime without this PR unless the changes only add or drop tables but don't modify any.

@chadwhitacre
Copy link
Collaborator

Alright, @Changaco, I'm merging this without further review, because a) I'm sure you'll want this for Liberapay, and b) I'm sure you'll quickly fix any bugs we're not seeing with it right now. :-)

chadwhitacre added a commit that referenced this pull request Sep 12, 2015
Try to handle changes to type definitions
@chadwhitacre chadwhitacre merged commit efc33d7 into master Sep 12, 2015
@chadwhitacre chadwhitacre deleted the fix-race-condition branch September 12, 2015 13:53
@Changaco
Copy link
Member Author

Thanks, better late than never. :-)

@Changaco
Copy link
Member Author

FTR I hit a corner case that this PR couldn't fix while deploying Liberapay schema changes today. It was a type change of columns from bool to int.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

eliminate race condition when updating types
2 participants