-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add interface to abstract query result persistence #4147
Conversation
q.skip_updated_at = True | ||
db.session.add(q) | ||
query_ids = [q.id for q in queries] | ||
logging.info("Updated %s queries with result (%s).", len(query_ids), query_hash) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We moved this to a separate class method on the Query
model. Eventually wasn't needed for this refactor, but it's a good separation regardless so we kept it here.
@@ -256,7 +274,7 @@ def to_dict(self): | |||
'id': self.id, | |||
'query_hash': self.query_hash, | |||
'query': self.query_text, | |||
'data': json_loads(self.data), | |||
'data': self.data, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We now deserialize it in a single location instead of having every user call json_loads
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good in general, but they may be some minor optimizations still possible (see comments).
class DBPersistence(object): | ||
@property | ||
def data(self): | ||
if self._data is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this also check if self._deserialized_data
has already been set and return it if yes (and also unset it in the property setter below)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In what case _data
will be None
and _deserialized_data
has some value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is basically a race condition when _data
is set to None
via the data property setter below when it was previously not None
. If data
should just be used as a read-only property I would suggest to remove the property setter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know. It feels like this complicates the code for no good reason, as this class has a very defined use 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Read your comment again and realized that actually making _deserialized_data
in sync with the setter isn't that complex. Implemented in 16fbc5e + added tests.
What type of PR is this? (check all applicable)
Description
This is a step towards supporting different backends (S3, filesystem, etc) for storing query results (the actual data). In this PR we're abstracting the persistence from the model. The default implementation keeps storing it in the DB as it was until now.
We DO NOT recommend creating extensions based on this, as the interface will probably change. Current interface has limitations like preventing streaming data, which we would like to eventually support.
Related Tickets & Documents
#78