-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
👌 IMPROVE: Allow for storing Decimal
#4964
👌 IMPROVE: Allow for storing Decimal
#4964
Conversation
@greschd do you remember: when generating the hash, do you do it after a store and load (and on the reloaded data) or before? |
We don't explicitly go through the store / load loop before hashing -- the expectation is that the hash function is implemented such that it's insensitive to changes that may occur. |
11eb8c8
to
1a98a9f
Compare
1a98a9f
to
39d800e
Compare
thanks @dev-zero, I see changes to the django backend and migrations, but no accompanying change to the sqlalchemy backend. Is this also required? and yeh obviously if you could fix the tests; I'll change this to a draft PR whilst you finalise it and let me know when its ready 👍 |
@chrisjsewell no, changes to sqlalchemy are not required: for some reason we've been using My question is how we should do hashing for |
39d800e
to
6b08690
Compare
Hmm, so you are saying that currently when a Decimal is stored to the database, it is stored and thus returned as an int/float? I guessed the desired behaviour would be to properly serialize/deserialize as a Decimal, but perhaps this is not possible/easy to achieve |
Codecov Report
@@ Coverage Diff @@
## develop #4964 +/- ##
===========================================
+ Coverage 80.23% 80.24% +0.01%
===========================================
Files 515 515
Lines 36746 36758 +12
===========================================
+ Hits 29478 29491 +13
+ Misses 7268 7267 -1
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
020d5b2
to
1f6931b
Compare
Yes. But that also applies to the Numpy datatypes: they will as well be deserialized as Python floats/ints.
Well, yes. But the only way for this would work is by introducing a schema. This can be either integrated in the json-data stored in the db column, or you keep it external. The latter might be easier: a |
1f6931b
to
574ede4
Compare
Yeh, it just feels a bit "off" to me (and unexpected for users), that we should be allowing data types to be serialized that we essentially do not know how to deserialize correctly. I would think it would be more explicit if you convert to a serializable data type before e.g. setting an attribute. Proper serialization would be nice, but I would be wary of the performance impact this would have on (de)serialization, and thus interacting with the database. |
If the DB serialize / deserialize changes the type to In general though, I'd be wary to allow decimals if they silently "degrade" (especially in the inexact case of |
While SQLA has been using simplejson for some time (via `aiida.common.json.dumps` in `aiida.backends.utils`), the `JSONField` from Django was using the native `json` module from Python (they have been using simplejson at some point). This becomes clear as soon as the decimal.Decimal is allowed which simplejson can natively serialize to JSON while the builtin json module does not.
574ede4
to
dd5d359
Compare
@greschd wrt hashing: I am currently calculating hashes for either a float or an int, depending on its string representation (e.g. depending on what type it gets when deserialized). So, it will be consistent. |
All of the Python JSON libraries claim the performance throne for themselves it seems, with https://github.com/ijl/orjson or https://pypi.org/project/ujson/ being specifically performance tuned. I've looked a bit more into this, and I think the proper way to go here would be to give the Anyway, this is good to go and would simplify my live already quiet a bit. |
Decimal
No description provided.