-
Notifications
You must be signed in to change notification settings - Fork 14.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: JSON serializers #22029
fix: JSON serializers #22029
Conversation
@@ -581,47 +588,60 @@ def base_json_conv( # pylint: disable=inconsistent-return-statements | |||
except Exception: # pylint: disable=broad-except | |||
return "[bytes]" | |||
|
|||
raise TypeError(f"Unserializable object {obj} of type {type(obj)}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously this would return None
however there was no way to distinguish between the conversion of np.array(None)
which returns None
and an unserializable object.
|
||
def json_iso_dttm_ser(obj: Any, pessimistic: bool = False) -> str: | ||
def json_iso_dttm_ser(obj: Any, pessimistic: bool = False) -> Any: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return type was wrong. Per the base_json_conv
method there's a slew of viable return types.
|
||
|
||
def pessimistic_json_iso_dttm_ser(obj: Any) -> str: | ||
def pessimistic_json_iso_dttm_ser(obj: Any) -> Any: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return type was wrong. Per the base_json_conv
method there's a slew of viable return types.
else: | ||
return obj.isoformat() | ||
|
||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic has been flipped. Rather than trying the base conversion—which now throws—first and then the more specific temporal types, the logic now tries to convert the more specific and then falls back to the more generic. This ensures there's less error handling.
else: | ||
raise TypeError("Unserializable object {} of type {}".format(obj, type(obj))) | ||
return obj | ||
return datetime_to_epoch(obj) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above.
assert isinstance(base_json_conv(uuid.uuid4()), str) is True | ||
assert isinstance(base_json_conv(time()), str) is True | ||
assert isinstance(base_json_conv(timedelta(0)), str) is True | ||
assert isinstance(base_json_conv(np.bool_(1)), bool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The is True
is unnecessary.
assert isinstance(base_json_conv(np.bool_(1)), bool) | ||
assert isinstance(base_json_conv(np.int64(1)), int) | ||
assert isinstance(base_json_conv(np.array([1, 2, 3])), list) | ||
assert base_json_conv(np.array(None)) is None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a new test.
assert isinstance(base_json_conv(bytes()), str) | ||
assert base_json_conv(bytes("", encoding="utf-16")) == ["bytes"] | ||
|
||
with pytest.raises(TypeError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a new test.
@@ -105,19 +111,26 @@ def test_json_iso_dttm_ser(self): | |||
assert json_iso_dttm_ser(dttm) == dttm.isoformat() | |||
assert json_iso_dttm_ser(dt) == dt.isoformat() | |||
assert json_iso_dttm_ser(t) == t.isoformat() | |||
assert json_int_dttm_ser(np.int64(1)) == 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The non-temporal case was never tested.
@@ -94,9 +94,15 @@ def test_json_int_dttm_ser(self): | |||
assert json_int_dttm_ser(datetime(1970, 1, 1)) == 0 | |||
assert json_int_dttm_ser(date(1970, 1, 1)) == 0 | |||
assert json_int_dttm_ser(dttm + timedelta(milliseconds=1)) == (ts + 1) | |||
assert json_int_dttm_ser(np.int64(1)) == 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The non-temporal case was never tested.
return datetime_to_epoch(obj) | ||
|
||
if isinstance(obj, date): | ||
return (obj - EPOCH.date()).total_seconds() * 1000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we can use obj.timestamp()
now since we are on python 3.3+.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ktmud this is a date
object and not a datetime
object.
6acb9e1
to
3516a32
Compare
3516a32
to
28cccf8
Compare
Codecov Report
@@ Coverage Diff @@
## master #22029 +/- ##
===========================================
- Coverage 66.96% 55.51% -11.45%
===========================================
Files 1807 1813 +6
Lines 69222 69426 +204
Branches 7403 7448 +45
===========================================
- Hits 46352 38541 -7811
- Misses 20965 28965 +8000
- Partials 1905 1920 +15
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
(cherry picked from commit 6bbf4f8)
(cherry picked from commit 6bbf4f8)
assert isinstance(base_json_conv(uuid.uuid4()), str) is True | ||
assert isinstance(base_json_conv(time()), str) is True | ||
assert isinstance(base_json_conv(timedelta(0)), str) is True | ||
assert isinstance(base_json_conv(np.bool_(1)), bool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: when adding stuff to old tests like this I nowadays refactor them by 1) converting them to functional pytest
tests 2) to use pytest.mark.parameterize
to DRY the repeated logic.
SUMMARY
I'm not exactly sure why this issue didn't surface prior to #21936 when
np.vectorize
was being leveraged, but the JSON serialization failed when annp.array
was undefined, i.e.,np.array(None)
, i.e., in SQL Lab one would see the following,if the column was an array but the array was null.
The issue was that the
base_json_conv
treated a return value ofNone
as representing an object which wasn't able to be converted however some converted types can have a return value ofNone
, i.e.,np.array(None).tolist()
isNone
.The fix is to make
base_json_conv
raise aTypeError
if the type is unknown and to reorganize the logic for the JSON serializers. See the inline comments for more details.BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
TESTING INSTRUCTIONS
Added unit tests.
ADDITIONAL INFORMATION