-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add warning when user tries to access struct series fields with __getitem__
#1082
Changes from 9 commits
f8c4e61
4d8e670
10a1c3f
f5bd3af
6bfe254
c2f7fb6
13da3b2
e0c8afc
54a7cd7
2e2dbe8
ad2895e
39ca130
da8439f
978bd01
3c42de7
536a81e
b1b04be
a7243ce
456fea0
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
# Copyright 2024 Google LLC | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. |
Original file line number | Diff line number | Diff line change | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
@@ -0,0 +1,83 @@ | ||||||||||
# Copyright 2024 Google LLC | ||||||||||
# | ||||||||||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||||||||||
# you may not use this file except in compliance with the License. | ||||||||||
# You may obtain a copy of the License at | ||||||||||
# | ||||||||||
# http://www.apache.org/licenses/LICENSE-2.0 | ||||||||||
# | ||||||||||
# Unless required by applicable law or agreed to in writing, software | ||||||||||
# distributed under the License is distributed on an "AS IS" BASIS, | ||||||||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||||||||||
# See the License for the specific language governing permissions and | ||||||||||
# limitations under the License. | ||||||||||
|
||||||||||
import pyarrow as pa | ||||||||||
import pytest | ||||||||||
|
||||||||||
import bigframes.pandas as bpd | ||||||||||
|
||||||||||
|
||||||||||
@pytest.mark.parametrize( | ||||||||||
("key", "should_warn"), | ||||||||||
[ | ||||||||||
pytest.param(0, False, id="non_string_key_should_not_warn"), | ||||||||||
pytest.param("a", True, id="string_key_should_warn"), | ||||||||||
], | ||||||||||
) | ||||||||||
def test_non_string_indexed_series_struct_accessor_warning(session, key, should_warn): | ||||||||||
s = bpd.Series( | ||||||||||
[ | ||||||||||
{"project": "pandas", "version": 1}, | ||||||||||
], | ||||||||||
dtype=bpd.ArrowDtype( | ||||||||||
pa.struct([("project", pa.string()), ("version", pa.int64())]) | ||||||||||
), | ||||||||||
session=session, | ||||||||||
) | ||||||||||
|
||||||||||
if should_warn: | ||||||||||
with pytest.warns(UserWarning, match=r"Series\.struct\.field\(.+\)"): | ||||||||||
s[key] | ||||||||||
else: | ||||||||||
s[key] | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we have a check that a warning is not raised in this case? I would prefer two separate tests to make this easier to follow. See go/tott/661 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I separated the tests into two parts: 1 for warning and 1 for no warning. Hopefully this makes it easier to read and understand. |
||||||||||
|
||||||||||
|
||||||||||
@pytest.mark.parametrize( | ||||||||||
"key", | ||||||||||
[ | ||||||||||
pytest.param(0, id="non_string_key"), | ||||||||||
pytest.param("a", id="string_key"), | ||||||||||
], | ||||||||||
) | ||||||||||
def test_string_indexed_series_struct_accessor_no_warning(session, key): | ||||||||||
s = bpd.Series( | ||||||||||
[ | ||||||||||
{"project": "pandas", "version": 1}, | ||||||||||
], | ||||||||||
dtype=bpd.ArrowDtype( | ||||||||||
pa.struct([("project", pa.string()), ("version", pa.int64())]) | ||||||||||
), | ||||||||||
index=["p1"], | ||||||||||
session=session, | ||||||||||
) | ||||||||||
|
||||||||||
s[key] | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please make this test fail if a warning is raised. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done |
||||||||||
|
||||||||||
|
||||||||||
@pytest.mark.parametrize( | ||||||||||
"key", | ||||||||||
[ | ||||||||||
pytest.param(0, id="non_string_key"), | ||||||||||
pytest.param("a", id="string_key"), | ||||||||||
], | ||||||||||
) | ||||||||||
def test_string_indexed_series_non_struct_accessor_no_warning(session, key): | ||||||||||
s = bpd.Series( | ||||||||||
[1], | ||||||||||
dtype=bpd.Int64Dtype, | ||||||||||
index=["a"], | ||||||||||
session=session, | ||||||||||
) | ||||||||||
|
||||||||||
s[key] | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please make this test fail if a warning is raised. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks! Maybe worth a comment about the filterwarnings above, as it'n not 100% obvious to me. Alternatively, could we use the solution suggested here, instead?
Suggested change
Better yet, have a custom category so we know precisely what warning to fail on. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah this approach is simpler. Defined a dedicated warning class instead. I also set a stack level as suggested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we do a custom category (defined in bigframes.exceptions) here to make it a bit easier to filter out if needed for some reason?
Also, I recommend setting stacklevel so that the error shows up in user code, not as appearing deep in BigFrames.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have provided a stack level such that the warning call site is now at
__getitem__