-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: complete dtype rules for expression tree transformations #376
Conversation
2d26785
to
1b79355
Compare
1b79355
to
0209fdd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After learning more details offline, overall LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left few questions. LGTM overall.
3 <NA> | ||
4 <NA> | ||
dtype: Float64 | ||
dtype: Int64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to see that this change fix this issue!
bigframes/session/__init__.py
Outdated
inline_df = dataframe.DataFrame( | ||
blocks.Block.from_local(pandas_dataframe, self) | ||
) | ||
except ValueError: # Thrown by ibis for some unhandled tyeps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -146,6 +205,24 @@ def skips_nulls(self): | |||
def handles_ties(self): | |||
return True | |||
|
|||
def output_type(self, *input_types: dtypes.ExpressionType): | |||
if isinstance(self.bins, int) and (self.labels is False): | |||
return dtypes.INT_DTYPE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you have a chance to test it? Is it FLOAT_DTYPE
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, the test suite automatically validates all these derivations now
bigframes/operations/aggregations.py
Outdated
|
||
@dataclasses.dataclass(frozen=True) | ||
class MeanOp(UnaryAggregateOp): | ||
name: ClassVar[str] = "mean" | ||
|
||
def output_type(self, *input_types: dtypes.ExpressionType): | ||
if pd.api.types.is_bool_dtype(input_types[0]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
if pd.api.types.is_bool_dtype(input_types[0]) or pd.api.types.is_integer_dtype(input_types[0]):
return dtypes.FLOAT_DTYPE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
bigframes/operations/aggregations.py
Outdated
@@ -87,16 +112,38 @@ class ApproxQuartilesOp(UnaryAggregateOp): | |||
def name(self): | |||
return f"{self.quartile*25}%" | |||
|
|||
def output_type(self, *input_types: dtypes.ExpressionType): | |||
if pd.api.types.is_bool_dtype(input_types[0]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as below
if not isinstance(x, ibis_types.DecimalValue): | ||
return False | ||
# Should be exactly 76 for bignumeric | ||
return x.precision > 70 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you update it into == 76
as you handle the numeric casting at line 1102 in this line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I care about the pre-cast type information, and I don't trust ibis to have the precision/scale numbers right.
Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
Fixes #<issue_number_goes_here> 🦕