Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible bug in df.describe #10467

Closed
1 task done
velicanu opened this issue Nov 10, 2024 · 1 comment · Fixed by #10470
Closed
1 task done

Possible bug in df.describe #10467

velicanu opened this issue Nov 10, 2024 · 1 comment · Fixed by #10470
Labels
bug Incorrect behavior inside of ibis
Milestone

Comments

@velicanu
Copy link

What happened?

df.describe crashes when two columns have different decimal types. Here's the file you can reproduce this with: bug.parquet

import ibis
df = ibis.read_parquet("bug.parquet")
df.describe()

What version of ibis are you using?

9.5.0

What backend(s) are you using, if any?

DuckDB

Relevant log output

$ python
Python 3.12.1 (main, Feb 27 2024, 08:10:07) [Clang 15.0.0 (clang-1500.0.40.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import ibis
>>> df = ibis.read_parquet("bug.parquet")
>>> df.describe()
Traceback (most recent call last):
  File ".direnv/python-3.12/lib/python3.12/site-packages/ibis/expr/operations/relations.py", line 339, in __init__
    missing_from_left = right.schema - left.schema
                        ~~~~~~~~~~~~~^~~~~~~~~~~~~
  File ".direnv/python-3.12/lib/python3.12/site-packages/ibis/common/collections.py", line 247, in __sub__
    common_keys = self._check_conflict(other)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".direnv/python-3.12/lib/python3.12/site-packages/ibis/common/collections.py", line 212, in _check_conflict
    raise ConflictingValuesError(conflicts)
ibis.common.exceptions.ConflictingValuesError: Conflicting values for keys:
  `mean`: decimal(38, 2) != decimal(20, 2)
  `std`: decimal(38, 2) != decimal(20, 2)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".direnv/python-3.12/lib/python3.12/site-packages/ibis/expr/types/relations.py", line 2968, in describe
    t = ibis.union(*aggs)
        ^^^^^^^^^^^^^^^^^
  File ".direnv/python-3.12/lib/python3.12/site-packages/ibis/expr/api.py", line 1951, in union
    return table.union(*rest, distinct=distinct) if rest else table
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".direnv/python-3.12/lib/python3.12/site-packages/ibis/expr/types/relations.py", line 1674, in union
    return self._assemble_set_op(ops.Union, table, *rest, distinct=distinct)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".direnv/python-3.12/lib/python3.12/site-packages/ibis/expr/types/relations.py", line 1599, in _assemble_set_op
    node = opcls(left, right, distinct=distinct)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".direnv/python-3.12/lib/python3.12/site-packages/ibis/common/bases.py", line 72, in __call__
    return cls.__create__(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".direnv/python-3.12/lib/python3.12/site-packages/ibis/common/grounds.py", line 120, in __create__
    return super().__create__(**kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".direnv/python-3.12/lib/python3.12/site-packages/ibis/expr/operations/relations.py", line 342, in __init__
    raise RelationError(err_msg + "\n" + str(e)) from e
ibis.common.exceptions.RelationError: Table schemas must be equal for set operations.
Conflicting values for keys:
  `mean`: decimal(38, 2) != decimal(20, 2)
  `std`: decimal(38, 2) != decimal(20, 2)

Code of Conduct

  • I agree to follow this project's Code of Conduct
@velicanu velicanu added the bug Incorrect behavior inside of ibis label Nov 10, 2024
@cpcloud
Copy link
Member

cpcloud commented Nov 11, 2024

This can be reproduced without a file using memtable:

In [12]: t = ibis.memtable({'a': [1, 2, 3], 'b': [4, 5, 6]}).cast({'a': 'decimal(38,2)', 'b': 'decimal(20,2)'})

In [13]: t
Out[13]:
┏━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ a              ┃ b              ┃
┡━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ decimal(38, 2) │ decimal(20, 2) │
├────────────────┼────────────────┤
│           1.00 │           4.00 │
│           2.00 │           5.00 │
│           3.00 │           6.00 │
└────────────────┴────────────────┘

In [14]: t.describe()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect behavior inside of ibis
Projects
Archived in project
2 participants