Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: RelationError when relating two scalar values from different relations #7616

Closed
1 task done
NickCrews opened this issue Nov 25, 2023 · 5 comments
Closed
1 task done
Labels
bug Incorrect behavior inside of ibis

Comments

@NickCrews
Copy link
Contributor

What happened?

I feel like I already filed a bug for this, but I couldn't find it with some searching? Apologies if so. Thanks for the help!

import ibis
ibis.options.interactive = True

t = ibis.examples.penguins.fetch()
t.distinct().count() == t.count()

results in RelationError: Selection expressions don't fully originate from dependencies of the table expression.

t.distinct().count() and t.count() are both scalar values, so they should be able to be compared, even though they are coming from different relations. If we were comparing columns, eg t.distinct().my_col == t.my_col, then it makes sense to get this error because how would you "line up" these two columns.

What version of ibis are you using?

7.1.0

What backend(s) are you using, if any?

No response

Relevant log output

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@NickCrews NickCrews added the bug Incorrect behavior inside of ibis label Nov 25, 2023
@cpcloud
Copy link
Member

cpcloud commented Dec 7, 2023

Do you have a general use case for this? It would be helpful to understand the context in which you're trying to do this.

For the query you're showing here you should be able to write

t.nunique() == t.count()

@NickCrews
Copy link
Contributor Author

NickCrews commented Dec 8, 2023

A few ideas that I have had to do:

  • t1.count() == t2.count() (where unlike in the original example t1 and t2 are totally unrelated eg from different parquet files)
  • ibis.greatest(t1.ids.max(), t2.ids.max())
  • t2.mutate(ids=t1.ids.max() + ibis.row_number() + 1) (this is relating scalar from t1 to a vector in t2, so a little different, but related)

@ncclementi
Copy link
Contributor

Just to report, this on main now brings up a different error. More specifically a RecursionError.

    293 has_unbound = False
    294 node_types = (ops.UnboundTable, ops.DatabaseTable, ops.SQLQueryResult)
--> 295 for table in self.op().find(node_types):
    296     if isinstance(table, ops.UnboundTable):
    297         has_unbound = True

RecursionError: maximum recursion depth exceeded

that being said, I'm not sure if this is the expected bug or there is something else going on here.

@cpcloud
Copy link
Member

cpcloud commented Apr 4, 2024

The infinite recursion is definitely unintended 😅

@ncclementi
Copy link
Contributor

This is also works as expected now, when using as_scalar().

In [13]: t.distinct().count().as_scalar() == t.count().as_scalar()
Out[13]: 
┌──────┐
│ True │
└──────┘

xref: #10124 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect behavior inside of ibis
Projects
Archived in project
Development

No branches or pull requests

3 participants