Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: CompileError: Multiple, unrelated CTEs found #7350

Closed
1 task done
NickCrews opened this issue Oct 13, 2023 · 9 comments · Fixed by #8119
Closed
1 task done

bug: CompileError: Multiple, unrelated CTEs found #7350

NickCrews opened this issue Oct 13, 2023 · 9 comments · Fixed by #8119
Labels
bug Incorrect behavior inside of ibis chained joins The bane of Ibis's existence

Comments

@NickCrews
Copy link
Contributor

NickCrews commented Oct 13, 2023

What happened?

EDIT: see below I found a much simpler repro

Ok, sorry I'm not sure what the exact cause of this is. Thanks for the help!

Basically, I

  1. load some data
  2. perform some cleaning/transformations to get featured table
  3. self join featured on condition A to get ja.
  4. self join featured on condition B to get jb.
  5. ibis.union(ja, jb)

If I do NOT .cache featured before the joins, then I get the error during the union. If I DO .cache featured before the joins, then there is no error. Regardless of whether I .cache(), I can always .execute() ja and jb before the union and it does not error.

Exact repro steps :

  1. run git clone https://github.com/nickcrews/mismo.git and CD into that repo
  2. git checkout 3105eaf7ef84ad5b629d7ce1177a05dc83469b68
  3. Make a venv and install frozen deps. Install PDM if you don't have it. Then run pdm install.
  4. open /docs/examples/patent_deduplication.ipynb
  5. Run until you reach the cell containing rules.block(featured, featured)
  6. If you comment out the .cache() call, you get the error. Try again with the .cache() and no error.

Let me know how else I can be helpful with debugging this.

What version of ibis are you using?

7.0.0

What backend(s) are you using, if any?

duckdb

Relevant log output

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@NickCrews NickCrews added the bug Incorrect behavior inside of ibis label Oct 13, 2023
@cpcloud
Copy link
Member

cpcloud commented Oct 17, 2023

Thanks for opening the issue.

Would you mind coming up with a reproducer that can be run with only ibis installed and that doesn't require use of a library that depends on ibis?

@NickCrews
Copy link
Contributor Author

yeah makes sense. Trying to minimize the changes I need to make: Are you concerned with some/all of

  1. The other libs in the env interfering? the code at hand only uses ibis, I think if you just pip install ibis then I think it should work, but can double check if that is the case. I can try to reduce this if so.
  2. you don't want to have to sift through the utils that load and featurize the data? This might be harder for me to workaround than...
  3. you don't want to have to sift through the join/union logic? I think I can simplify this the easiest

Thanks!

@cpcloud
Copy link
Member

cpcloud commented Oct 23, 2023

Concerned with all of those 😄

How about trying to reproduce without loading and featurizing and trying only steps 3, 4 and 5 in the issue description?

@NickCrews
Copy link
Contributor Author

yes I will do that!

@NickCrews
Copy link
Contributor Author

NickCrews commented Oct 23, 2023

OK so this is actually quite simple 🤣

import ibis

t = ibis.examples.penguins.fetch()
# No error if you remove the .head() call
t = t.head(100)

# The rename is simply needed so the schemas are the same so the union works
sub1 = t.inner_join(t.view(), "island").mutate(island_right=_.island)
sub2 = t.inner_join(t.view(), "sex").mutate(sex_right=_.sex)
u = ibis.union(sub1, sub2)
u

@cpcloud
Copy link
Member

cpcloud commented Oct 23, 2023

Thanks @NickCrews! I can reproduce.

This may not make the 7.1 cut, but it'll definitely be in the release following 7.1.

@cpcloud
Copy link
Member

cpcloud commented Dec 7, 2023

I'll check #7580 for this and add a test there if it's fixed.

@cpcloud cpcloud linked a pull request Dec 7, 2023 that will close this issue
@cpcloud
Copy link
Member

cpcloud commented Dec 7, 2023

Turns out this is also fixed by #7580.

cpcloud added a commit to kszucs/ibis that referenced this issue Dec 7, 2023
@cpcloud cpcloud added the chained joins The bane of Ibis's existence label Dec 7, 2023
@lostmygithubaccount lostmygithubaccount moved this from backlog to cooking in Ibis planning and roadmap Dec 11, 2023
cpcloud added a commit to kszucs/ibis that referenced this issue Dec 14, 2023
cpcloud added a commit to kszucs/ibis that referenced this issue Dec 14, 2023
cpcloud added a commit to kszucs/ibis that referenced this issue Dec 14, 2023
@cpcloud cpcloud removed their assignment Dec 19, 2023
@lostmygithubaccount lostmygithubaccount moved this from cooking to backlog in Ibis planning and roadmap Jan 26, 2024
@cpcloud cpcloud linked a pull request Jan 27, 2024 that will close this issue
@gforsyth
Copy link
Member

gforsyth commented Feb 9, 2024

Closing this out as fixed and tested on the-epic-split

@gforsyth gforsyth closed this as completed Feb 9, 2024
@github-project-automation github-project-automation bot moved this from backlog to done in Ibis planning and roadmap Feb 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect behavior inside of ibis chained joins The bane of Ibis's existence
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants