Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: .cache() gives incorrect result when column order changes #9063

Closed
1 task done
NickCrews opened this issue Apr 26, 2024 · 1 comment · Fixed by #9081
Closed
1 task done

bug: .cache() gives incorrect result when column order changes #9063

NickCrews opened this issue Apr 26, 2024 · 1 comment · Fixed by #9081
Labels
bug Incorrect behavior inside of ibis
Milestone

Comments

@NickCrews
Copy link
Contributor

What happened?

import ibis
ibis.options.interactive = True

t = ibis.memtable({"x": [1, 2, 3], "y": [4, 5, 6]})
t = t.cache()
t.select("x", "y").cache(), t.select("y", "x").cache()

shows

(┏━━━━━━━┳━━━━━━━┓
 ┃ x     ┃ y     ┃
 ┡━━━━━━━╇━━━━━━━┩
 │ int64 │ int64 │
 ├───────┼───────┤
 │     1 │     4 │
 │     2 │     5 │
 │     3 │     6 │
 └───────┴───────┘,
 ┏━━━━━━━┳━━━━━━━┓
 ┃ x     ┃ y     ┃
 ┡━━━━━━━╇━━━━━━━┩
 │ int64 │ int64 │
 ├───────┼───────┤
 │     1 │     4 │
 │     2 │     5 │
 │     3 │     6 │
 └───────┴───────┘)

It's like the first .cache() populates the cache with column order [x, y], and then the second call retrieves the same value.

What version of ibis are you using?

main

What backend(s) are you using, if any?

probably NA?

Relevant log output

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@NickCrews NickCrews added the bug Incorrect behavior inside of ibis label Apr 26, 2024
@cpcloud
Copy link
Member

cpcloud commented Apr 26, 2024

Probably related to this equality:

In [7]: s1 = ibis.schema(dict(a="int", b="string"))

In [8]: s2 = ibis.schema(dict(b="string", a="int"))

In [9]: s1 == s2
Out[9]: True

@cpcloud cpcloud changed the title bug: .cache() gives stale result when column order changes bug: .cache() gives incorrect result when column order changes Apr 27, 2024
@cpcloud cpcloud added this to the 9.0 milestone Apr 27, 2024
@gforsyth gforsyth moved this from backlog to review in Ibis planning and roadmap Apr 29, 2024
cpcloud added a commit that referenced this issue Apr 30, 2024
Continuation of #9068 by adding
`FrozenOrderedDict` which calculates its hash from `tuple(self.items()`
rather than `frozenset(self.items())` and also checks for item order
during equality checks.

Closes #9063.

---------

Co-authored-by: Phillip Cloud <[email protected]>
@github-project-automation github-project-automation bot moved this from review to done in Ibis planning and roadmap Apr 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect behavior inside of ibis
Projects
Archived in project
2 participants