Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add workarounds for duckdb MAP bugs #8632

Closed
1 task done
NickCrews opened this issue Mar 12, 2024 · 5 comments · Fixed by #8985
Closed
1 task done

feat: add workarounds for duckdb MAP bugs #8632

NickCrews opened this issue Mar 12, 2024 · 5 comments · Fixed by #8985
Labels
duckdb The DuckDB backend feature Features or general enhancements
Milestone

Comments

@NickCrews
Copy link
Contributor

Is your feature request related to a problem?

see
duckdb/duckdb#11115
and
duckdb/duckdb#11116

This might be out of scope, and we should just wait for duckdb to fix these upstream, but we could add some workarounds for these bugs in ibis. I am running into these and am having to implement my own little wrappers, eg

def _map_values(m: ir.MapValue) -> ir.ArrayValue:
    """workaround for https://github.com/duckdb/duckdb/issues/11116"""
    normal = m.values()
    null = ibis.literal(None, type=dt.Array(value_type=m.type().value_type))
    return m.isnull().ifelse(null, normal)

feel free to close this out if you don't want to take this on (I don't blame you if you don't 😉 )

Describe the solution you'd like

I'm not sure where in the stack these fixes would need to go, at compile time, or just when creating expressions.

What version of ibis are you running?

main

What backend(s) are you using, if any?

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@NickCrews NickCrews added the feature Features or general enhancements label Mar 12, 2024
@cpcloud
Copy link
Member

cpcloud commented Mar 12, 2024

Thanks! Looks like you've gone down a map 🐰 🕳️

I think it's reasonable to add these workarounds for now.

@cpcloud cpcloud added the duckdb The DuckDB backend label Mar 12, 2024
@NickCrews
Copy link
Contributor Author

yeah seriously talk about a 🐰 🕳️ . I was convinced for hours it was bugs in my code wheeeeee

here is what I currently use. I have unit tests for functions that use these, but I don't have unit tests for these as individuals.

def map_keys(m: ir.MapValue) -> ir.ArrayValue:
    """workaround for https://github.com/duckdb/duckdb/issues/11116"""
    normal = m.keys()
    null = ibis.literal(None, type=dt.Array(value_type=m.type().key_type))
    return m.isnull().ifelse(null, normal)


def map_values(m: ir.MapValue) -> ir.ArrayValue:
    """workaround for https://github.com/duckdb/duckdb/issues/11116"""
    normal = m.values()
    null = ibis.literal(None, type=dt.Array(value_type=m.type().value_type))
    return m.isnull().ifelse(null, normal)


def map_(keys: ir.ArrayValue, values: ir.ArrayValue) -> ir.MapValue:
    """workaround for https://github.com/duckdb/duckdb/issues/11115"""
    either_null = keys.isnull() | values.isnull()
    regular = ibis.map(keys, values)
    null = ibis.literal(
        None,
        type=dt.Map(
            key_type=keys.type().value_type, value_type=values.type().value_type
        ),
    )
    return either_null.ifelse(null, regular)

@gforsyth
Copy link
Member

Yeah, this would be good to get in for 9.0.

@NickCrews do you have the bandwidth to PR this in? We probably do want unit tests for them, and probably also some xfailed tests that will start to XPASS when the upstream issues are fixed.

NickCrews added a commit to NickCrews/ibis that referenced this issue Mar 13, 2024
NickCrews added a commit to NickCrews/ibis that referenced this issue Mar 13, 2024
NickCrews added a commit to NickCrews/ibis that referenced this issue Mar 13, 2024
@NickCrews
Copy link
Contributor Author

In that PR I addressed duckdb/duckdb#11115, which

  • is more common
  • leads to incorrect results in the NULL NULL case (!)

Once that is in then I can tackle duckdb/duckdb#11116, which I think

  • is much more rare
  • usually leads to errors instead of incorrect reuslts.

NickCrews added a commit to NickCrews/ibis that referenced this issue Mar 14, 2024
partially addresses ibis-project#8632

postgres started failing these new test too,
so had to adjust that visit_Map()
NickCrews added a commit to NickCrews/ibis that referenced this issue Mar 15, 2024
partially addresses ibis-project#8632

postgres started failing these new test too,
so had to adjust that visit_Map()
@lostmygithubaccount lostmygithubaccount moved this from backlog to cooking in Ibis planning and roadmap Mar 15, 2024
NickCrews added a commit to NickCrews/ibis that referenced this issue Mar 19, 2024
partially addresses ibis-project#8632

postgres started failing these new test too,
so had to adjust that visit_Map()
NickCrews added a commit to NickCrews/ibis that referenced this issue Mar 19, 2024
partially addresses ibis-project#8632

postgres started failing these new test too,
so had to adjust that visit_Map()
NickCrews added a commit to NickCrews/ibis that referenced this issue Mar 19, 2024
partially addresses ibis-project#8632

postgres started failing these new test too,
so had to adjust that visit_Map()
NickCrews added a commit to NickCrews/ibis that referenced this issue Mar 19, 2024
partially addresses ibis-project#8632

postgres started failing these new test too,
so had to adjust that visit_Map()
NickCrews added a commit to NickCrews/ibis that referenced this issue Mar 21, 2024
partially addresses ibis-project#8632

postgres started failing these new test too,
so had to adjust that visit_Map()
NickCrews added a commit to NickCrews/ibis that referenced this issue Mar 21, 2024
partially addresses ibis-project#8632

postgres started failing these new test too,
so had to adjust that visit_Map()
NickCrews added a commit to NickCrews/ibis that referenced this issue Mar 21, 2024
partially addresses ibis-project#8632

postgres started failing these new test too,
so had to adjust that visit_Map()
NickCrews added a commit to NickCrews/ibis that referenced this issue Mar 21, 2024
partially addresses ibis-project#8632

postgres started failing these new test too,
so had to adjust that visit_Map()
NickCrews added a commit to NickCrews/ibis that referenced this issue Mar 21, 2024
partially addresses ibis-project#8632

postgres started failing these new test too,
so had to adjust that visit_Map()
cpcloud pushed a commit to NickCrews/ibis that referenced this issue Apr 17, 2024
partially addresses ibis-project#8632

postgres started failing these new test too,
so had to adjust that visit_Map()
@cpcloud
Copy link
Member

cpcloud commented Apr 17, 2024

I'll work on getting the rest of the map family of functions behaving consistently here.

@cpcloud cpcloud added this to the 9.0 milestone Apr 17, 2024
gforsyth pushed a commit that referenced this issue Apr 17, 2024
Closes #8632.

Note that of the backends that support nullable maps, DuckDB is the only
one whose `NULL` handling semantics differ.

Other backends that support nullable maps and are implemented in Ibis
and tested in CI:

- Postgres (only maps of string -> string via the postgres `hstore`
extension)
- PySpark
- Snowflake
- Trino

BREAKING CHANGE: Calling the `get` or `contains` method on `NULL` map
values now returns `NULL`. Use `coalesce(map.get(...), default)` or
`coalesce(map.contains(), False)` to get the previous behavior.
@github-project-automation github-project-automation bot moved this from cooking to done in Ibis planning and roadmap Apr 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duckdb The DuckDB backend feature Features or general enhancements
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants