Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected result when using IN with TIME values #13813

Closed
2 tasks done
suyZhong opened this issue Sep 8, 2024 · 6 comments · Fixed by #13820
Closed
2 tasks done

Unexpected result when using IN with TIME values #13813

suyZhong opened this issue Sep 8, 2024 · 6 comments · Fixed by #13820

Comments

@suyZhong
Copy link

suyZhong commented Sep 8, 2024

What happens?

Consider the following test cases. The second query should return true since the value after casting is 12:34:56 and the IN expression should be evaluated to true. Besides, if the false is expected, its negation query should not be evaluated as false.

To Reproduce

DROP TABLE IF EXISTS t1;
CREATE   TABLE  t1(c0 TIME WITH TIME ZONE);
INSERT INTO t1(c0) VALUES ('12:34:56');

SELECT t1.c0 FROM t1; -- 12:34:56+08

SELECT (CAST(t1.c0 AS TIME) IN ('12:34:56')) FROM t1; -- false (unexpected)
SELECT NOT (CAST(t1.c0 AS TIME) IN ('12:34:56')) FROM t1; -- false

OS:

ubuntu 22.04

DuckDB Version:

v1.0.1-dev5313 64bacde

DuckDB Client:

CLI

Hardware:

No response

Full Name:

Suyang Zhong

Affiliation:

NUS

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have tested with a nightly build

Did you include all relevant data sets for reproducing the issue?

Yes

Did you include all code required to reproduce the issue?

  • Yes, I have

Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

  • Yes, I have
@szarnyasg
Copy link
Collaborator

Hi, thanks for reporting this! Interestingly, it doesn't reproduce on macOS:

v1.0.1-dev5313 64bacde85e
D DROP TABLE IF EXISTS t1;
D CREATE   TABLE  t1(c0 TIME WITH TIME ZONE);
D INSERT INTO t1(c0) VALUES ('12:34:56');
D
D SELECT t1.c0 FROM t1; -- 12:34:56+08
┌─────────────────────┐
│         c0          │
│ time with time zone │
├─────────────────────┤
│ 12:34:56+00         │
└─────────────────────┘
D
D SELECT (CAST(t1.c0 AS TIME) IN ('12:34:56')) FROM t1; -- false (unexpected)
┌───────────────────────────────────────┐
│ (CAST(t1.c0 AS TIME) IN ('12:34:56')) │
│                boolean                │
├───────────────────────────────────────┤
│ true                                  │
└───────────────────────────────────────┘
D SELECT NOT (CAST(t1.c0 AS TIME) IN ('12:34:56')) FROM t1; -- false
┌───────────────────────────────────────────┐
│ (CAST(t1.c0 AS TIME) NOT IN ('12:34:56')) │
│                  boolean                  │
├───────────────────────────────────────────┤
│ false                                     │
└───────────────────────────────────────────┘

I will test it on Ubuntu in a moment.

@suyZhong
Copy link
Author

suyZhong commented Sep 8, 2024

Have just tested on my M2 Mac by install through this link. Looks like it's an old version (perhaps we should also update the mac install link), but I could also reproduce it.

./duckdb
v1.0.1-dev4917 45787e5f9f
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
D DROP TABLE IF EXISTS t1;
D CREATE   TABLE  t1(c0 TIME WITH TIME ZONE);
D INSERT INTO t1(c0) VALUES ('12:34:56');
D
D SELECT t1.c0 FROM t1; -- 12:34:56+08
┌─────────────────────┐
│         c0          │
│ time with time zone │
├─────────────────────┤
│ 12:34:56+08         │
└─────────────────────┘
D
D SELECT (CAST(t1.c0 AS TIME) IN ('12:34:56')) FROM t1; -- false (unexpected)
┌───────────────────────────────────────┐
│ (CAST(t1.c0 AS TIME) IN ('12:34:56')) │
│                boolean                │
├───────────────────────────────────────┤
│ false                                 │
└───────────────────────────────────────┘
D SELECT NOT (CAST(t1.c0 AS TIME) IN ('12:34:56')) FROM t1; -- false
┌───────────────────────────────────────────┐
│ (CAST(t1.c0 AS TIME) NOT IN ('12:34:56')) │
│                  boolean                  │
├───────────────────────────────────────────┤
│ false                                     │
└───────────────────────────────────────────┘
D

So I guess it's about timezone. When I change the timezone to UTC, there's not issue:

SET TimeZone = 'UTC';

I guess you can try this:

SET TimeZone='Asia/Singapore';

@szarnyasg
Copy link
Collaborator

szarnyasg commented Sep 8, 2024

It's an icu issue! I also tried to set the timezone which failed due to icu not being available in my (default) build. I rebuilt DuckDB with a few core extension and now it reproduces the problem.

CORE_EXTENSIONS='autocomplete;icu;parquet;json' GEN=ninja make

Thanks for helping with the troubleshooting!

@hawkfish
Copy link
Contributor

hawkfish commented Sep 8, 2024

Hi @suyZhong - Thanks for the report!

The problem is really that TIMETZ/TIME WITH TIME ZONE is not very well tested right now because almost no one uses it, partly because it was not the standards committee's finest design. I'd recommend avoiding it if you can. You are in a time zone that doesn't use Daylight Savings Time, so it might be less useless for you, but even so, sticking with full timestamps is probably more robust.

@hawkfish
Copy link
Contributor

hawkfish commented Sep 8, 2024

FWIW, TIMESTAMP, TIMESTAMPTZ and TIMETZ take 64 bits/value, so switching wouldn't even increase your storage footprint. It might even improve it as the offsets in TIMETZ are stored in the low 24 bits, which means that even if they are all the same (e.g., +08) any compression will likely be less efficient.

@suyZhong
Copy link
Author

suyZhong commented Sep 9, 2024

@hawkfish Thanks for your insights! I'm actually developing a testing tool that automatically integrates new features, and thus may cover some corner features.

Mytherin added a commit that referenced this issue Sep 10, 2024
hawkfish added a commit to hawkfish/duckdb that referenced this issue Sep 16, 2024
Push down right side predicates that do not reference the inequality column.
This appears to be what other systems do, but the optimiser doesn't know this,
so we have to require it for testing.

fixes: duckdb#13813
fixes: duckdblabs/duckdb-internal#3048
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants