Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timestamp time zone handling appears broken in ORC format for 0.167-t.0.2 #583

Open
Downchuck opened this issue Jun 19, 2017 · 6 comments
Open
Assignees

Comments

@Downchuck
Copy link

Downchuck commented Jun 19, 2017

Teradata Presto release appears to be deserializing the timestamp column incorrectly in ORC files with time zone information. (ORC written from Hive 1.2.1)

Note the query returns correctly in standard Presto and in Hive. The table was built via Hive, transforming the first column, a string, into a timestamp column; the local time zone of Hive/Presto is America/Los_Angeles.

Teradata Presto 0.167-t.0.2:
time         |           tm            |                    _col2
---------------------+-------------------------+---------------------------------------------
 01-30-2016-00:11:02 | 2016-01-30 08:11:02.000 | 2016-01-30 07:11:02.000 America/Los_Angeles
 01-30-2016-00:39:28 | 2016-01-30 08:39:28.000 | 2016-01-30 07:39:28.000 America/Los_Angeles


Mini cluster (0.17x)
time         |           tm            |                    _col2
---------------------+-------------------------+---------------------------------------------
 01-30-2016-00:11:02 | 2016-01-30 00:11:02.000 | 2016-01-30 00:11:02.000 America/Los_Angeles
 01-30-2016-00:39:28 | 2016-01-30 00:39:28.000 | 2016-01-30 00:39:28.000 America/Los_Angeles

presto:mesoads> select time,tm, tm at time zone 'America/Los_Angeles' from orc_table 

@fiedukow
Copy link

Thanks for the bug report. We will look into that.
In a mean while, the workaround may be setting session property as follows:
set session legacy_timestamp = true;
or setting server property:
deprecated.legacy-timestamp = true
This will most likely restore original prestodb/presto behaviour for date time types, which has problems of their own though.

@fiedukow fiedukow assigned fiedukow and unassigned fiedukow Jun 22, 2017
@Downchuck
Copy link
Author

@fiedukow Confirmed, legacy_timestamp works as a workaround.

@fiedukow fiedukow self-assigned this Jul 3, 2017
@zz22394
Copy link

zz22394 commented Sep 1, 2017

@fiedukow Is this issue related with TIMESTAMP behaviour does not match sql standard #7122
?

@cawallin
Copy link

cawallin commented Sep 1, 2017

As far as I know, it's only tangentially related: this ORC bug exists both for the existing timestamp implementation and the implementation in prestodb#7122.

@Downchuck
Copy link
Author

Seems like we can't use any timezone but UTC -- as the hive connector needs to be set to UTC and Presto, to avoid ORC read issues for data written in UTC.

@zz22394
Copy link

zz22394 commented Sep 7, 2017

@cawallin Thanks. After reading prestodb#7122 and prestodb#7480 and DATE TIME SUPPORT IN PRESTO, I think I had got your point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants