Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trino BigQuery Connector: CAST causing SELECT to return NULL values and inter-changing column positions #8183

Closed
frg96 opened this issue Jun 3, 2021 · 2 comments · Fixed by #8201
Assignees
Labels
bug Something isn't working correctness

Comments

@frg96
Copy link

frg96 commented Jun 3, 2021

I am using Starburst Trino 356-e (LTS) on a single node cluster (same node running coordinator and worker).

I created a table from Trino CLI:

trino:frg96_dataset> CREATE TABLE IF NOT EXISTS
                  -> test (
                  ->     colvarchar VARCHAR,
                  ->     colsmallint SMALLINT,
                  ->     colint INTEGER,
                  ->     colbigint BIGINT,
                  ->     coldouble DOUBLE,
                  ->     coldecimal DECIMAL(10,3),
                  ->     coldate DATE,
                  ->     coltimestamp TIMESTAMP(3),
                  ->     colvarchar2 VARCHAR,
                  ->     colvarchar3 VARCHAR
                  -> );
CREATE TABLE

This created a bigquery table as shown below:

Describe table from Trino:

trino:frg96_dataset> describe test;
    Column    |     Type      | Extra | Comment
--------------+---------------+-------+---------
 colvarchar   | varchar       |       |
 colsmallint  | bigint        |       |
 colint       | bigint        |       |
 colbigint    | bigint        |       |
 coldouble    | double        |       |
 coldecimal   | decimal(38,9) |       |
 coldate      | date          |       |
 coltimestamp | timestamp(3)  |       |
 colvarchar2  | varchar       |       |
 colvarchar3  | varchar       |       |
(10 rows)

Query 20210603_042500_00006_59a9y, FINISHED, 1 node
Splits: 19 total, 19 done (100.00%)
1.23 [10 rows, 730B] [8 rows/s, 592B/s]

Inserted 5 rows into test via BigQuery Cloud Console:

INSERT INTO `gcp-project-id.frg96_dataset.test`
VALUES(
    'Hello world',
    12345,
    1234567,
    123456789,
    12345.678,
    NUMERIC '12345.678',
    DATE '2014-09-17',
    DATETIME '2014-09-17 18:35:54',
    'HELLO WORLD 2',
    'Hello WORLD 3'
);

When I run a select on all columns, The output is correct as shown:
trino:frg96_dataset> SELECT
                  ->     colvarchar,
                  ->     colsmallint,
                  ->     colint,
                  ->     colbigint,
                  ->     coldouble,
                  ->     colvarchar,
                  ->     coldecimal,
                  ->     coldate,
                  ->     coltimestamp,
                  ->     colvarchar2,
                  ->     colvarchar3
                  -> FROM test;
 colvarchar  | colsmallint | colint  | colbigint | coldouble | colvarchar  |   coldecimal    |  coldate   |      coltimestamp       |  colvarchar2  |  colvarchar3
-------------+-------------+---------+-----------+-----------+-------------+-----------------+------------+-------------------------+---------------+---------------
 Hello world |       12345 | 1234567 | 123456789 | 12345.678 | Hello world | 12345.678000000 | 2014-09-17 | 2014-09-17 18:35:54.000 | HELLO WORLD 2 | Hello WORLD 3
 Hello world |       12345 | 1234567 | 123456789 | 12345.678 | Hello world | 12345.678000000 | 2014-09-17 | 2014-09-17 18:35:54.000 | HELLO WORLD 2 | Hello WORLD 3
 Hello world |       12345 | 1234567 | 123456789 | 12345.678 | Hello world | 12345.678000000 | 2014-09-17 | 2014-09-17 18:35:54.000 | HELLO WORLD 2 | Hello WORLD 3
 Hello world |       12345 | 1234567 | 123456789 | 12345.678 | Hello world | 12345.678000000 | 2014-09-17 | 2014-09-17 18:35:54.000 | HELLO WORLD 2 | Hello WORLD 3
 Hello world |       12345 | 1234567 | 123456789 | 12345.678 | Hello world | 12345.678000000 | 2014-09-17 | 2014-09-17 18:35:54.000 | HELLO WORLD 2 | Hello WORLD 3
(5 rows)

Query 20210603_042330_00005_59a9y, FINISHED, 1 node
Splits: 17 total, 17 done (100.00%)
4.32 [5 rows, 545B] [1 rows/s, 126B/s]




But When I use CAST for the SMALLINT and INT columns, I get the following as shown below:

trino:frg96_dataset>
                  -> SELECT
                  ->     colvarchar,
                  ->     CAST(colsmallint AS SMALLINT) AS colsmallint,
                  ->     CAST(colint AS INTEGER) AS colint,
                  ->     colbigint,
                  ->     coldouble,
                  ->     colvarchar,
                  ->     coldecimal,
                  ->     coldate,
                  ->     coltimestamp,
                  ->     colvarchar2,
                  ->     colvarchar3
                  -> FROM test;
 colvarchar | colsmallint | colint  | colbigint | coldouble | colvarchar | coldecimal |  coldate   | coltimestamp |  colvarchar2  |     colvarchar3
------------+-------------+---------+-----------+-----------+------------+------------+------------+--------------+---------------+---------------------
 NULL       |        NULL | 1234567 | 123456789 |      NULL | NULL       |       NULL | 2014-09-17 | NULL         | HELLO WORLD 2 | 2014-09-17T18:35:54
 NULL       |        NULL | 1234567 | 123456789 |      NULL | NULL       |       NULL | 2014-09-17 | NULL         | HELLO WORLD 2 | 2014-09-17T18:35:54
 NULL       |        NULL | 1234567 | 123456789 |      NULL | NULL       |       NULL | 2014-09-17 | NULL         | HELLO WORLD 2 | 2014-09-17T18:35:54
 NULL       |        NULL | 1234567 | 123456789 |      NULL | NULL       |       NULL | 2014-09-17 | NULL         | HELLO WORLD 2 | 2014-09-17T18:35:54
 NULL       |        NULL | 1234567 | 123456789 |      NULL | NULL       |       NULL | 2014-09-17 | NULL         | HELLO WORLD 2 | 2014-09-17T18:35:54
(5 rows)

Query 20210603_042802_00008_59a9y, FINISHED, 1 node
Splits: 17 total, 17 done (100.00%)
2.15 [5 rows, 545B] [2 rows/s, 253B/s]

Certain columns return NULL and certain column positions are interchanged.


Weirdly, this happens only with this combination of columns. When I remove the last column colvarchar3 from SELECT, it works fine:

trino:frg96_dataset> SELECT
                  ->     colvarchar,
                  ->     CAST(colsmallint AS SMALLINT) AS colsmallint,
                  ->     CAST(colint AS INTEGER) AS colint,
                  ->     colbigint,
                  ->     coldouble,
                  ->     colvarchar,
                  ->     coldecimal,
                  ->     coldate,
                  ->     coltimestamp,
                  ->     colvarchar2
                  -> FROM test;
 colvarchar  | colsmallint | colint  | colbigint | coldouble | colvarchar  |   coldecimal    |  coldate   |      coltimestamp       |  colvarchar2
-------------+-------------+---------+-----------+-----------+-------------+-----------------+------------+-------------------------+---------------
 Hello world |       12345 | 1234567 | 123456789 | 12345.678 | Hello world | 12345.678000000 | 2014-09-17 | 2014-09-17 18:35:54.000 | HELLO WORLD 2
 Hello world |       12345 | 1234567 | 123456789 | 12345.678 | Hello world | 12345.678000000 | 2014-09-17 | 2014-09-17 18:35:54.000 | HELLO WORLD 2
 Hello world |       12345 | 1234567 | 123456789 | 12345.678 | Hello world | 12345.678000000 | 2014-09-17 | 2014-09-17 18:35:54.000 | HELLO WORLD 2
 Hello world |       12345 | 1234567 | 123456789 | 12345.678 | Hello world | 12345.678000000 | 2014-09-17 | 2014-09-17 18:35:54.000 | HELLO WORLD 2
 Hello world |       12345 | 1234567 | 123456789 | 12345.678 | Hello world | 12345.678000000 | 2014-09-17 | 2014-09-17 18:35:54.000 | HELLO WORLD 2
(5 rows)

Query 20210603_044921_00011_59a9y, FINISHED, 1 node
Splits: 17 total, 17 done (100.00%)
2.22 [5 rows, 470B] [2 rows/s, 212B/s]

Any idea(s) what could be causing this issue?
@ebyhr ebyhr added the bug Something isn't working label Jun 3, 2021
@ebyhr
Copy link
Member

ebyhr commented Jun 3, 2021

I confirmed this issue happens in upstream. It comes from BigQueryResultPageSource.getNextPage()

@frg96
Copy link
Author

frg96 commented Jun 7, 2021

Glad to hear that. Thanks @ebyhr !

v-jizhang added a commit to v-jizhang/presto that referenced this issue Apr 27, 2022
highker pushed a commit to prestodb/presto that referenced this issue Apr 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working correctness
Development

Successfully merging a pull request may close this issue.

2 participants