Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow case-insensitive fieldname matching for struct coercion in hive connector #13423

Merged
merged 1 commit into from
Oct 6, 2022

Conversation

leetcode-1533
Copy link
Contributor

@leetcode-1533 leetcode-1533 commented Jul 29, 2022

Description

Is this change a fix, improvement, new feature, refactoring, or other?

Fix

Is this a change to the core query engine, a connector, client library, or the SPI interfaces? (be specific)

Hive Connector

How would you describe this change to a non-technical end user or system administrator?

  1. It's safe to do so since Hive is case insensitive

Related issues, pull requests, and links

#5575
#7350

Documentation

(x) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.

Release notes

( ) No release notes entries required.
(x) Release notes entries required with the following suggested text:

# Section
* Allow case-insensitive fieldname matching for struct coercion in hive connector. ({issue}`13423`)

@cla-bot cla-bot bot added the cla-signed label Jul 29, 2022
@leetcode-1533 leetcode-1533 marked this pull request as ready for review July 29, 2022 23:00
@findepi
Copy link
Member

findepi commented Aug 2, 2022

@leetcode-1533
Copy link
Contributor Author

@ebyhr can you help? thanks!

@phd3 phd3 self-requested a review August 22, 2022 21:39
@phd3
Copy link
Member

phd3 commented Aug 29, 2022

w.r.t. #5575 (comment) this is the behavior on the current master head in case we change ordering of foo/FOO. On hdp3 3.1.0, it seems broken and ambiguous. cc @electrum @martint

@phd3
Copy link
Member

phd3 commented Aug 29, 2022

As evident from the tests in this PR, the behavior is different across versions/formats. Given that hive's documentation mentions about the case-insensitivity (and we already support this between table and files), IMO it'd be okay to move forward with the change in this PR.

@phd3 phd3 requested review from electrum and martint August 29, 2022 14:48
@leetcode-1533 leetcode-1533 force-pushed the FieldNameCase branch 2 times, most recently from 4bee8dc to 617d54a Compare September 22, 2022 22:54
@cla-bot
Copy link

cla-bot bot commented Sep 23, 2022

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: yluan.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@cla-bot cla-bot bot removed the cla-signed label Sep 23, 2022
@cla-bot
Copy link

cla-bot bot commented Sep 23, 2022

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: yluan.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

1 similar comment
@cla-bot
Copy link

cla-bot bot commented Sep 23, 2022

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: yluan.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@leetcode-1533
Copy link
Contributor Author

leetcode-1533 commented Sep 23, 2022

All test passed, I used config:

tests | 2022-09-23 07:49:48 INFO: SUCCESS / io.trino.tests.product.hive.TestHiveCoercion.testHiveCoercionTextFile (Groups: hive_coercion, jdbc) took 6.2 seconds
tests | 2022-09-23 07:49:49 INFO:
tests | 2022-09-23 07:49:49 INFO: Completed 6 tests
tests | 2022-09-23 07:49:49 INFO: 6 SUCCEEDED / 0 FAILED / 0 SKIPPED
tests | 2022-09-23 07:49:49 INFO: Tests execution took 1 minutes and 11 seconds
tests |
tests | ===============================================
tests | tempto-tests
tests | Total tests run: 6, Failures: 0, Skips: 0
tests | ===============================================

[yluan@yluan-ld1 trino]$ testing/bin/ptl test run --environment singlenode --config config-hdp3 --debug -- -t io.trino.tests.product.hive.TestHiveCoercion;

@leetcode-1533
Copy link
Contributor Author

@phd3 can you review and merge in this change?

@leetcode-1533
Copy link
Contributor Author

@leetcode-1533
Copy link
Contributor Author

s | java.lang.AssertionError: row_to_row field is not equal expected: collections to be equal (ignoring order).
tests | Actual:
tests | {"keep":"as is","ti2si":-1,"si2int":100,"int2bi":2323,"bi2vc":"12345","LOWER2UPPERCASE":2}
tests | {"keep":null,"ti2si":1,"si2int":-100,"int2bi":-2323,"bi2vc":"-12345","LOWER2UPPERCASE":2}
tests | Expected:
tests | {"keep":"as is","ti2si":-1,"si2int":100,"int2bi":2323,"bi2vc":"12345","LOWER2UPPERCASE":null}
tests | {"keep":null,"ti2si":1,"si2int":-100,"int2bi":-2323,"bi2vc":"-12345","LOWER2UPPERCASE":null}
tests | at io.airlift.testing.Assertions.fail(Assertions.java:323)
tests | at io.airlift.testing.Assertions.assertEqualsIgnoreOrder(Assertions.java:312)
tests | at io.trino.tests.product.hive.TestHiveCoercion.assertQueryResults(TestHiveCoercion.java:675)
tests | at io.trino.tests.product.hive.TestHiveCoercion.doTestHiveCoercion(TestHiveCoercion.java:334)
tests | at io.trino.tests.product.hive.TestHiveCoercion.testHiveCoercionOrc(TestHiveCoercion.java:229)
tests | at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
tests | at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
tests | at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
tests | at java.base/java.lang.reflect.Method.invoke(Method.java:568)
tests | at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:104)
tests | at org.testng.internal.Invoker.invokeMethod(Invoker.java:645)
tests | at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:851)
tests | at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1177)
tests | at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:129)
tests | at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:112)
tests | at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
tests | at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
tests | at java.base/java.lang.Thread.run(Thread.java:833)

@leetcode-1533
Copy link
Contributor Author

This is what I saw locally:

tests | 2022-09-23 06:59:29 INFO: FlakyTestRetryAnalyzer not enabled: CONTINUOUS_INTEGRATION environment is not detected or system property 'io.trino.testng.services.FlakyTestRetryAnalyzer.enabled' is not set to 'true' (actual: )
tests | 2022-09-23 06:59:29 INFO: FAILURE / io.trino.tests.product.hive.TestHiveCoercion.testHiveCoercionOrc (Groups: hive_coercion, jdbc) took 10.8 seconds
tests | 2022-09-23 06:59:29 SEVERE: Failure cause:
tests | java.lang.AssertionError: row_to_row field is not equal expected: collections to be equal (ignoring order).
tests | Actual:
tests | {"keep":"as is","ti2si":-1,"si2int":100,"int2bi":2323,"bi2vc":"12345","LOWER2UPPERCASE":null}
tests | {"keep":null,"ti2si":1,"si2int":-100,"int2bi":-2323,"bi2vc":"-12345","LOWER2UPPERCASE":null}
tests | Expected:
tests | {"keep":"as is","ti2si":-1,"si2int":100,"int2bi":2323,"bi2vc":"12345","LOWER2UPPERCASE":2}
tests | {"keep":null,"ti2si":1,"si2int":-100,"int2bi":-2323,"bi2vc":"-12345","LOWER2UPPERCASE":2}

@leetcode-1533
Copy link
Contributor Author

Most of the product test changes are testing hive's behavior, I can also remove those part.

@cla-bot
Copy link

cla-bot bot commented Sep 23, 2022

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: yluan.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@cla-bot cla-bot bot removed the cla-signed label Sep 23, 2022
@cla-bot
Copy link

cla-bot bot commented Oct 1, 2022

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: yluan.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@phd3
Copy link
Member

phd3 commented Oct 3, 2022

looks good, can you please squash the commits?

@phd3
Copy link
Member

phd3 commented Oct 3, 2022

Confirmed with @martint on Slack that the change looks good to him.

@leetcode-1533
Copy link
Contributor Author

finished squashing.

@phd3 phd3 merged commit c84fe06 into trinodb:master Oct 6, 2022
@phd3
Copy link
Member

phd3 commented Oct 6, 2022

Merged, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

4 participants