Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug report] check-status.sh run failed in gravitino-ci-hive container #2379

Closed
danhuawang opened this issue Feb 28, 2024 · 9 comments · Fixed by #2523
Closed

[Bug report] check-status.sh run failed in gravitino-ci-hive container #2379

danhuawang opened this issue Feb 28, 2024 · 9 comments · Fixed by #2523
Assignees
Labels
bug Something isn't working

Comments

@danhuawang
Copy link
Contributor

Version

main branch

Describe what's wrong

root@ip-172-31-44-48:/# /tmp/check-status.sh
++ hdfs dfsadmin -report
++ awk '{print $3}'
++ grep 'Live datanodes'
OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0

  • hdfs_ready='(1):'
  • [[ (1): == (\1): ]]
  • echo 'HDFS is ready'
    HDFS is ready
    ++ hive -e 'show databases;'
  • hive_ready='OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

Logging initialized using configuration in jar:file:/usr/local/hive/lib/hive-common-2.3.9.jar!/hive-log4j2.properties Async: true
Loading class com.mysql.jdbc.Driver'\''. This is deprecated. The new driver class is com.mysql.cj.jdbc.Driver'''. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary.
FAILED: SemanticException org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient'

Error message and/or stacktrace

N/A

How to reproduce

deploy latest gravitino-ci-hive image on ECS,
the container status is unhealthy

Additional context

No response

@danhuawang danhuawang added the bug Something isn't working label Feb 28, 2024
@mchades
Copy link
Contributor

mchades commented Mar 1, 2024

Can you provide a more detailed log with timestamps for each line, similar to #2270? This will allow us to observe its retry behavior and intervals.

@danhuawang
Copy link
Contributor Author

Can you provide a more detailed log with timestamps for each line, similar to #2270? This will allow us to observe its retry behavior and intervals.

#2270 check-status.sh exit 1 because HDFS is not ready , but in this issue , HDFS is ready. And check-status.sh failed at the command hive -e 'show databases;' ,the error info is "FAILED: SemanticException org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient'"

the log is the output of the script /tmp/check-status.sh which I ran in the container manually, so the log is no timestamp.

@mchades
Copy link
Contributor

mchades commented Mar 4, 2024

Did you include #2366 change in your test code?

@danhuawang
Copy link
Contributor Author

Did you include #2366 change in your test code?
@mchades
Maybe this issue is imported by the pr: #2090
When I revert the changes in this pr ,and test it with my personal repo. There's no problem.

@mchades
Copy link
Contributor

mchades commented Mar 6, 2024

@danhuawang you're right, #2090 introduced the bug due to moving the mysql-server installation before the MYSQL_PWD variable setting. Those changes have not been tested by IT, so the CI on Github was not affected because it still uses the 0.1.8 image.

How can we improve?

  1. we should move rm -rf /var/lib/apt/lists/* to the # removed install packages block
  2. test the new image by IT

@mchades mchades added this to the Gravitino 0.5.0 milestone Mar 6, 2024
@charliecheng630
Copy link
Contributor

@mchades I would like to work on this issue, should I start with the image (datastrato/gravitino-ci-hive:0.1.8) first?

@mchades
Copy link
Contributor

mchades commented Mar 9, 2024

@mchades I would like to work on this issue, should I start with the image (datastrato/gravitino-ci-hive:0.1.8) first?

Hi @charliecheng630 , Thank you so much for your interest in contributing to the project!

Before you begin, please ensure that you can successfully execute the command ./gradlew build :integration-test:test --tests "com.datastrato.gravitino.integration.test.catalog.hive.CatalogHiveIT" locally. This validation is necessary before pushing the new image to ensure it runs correctly in your local environment.

@charliecheng630
Copy link
Contributor

charliecheng630 commented Mar 10, 2024

@mchades This command (./gradlew build :integration-test:test --tests "com.datastrato.gravitino.integration.test.catalog.hive.CatalogHiveIT") can be executed successfully.

But I am unable to reproduce the issue.
I ran gravitino-ci-hive as the container manually, and execute the command /tmp/check-status.sh, but there is no exception occurring in the output.

Here are the three versions I've tried.

docker run datastrato/gravitino-ci-hive:0.1.8
docker run datastrato/gravitino-ci-hive:test
docker run datastrato/gravitino-ci-hive:latest

Output (The output appears to be normal.):

root@e95897b23a5c:/# /tmp/check-status.sh 
++ hdfs dfsadmin -report
++ grep 'Live datanodes'
++ awk '{print $3}'
OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
24/03/10 11:21:23 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
+ hdfs_ready='(1):'
+ [[ (1): == \(\1\)\: ]]
+ echo 'HDFS is ready'
HDFS is ready
++ hive -e 'show databases;'
+ hive_ready='OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

Logging initialized using configuration in jar:file:/usr/local/hive/lib/hive-common-2.3.9.jar!/hive-log4j2.properties Async: true
Loading class `com.mysql.jdbc.Driver'\''. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver'\''. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary.
OK
default
Time taken: 2.139 seconds, Fetched: 1 row(s)'
+ [[ OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

Logging initialized using configuration in jar:file:/usr/local/hive/lib/hive-common-2.3.9.jar!/hive-log4j2.properties Async: true
Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary.
OK
default
Time taken: 2.139 seconds, Fetched: 1 row(s) == *\F\A\I\L\E\D* ]]
+ echo 'Hive is ready'
Hive is ready
+ exit 0

Was my testing method correct? Are there any other ways to reproduce the issue?
My work branch is main, and already includes PR #2090

@mchades
Copy link
Contributor

mchades commented Mar 11, 2024

@charliecheng630 You cannot reproduce because you didn't use the modified image. Maybe you can try to reproduce locally using the following steps:

  1. enter the directory {gravitino-root}/dev/docker
  2. build the new image locally: ./build-docker.sh --platform linux/arm64 --type hive --image datastrato/gravitino-ci-hive --tag 0.1.9
  3. modify Hive IT image version from 0.1.8 to 0.1.9: https://github.com/datastrato/gravitino/blob/77fcc1309497e084052839bb07e18a417fc80066/integration-test/build.gradle.kts#L321
  4. enter the project root directory and rerun ./gradlew build :integration-test:test --tests "com.datastrato.gravitino.integration.test.catalog.hive.CatalogHiveIT"

mchades pushed a commit that referenced this issue Mar 14, 2024
)

### What changes were proposed in this pull request?

Let user hive be able to access MySQL in gravitino-ci-hive container.

### Why are the changes needed?

Fix: #2379 

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

existing tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants