-
Notifications
You must be signed in to change notification settings - Fork 387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[#198] test: Add Catalog-hive e2e integration test #308
Conversation
Code Coverage Report
Files
|
env: | ||
HIVE2_IMAGE_NAME: datastrato/hive2 | ||
HIVE2_IMAGE_VERSION: 0.1.0 | ||
HIVE2_IMAGE_LATEST: latest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's usage of "HIVE_IMAGE_VERSION" and "HIVE_IMAGE_LATEST"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many places needs to use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My question is that "HIVE2_IMAGE_VERSION" seemed not used in this script. Also seems "HIVE_IMAGE_VERSION" and "HIVE_IMAGE_LATEST" they all point to the version things, do we need to unify them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a reminder that with ASF projects you can only publish releases to Docker and can't refer to the latest. If we go with publishing the latest we'll need to change that when we enter the ASF incubator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a reminder that with ASF projects you can only publish releases to Docker and can't refer to the latest. If we go with publishing the latest we'll need to change that when we enter the ASF incubator.
@justinmclean Thank you for your reminder.
I modify this code.
// The specify Hadoop user name that will be used when accessing Hadoop | ||
// Distributed File System (HDFS). | ||
if (conf.containsKey(HiveCatalogConfig.HADOOP_USER_NAME.getKey())) { | ||
System.setProperty("HADOOP_USER_NAME", conf.get(HiveCatalogConfig.HADOOP_USER_NAME.getKey())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The specific Hadoop user name that will be used when accessing Hadoop Distributed File System (HDFS).
I think using HADOOP_USER_NAME
environment is not good. Because users need to use HiveCatalogOperations.java
operation in many different Hive Clusters.
Wait add Graviton User Account System to manage users and groups in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible that we remove this configuration and use some other ways to bypass this issue? I don't think using configuration here is a good idea?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible that we set this property in the test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modify to set HADOOP_USER_NAME
environment in the integration-test
module.
integration-test/src/test/java/com/datastrato/graviton/integration/e2e/CatalogHiveIT.java
Outdated
Show resolved
Hide resolved
integration-test/src/test/java/com/datastrato/graviton/integration/e2e/CatalogHiveIT.java
Show resolved
Hide resolved
integration-test/src/test/java/com/datastrato/graviton/integration/e2e/MetalakeIT.java
Outdated
Show resolved
Hide resolved
// TODO (xun) enable this test waiting for fixed [#316] [Bug report] alterSchema throw | ||
// NoSuchSchemaException | ||
// @Order(3) | ||
// @Test | ||
public void testAlterSchema() throws TException, InterruptedException { | ||
NameIdentifier ident = NameIdentifier.of(metalakeName, catalogName, schemaName); | ||
Map<String, String> properties = Maps.newHashMap(); | ||
properties.put("key1", "val1"); | ||
properties.put("key2", "val2"); | ||
String comment = "comment"; | ||
|
||
GravitonMetaLake metalake = client.loadMetalake(NameIdentifier.of(metalakeName)); | ||
Catalog catalog = metalake.loadCatalog(NameIdentifier.of(metalakeName, catalogName)); | ||
catalog | ||
.asSchemas() | ||
.alterSchema( | ||
ident, | ||
SchemaChange.removeProperty("key1"), | ||
SchemaChange.setProperty("key2", "val2-alter")); | ||
|
||
NameIdentifier[] nameIdentifiers = catalog.asSchemas().listSchemas(ident.namespace()); | ||
|
||
Map<String, String> properties2 = catalog.asSchemas().loadSchema(ident).properties(); | ||
Assertions.assertFalse(properties2.containsKey("key1")); | ||
Assertions.assertEquals("val2-alter", properties2.get("key2")); | ||
|
||
Database database = hiveClientPool.run(client -> client.getDatabase(schemaName)); | ||
Map<String, String> properties3 = database.getParameters(); | ||
Assertions.assertFalse(properties3.containsKey("key1")); | ||
Assertions.assertEquals("val2-alter", properties3.get("key2")); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
testAlterSchema()
throws an exception.
I created #316 [Bug report] alterSchema throw to track this bug.
Temporarily disable this test case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can debug it together tomorrow;
// TODO (xun) enable this test waiting for fixed [#316] [Bug report] alterSchema throw | ||
// NoSuchSchemaException | ||
// @Order(3) | ||
// @Test | ||
public void testAlterSchema() throws TException, InterruptedException { | ||
NameIdentifier ident = NameIdentifier.of(metalakeName, catalogName, schemaName); | ||
Map<String, String> properties = Maps.newHashMap(); | ||
properties.put("key1", "val1"); | ||
properties.put("key2", "val2"); | ||
String comment = "comment"; | ||
|
||
GravitonMetaLake metalake = client.loadMetalake(NameIdentifier.of(metalakeName)); | ||
Catalog catalog = metalake.loadCatalog(NameIdentifier.of(metalakeName, catalogName)); | ||
catalog | ||
.asSchemas() | ||
.alterSchema( | ||
ident, | ||
SchemaChange.removeProperty("key1"), | ||
SchemaChange.setProperty("key2", "val2-alter")); | ||
|
||
NameIdentifier[] nameIdentifiers = catalog.asSchemas().listSchemas(ident.namespace()); | ||
|
||
Map<String, String> properties2 = catalog.asSchemas().loadSchema(ident).properties(); | ||
Assertions.assertFalse(properties2.containsKey("key1")); | ||
Assertions.assertEquals("val2-alter", properties2.get("key2")); | ||
|
||
Database database = hiveClientPool.run(client -> client.getDatabase(schemaName)); | ||
Map<String, String> properties3 = database.getParameters(); | ||
Assertions.assertFalse(properties3.containsKey("key1")); | ||
Assertions.assertEquals("val2-alter", properties3.get("key2")); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can debug it together tomorrow;
- name: Set up Docker Buildx | ||
uses: docker/setup-buildx-action@v1 | ||
|
||
- name: Build the hive2 Docker image for AMD64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AMD64?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the Docker Official Images on Docker Hub provide a variety of architecturesopen_in_new. For example, the busybox image supports amd64, arm32v5, arm32v6, arm32v7, arm64v8, i386, ppc64le, and s390x. When running this image on an x86_64 / amd64 machine, the amd64 variant is pulled and run.
@jerryshao I fixed the problem in the comments. Please help me review PR, Thanks. |
### What changes were proposed in this pull request? 1. Add Catalog-hive integration test in the `integration-test` module 2. Run integration test in the GitHub Action AMD64 architecture (support ARM64 future in the next PR) 3. Add some documents 4. Add `build docker image` label in the PR, It enables/disable rebuilding the docker image in the Pull Request CI step. ### Why are the changes needed? Launching the Graviton Server process in the local host or GitHub CI environment. Through Graviton client connects the Graviton Server and hive docker container to test Catalog-hive. Fix: #198 ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? Test in the GitHub Action
What changes were proposed in this pull request?
integration-test
modulebuild docker image
label in the PR, It enables/disable rebuilding the docker image in the Pull Request CI step.Why are the changes needed?
Launching the Graviton Server process in the local host or GitHub CI environment.
Through Graviton client connects the Graviton Server and hive docker container to test Catalog-hive.
Fix: #198
Does this PR introduce any user-facing change?
N/A
How was this patch tested?
Test in the GitHub Action