Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-3826][SQL]enable hive-thriftserver to support hive-0.13.1 #2685

Closed
wants to merge 22 commits into from

Conversation

scwf
Copy link
Contributor

@scwf scwf commented Oct 7, 2014

In #2241 hive-thriftserver is not enabled. This patch enable hive-thriftserver to support hive-0.13.1 by using a shim layer refer to #2241.

1 A light shim layer(code in sql/hive-thriftserver/hive-version) for each different hive version to handle api compatibility

2 New pom profiles "hive-default" and "hive-versions"(copy from #2241) to activate different hive version

3 SBT cmd for different version as follows:
hive-0.12.0 --- sbt/sbt -Phive,hadoop-2.3 -Phive-0.12.0 assembly
hive-0.13.1 --- sbt/sbt -Phive,hadoop-2.3 -Phive-0.13.1 assembly

4 Since hive-thriftserver depend on hive subproject, this patch should be merged with #2241 to enable hive-0.13.1 for hive-thriftserver

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@SparkQA
Copy link

SparkQA commented Oct 7, 2014

QA tests have started for PR 2685 at commit 3a08b14.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Oct 7, 2014

QA tests have finished for PR 2685 at commit 3a08b14.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@scwf
Copy link
Contributor Author

scwf commented Oct 13, 2014

@pwendell, i am resolving the conflicts, other TODO's here?

<activeByDefault>false</activeByDefault>
</activation>
<modules>
<module>sql/hive-thriftserver</module>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After hive-0.13.1 is committed, sql/hive-thirftserver can be put to top level instead of here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean move to <modules> upper in this pom?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. either move to modules, or move to hive profile, since it is supported in both versions.

@marmbrus
Copy link
Contributor

Now that Hive 13 is merged in it would be great to get this in ASAP. I looked over this and it seems pretty good. My only high level comment is maybe we should keep all the Hive Shim code in a single project instead of having version specific code in both hive and hive-thrift server. That would simplify the build and consolidate the places where we have these hacks. It woudl also allow us to avoid duplicating things like getCommandProcessor in both Shims. Thoughts?

@pwendell can you glance over the (limited) build changes.
@liancheng can you look this over as well?

@@ -167,7 +167,7 @@ CURRENT_BLOCK=$BLOCK_SPARK_UNIT_TESTS
# If the Spark SQL tests are enabled, run the tests with the Hive profiles enabled.
# This must be a single argument, as it is.
if [ -n "$_RUN_SQL_TESTS" ]; then
SBT_MAVEN_PROFILES_ARGS="$SBT_MAVEN_PROFILES_ARGS -Phive"
SBT_MAVEN_PROFILES_ARGS="$SBT_MAVEN_PROFILES_ARGS -Phive -Phive-0.12.0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this change ncessary? seems like it might be good to leave it how it is now

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah agreed. If Hive13 is the default we should test Hive13.

@pwendell
Copy link
Contributor

Some minor comments on the build stuff. I think we're close, it's just small stuff.

asfgit pushed a commit that referenced this pull request Oct 26, 2014
The thirift server is not available in the default (hive13) profile yet which is breaking all SQL only PRs.  This turns off these test until #2685 is merged.

Author: Michael Armbrust <[email protected]>

Closes #2950 from marmbrus/fixTests and squashes the following commits:

1a6dfee [Michael Armbrust] [HOTFIX][SQL] Temporarily turn of hive-server tests.
@scwf
Copy link
Contributor Author

scwf commented Oct 27, 2014

@marmbrus, how about make a new sub project named hive-shim to keep all the Hive Shim code in it?

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22591/
Test PASSed.

@liancheng
Copy link
Contributor

Yay! It passed!

@scwf
Copy link
Contributor Author

scwf commented Oct 31, 2014

So will you re-publish the hive 0.13 jar @pwendell? or use 0.13.1a?

@pwendell
Copy link
Contributor

it is not possible to mutate them after the fact. Let's stick with 0.13.1a - I will fully release it now. But don't remove the extra repository (we can remove it later) because it takes some time to propagate.

echo -e "q\n" \
| sbt/sbt $BUILD_MVN_PROFILE_ARGS clean package assembly/assembly \
| sbt/sbt $BUILD_MVN_PROFILE_ARGS clean hive/compile hive-thriftserver/compile \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually don't think this is compiling against Hive 0.12 right now... is it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it's against 0.12 because BUILD_MVN_PROFILE_ARGS="$SBT_MAVEN_PROFILES_ARGS -Phive -Phive-0.12.0", right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be, BUILD_MVN_PROFILE_ARGS is defined above with -Phive-0.12.0:

  BUILD_MVN_PROFILE_ARGS="$SBT_MAVEN_PROFILES_ARGS -Phive -Phive-0.12.0"

@zhzhan
Copy link
Contributor

zhzhan commented Oct 31, 2014

whew, finally.

@SparkQA
Copy link

SparkQA commented Oct 31, 2014

Test build #22602 has started for PR 2685 at commit f26f3be.

  • This patch merges cleanly.

@liancheng
Copy link
Contributor

@zhzhan Had a glance at the Kryo issue you pointed out, it should be related to the POM inconsistency problem I mentioned above, but I'm not sure whether they are identical since this issue doesn't mention any specific Kryo version. I'll investigate this later. Thanks for pointing out this issue!

Although Jenkins finally nods, I'm still puzzled with the root cause of the original build failure. We only know that introducing Kryo 2.22 prevents un-shaded Objenesis classes from being included in the assembly jar and thus breaks the core tests, but how and why? Also, is it 100% safe to downgrade Kryo 2.22 to Kryo 2.21 for Hive 0.13.1?

@SparkQA
Copy link

SparkQA commented Oct 31, 2014

Test build #22602 has finished for PR 2685 at commit f26f3be.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22602/
Test PASSed.

@marmbrus
Copy link
Contributor

Thanks guys for all your hard work on this! Merging to master.

@asfgit asfgit closed this in 7c41d13 Oct 31, 2014
@coderfi
Copy link
Contributor

coderfi commented Nov 2, 2014

In /pom.xml

<properties>
    <hive.version>0.13.1a</hive.version>
</properties>
...
<profile>
  <id>hive-0.13.1</id>
  ...
  <properties>
    <hive.version>0.13.1</hive.version>
    ...
  </properties>
</profile>

The 'hive-0.13.1' is overriding the hive.version back to the regular 'hive-0.13.1' version, instead of referencing 'hive-0.13.1a'.

It seems mvn -Phive would get me a build with hive-0.13a containing this pull's patches.

However, I was building with mvn -Phive -Phive-0.13.1, so I ran into the 'NoClassDefFoundError ... InstantiatorStrategy' issue.
Omitting the 'hive-0.13.1' profile got me past the issue (since the pom defaults effectively gives me hive 0.13(a) anyway, which is what I want).

@liancheng
Copy link
Contributor

@coderfi This is a good catch, would you mind to file a JIRA ticket for this? A PR would be even better :)

@scwf scwf deleted the shim-thriftserver1 branch November 3, 2014 04:04
@scwf
Copy link
Contributor Author

scwf commented Nov 3, 2014

Yes, here should be 0.13.1a.

@pwendell
Copy link
Contributor

pwendell commented Nov 3, 2014

Yes - we just need to change that to 0.13.1a

On Sun, Nov 2, 2014 at 8:05 PM, wangfei [email protected] wrote:

Yes, here should be 0.13.1a.


Reply to this email directly or view it on GitHub
#2685 (comment).

@scwf
Copy link
Contributor Author

scwf commented Nov 3, 2014

And note just to change hive.version to 0.13.1a, hive.version.short should be 0.13.1.

@coderfi
Copy link
Contributor

coderfi commented Nov 3, 2014

@liancheng PR #3072 created (all of one line! :) ).

asfgit pushed a commit that referenced this pull request Nov 3, 2014
instead of `hive.version=0.13.1`.
e.g. mvn -Phive -Phive=0.13.1

Note: `hive.version=0.13.1a` is the default property value. However, when explicitly specifying the `hive-0.13.1` maven profile, the wrong one would be selected.
References:  PR #2685, which resolved a package incompatibility issue with Hive-0.13.1 by introducing a special version Hive-0.13.1a

Author: fi <[email protected]>

Closes #3072 from coderfi/master and squashes the following commits:

7ca4b1e [fi] Fixes the `hive-0.13.1` maven profile referencing `hive.version=0.13.1` instead of the Spark compatible `hive.version=0.13.1a` Note: `hive.version=0.13.1a` is the default version. However, when explicitly specifying the `hive-0.13.1` maven profile, the wrong one would be selected. e.g. mvn -Phive -Phive=0.13.1 See PR #2685

(cherry picked from commit df607da)
Signed-off-by: Michael Armbrust <[email protected]>
asfgit pushed a commit that referenced this pull request Nov 3, 2014
instead of `hive.version=0.13.1`.
e.g. mvn -Phive -Phive=0.13.1

Note: `hive.version=0.13.1a` is the default property value. However, when explicitly specifying the `hive-0.13.1` maven profile, the wrong one would be selected.
References:  PR #2685, which resolved a package incompatibility issue with Hive-0.13.1 by introducing a special version Hive-0.13.1a

Author: fi <[email protected]>

Closes #3072 from coderfi/master and squashes the following commits:

7ca4b1e [fi] Fixes the `hive-0.13.1` maven profile referencing `hive.version=0.13.1` instead of the Spark compatible `hive.version=0.13.1a` Note: `hive.version=0.13.1a` is the default version. However, when explicitly specifying the `hive-0.13.1` maven profile, the wrong one would be selected. e.g. mvn -Phive -Phive=0.13.1 See PR #2685
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants