-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SPARK-6433 hive tests to import spark-sql test JAR for QueryTest access #5119
SPARK-6433 hive tests to import spark-sql test JAR for QueryTest access #5119
Conversation
…l & spark-catalyst for its test runs
Can one of the admins verify this patch? |
@@ -158,6 +158,7 @@ | |||
<fasterxml.jackson.version>2.4.4</fasterxml.jackson.version> | |||
<snappy.version>1.1.1.6</snappy.version> | |||
<netlib.java.version>1.1.2</netlib.java.version> | |||
<maven-jar-plugin.version>2.6</maven-jar-plugin.version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit, I don't think this needs to be factored out into a property as I don't think it would be overridden. It just occurs one place too. The plugin versions are just written once in the parent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lifted it from the hadoop code. the parent one is @ v 2.4, so it comes down to whether you are happy with what that parent gives you or not. Easy to alter.
-Updated patch with the indentation corrected; plugin version entrusted to the apache parent template |
<plugin> | ||
<groupId>org.apache.maven.plugins</groupId> | ||
<artifactId>maven-jar-plugin</artifactId> | ||
<configuration> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I still wasn't sure what this config was for, what does it do?
+1 from the SQL side. I only duplicated that code because I was afraid of the maven changes to avoid doing as such. I'll defer to Sean's maven expertise on when this is ready to be merged. |
As an experiment, I changed the plugin declaration to exclude the main JAR phase, that is, commented out this bit: <execution>
<id>prepare-jar</id>
<phase>prepare-package</phase>
<goals>
<goal>jar</goal>
</goals>
</execution> Running {{mvn help:effective-pom}} showed that the entire {{}} section had been overridden, with the result that the the main build didn't work: jars weren't being published. <plugin>
<artifactId>maven-jar-plugin</artifactId>
<version>2.4</version>
<executions>
<execution>
<id>prepare-test-jar</id>
<phase>prepare-package</phase>
<goals>
<goal>test-jar</goal>
</goals>
<configuration>
<excludes>
<exclude>log4j.properties</exclude>
</excludes>
<archive>
<manifestEntries>
<mode>development</mode>
<url>http://spark.apache.org/</url>
</manifestEntries>
<manifest>
<addDefaultSpecificationEntries>true</addDefaultSpecificationEntries>
<addDefaultImplementationEntries>true</addDefaultImplementationEntries>
</manifest>
</archive>
</configuration>
</execution>
</executions> To summarise: yes, you do need to re-declare parent executions when adding them. This is not just superstition. |
Hm, so I tried changing the config to:
The effective POM doesn't show a default |
I did a clean build and it didn't work, at least not with the command. I can update this pull with a version without the target & see what Jenkins says. |
Hm, could be a Maven version thing; I'm on 3.2. Are you looking at |
I commented it out again and did work, so I am now concluding I trust maven even less than before. Pushed a new commit with the JAR execution omitted. |
Jenkins, add to whitelist |
ok to test |
Looking good except I still don't think the |
Test build #29341 has started for PR 5119 at commit
|
Test build #29341 has finished for PR 5119 at commit
|
Test PASSed. |
</goals> | ||
<configuration> | ||
<excludes> | ||
<exclude>log4j.properties</exclude> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just being thorough here at the end -- so the idea here is that the tests shouldn't 'publish' their logging config right? I think that will only affect the log4j.properties
in the test dirs, yeah. As long as it doesn't remove the one actually used, in the core
source tree, in the final assembly.
It leaves out the |
To clarify something: do you expect there to be a {{log4j.properties}} file in the slider assembly JAR? because there isn't one, not in trunk@ 0e2753f : $ mvn package -DskipTests -Pyarn -Phadoop-2.4 -Dhadoop.version=2.6.0 -Dhbase.profile=hadoop2 -Phive
$ pushd assembly/target/scala-2.10/
$ unzip spark-assembly-1.4.0-SNAPSHOT-hadoop2.6.0.jar
$ find . -name log4j.properties
$ Looking in the source tree, there are only such files in the test resources, other than one in $ find . -name log4j.properties
./bagel/src/test/resources/log4j.properties
./bagel/target/scala-2.10/test-classes/log4j.properties
./core/src/test/resources/log4j.properties
./core/target/scala-2.10/test-classes/log4j.properties
./external/flume/src/test/resources/log4j.properties
./external/flume/target/scala-2.10/test-classes/log4j.properties
./external/flume-sink/src/test/resources/log4j.properties
./external/flume-sink/target/scala-2.10/test-classes/log4j.properties
./external/kafka/src/test/resources/log4j.properties
./external/kafka/target/scala-2.10/test-classes/log4j.properties
./external/mqtt/src/test/resources/log4j.properties
./external/mqtt/target/scala-2.10/test-classes/log4j.properties
./external/twitter/src/test/resources/log4j.properties
./external/twitter/target/scala-2.10/test-classes/log4j.properties
./external/zeromq/src/test/resources/log4j.properties
./external/zeromq/target/scala-2.10/test-classes/log4j.properties
./extras/java8-tests/src/test/resources/log4j.properties
./extras/kinesis-asl/src/main/resources/log4j.properties
./extras/kinesis-asl/src/test/resources/log4j.properties
./graphx/src/test/resources/log4j.properties
./graphx/target/scala-2.10/test-classes/log4j.properties
./launcher/src/test/resources/log4j.properties
./launcher/target/scala-2.10/test-classes/log4j.properties
./mllib/src/test/resources/log4j.properties
./mllib/target/scala-2.10/test-classes/log4j.properties
./repl/src/test/resources/log4j.properties
./repl/target/scala-2.10/test-classes/log4j.properties
./sql/catalyst/src/test/resources/log4j.properties
./sql/catalyst/target/scala-2.10/test-classes/log4j.properties
./sql/core/src/test/resources/log4j.properties
./sql/core/target/scala-2.10/test-classes/log4j.properties
./sql/hive/src/test/resources/log4j.properties
./sql/hive/target/scala-2.10/test-classes/log4j.properties
./streaming/src/test/resources/log4j.properties
./streaming/target/scala-2.10/test-classes/log4j.properties
./yarn/src/test/resources/log4j.properties
./yarn/target/scala-2.10/test-classes/log4j.properties This pull request will make sure that none of the test log4j files will make it onto the classpath of any downstream tests importing the test JARs. I manually tested this by
|
Yes that all looks correct with regard to |
I'm ready to merge this but still haven't heard anything on the |
Test build #29488 has started for PR 5119 at commit |
Test FAILed. |
Jenkins, retest this please. (Timeout, but doesn't look like any stuck test; it was almost done with Python tests when it stopped.) |
Test build #29498 has started for PR 5119 at commit |
Test build #29498 has finished for PR 5119 at commit
|
Test PASSed. |
I like this. Let me give @pwendell a day or so to comment since it touches the build but I think this is a good change. |
@@ -1472,6 +1473,25 @@ | |||
<groupId>org.scalatest</groupId> | |||
<artifactId>scalatest-maven-plugin</artifactId> | |||
</plugin> | |||
<!-- Build the JARs--> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this say something more useful? Maybe like:
<!-- Build test-jar's for all projects, since some projects depend on tests from others -->
Otherwise it might be hard for someone to understand what is going on here.
The overall approach LGTM, but I would suggest adding a better comment since it's non-obvious what is going on. |
OK |
Test build #29538 has started for PR 5119 at commit |
Test FAILed. |
Jenkins, retest this please. (Timeout again.) |
Test build #29544 has started for PR 5119 at commit |
Test build #29544 has finished for PR 5119 at commit
|
Test PASSed. |
Outstanding issues:
spark-streaming-test.jar
does, but given these are test jars for developers only, especially in the same spark source tree, it's hard to justify.maven-jar-plugin
v 2.6 was explicitly selected; without this the apache-1.4 parent template JAR version (2.4) chosen.