-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix SPARK-1413: Parquet messes up stdout and stdin when used in Spark REPL #325
Conversation
Can one of the admins verify this patch? |
ok to test |
@@ -81,11 +81,14 @@ private[sql] case class ParquetRelation(val path: String) | |||
|
|||
private[sql] object ParquetRelation { | |||
|
|||
var isEnableLogForwarding = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably just call this enabledLogForwarding
. Thanks for the fix though!
Jenkins, test this please |
Merged build triggered. |
Merged build started. |
There is a problem I do not know what the reason is cause
=>
|
@AndreSchumacher, any thoughts? |
Ouch. OK, I'll have a look at this. The default j.u.l. logger writes to stdout. Maybe the log forwarding doesn't work as intended. |
Merged build finished. All automated tests passed. |
All automated tests passed. |
class parquet.Log has a static block ( add a default handler in case there is none ) |
@witgo that alone would not explain a deadlock if the forwarding is initialized multiple times? Unless maybe the log forwarding is called before parquet.Log is fully initialized I guess. Maybe that is why it did not show up in the tests that don't use the console. Great that you found it and thanks a lot for the fix. I tried it and it works fine here. For completeness let's run the Travis checks once more. Could you still remove my now misleading comment about the parent loggers? |
|
So is the main problem here that parquet should not use a static block like this to do initialization? If so, @witgo do you mind documenting this in the code to explain that it's a work around. Also, should we reach out to the parquet team and @julienledem about removing this static initialization? It seems this initialization code breaks the use case where JUL is being routed to SLF4j via a bridge. Or maybe it's sensitive to the order in which the SLF4J bridge and the Parquet |
Btw @julienledem in Spark we also add some default initialization, but we do it lazily instead of statically because this allows re-routing packages to statically initialize before we try to detect whether any logging has been configured: |
@witgo Good idea to move the forwarding setup out of there but in the process it seems that edd9630 broke the forwarding again. Also: I think it is confusing to set the log level to null (root level). Parquet hides most log statements behind |
@AndreSchumacher
|
@witgo yes, it seems the parent handlers need to be set for the forwarding to work. My only remaining suggestion is to the log level. I still think setting it to anything else than the pre-compiled one would be confusing (see my comment above). @pwendell are you OK with having the initialization code there where it is now? |
Modify spark on yarn to create SparkConf process
@@ -135,4 +136,6 @@ trait Logging { | |||
private object Logging { | |||
@volatile private var initialized = false | |||
val initLock = new Object() | |||
// SLF4JBridgeHandler.removeHandlersForRootLogger() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this line not actually necessary?
If it turns out we need to directly call SLF4JBrigeHandler
let's change the call to just use reflection and fail gracefully if it's not found on the classpath. It's not pretty but will be friendlier to downstream users. And we can drop it if Parquet ditches static initializers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specifically, I'd first check if the class can be loaded, then if it can, use java reflection to access and call the class methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right,I don't understand the scala reflection, an example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For static methods it looks like this:
try {
// We use reflection here to handle the case where users remove the
// slf4j-to-jul bridge order to route their logs to JUL.
val bridgeClass = Class.forName("org.slf4j.bridge.SLF4JBridgeHandler")
bridgeClass.getMethod("removeHandlersForRootLogger").invoke(null)
val installed = bridgeClass.getMethod("isInstalled").invoke(null).asInstanceOf[Boolean]
if (!installed) {
bridgeClass.getMethod("install").invoke(null)
}
} catch {
case e: Exception => // can't log anything yet so just fail silently
}
@witgo I gave an example of using reflection - mind adjusting your patch and checking if this works? |
@pwendell But this only solve one problem. The Spark dependence is fixed, we can only use log4j. In SparkBuild.scala file:
To become a configurable logging system? |
no we don't need to change the dependencies. The case I wanted to handle was if users downstream remove the dependencies (e.g. they remove "jul-to-slf4j") and do a custom spark build. This is something people actually do. |
Jenkins, test this please. |
Merged build triggered. |
Merged build started. |
Merged build finished. |
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13996/ |
Jenkins, test this please. |
Merged build triggered. |
Merged build started. |
Merged build finished. All automated tests passed. |
All automated tests passed. |
I'm seeing some Akka related problems in my local build but nothing that would seem related to this PR. The log forwarding seems to work. @pwendell thanks a lot for the help on this. I think in the interest of getting the saveAsParquetFile working inside the console that should be merged soon. |
@AndreSchumacher it's likely related to an accidental merge of a buggy patch into master... try rebasing to the latest master which includes a reversion of the bug. |
Okay thanks, merged! |
… REPL Author: witgo <[email protected]> Closes apache#325 from witgo/SPARK-1413 and squashes the following commits: e57cd8e [witgo] use scala reflection to access and call the SLF4JBridgeHandler methods 45c8f40 [witgo] Merge branch 'master' of https://github.com/apache/spark into SPARK-1413 5e35d87 [witgo] Merge branch 'master' of https://github.com/apache/spark into SPARK-1413 0d5f819 [witgo] review commit 45e5b70 [witgo] Merge branch 'master' of https://github.com/apache/spark into SPARK-1413 fa69dcf [witgo] Merge branch 'master' into SPARK-1413 3c98dc4 [witgo] Merge branch 'master' into SPARK-1413 38160cb [witgo] Merge branch 'master' of https://github.com/apache/spark into SPARK-1413 ba09bcd [witgo] remove set the parquet log level a63d574 [witgo] Merge branch 'master' of https://github.com/apache/spark into SPARK-1413 5231ecd [witgo] Merge branch 'master' of https://github.com/apache/spark into SPARK-1413 3feb635 [witgo] parquet logger use parent handler fa00d5d [witgo] Merge branch 'master' of https://github.com/apache/spark into SPARK-1413 8bb6ffd [witgo] enableLogForwarding note fix edd9630 [witgo] move to f447f50 [witgo] merging master 5ad52bd [witgo] Merge branch 'master' of https://github.com/apache/spark into SPARK-1413 76670c1 [witgo] review commit 70f3c64 [witgo] Fix SPARK-1413
* Add job terraform-provider-openstack-acceptance-test-rocky * manila-provisioner job: set manila share type (apache#321) * manila-provisioner job: set manila share type * manila-provisioner job: added a sleep in between deleting pvc+pod and sc+secret * Add job for ansible against OpenStack Rocky release * Add periodic job for ansible against OpenStack Rocky release * Add job for packer against OpenStack Rocky release * Add periodic job for packer against OpenStack Rocky release * Add periodic job for docker-machine against OpenStack Rocky release Related-Bug: theopenlab/openlab#82
No description provided.