Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Static method collectAndServe failed for class org.apache.spark.api.python.PythonRDD #698

Open
k2moz opened this issue Nov 23, 2018 · 2 comments

Comments

@k2moz
Copy link

k2moz commented Nov 23, 2018

Please help me. What's wrong?

I cant call Reduce in "Pi Example"

I did everything as a guide - https://www.ics.uci.edu/~shantas/Install_Spark_on_Windows10.pdf

##List of my components

  • Windows10
  • JDK:8u60
  • Spark: spark-2.0.2-bin-hadoop2.6
  • Win Utils: Uzip all in %SPARK_HOME%/bin |
  • Mobius: v2.0.200

##All variables been regitred

Run in VisualStudio

  1. I change Java running settings for heap size in sparkclr-submit.cmd file
    (
...
:debugmode
"%JAVA_HOME%\bin\java" -Xms512m -Xmx1024m -cp "%LAUNCH_CLASSPATH%" org.apache.spark.deploy.csharp.CSharpRunner debug
goto :eof
...

)
2. call sparkclr-submit debugg in command line
3. Load VisualStudio Exaples and Run "Pi " project with params

<add key="CSharpBackendPortNumber" value="5567"/>
<add key="CSharpWorkerPath" value="C:/Mobius/runtime/bin/CSharpWorker.exe"/>

and SparkContext: var _conf = new SparkConf().Set("spark.local.dir", "C:\\tmp\\SparkCLRTemp");

programm hase down on Reduce part of function CalculatePiUsingAnonymousMethod

private static void CalculatePiUsingAnonymousMethod(int n, RDD<int> rdd)
        {
            var _preCount = rdd
                            .Map(i =>
                            {
                                var random = new Random();
                                var x = random.NextDouble() * 2 - 1;
                                var y = random.NextDouble() * 2 - 1;

                                return (x * x + y * y) < 1 ? 1 : 0;
                            });
            var _count = _preCount.Reduce((a,b)=> {
                return a + b;
            });
        }

And show this

[2018-11-23 15:18:20,447] [1] [INFO ] [Microsoft.Spark.CSharp.Configuration.ConfigurationService] - ConfigurationService runMode is DEBUG
[2018-11-23 15:18:20,456] [1] [INFO ] [Microsoft.Spark.CSharp.Configuration.ConfigurationService+SparkCLRDebugConfiguration] - CSharpBackend port number read from app config 5567. Using it to connect to CSharpBackend
[2018-11-23 15:18:20,456] [1] [INFO ] [Microsoft.Spark.CSharp.Proxy.Ipc.SparkCLRIpcProxy] - CSharpBackend port number to be used in JvMBridge is 5567
[2018-11-23 15:18:20,545] [1] [INFO ] [Microsoft.Spark.CSharp.Core.SparkConf] - spark.master not set. Assuming debug mode.
[2018-11-23 15:18:20,548] [1] [INFO ] [Microsoft.Spark.CSharp.Core.SparkConf] - Spark master set to local
[2018-11-23 15:18:20,550] [1] [INFO ] [Microsoft.Spark.CSharp.Core.SparkConf] - spark.app.name not set. Assuming debug mode
[2018-11-23 15:18:20,551] [1] [INFO ] [Microsoft.Spark.CSharp.Core.SparkConf] - Spark app name set to debug app
[2018-11-23 15:18:20,552] [1] [INFO ] [Microsoft.Spark.CSharp.Core.SparkConf] - Spark configuration key-value set to spark.local.dir=C:\tmp\SparkCLRTemp
[2018-11-23 15:18:22,063] [1] [INFO ] [Microsoft.Spark.CSharp.Core.SparkContext] - Parallelizing 300001 items to form RDD in the cluster with 3 partitions
[2018-11-23 15:18:22,790] [1] [INFO ] [Microsoft.Spark.CSharp.Core.RDD`1[[System.Int32, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]]] - Executing Map operation on RDD (preservesPartitioning=False)
[2018-11-23 15:20:14,247] [1] [INFO ] [Microsoft.Spark.CSharp.Core.RDD`1[[System.Int32, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]]] - Executing Reduce operation on RDD
[2018-11-23 15:20:14,258] [1] [INFO ] [Microsoft.Spark.CSharp.Configuration.ConfigurationService+SparkCLRDebugConfiguration] - Worker path read from setting CSharpWorkerPath in app config
[2018-11-23 15:20:15,011] [1] [ERROR] [Microsoft.Spark.CSharp.Interop.Ipc.JvmBridge] - JVM method execution failed: Static method collectAndServe failed for class org.apache.spark.api.python.PythonRDD when called with 1 parameters ([Index=1, Type=JvmObjectReference, Value=14], )
[2018-11-23 15:20:15,011] [1] [ERROR] [Microsoft.Spark.CSharp.Interop.Ipc.JvmBridge] - org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.net.SocketException: Connection reset by peer: socket write error
        at java.net.SocketOutputStream.socketWrite0(Native Method)
        at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
        at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
        at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
        at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
        at java.io.DataOutputStream.write(DataOutputStream.java:107)
        at java.io.FilterOutputStream.write(FilterOutputStream.java:97)
        at org.apache.spark.api.python.PythonRDD$.org$apache$spark$api$python$PythonRDD$$write$1(PythonRDD.scala:492)
        at org.apache.spark.api.python.PythonRDD$$anonfun$writeIteratorToStream$1.apply(PythonRDD.scala:504)
        at org.apache.spark.api.python.PythonRDD$$anonfun$writeIteratorToStream$1.apply(PythonRDD.scala:504)
        at scala.collection.Iterator$class.foreach(Iterator.scala:893)
        at org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28)
        at org.apache.spark.api.python.PythonRDD$.writeIteratorToStream(PythonRDD.scala:504)
        at org.apache.spark.api.python.PythonRunner$WriterThread$$anonfun$run$3.apply(PythonRDD.scala:328)
        at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1953)
        at org.apache.spark.api.python.PythonRunner$WriterThread.run(PythonRDD.scala:269)

Driver stacktrace:
        at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1454)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1442)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1441)
        at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
        at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1441)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:811)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:811)
        at scala.Option.foreach(Option.scala:257)
        at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:811)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1667)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1622)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1611)
        at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
        at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:632)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1873)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1886)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1899)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1913)
        at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:912)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
        at org.apache.spark.rdd.RDD.withScope(RDD.scala:358)
        at org.apache.spark.rdd.RDD.collect(RDD.scala:911)
        at org.apache.spark.api.python.PythonRDD$.collectAndServe(PythonRDD.scala:453)
        at org.apache.spark.api.python.PythonRDD.collectAndServe(PythonRDD.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at org.apache.spark.api.csharp.CSharpBackendHandler.handleMethodCall(CSharpBackendHandler.scala:156)
        at org.apache.spark.api.csharp.CSharpBackendHandler.handleBackendRequest(CSharpBackendHandler.scala:106)
        at org.apache.spark.api.csharp.CSharpBackendHandler.channelRead0(CSharpBackendHandler.scala:32)
        at org.apache.spark.api.csharp.CSharpBackendHandler.channelRead0(CSharpBackendHandler.scala:28)
        at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
        at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:244)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
        at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
        at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.SocketException: Connection reset by peer: socket write error
        at java.net.SocketOutputStream.socketWrite0(Native Method)
        at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
        at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
        at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
        at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
        at java.io.DataOutputStream.write(DataOutputStream.java:107)
        at java.io.FilterOutputStream.write(FilterOutputStream.java:97)
        at org.apache.spark.api.python.PythonRDD$.org$apache$spark$api$python$PythonRDD$$write$1(PythonRDD.scala:492)
        at org.apache.spark.api.python.PythonRDD$$anonfun$writeIteratorToStream$1.apply(PythonRDD.scala:504)
        at org.apache.spark.api.python.PythonRDD$$anonfun$writeIteratorToStream$1.apply(PythonRDD.scala:504)
        at scala.collection.Iterator$class.foreach(Iterator.scala:893)
        at org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28)
        at org.apache.spark.api.python.PythonRDD$.writeIteratorToStream(PythonRDD.scala:504)
        at org.apache.spark.api.python.PythonRunner$WriterThread$$anonfun$run$3.apply(PythonRDD.scala:328)
        at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1953)
        at org.apache.spark.api.python.PythonRunner$WriterThread.run(PythonRDD.scala:269)

[2018-11-23 15:20:15,021] [1] [ERROR] [Microsoft.Spark.CSharp.Interop.Ipc.JvmBridge] - JVM method execution failed: Static method collectAndServe failed for class org.apache.spark.api.python.PythonRDD when called with 1 parameters ([Index=1, Type=JvmObjectReference, Value=14], )
[2018-11-23 15:20:15,022] [1] [ERROR] [Microsoft.Spark.CSharp.Interop.Ipc.JvmBridge] -
*******************************************************************************************************************************
   в Microsoft.Spark.CSharp.Interop.Ipc.JvmBridge.CallJavaMethod(Boolean isStatic, Object classNameOrJvmObjectReference, String methodName, Object[] parameters)
*******************************************************************************************************************************
@sharpeye096
Copy link

I encountered similar issue with you. There is a commit that upgrade it to Spark 2.3.1.
9aa97b9

It should be some version related issue, please make sure your local spark version is consistent with the one on clustered. I somehow mitigated it by a workaround but now stuck by another issue

@k2moz
Copy link
Author

k2moz commented Dec 26, 2018

I encountered similar issue with you. There is a commit that upgrade it to Spark 2.3.1.
9aa97b9

It should be some version related issue, please make sure your local spark version is consistent with the one on clustered. I somehow mitigated it by a workaround but now stuck by another issue

I could not solve the problem =(

I had to use Phyton scripts inside my c# code for use pyspark api (Spark 2.4.0)
Phyton Scripts running in c# code with IronPhyton (http://ironpython.net/)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants