You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
java.lang.ClassCastException: cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD
#73
Open
aashishrtyagi opened this issue
Feb 22, 2018
· 1 comment
object empSchema {
val stid = StructField("stid", StringType)
val name = StructField("name", StringType)
val subject = StructField("subject", StringType)
val grade = StructField("grade", StringType)
val city = StructField("city", StringType)
val struct = StructType(Array(stid, name, subject, grade, city))
}
val sparkConf = new SparkConf().setMaster("spark://myhostname:7077").setAppName("TestApp")
sparkConf.set("spark.hbase.host","myhostname")
val sc = new SparkContext(sparkConf)
val rdd = sc.parallelize(1 to 100)
.map(i => (i.toString, i+1, "Hello"))
============Exception stack trace =====================
8/02/22 15:45:46 INFO TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0) on 192.168.224.116, executor 1: java.lang.ClassCastException (cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD) [duplicate 1]
18/02/22 15:45:46 INFO TaskSetManager: Starting task 0.1 in stage 0.0 (TID 2, 192.168.224.116, executor 1, partition 0, PROCESS_LOCAL, 4829 bytes)
18/02/22 15:45:46 INFO TaskSetManager: Starting task 1.1 in stage 0.0 (TID 3, 192.168.224.116, executor 1, partition 1, PROCESS_LOCAL, 4886 bytes)
18/02/22 15:45:46 INFO TaskSetManager: Lost task 0.1 in stage 0.0 (TID 2) on 192.168.224.116, executor 1: java.lang.ClassCastException (cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD) [duplicate 2]
18/02/22 15:45:46 INFO TaskSetManager: Starting task 0.2 in stage 0.0 (TID 4, 192.168.224.116, executor 1, partition 0, PROCESS_LOCAL, 4829 bytes)
18/02/22 15:45:46 INFO TaskSetManager: Lost task 1.1 in stage 0.0 (TID 3) on 192.168.224.116, executor 1: java.lang.ClassCastException (cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD) [duplicate 3]
18/02/22 15:45:46 INFO TaskSetManager: Starting task 1.2 in stage 0.0 (TID 5, 192.168.224.116, executor 1, partition 1, PROCESS_LOCAL, 4886 bytes)
18/02/22 15:45:46 INFO TaskSetManager: Lost task 0.2 in stage 0.0 (TID 4) on 192.168.224.116, executor 1: java.lang.ClassCastException (cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD) [duplicate 4]
18/02/22 15:45:46 INFO TaskSetManager: Starting task 0.3 in stage 0.0 (TID 6, 192.168.224.116, executor 1, partition 0, PROCESS_LOCAL, 4829 bytes)
18/02/22 15:45:46 INFO TaskSetManager: Lost task 1.2 in stage 0.0 (TID 5) on 192.168.224.116, executor 1: java.lang.ClassCastException (cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD) [duplicate 5]
18/02/22 15:45:46 INFO TaskSetManager: Starting task 1.3 in stage 0.0 (TID 7, 192.168.224.116, executor 1, partition 1, PROCESS_LOCAL, 4886 bytes)
18/02/22 15:45:46 INFO TaskSetManager: Lost task 1.3 in stage 0.0 (TID 7) on 192.168.224.116, executor 1: java.lang.ClassCastException (cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD) [duplicate 6]
18/02/22 15:45:46 ERROR TaskSetManager: Task 1 in stage 0.0 failed 4 times; aborting job
18/02/22 15:45:46 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
18/02/22 15:45:46 INFO TaskSetManager: Lost task 0.3 in stage 0.0 (TID 6) on 192.168.224.116, executor 1: java.lang.ClassCastException (cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD) [duplicate 7]
18/02/22 15:45:46 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
18/02/22 15:45:46 INFO TaskSchedulerImpl: Cancelling stage 0
18/02/22 15:45:46 INFO DAGScheduler: ResultStage 0 (runJob at SparkHadoopMapReduceWriter.scala:88) failed in 6.777 s due to Job aborted due to stage failure: Task 1 in stage 0.0 failed 4 times, most recent failure: Lost task 1.3 in stage 0.0 (TID 7, 192.168.224.116, executor 1): java.lang.ClassCastException: cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD
at java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2133)
at java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1305)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2024)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373)
at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:80)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
18/02/22 15:45:46 INFO DAGScheduler: Job 0 failed: runJob at SparkHadoopMapReduceWriter.scala:88, took 7.208332 s
18/02/22 15:45:46 ERROR SparkHadoopMapReduceWriter: Aborting job job_20180222154539_0002.
org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 0.0 failed 4 times, most recent failure: Lost task 1.3 in stage 0.0 (TID 7, 192.168.224.116, executor 1): java.lang.ClassCastException: cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD
at java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2133)
at java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1305)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2024)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373)
at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:80)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
========== Spark and hbase version ===================
spark : spark-2.2.1
hbase : hbase-1.2.6
spark-hbase connector jar version : spark-hbase-connector_2.10-1.0.3.jar
The text was updated successfully, but these errors were encountered:
Hi,
I used the following code as given in example to connect to hbase but i am facing class cast exception .
package com.gs
import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
import it.nerdammer.spark.hbase._
import org.apache.spark.sql.Row
import org.apache.spark.sql.types.StructType
import org.apache.spark.sql.types.StructField
import org.apache.spark.sql.types.StringType
import org.apache.spark.sql.SparkSession
object Test extends App{
object empSchema {
val stid = StructField("stid", StringType)
val name = StructField("name", StringType)
val subject = StructField("subject", StringType)
val grade = StructField("grade", StringType)
val city = StructField("city", StringType)
val struct = StructType(Array(stid, name, subject, grade, city))
}
val sparkConf = new SparkConf().setMaster("spark://myhostname:7077").setAppName("TestApp")
sparkConf.set("spark.hbase.host","myhostname")
val sc = new SparkContext(sparkConf)
val rdd = sc.parallelize(1 to 100)
.map(i => (i.toString, i+1, "Hello"))
rdd.toHBaseTable("mytable")
.toColumns("column1", "column2")
.inColumnFamily("mycf")
.save()
}
============Exception stack trace =====================
8/02/22 15:45:46 INFO TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0) on 192.168.224.116, executor 1: java.lang.ClassCastException (cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD) [duplicate 1]
18/02/22 15:45:46 INFO TaskSetManager: Starting task 0.1 in stage 0.0 (TID 2, 192.168.224.116, executor 1, partition 0, PROCESS_LOCAL, 4829 bytes)
18/02/22 15:45:46 INFO TaskSetManager: Starting task 1.1 in stage 0.0 (TID 3, 192.168.224.116, executor 1, partition 1, PROCESS_LOCAL, 4886 bytes)
18/02/22 15:45:46 INFO TaskSetManager: Lost task 0.1 in stage 0.0 (TID 2) on 192.168.224.116, executor 1: java.lang.ClassCastException (cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD) [duplicate 2]
18/02/22 15:45:46 INFO TaskSetManager: Starting task 0.2 in stage 0.0 (TID 4, 192.168.224.116, executor 1, partition 0, PROCESS_LOCAL, 4829 bytes)
18/02/22 15:45:46 INFO TaskSetManager: Lost task 1.1 in stage 0.0 (TID 3) on 192.168.224.116, executor 1: java.lang.ClassCastException (cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD) [duplicate 3]
18/02/22 15:45:46 INFO TaskSetManager: Starting task 1.2 in stage 0.0 (TID 5, 192.168.224.116, executor 1, partition 1, PROCESS_LOCAL, 4886 bytes)
18/02/22 15:45:46 INFO TaskSetManager: Lost task 0.2 in stage 0.0 (TID 4) on 192.168.224.116, executor 1: java.lang.ClassCastException (cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD) [duplicate 4]
18/02/22 15:45:46 INFO TaskSetManager: Starting task 0.3 in stage 0.0 (TID 6, 192.168.224.116, executor 1, partition 0, PROCESS_LOCAL, 4829 bytes)
18/02/22 15:45:46 INFO TaskSetManager: Lost task 1.2 in stage 0.0 (TID 5) on 192.168.224.116, executor 1: java.lang.ClassCastException (cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD) [duplicate 5]
18/02/22 15:45:46 INFO TaskSetManager: Starting task 1.3 in stage 0.0 (TID 7, 192.168.224.116, executor 1, partition 1, PROCESS_LOCAL, 4886 bytes)
18/02/22 15:45:46 INFO TaskSetManager: Lost task 1.3 in stage 0.0 (TID 7) on 192.168.224.116, executor 1: java.lang.ClassCastException (cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD) [duplicate 6]
18/02/22 15:45:46 ERROR TaskSetManager: Task 1 in stage 0.0 failed 4 times; aborting job
18/02/22 15:45:46 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
18/02/22 15:45:46 INFO TaskSetManager: Lost task 0.3 in stage 0.0 (TID 6) on 192.168.224.116, executor 1: java.lang.ClassCastException (cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD) [duplicate 7]
18/02/22 15:45:46 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
18/02/22 15:45:46 INFO TaskSchedulerImpl: Cancelling stage 0
18/02/22 15:45:46 INFO DAGScheduler: ResultStage 0 (runJob at SparkHadoopMapReduceWriter.scala:88) failed in 6.777 s due to Job aborted due to stage failure: Task 1 in stage 0.0 failed 4 times, most recent failure: Lost task 1.3 in stage 0.0 (TID 7, 192.168.224.116, executor 1): java.lang.ClassCastException: cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD
at java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2133)
at java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1305)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2024)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373)
at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:80)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
18/02/22 15:45:46 INFO DAGScheduler: Job 0 failed: runJob at SparkHadoopMapReduceWriter.scala:88, took 7.208332 s
18/02/22 15:45:46 ERROR SparkHadoopMapReduceWriter: Aborting job job_20180222154539_0002.
org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 0.0 failed 4 times, most recent failure: Lost task 1.3 in stage 0.0 (TID 7, 192.168.224.116, executor 1): java.lang.ClassCastException: cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD
at java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2133)
at java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1305)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2024)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373)
at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:80)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
========== Spark and hbase version ===================
spark : spark-2.2.1
hbase : hbase-1.2.6
spark-hbase connector jar version : spark-hbase-connector_2.10-1.0.3.jar
The text was updated successfully, but these errors were encountered: