We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hi,
Thanks for this repo.
I’m trying to run SparkBWA on Amazon EMR Yarn cluster, but I got many errors.
I wrote yarn instead of yarn-cluster and also I wrote the --deploy-mode cluster
yarn
yarn-cluster
--deploy-mode cluster
Then, I got the following error:
[hadoop@ip-172-31-14-100 ~]$ spark-submit --class com.github.sparkbwa.SparkBWA --master yarn --deploy-mode cluster --driver-memory 1500m --executor-memory 10g --executor-cores 1 --verbose --num-executors 16 sparkbwa-1.0.jar -m -r -p --index /Data/HumanBase/hg38 -n 16 -w "-R @RG\tID:foo\tLB:bar\tPL:illumina\tPU:illumina\tSM:ERR000589" ERR000589_1.filt.fastq ERR000589_2.filt.fastq Output_ERR000589 Using properties file: /usr/lib/spark/conf/spark-defaults.conf Adding default property: spark.sql.warehouse.dir=*********(redacted) Adding default property: spark.executor.extraJavaOptions=-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:MaxHeapFreeRatio=70 -XX:+CMSClassUnloadingEnabled -XX:OnOutOfMemoryError='kill -9 %p' Adding default property: spark.history.fs.logDirectory=hdfs:///var/log/spark/apps Adding default property: spark.eventLog.enabled=true Adding default property: spark.shuffle.service.enabled=true Adding default property: spark.driver.extraLibraryPath=/usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native Adding default property: spark.yarn.historyServer.address=ip-172-31-14-100.eu-west-2.compute.internal:18080 Adding default property: spark.stage.attempt.ignoreOnDecommissionFetchFailure=true Adding default property: spark.driver.memory=11171M Adding default property: spark.executor.instances=16 Adding default property: spark.default.parallelism=256 Adding default property: spark.resourceManager.cleanupExpiredHost=true Adding default property: spark.yarn.appMasterEnv.SPARK_PUBLIC_DNS=$(hostname -f) Adding default property: spark.driver.extraJavaOptions=-XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:MaxHeapFreeRatio=70 -XX:+CMSClassUnloadingEnabled -XX:OnOutOfMemoryError='kill -9 %p' Adding default property: spark.master=yarn Adding default property: spark.blacklist.decommissioning.timeout=1h Adding default property: spark.executor.extraLibraryPath=/usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native Adding default property: spark.sql.hive.metastore.sharedPrefixes=com.amazonaws.services.dynamodbv2 Adding default property: spark.executor.memory=10356M Adding default property: spark.driver.extraClassPath=/usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/*:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar Adding default property: spark.eventLog.dir=hdfs:///var/log/spark/apps Adding default property: spark.dynamicAllocation.enabled=true Adding default property: spark.executor.extraClassPath=/usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/*:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar Adding default property: spark.executor.cores=8 Adding default property: spark.history.ui.port=18080 Adding default property: spark.blacklist.decommissioning.enabled=true Adding default property: spark.decommissioning.timeout.threshold=20 Adding default property: spark.hadoop.yarn.timeline-service.enabled=false Parsed arguments: master yarn deployMode cluster executorMemory 10g executorCores 1 totalExecutorCores null propertiesFile /usr/lib/spark/conf/spark-defaults.conf driverMemory 1500m driverCores null driverExtraClassPath /usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/*:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar driverExtraLibraryPath /usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native driverExtraJavaOptions -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:MaxHeapFreeRatio=70 -XX:+CMSClassUnloadingEnabled -XX:OnOutOfMemoryError='kill -9 %p' supervise false queue null numExecutors 16 files null pyFiles null archives null mainClass com.github.sparkbwa.SparkBWA primaryResource file:/home/hadoop/sparkbwa-1.0.jar name com.github.sparkbwa.SparkBWA childArgs [-m -r -p --index /Data/HumanBase/hg38 -n 16 -w -R @RG\tID:foo\tLB:bar\tPL:illumina\tPU:illumina\tSM:ERR000589 ERR000589_1.filt.fastq ERR000589_2.filt.fastq Output_ERR000589] jars null packages null packagesExclusions null repositories null verbose true Spark properties used, including those specified through --conf and those from the properties file /usr/lib/spark/conf/spark-defaults.conf: (spark.blacklist.decommissioning.timeout,1h) (spark.executor.extraLibraryPath,/usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native) (spark.default.parallelism,256) (spark.blacklist.decommissioning.enabled,true) (spark.hadoop.yarn.timeline-service.enabled,false) (spark.driver.memory,1500m) (spark.executor.memory,10356M) (spark.executor.instances,16) (spark.sql.warehouse.dir,*********(redacted)) (spark.driver.extraLibraryPath,/usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native) (spark.yarn.historyServer.address,ip-172-31-14-100.eu-west-2.compute.internal:18080) (spark.eventLog.enabled,true) (spark.stage.attempt.ignoreOnDecommissionFetchFailure,true) (spark.history.ui.port,18080) (spark.yarn.appMasterEnv.SPARK_PUBLIC_DNS,$(hostname -f)) (spark.executor.extraJavaOptions,-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:MaxHeapFreeRatio=70 -XX:+CMSClassUnloadingEnabled -XX:OnOutOfMemoryError='kill -9 %p') (spark.resourceManager.cleanupExpiredHost,true) (spark.shuffle.service.enabled,true) (spark.history.fs.logDirectory,hdfs:///var/log/spark/apps) (spark.driver.extraJavaOptions,-XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:MaxHeapFreeRatio=70 -XX:+CMSClassUnloadingEnabled -XX:OnOutOfMemoryError='kill -9 %p') (spark.executor.extraClassPath,/usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/*:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar) (spark.sql.hive.metastore.sharedPrefixes,com.amazonaws.services.dynamodbv2) (spark.eventLog.dir,hdfs:///var/log/spark/apps) (spark.master,yarn) (spark.dynamicAllocation.enabled,true) (spark.executor.cores,8) (spark.decommissioning.timeout.threshold,20) (spark.driver.extraClassPath,/usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/*:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar) Main class: org.apache.spark.deploy.yarn.Client Arguments: --jar file:/home/hadoop/sparkbwa-1.0.jar --class com.github.sparkbwa.SparkBWA --arg -m --arg -r --arg -p --arg --index --arg /Data/HumanBase/hg38 --arg -n --arg 16 --arg -w --arg -R @RG\tID:foo\tLB:bar\tPL:illumina\tPU:illumina\tSM:ERR000589 --arg ERR000589_1.filt.fastq --arg ERR000589_2.filt.fastq --arg Output_ERR000589 System properties: (spark.blacklist.decommissioning.timeout,1h) (spark.executor.extraLibraryPath,/usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native) (spark.default.parallelism,256) (spark.blacklist.decommissioning.enabled,true) (spark.hadoop.yarn.timeline-service.enabled,false) (spark.driver.memory,1500m) (spark.executor.memory,10g) (spark.executor.instances,16) (spark.driver.extraLibraryPath,/usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native) (spark.sql.warehouse.dir,*********(redacted)) (spark.yarn.historyServer.address,ip-172-31-14-100.eu-west-2.compute.internal:18080) (spark.eventLog.enabled,true) (spark.stage.attempt.ignoreOnDecommissionFetchFailure,true) (spark.history.ui.port,18080) (spark.yarn.appMasterEnv.SPARK_PUBLIC_DNS,$(hostname -f)) (SPARK_SUBMIT,true) (spark.executor.extraJavaOptions,-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:MaxHeapFreeRatio=70 -XX:+CMSClassUnloadingEnabled -XX:OnOutOfMemoryError='kill -9 %p') (spark.app.name,com.github.sparkbwa.SparkBWA) (spark.resourceManager.cleanupExpiredHost,true) (spark.history.fs.logDirectory,hdfs:///var/log/spark/apps) (spark.shuffle.service.enabled,true) (spark.driver.extraJavaOptions,-XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:MaxHeapFreeRatio=70 -XX:+CMSClassUnloadingEnabled -XX:OnOutOfMemoryError='kill -9 %p') (spark.submit.deployMode,cluster) (spark.executor.extraClassPath,/usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/*:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar) (spark.eventLog.dir,hdfs:///var/log/spark/apps) (spark.sql.hive.metastore.sharedPrefixes,com.amazonaws.services.dynamodbv2) (spark.master,yarn) (spark.dynamicAllocation.enabled,true) (spark.decommissioning.timeout.threshold,20) (spark.executor.cores,1) (spark.driver.extraClassPath,/usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/*:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar) Classpath elements: file:/home/hadoop/sparkbwa-1.0.jar 18/01/20 15:53:12 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 18/01/20 15:53:13 INFO RMProxy: Connecting to ResourceManager at ip-172-31-14-100.eu-west-2.compute.internal/172.31.14.100:8032 18/01/20 15:53:13 INFO Client: Requesting a new application from cluster with 16 NodeManagers 18/01/20 15:53:13 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (12288 MB per container) 18/01/20 15:53:13 INFO Client: Will allocate AM container, with 1884 MB memory including 384 MB overhead 18/01/20 15:53:13 INFO Client: Setting up container launch context for our AM 18/01/20 15:53:13 INFO Client: Setting up the launch environment for our AM container 18/01/20 15:53:13 INFO Client: Preparing resources for our AM container 18/01/20 15:53:14 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME. 18/01/20 15:53:16 INFO Client: Uploading resource file:/mnt/tmp/spark-8adea679-22d7-4945-9708-d61ef96b2c2a/__spark_libs__3181673287761365885.zip -> hdfs://ip-172-31-14-100.eu-west-2.compute.internal:8020/user/hadoop/.sparkStaging/application_1516463115359_0001/__spark_libs__3181673287761365885.zip 18/01/20 15:53:17 INFO Client: Uploading resource file:/home/hadoop/sparkbwa-1.0.jar -> hdfs://ip-172-31-14-100.eu-west-2.compute.internal:8020/user/hadoop/.sparkStaging/application_1516463115359_0001/sparkbwa-1.0.jar 18/01/20 15:53:17 INFO Client: Uploading resource file:/mnt/tmp/spark-8adea679-22d7-4945-9708-d61ef96b2c2a/__spark_conf__4991143839440201874.zip -> hdfs://ip-172-31-14-100.eu-west-2.compute.internal:8020/user/hadoop/.sparkStaging/application_1516463115359_0001/__spark_conf__.zip 18/01/20 15:53:17 INFO SecurityManager: Changing view acls to: hadoop 18/01/20 15:53:17 INFO SecurityManager: Changing modify acls to: hadoop 18/01/20 15:53:17 INFO SecurityManager: Changing view acls groups to: 18/01/20 15:53:17 INFO SecurityManager: Changing modify acls groups to: 18/01/20 15:53:17 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop); groups with view permissions: Set(); users with modify permissions: Set(hadoop); groups with modify permissions: Set() 18/01/20 15:53:17 INFO Client: Submitting application application_1516463115359_0001 to ResourceManager 18/01/20 15:53:18 INFO YarnClientImpl: Submitted application application_1516463115359_0001 18/01/20 15:53:19 INFO Client: Application report for application_1516463115359_0001 (state: ACCEPTED) 18/01/20 15:53:19 INFO Client: client token: N/A diagnostics: N/A ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1516463597765 final status: UNDEFINED tracking URL: http://ip-172-31-14-100.eu-west-2.compute.internal:20888/proxy/application_1516463115359_0001/ user: hadoop 18/01/20 15:53:20 INFO Client: Application report for application_1516463115359_0001 (state: ACCEPTED) 18/01/20 15:53:21 INFO Client: Application report for application_1516463115359_0001 (state: ACCEPTED) 18/01/20 15:53:22 INFO Client: Application report for application_1516463115359_0001 (state: ACCEPTED) 18/01/20 15:53:23 INFO Client: Application report for application_1516463115359_0001 (state: ACCEPTED) 18/01/20 15:53:24 INFO Client: Application report for application_1516463115359_0001 (state: ACCEPTED) 18/01/20 15:53:25 INFO Client: Application report for application_1516463115359_0001 (state: ACCEPTED) 18/01/20 15:53:26 INFO Client: Application report for application_1516463115359_0001 (state: ACCEPTED) 18/01/20 15:53:27 INFO Client: Application report for application_1516463115359_0001 (state: ACCEPTED) 18/01/20 15:53:28 INFO Client: Application report for application_1516463115359_0001 (state: ACCEPTED) 18/01/20 15:53:29 INFO Client: Application report for application_1516463115359_0001 (state: FAILED) 18/01/20 15:53:29 INFO Client: client token: N/A diagnostics: Application application_1516463115359_0001 failed 2 times due to AM Container for appattempt_1516463115359_0001_000002 exited with exitCode: 1 For more detailed output, check application tracking page:http://ip-172-31-14-100.eu-west-2.compute.internal:8088/cluster/app/application_1516463115359_0001Then, click on links to logs of each attempt. Diagnostics: Exception from container-launch. Container id: container_1516463115359_0001_02_000001 Exit code: 1 Stack trace: ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:582) at org.apache.hadoop.util.Shell.run(Shell.java:479) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:773) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Container exited with a non-zero exit code 1 Failing this attempt. Failing the application. ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1516463597765 final status: FAILED tracking URL: http://ip-172-31-14-100.eu-west-2.compute.internal:8088/cluster/app/application_1516463115359_0001 user: hadoop Exception in thread "main" org.apache.spark.SparkException: Application application_1516463115359_0001 finished with failed status at org.apache.spark.deploy.yarn.Client.run(Client.scala:1122) at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1168) at org.apache.spark.deploy.yarn.Client.main(Client.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:775) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 18/01/20 15:53:29 INFO ShutdownHookManager: Shutdown hook called 18/01/20 15:53:29 INFO ShutdownHookManager: Deleting directory /mnt/tmp/spark-8adea679-22d7-4945-9708-d61ef96b2c2a [hadoop@ip-172-31-14-100 ~]$ Broadcast message from root@ip-172-31-14-100 (unknown) at 15:54 ... The system is going down for power off NOW! Connection to ec2-35-177-163-135.eu-west-2.compute.amazonaws.com closed by remote host. Connection to ec2-35-177-163-135.eu-west-2.compute.amazonaws.com closed.
Any help will be appropriated
Thank you 🙏
The text was updated successfully, but these errors were encountered:
any word on this ?
Sorry, something went wrong.
No branches or pull requests
Hi,
Thanks for this repo.
I’m trying to run SparkBWA on Amazon EMR Yarn cluster, but I got many errors.
I wrote
yarn
instead ofyarn-cluster
and also I wrote the--deploy-mode cluster
Then, I got the following error:
Any help will be appropriated
Thank you 🙏
The text was updated successfully, but these errors were encountered: