From 4b8f9b8fcb86acff4542aa70ed006ba75cb83869 Mon Sep 17 00:00:00 2001 From: Cengguang Zhang Date: Thu, 1 Sep 2022 16:15:50 +0800 Subject: [PATCH 1/4] doc: update hadoop document. --- docs/readthedocs/source/doc/UserGuide/hadoop.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/readthedocs/source/doc/UserGuide/hadoop.md b/docs/readthedocs/source/doc/UserGuide/hadoop.md index aa41789796a..f9ab3b62b76 100644 --- a/docs/readthedocs/source/doc/UserGuide/hadoop.md +++ b/docs/readthedocs/source/doc/UserGuide/hadoop.md @@ -119,7 +119,7 @@ Follow the steps below if you need to run BigDL with [spark-submit](https://spar --archives environment.tar.gz#environment \ script.py ``` - Note: For `yarn-client`, the Spark driver is running on local and it will use the Python interpreter in the current active conda environment while the executors will use the Python interpreter in `environment.tar.gz`. + Note: For `yarn-cluster`, the Spark driver is running in a YARN container as well and thus both the driver and executors will use the Python interpreter in `environment.tar.gz`. If you want to operate HDFS as some certain user, you can set SparkConf in submit command line using `--conf spark.yarn.appMasterEnv.HADOOP_USER_NAME=username`. For `yarn-client` mode: @@ -134,4 +134,4 @@ Follow the steps below if you need to run BigDL with [spark-submit](https://spar --archives environment.tar.gz#environment \ script.py ``` - Note: For `yarn-cluster`, the Spark driver is running in a YARN container as well and thus both the driver and executors will use the Python interpreter in `environment.tar.gz`. + Note: For `yarn-client`, the Spark driver is running on local and it will use the Python interpreter in the current active conda environment while the executors will use the Python interpreter in `environment.tar.gz`. From fdb6277649fac51e9952bd03e3e0de01bd6d61d8 Mon Sep 17 00:00:00 2001 From: Cengguang Zhang Date: Thu, 1 Sep 2022 19:02:36 +0800 Subject: [PATCH 2/4] fix: fix wording. --- docs/readthedocs/source/doc/UserGuide/hadoop.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/readthedocs/source/doc/UserGuide/hadoop.md b/docs/readthedocs/source/doc/UserGuide/hadoop.md index f9ab3b62b76..3557d546422 100644 --- a/docs/readthedocs/source/doc/UserGuide/hadoop.md +++ b/docs/readthedocs/source/doc/UserGuide/hadoop.md @@ -119,7 +119,7 @@ Follow the steps below if you need to run BigDL with [spark-submit](https://spar --archives environment.tar.gz#environment \ script.py ``` - Note: For `yarn-cluster`, the Spark driver is running in a YARN container as well and thus both the driver and executors will use the Python interpreter in `environment.tar.gz`. If you want to operate HDFS as some certain user, you can set SparkConf in submit command line using `--conf spark.yarn.appMasterEnv.HADOOP_USER_NAME=username`. + Note: For `yarn-cluster`, the Spark driver is running in a YARN container as well and thus both the driver and executors will use the Python interpreter in `environment.tar.gz`. If you want to operate HDFS as some certain user, you can set SparkConf using `--conf spark.yarn.appMasterEnv.HADOOP_USER_NAME=username`. For `yarn-client` mode: From 87fdc1f93071a554b94ab6f95d6cd53252d44598 Mon Sep 17 00:00:00 2001 From: Cengguang Zhang Date: Thu, 1 Sep 2022 19:03:28 +0800 Subject: [PATCH 3/4] fix: fix wording. --- docs/readthedocs/source/doc/UserGuide/hadoop.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/readthedocs/source/doc/UserGuide/hadoop.md b/docs/readthedocs/source/doc/UserGuide/hadoop.md index 3557d546422..c6a683190ef 100644 --- a/docs/readthedocs/source/doc/UserGuide/hadoop.md +++ b/docs/readthedocs/source/doc/UserGuide/hadoop.md @@ -119,7 +119,7 @@ Follow the steps below if you need to run BigDL with [spark-submit](https://spar --archives environment.tar.gz#environment \ script.py ``` - Note: For `yarn-cluster`, the Spark driver is running in a YARN container as well and thus both the driver and executors will use the Python interpreter in `environment.tar.gz`. If you want to operate HDFS as some certain user, you can set SparkConf using `--conf spark.yarn.appMasterEnv.HADOOP_USER_NAME=username`. + Note: For `yarn-cluster`, the Spark driver is running in a YARN container as well and thus both the driver and executors will use the Python interpreter in `environment.tar.gz`. If you want to operate HDFS as some certain user, you can set SparkConf using `spark.yarn.appMasterEnv.HADOOP_USER_NAME=username`. For `yarn-client` mode: From 69e030b9e6badffbb4ffba61e001e3d1d84527e2 Mon Sep 17 00:00:00 2001 From: Cengguang Zhang Date: Thu, 1 Sep 2022 19:39:33 +0800 Subject: [PATCH 4/4] fix: fix wording. --- docs/readthedocs/source/doc/UserGuide/hadoop.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/readthedocs/source/doc/UserGuide/hadoop.md b/docs/readthedocs/source/doc/UserGuide/hadoop.md index c6a683190ef..c6c66900346 100644 --- a/docs/readthedocs/source/doc/UserGuide/hadoop.md +++ b/docs/readthedocs/source/doc/UserGuide/hadoop.md @@ -119,7 +119,7 @@ Follow the steps below if you need to run BigDL with [spark-submit](https://spar --archives environment.tar.gz#environment \ script.py ``` - Note: For `yarn-cluster`, the Spark driver is running in a YARN container as well and thus both the driver and executors will use the Python interpreter in `environment.tar.gz`. If you want to operate HDFS as some certain user, you can set SparkConf using `spark.yarn.appMasterEnv.HADOOP_USER_NAME=username`. + Note: For `yarn-cluster`, the Spark driver is running in a YARN container as well and thus both the driver and executors will use the Python interpreter in `environment.tar.gz`. If you want to operate HDFS as some certain user, you can add `spark.yarn.appMasterEnv.HADOOP_USER_NAME=username` to SparkConf. For `yarn-client` mode: