-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-22648] [K8S] Spark on Kubernetes - Documentation #19946
Conversation
Jenkins OK to test |
done | ||
} | ||
|
||
function usage { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So likely not for this PR, but I'm wondering for the future what you think about the idea of extending this to package up a usable Python env so we can solve the dependency management issue as well?
Really excited to see the progress :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do already do that in our fork - and when we get PySpark and R submitted (in Spark 2.4 hopefully), we will extend this script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM, only some nits.
push) push;; | ||
*) usage;; | ||
esac | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add extra empty line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
docs/running-on-kubernetes.md
Outdated
* This will become a table of contents (this text will be scraped). | ||
{:toc} | ||
|
||
Spark can run on clusters managed by [Kubernetes](https://kubernetes.io). This features makes use of the new experimental native |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"This features" -> "This feature"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
docs/running-on-kubernetes.md
Outdated
spark-submit can be directly used to submit a Spark application to a Kubernetes cluster. The mechanism by which spark-submit happens is as follows: | ||
|
||
* Spark creates a spark driver running within a [Kubernetes pod](https://kubernetes.io/docs/concepts/workloads/pods/pod/). | ||
* The driver creates executors which are also Kubernetes pods and connects to them, and executes application code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"which are also Kubernetes pods" -> "which are also running within Kubernetes pods"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
docs/running-on-kubernetes.md
Outdated
|
||
To launch Spark Pi in cluster mode, | ||
|
||
bin/spark-submit \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add {% highlight bash %}
over the command.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
master string with `k8s://` will cause the Spark application to launch on the Kubernetes cluster, with the API server | ||
being contacted at `api_server_url`. If no HTTP protocol is specified in the URL, it defaults to `https`. For example, | ||
setting the master to `k8s://example.com:443` is equivalent to setting it to `k8s://https://example.com:443`, but to | ||
connect without TLS on a different port, the master would be set to `k8s://http://example.com:8080`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I missed something but where is the logic that handles connect without TLS you mentioned here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should just be handled by the fabric8 client. We added this in fabric8io/kubernetes-client#652
docs/running-on-kubernetes.md
Outdated
### Namespaces | ||
|
||
Kubernetes has the concept of [namespaces](https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/). | ||
Namespaces are a way to divide cluster resources between multiple users (via resource quota). Spark on Kubernetes can |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"a way" -> "ways"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
docs/running-on-kubernetes.md
Outdated
|
||
# Configuration | ||
|
||
See the [configuration page](configuration.html) for information on Spark configurations. The following configuration is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
" The following configuration is" -> " The following configurations are"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@jiangxb1987, thanks for the round of review - I updated some links and pointers to the docs as well. PTAL. |
docs/running-on-kubernetes.md
Outdated
[kubectl](https://kubernetes.io/docs/user-guide/prereqs/). If you do not already have a working Kubernetes cluster, | ||
you may setup a test cluster on your local machine using | ||
[minikube](https://kubernetes.io/docs/getting-started-guides/minikube/). | ||
* We recommend using the latest releases of minikube be updated to the most recent version with the DNS addon enabled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence doesn't read right.
docs/running-on-kubernetes.md
Outdated
<img src="img/k8s-cluster-mode.png" title="Spark cluster components" alt="Spark cluster components" /> | ||
</p> | ||
|
||
spark-submit can be directly used to submit a Spark application to a Kubernetes cluster. The mechanism by which spark-submit happens is as follows: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<code>spark-submit</code>
"The submission mechanism works as follows:"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
docs/running-on-kubernetes.md
Outdated
* Spark creates a spark driver running within a [Kubernetes pod](https://kubernetes.io/docs/concepts/workloads/pods/pod/). | ||
* The driver creates executors which are also running within Kubernetes pods and connects to them, and executes application code. | ||
* When the application completes, the executor pods terminate and are cleaned up, but the driver pod persists | ||
logs and remains in "completed" state in the Kubernetes API till it's eventually garbage collected or manually cleaned up. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/till/until
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
docs/running-on-kubernetes.md
Outdated
|
||
Kubernetes has the concept of [namespaces](https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/). | ||
Namespaces are ways to divide cluster resources between multiple users (via resource quota). Spark on Kubernetes can | ||
use namespaces to launch spark applications. This is through the `--conf spark.kubernetes.namespace` argument to spark-submit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of mentioning it as a spark-submit argument, mention it as a configuration. Users have more than one way to set configuration in their apps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
docs/running-on-kubernetes.md
Outdated
service account that has the right role granted. Spark on Kubernetes supports specifying a custom service account to | ||
be used by the driver pod through the configuration property | ||
`spark.kubernetes.authenticate.driver.serviceAccountName=<service account name>`. For example to make the driver pod | ||
to use the `spark` service account, a user simply adds the following option to the `spark-submit` command: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"...to make the driver pod use..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
docs/running-on-kubernetes.md
Outdated
</td> | ||
</tr> | ||
<tr> | ||
<td><code>spark.kubernetes.executor.docker.image</code></td> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's required.
docs/running-on-kubernetes.md
Outdated
</tr> | ||
<tr> | ||
<td><code>spark.kubernetes.allocation.batch.delay</code></td> | ||
<td><code>1</code></td> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, isn't this a time conf? If so, specify the value with the unit and don't mention units in the description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an int conf at the moment - but good point about making it a time config. I think it makes sense to change that. Will draft up a PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix in #20032
</td> | ||
</tr> | ||
<tr> | ||
<td><code>spark.kubernetes.authenticate.submission.oauthToken</code></td> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If not yet supported, I recommend adding support for reading this from an env variable; command line args are visible by anyone using ps
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense. I think the fabric8 client has a way of consuming that from env-vars, but we need to check that it works.
cc @mccheah
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filing an issue to unblock. I think we can work around this for now - but will be important to document for the future.
docs/submitting-applications.md
Outdated
@@ -155,6 +165,12 @@ The master URL passed to Spark can be in one of the following formats: | |||
<code>client</code> or <code>cluster</code> mode depending on the value of <code>--deploy-mode</code>. | |||
The cluster location will be found based on the <code>HADOOP_CONF_DIR</code> or <code>YARN_CONF_DIR</code> variable. | |||
</td></tr> | |||
<tr><td> <code>k8s://HOST:PORT</code> </td><td> Connect to a <a href="running-on-kubernetes.html"> Kubernetes </a> cluster in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove spaces around "Kubernetes".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
docs/running-on-kubernetes.md
Outdated
<td><code>spark.kubernetes.driver.secrets.[SecretName]</code></td> | ||
<td>(none)</td> | ||
<td> | ||
Mounts the Kubernetes secret named <code>SecretName</code> onto the path specified by the value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is a "Kubernetes secret"? Do you need a link to explain it somewhere? When would you use this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
ok to test |
docs/index.md
Outdated
@@ -112,7 +113,7 @@ options for deployment: | |||
* [Mesos](running-on-mesos.html): deploy a private cluster using | |||
[Apache Mesos](http://mesos.apache.org) | |||
* [YARN](running-on-yarn.html): deploy Spark on top of Hadoop NextGen (YARN) | |||
* [Kubernetes (experimental)](https://github.com/apache-spark-on-k8s/spark): deploy Spark on top of Kubernetes | |||
* [Kubernetes (experimental)](running-on-kubernetes.html): deploy Spark on top of Kubernetes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove (experimental)
here as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
docs/running-on-kubernetes.md
Outdated
Kubernetes master is running at http://127.0.0.1:6443 | ||
``` | ||
|
||
In the above example, the specific Kubernetes cluster can be used with spark submit by specifying |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spark-submit
instead of spark submit
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
docs/running-on-kubernetes.md
Outdated
The local proxy can be started by: | ||
|
||
```bash | ||
kubectl proxy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we can remove extra space at the beginning of this line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
docs/running-on-kubernetes.md
Outdated
|
||
### Accessing Logs | ||
|
||
Logs can be accessed using the kubernetes API and the `kubectl` CLI. When a Spark application is running, it's possible |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Kubernetes API
instead of kubernetes API
(using a capital letter K
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
docs/running-on-kubernetes.md
Outdated
|
||
There may be several kinds of failures. If the Kubernetes API server rejects the request made from spark-submit, or the | ||
connection is refused for a different reason, the submission logic should indicate the error encountered. However, if there | ||
are errors during the running of the application, often, the best way to investigate may be through the kubernetes CLI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Kubernetes CLI
instead of kubernetes CLI
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
docs/running-on-kubernetes.md
Outdated
|
||
Status and logs of failed executor pods can be checked in similar ways. Finally, deleting the driver pod will clean up the entire spark | ||
application, includling all executors, associated service, etc. The driver pod can be thought of as the Kubernetes representation of | ||
the spark application. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Spark application
instead of spark application
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
docs/running-on-kubernetes.md
Outdated
|
||
Kubernetes has the concept of [namespaces](https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/). | ||
Namespaces are ways to divide cluster resources between multiple users (via resource quota). Spark on Kubernetes can | ||
use namespaces to launch spark applications. This is through the `--conf spark.kubernetes.namespace` argument to spark-submit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto.
docs/running-on-kubernetes.md
Outdated
<td><code>spark.kubernetes.authenticate.driver.oauthToken</code></td> | ||
<td>(none)</td> | ||
<td> | ||
OAuth token to use when authenticating against the against the Kubernetes API server from the driver pod when |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
against the against
-> against
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Test build #84850 has finished for PR 19946 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a few minor comments, overall looks pretty good - thanks for the PR !
docs/running-on-kubernetes.md
Outdated
{:toc} | ||
|
||
Spark can run on clusters managed by [Kubernetes](https://kubernetes.io). This feature makes use of the new experimental native | ||
Kubernetes scheduler that has been added to Spark. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove experimental ? I think there are other references (a grep should catch them all) and we can remove them all now ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. The other references have been removed. Couldn't find any others.
docs/running-on-kubernetes.md
Outdated
* When the application completes, the executor pods terminate and are cleaned up, but the driver pod persists | ||
logs and remains in "completed" state in the Kubernetes API till it's eventually garbage collected or manually cleaned up. | ||
|
||
Note that in the completed state, the driver pod does *not* use any computational or memory resources. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious - what about disk usage ? What is it proportional to ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some edit probably buried this query ... would be great if you can opine @foxish, thanks.
decisions for driver and executor pods using advanced primitives like | ||
[node selectors](https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#nodeselector) | ||
and [node/pod affinities](https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity) | ||
in a future release. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As an aside - should have asked about it in scheduler review - how does kubernetes handle prioritization of resources ? In particular, if it is pre-empting containers, is there a relative ordering ?
Will driver will always be the last to go ?
If this has to be user specified, do we need to add doc link to it here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As of today, preemption is at random.
priority and preemption are in alpha as of now. As soon as they go to beta (in the Spark 2.4 timeframe), we'll add the required pieces to make it honor the rule as you said - driver is the last to go, etc.
docs/running-on-kubernetes.md
Outdated
--deploy-mode cluster \ | ||
--class org.apache.spark.examples.SparkPi \ | ||
--master k8s://https://<k8s-apiserver-host>:<k8s-apiserver-port> \ | ||
--conf spark.kubernetes.namespace=default \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this default value from example ? (or set to non-default value if we want to illustrate its use)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
docs/running-on-kubernetes.md
Outdated
--class org.apache.spark.examples.SparkPi \ | ||
--master k8s://https://<k8s-apiserver-host>:<k8s-apiserver-port> \ | ||
--conf spark.kubernetes.namespace=default \ | ||
--conf spark.executor.instances=5 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--num-executors instead ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is specified currently as a YARN-only option in spark-submit
. Does it make sense to move it out to suit both K8s and YARN?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, I did not realize mesos did not honour it and it was yarn only option !
It might make sense for k8s also to support it too (since it does not right now, my proposal for doc change would be incorrect) .. what do you think @foxish ?
docs/running-on-kubernetes.md
Outdated
$ bin/spark-submit \ | ||
--deploy-mode cluster \ | ||
--class org.apache.spark.examples.SparkPi \ | ||
--master k8s://https://<k8s-apiserver-host>:<k8s-apiserver-port> \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reorder so that master is above deploy-mode ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
docs/running-on-kubernetes.md
Outdated
--master k8s://https://<k8s-apiserver-host>:<k8s-apiserver-port> \ | ||
--conf spark.kubernetes.namespace=default \ | ||
--conf spark.executor.instances=5 \ | ||
--conf spark.app.name=spark-pi \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--name instead ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
docs/running-on-kubernetes.md
Outdated
--conf spark.app.name=spark-pi \ | ||
--conf spark.kubernetes.driver.docker.image=<driver-image> \ | ||
--conf spark.kubernetes.executor.docker.image=<executor-image> \ | ||
local:///opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just the filename ?
I want to make sure it is intuitive and easy for users looking at examples - and it looks familiar with how spark is typically invoked. (Though the scheme is referenced here https://github.com/apache/spark/pull/19946/files#diff-b5527f236b253e0d9f5db5164bdb43e9R108 : I am assuming avoiding it defaults to local)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, I just wanted to show that it had to be a path. The protocol local:///
is necessary to point at container-local files, as opposed to local file paths on the submitting user's machine which will both be handled separately. Changed it to local:///path/to/examples.jar
. Is that better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify - local:///my/path/jar implies local to the docker container ? and /my/path/jar points to submitters machine ?
<td> | ||
In cluster mode, whether to wait for the application to finish before exiting the launcher process. When changed to | ||
false, the launcher has a "fire-and-forget" behavior when launching the Spark job. | ||
</td> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When false, will terminating spark submit cause spark application to also terminate ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it wouldn't. It would simply stop spark-submit from continuing to watch for status from the cluster. The application runs the same in either case.
f618e8b
to
8c656c5
Compare
Test build #84865 has finished for PR 19946 at commit
|
Test build #84861 has finished for PR 19946 at commit
|
Test build #84864 has finished for PR 19946 at commit
|
@@ -49,7 +49,7 @@ To create a Spark distribution like those distributed by the | |||
to be runnable, use `./dev/make-distribution.sh` in the project root directory. It can be configured | |||
with Maven profile settings and so on like the direct Maven build. Example: | |||
|
|||
./dev/make-distribution.sh --name custom-spark --pip --r --tgz -Psparkr -Phadoop-2.7 -Phive -Phive-thriftserver -Pmesos -Pyarn | |||
./dev/make-distribution.sh --name custom-spark --pip --r --tgz -Psparkr -Phadoop-2.7 -Phive -Phive-thriftserver -Pmesos -Pyarn -Pkubernetes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we use k8s? I kept bringing this up and that's because I can never spell Kubernetes properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's k8s://
in the URL scheme we use and the package names are also k8s
- so, users should never have to type the name.
This is one of the last few holdouts. I'd say that it's consistent here, with the use of other cluster manager names in full in their maven projects. I can change it here if you feel strongly about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually second @rxin, I do get the spelling wrong at times :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you both also referring to the config options etc, which are still spark.kubernetes.*
, or just the maven build target? If it's everything, it would be a fairly large change, doable certainly - but confirming how far the rename should go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ping @mridulm, @rxin - can you please confirm the scope of the renaming you were referring to here? Is it just the maven target? Changing all the config options etc would be a considerably large change at this point. Also, a point that was brought up today was - while k8s is common shorthand, it's not universal as is the full name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've filed https://issues.apache.org/jira/browse/SPARK-22853 to discuss this and unblock this PR. We should be able to reach consensus by release time. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea I don't think you need to block this pr with this.
Seems like the PR builders lost their state? |
ok to test |
## Docker Images | ||
|
||
Kubernetes requires users to supply images that can be deployed into containers within pods. The images are built to | ||
be run in a container runtime environment that Kubernetes supports. Docker is a container runtime environment that is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a container runtime environment that Kubernetes supports
Does the current code support these other container runtimes, or just docker? From my memory (and the propery names) it seems that only docker is supported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the images should be runnable by other container runtimes (cri-o, rkt, etc), although I'm not aware of anybody testing that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comment is more about the properties being called "docker" and whether that means only docker images are supported. If you can use any image supported by the k8s cluster, than pehaps the properties should be renamed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point - Although there is flexibility in theory, as of now, it's safe to assume that most people are running docker containers when using k8s - making the name docker much more intuitive. If the other runtimes do see traction in future (and we do some testing around them), we can rename to container.image
instead of docker.image
. As of now, I can make the documentation clearer that spark on k8s only supports docker images. Sound like a reasonable thing to do here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if k8s can tell the container runtime from the image name? If it can, we can use container.image
here, but otherwise, I guess we need another config like xxx.container
to tell the container runtime and xxx.${container}.image
to specify the image, e.g. xxx.container=docker
and xxx.docker.image=something
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to @ueshin's comment; if the same property supports more than docker images, it should be renamed to use a more generic name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I think container.image
should be fine then - will send a PR changing that.
I don't think there are plans to support choosing the runtime on the fly, or letting the CRI abstraction leak into the application layer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd expect the container run-time to be transparent to the particular image, and very definitely transparent to spark. Using container.image
makes sense from either standpoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#19995 PTAL
docs/running-on-kubernetes.md
Outdated
./sbin/build-push-docker-images.sh -r <repo> -t my-tag build | ||
./sbin/build-push-docker-images.sh -r <repo> -t my-tag push | ||
|
||
Docker files are under the `dockerfiles/` and can be customized further before |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"the dockerfiles/
directory in the Spark distribution archive"?
(Assuming that's true. Need to clarify where to find dockerfiles/
.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not yet true - it's a TODO item on this PR. Will clarify after we get that change in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has been addressed and the script has been fixed now.
docs/running-on-kubernetes.md
Outdated
|
||
## Dependency Management | ||
|
||
If your application's dependencies are all hosted in remote locations like HDFS or http servers, they may be referred to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"HTTP servers"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
docs/running-on-kubernetes.md
Outdated
<td> | ||
Docker image to use for the executors. Specify this using the standard | ||
<a href="https://docs.docker.com/engine/reference/commandline/tag/">Docker tag</a> format. | ||
This configuration is required and must be provided by the user. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to have this default to the same image used for the driver? Or are they fundamentally different in some way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They use different commands to run driver vs executor, and also have some differing environment expectations, so defaulting them to the same would not work
docs/running-on-yarn.md
Outdated
@@ -18,7 +18,8 @@ Spark application's configuration (driver, executors, and the AM when running in | |||
|
|||
There are two deploy modes that can be used to launch Spark applications on YARN. In `cluster` mode, the Spark driver runs inside an application master process which is managed by YARN on the cluster, and the client can go away after initiating the application. In `client` mode, the driver runs in the client process, and the application master is only used for requesting resources from YARN. | |||
|
|||
Unlike [Spark standalone](spark-standalone.html) and [Mesos](running-on-mesos.html) modes, in which the master's address is specified in the `--master` parameter, in YARN mode the ResourceManager's address is picked up from the Hadoop configuration. Thus, the `--master` parameter is `yarn`. | |||
Unlike [Spark standalone](spark-standalone.html), [Mesos](running-on-mesos.html) and [Kubernetes](running-on-kubernetes.html) modes, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should just say "other cluster managers supported by Spark" at this point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Test build #84929 has finished for PR 19946 at commit
|
Test build #84934 has finished for PR 19946 at commit
|
…ith container ## What changes were proposed in this pull request? Changes discussed in #19946 (comment) docker -> container, since with CRI, we are not limited to running only docker images. ## How was this patch tested? Manual testing Author: foxish <[email protected]> Closes #19995 from foxish/make-docker-container.
873f04d
to
a7e0c4c
Compare
Test build #85167 has finished for PR 19946 at commit
|
Test build #85178 has finished for PR 19946 at commit
|
Test build #85179 has finished for PR 19946 at commit
|
$ bin/spark-submit \ | ||
--master k8s://https://<k8s-apiserver-host>:<k8s-apiserver-port> \ | ||
--deploy-mode cluster \ | ||
--name spark-pi \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to explicitly call out that app names in k8s mode cannot have spaces and special characters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, will update with caveat.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
BTW can I ask you guys to start using "K8S" in the PR titles? These are not really scheduler changes. |
…ay to take time instead of int ## What changes were proposed in this pull request? Fixing configuration that was taking an int which should take time. Discussion in apache#19946 (comment) Made the granularity milliseconds as opposed to seconds since there's a use-case for sub-second reactions to scale-up rapidly especially with dynamic allocation. ## How was this patch tested? TODO: manual run of integration tests against this PR. PTAL cc/ mccheah liyinan926 kimoonkim vanzin mridulm jiangxb1987 ueshin Author: foxish <[email protected]> Closes apache#20032 from foxish/fix-time-conf.
Makes sense, will be sure to categorize appropriately moving forward.
Thanks!
On Dec 20, 2017 4:16 PM, "Marcelo Vanzin" <[email protected]> wrote:
BTW can I ask you guys to start using "K8S" in the PR titles? These are not
really scheduler changes.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#19946 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AA3U53sCa7IHsWMCjwYxlIWBCmkDrqUKks5tCaNYgaJpZM4Q-PQe>
.
|
Addressed comments. This PR should be ready to go, with one separate issue of renaming to be discussed in https://issues.apache.org/jira/browse/SPARK-22853. PTAL @vanzin @mridulm @ueshin @jiangxb1987 @felixcheung @rxin |
Test build #85222 has finished for PR 19946 at commit
|
docs/running-on-yarn.md
Outdated
@@ -18,7 +18,8 @@ Spark application's configuration (driver, executors, and the AM when running in | |||
|
|||
There are two deploy modes that can be used to launch Spark applications on YARN. In `cluster` mode, the Spark driver runs inside an application master process which is managed by YARN on the cluster, and the client can go away after initiating the application. In `client` mode, the driver runs in the client process, and the application master is only used for requesting resources from YARN. | |||
|
|||
Unlike [Spark standalone](spark-standalone.html) and [Mesos](running-on-mesos.html) modes, in which the master's address is specified in the `--master` parameter, in YARN mode the ResourceManager's address is picked up from the Hadoop configuration. Thus, the `--master` parameter is `yarn`. | |||
Unlike other cluster managers supported by Spark | |||
in which the master's address is specified in the `--master` parameter, in YARN mode the ResourceManager's address is picked up from the Hadoop configuration. Thus, the `--master` parameter is `yarn`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: why start a new line here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Test build #85278 has finished for PR 19946 at commit
|
Ping - any other comments? |
LGTM |
Merging in master. |
The path was recently changed in apache#19946, but the dockerfile was not updated.
## What changes were proposed in this pull request? The path was recently changed in #19946, but the dockerfile was not updated. This is a trivial 1 line fix. ## How was this patch tested? `./sbin/build-push-docker-images.sh -r spark-repo -t latest build` cc/ vanzin mridulm rxin jiangxb1987 liyinan926 Author: Anirudh Ramanathan <[email protected]> Author: foxish <[email protected]> Closes #20051 from foxish/patch-1.
What changes were proposed in this pull request?
This PR contains documentation on the usage of Kubernetes scheduler in Spark 2.3, and a shell script to make it easier to build docker images required to use the integration. The changes detailed here are covered by #19717 and #19468 which have merged already.
How was this patch tested?
The script has been in use for releases on our fork. Rest is documentation.
cc @rxin @mateiz (shepherd)
k8s-big-data SIG members & contributors: @foxish @ash211 @mccheah @liyinan926 @erikerlandson @ssuchter @varunkatta @kimoonkim @tnachen @ifilonenko
reviewers: @vanzin @felixcheung @jiangxb1987 @mridulm
TODO: