-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot launch driver after Spark default CPU value to int32 #721
Comments
Can you paste your |
same issue here. I'm testing, so I'm using the example lightbend manifest. original one use
I removed v1beta1 from our k8s and redeployed v1beta2 helm chart. changed the version on the manifest accordingly. here is the update
error:
|
@liyinan926 so sorry been away from my affected machine over the holidays will try to give you something sooner than later <3 |
This is due to a recent change in #578 that introduced the |
@liyinan926 Thanks buddy! |
attached is the output of any idea what is causing this? |
the error doesn't always appear. but this time I removed the |
The change to the type to |
@liyinan926 That error @damache posted is familiar. What I recall experiencing is if your default cpu limit for a namespace is 1, then it will also behave as the max limit and throw a different error (related to this issue) kubernetes/kubernetes#51430, but if you set a string value (i.e. "200m") then you get |
Let's make it clear that:
Whether the operator works out of the box depends on the environment's default cpu limit. To mitigate this issue, I think we should have the operator set at runtime Spark 3.0 has a new field |
spark 2.4.x in k8s mode the value is extracted as a string, we use this today in our spark services when deploying spark 2.4.4 to k8s via spark-submit directly not sure why issue #578 references the standalone spark configs, but it appears based on this line this crd deploys in k8s mode. I would recommend letting the spark submit do validation, otherwise the crd is going to require more maintenance and possibly make incorrect assumptions. |
@damache yes, it's parsed as a string in the k8s mode, but it's actually defined and treated as an integer elsewhere (see https://github.com/apache/spark/blob/cdc8fc6233450ed040f2f0272d06510c1eedbefb/core/src/main/scala/org/apache/spark/internal/config/package.scala#L81). That's why the new config |
BTW: we added support for |
that's only applicable to spark 3.0. so v1beta2 is only compatible to spark 3.0 if you don't want to set the cores of the driver to 1. v1beta1 says it supports spark 2.4.0. so no real support for spark 2.4.4. |
The specific field works with Spark 3.x only as the documentation clearly indicates. The rest of the API in
Not sure what you meant by this. If you would like to stick to the semantics of treating |
https://github.com/GoogleCloudPlatform/spark-on-k8s-operator#version-matrix that matrix says |
OK, the |
Hey there!
Firstly thank you for everything you are trying to achieve. When trying to carve your own path, projects like this surely make for great ways for new players in the data engineering space to get started running clusters and forge great data experiences :)
Here's the problem...
I've noticed that the Apache Spark on k8s specification has changed the no. CPUs from a float to an integer. With the release of the latest API version this has been reflected #578.
Fairly new to kubernates but this seems to create a conflict with the driver runner where I'm experiencing...
This is likely because
Invalid value: "1": must be less than or equal to cpu limit.
. So our new minimum value 1 seems to be the default value. But the error indicates that the CPU limit for Kubernates must be 1. Perhaps this is related to this issue kubernetes/kubernetes#51430. Not sure how to resolve this, may be my Kubernetes configuration at fault here more than anything.Environment
Here is the environment I'm running (from the pod)...
The text was updated successfully, but these errors were encountered: