-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-19688] [STREAMING] Not to read spark.yarn.credentials.file
from checkpoint.
#18230
Conversation
ok to test |
Test build #77804 has finished for PR 18230 at commit
|
@saturday-shi would you please update the title to track SPARK-19688? |
I guess "spark.yarn.credentials.renewalTime" and "spark.yarn.credentials.updateTime" should also be excluded. |
spark.yarn.credentials.file
from checkpoint.spark.yarn.credentials.file
from checkpoint.
Thank you for pointing out that. I'll check & fix them. |
@jerryshao I've taken a look at Could you describe some elaborate scenes about the changing of |
From my understanding this two configurations will also point to expired timestamp, since each started application will reset it, so no need to checkpoint them. I think all the Spark on YARN internal configurations used for internal state tracking and passing needn't to checkpoint, they will always be updated in yarn client's launching. |
@jerryshao Sorry for the delay. This PR attempt to fix the bug at least changing of codes, so it'll be easily merged into any maintenance branches. I don't care adding more options into the exclude-list, but you'll have to do an extra work to cherry-pick a subset for branches before 2.1, as |
I don't agree with your point of view, there's already some potential issues regarding to internal configurations, either they will potentially lead to unexpected state, either they're meaningless to checkpoint, I will reserve my opinion if you insist on a point fix. |
No, I don't mean to insist on my opinion. I'm just curious to know the reason for the changing (as it looks like another point fix). |
Test build #77910 has finished for PR 18230 at commit
|
The wording in this code always confuses me... I never know what "reload" means ( Anyway, I think I understand why this is broken. Because of this in
So if you start the second streaming application without providing principal / keytab, So the workaround is to make sure the restarted application also has the principal / keytab arguments. As for the fix, it seems only adding the credential file does fix the problem, since the AM code only looks at it to decide whether to start the credential updater thread. But perhaps a better fix would be to fix the AM to look at the correct config (e.g. |
@vanzin "reload" here meanings retrieving back This is a streaming specific operation, the original purpose is to keep restarted streaming application be the same state (as well as configurations) as the previous stopped app. But in Spark we have some configurations which are only meaningful in the current application, so recovering these for restarted app is meaningless and will lead to some issues. And for this purpose Spark Streaming maintains a |
That explanation is extremely wrong. But your opinion of what the After restarting from checkpoint, properties in
That's probably right, but not the case. I do submit the principal & keytab at restarting and the AM do renew the token using the principal successfully. I noticed that the FYI, the log of
It renews the token successfully and saves it to application_1496384469444_0036's dir.
... which says that the credentials file doesn't exist in application_1496384469444_0035's dir. |
Ok, I think I follow the code path now. The AM uses the conf from the user, while SparkContext (which provides the conf to the executors) loads it from the checkpoint and needs to overwrite the properties listed in the "reload" list. Anyway, current patch looks good. Merging to master / 2.2 / 2.1 / 2.0 / 1.6 (although I wouldn't hold my breath for a new 1.6 release). |
…om checkpoint. ## What changes were proposed in this pull request? Reload the `spark.yarn.credentials.file` property when restarting a streaming application from checkpoint. ## How was this patch tested? Manual tested with 1.6.3 and 2.1.1. I didn't test this with master because of some compile problems, but I think it will be the same result. ## Notice This should be merged into maintenance branches too. jira: [SPARK-21008](https://issues.apache.org/jira/browse/SPARK-21008) Author: saturday_s <[email protected]> Closes #18230 from saturday-shi/SPARK-21008. (cherry picked from commit e92ffe6) Signed-off-by: Marcelo Vanzin <[email protected]>
…om checkpoint. ## What changes were proposed in this pull request? Reload the `spark.yarn.credentials.file` property when restarting a streaming application from checkpoint. ## How was this patch tested? Manual tested with 1.6.3 and 2.1.1. I didn't test this with master because of some compile problems, but I think it will be the same result. ## Notice This should be merged into maintenance branches too. jira: [SPARK-21008](https://issues.apache.org/jira/browse/SPARK-21008) Author: saturday_s <[email protected]> Closes #18230 from saturday-shi/SPARK-21008. (cherry picked from commit e92ffe6) Signed-off-by: Marcelo Vanzin <[email protected]>
…om checkpoint. ## What changes were proposed in this pull request? Reload the `spark.yarn.credentials.file` property when restarting a streaming application from checkpoint. ## How was this patch tested? Manual tested with 1.6.3 and 2.1.1. I didn't test this with master because of some compile problems, but I think it will be the same result. ## Notice This should be merged into maintenance branches too. jira: [SPARK-21008](https://issues.apache.org/jira/browse/SPARK-21008) Author: saturday_s <[email protected]> Closes #18230 from saturday-shi/SPARK-21008. (cherry picked from commit e92ffe6) Signed-off-by: Marcelo Vanzin <[email protected]>
…om checkpoint. ## What changes were proposed in this pull request? Reload the `spark.yarn.credentials.file` property when restarting a streaming application from checkpoint. ## How was this patch tested? Manual tested with 1.6.3 and 2.1.1. I didn't test this with master because of some compile problems, but I think it will be the same result. ## Notice This should be merged into maintenance branches too. jira: [SPARK-21008](https://issues.apache.org/jira/browse/SPARK-21008) Author: saturday_s <[email protected]> Closes #18230 from saturday-shi/SPARK-21008. (cherry picked from commit e92ffe6) Signed-off-by: Marcelo Vanzin <[email protected]>
@saturday-shi I don't know your jira handle, let me know it if you want the bug to be assigned to you. |
@vanzin Xing Shi (saturday_s), thanks. |
…om checkpoint. ## What changes were proposed in this pull request? Reload the `spark.yarn.credentials.file` property when restarting a streaming application from checkpoint. ## How was this patch tested? Manual tested with 1.6.3 and 2.1.1. I didn't test this with master because of some compile problems, but I think it will be the same result. ## Notice This should be merged into maintenance branches too. jira: [SPARK-21008](https://issues.apache.org/jira/browse/SPARK-21008) Author: saturday_s <[email protected]> Closes apache#18230 from saturday-shi/SPARK-21008. (cherry picked from commit e92ffe6) Signed-off-by: Marcelo Vanzin <[email protected]> (cherry picked from commit a233fac)
…om checkpoint. ## What changes were proposed in this pull request? Reload the `spark.yarn.credentials.file` property when restarting a streaming application from checkpoint. ## How was this patch tested? Manual tested with 1.6.3 and 2.1.1. I didn't test this with master because of some compile problems, but I think it will be the same result. ## Notice This should be merged into maintenance branches too. jira: [SPARK-21008](https://issues.apache.org/jira/browse/SPARK-21008) Author: saturday_s <[email protected]> Closes apache#18230 from saturday-shi/SPARK-21008.
…om checkpoint. Reload the `spark.yarn.credentials.file` property when restarting a streaming application from checkpoint. Manual tested with 1.6.3 and 2.1.1. I didn't test this with master because of some compile problems, but I think it will be the same result. This should be merged into maintenance branches too. jira: [SPARK-21008](https://issues.apache.org/jira/browse/SPARK-21008) Author: saturday_s <[email protected]> Closes apache#18230 from saturday-shi/SPARK-21008.
What changes were proposed in this pull request?
Reload the
spark.yarn.credentials.file
property when restarting a streaming application from checkpoint.How was this patch tested?
Manual tested with 1.6.3 and 2.1.1.
I didn't test this with master because of some compile problems, but I think it will be the same result.
Notice
This should be merged into maintenance branches too.
jira: SPARK-21008