-
Notifications
You must be signed in to change notification settings - Fork 359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CELEBORN-1601] Support revise lost shuffles #2746
Conversation
@FMX, could you also support the corresponding cli command for the HTTP endpoint to revise lost shuffles? |
Sounds good. I'll add the cli command. |
@SteNicholas Thanks. I have added the CLI command and the API endpoint. Please review this PR when you have time. |
cli/src/main/scala/org/apache/celeborn/cli/master/MasterSubcommandImpl.scala
Outdated
Show resolved
Hide resolved
master/src/main/scala/org/apache/celeborn/service/deploy/master/Master.scala
Outdated
Show resolved
Hide resolved
e71ac9e
to
1daa172
Compare
This PR is stale because it has been open 20 days with no activity. Remove stale label or comment or this will be closed in 10 days. |
service/src/main/scala/org/apache/celeborn/server/common/HttpService.scala
Outdated
Show resolved
Hide resolved
cli/src/main/scala/org/apache/celeborn/cli/master/ReviseLostShuffleOptions.scala
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Co-authored-by: Nicholas Jiang <[email protected]>
cli/src/main/scala/org/apache/celeborn/cli/common/CommonOptions.scala
Outdated
Show resolved
Hide resolved
common/src/main/scala/org/apache/celeborn/common/CelebornConf.scala
Outdated
Show resolved
Hide resolved
...c/main/scala/org/apache/celeborn/service/deploy/master/http/api/v1/ApplicationResource.scala
Outdated
Show resolved
Hide resolved
…scala Co-authored-by: Nicholas Jiang <[email protected]>
Co-authored-by: Nicholas Jiang <[email protected]>
…r/http/api/v1/ApplicationResource.scala Co-authored-by: Nicholas Jiang <[email protected]>
…s.scala Co-authored-by: Nicholas Jiang <[email protected]>
Thanks. Merged to main(v0.6.0). |
### What changes were proposed in this pull request? To support revising lost shuffle IDs in a long-running job such as flink batch jobs. ### Why are the changes needed? 1. To support revise lost shuffles. 2. To add an HTTP endpoint to revise lost shuffles manually. ### Does this PR introduce _any_ user-facing change? NO. ### How was this patch tested? Cluster tests. Closes apache#2746 from FMX/b1600. Lead-authored-by: mingji <[email protected]> Co-authored-by: Ethan Feng <[email protected]> Signed-off-by: SteNicholas <[email protected]>
What changes were proposed in this pull request?
To support revising lost shuffle IDs in a long-running job such as flink batch jobs.
Why are the changes needed?
Does this PR introduce any user-facing change?
NO.
How was this patch tested?
Cluster tests.