[CELEBORN-1451] HPA support #2776

lianneli · 2024-09-30T10:14:52Z

What changes were proposed in this pull request?

Add HorizontalPodAutoscaler to worker in helm chart.
Add HorizontalPodAutoscaler test.
Add lifecycle preStop hook to worker StatefulSet, when HPA close worker, worker will trigger decommission through http restful api.
Delete duplicated resources key in worker and master StatefulSet.
Change app version to 0.6.0

Why are the changes needed?

For most of time in day time, spark task is very little and shuffle data is barely empty. Celeborn do not need much Pods which got waste of resources. HPA can control this automatically.

Does this PR introduce any user-facing change?

no. I add a switch to the HPA and the default value is false.

How was this patch tested?

Tested locally and in dev environment.

add a new line in the end

s0nskar · 2024-09-30T11:06:54Z

@lianneli This is a great feature. On what metrics it will upscale/downscale. Is there any document for this?

lianneli · 2024-10-08T03:07:40Z

@lianneli This is a great feature. On what metrics it will upscale/downscale. Is there any document for this?

@s0nskar The official doc is https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/.

For more conveniently using, I make a default setting in Values.yaml. Namely, choose cpu utilization for worker pods as the decisive metric. When cpu utilization higher than 70% as long as 10s, then upscale; and when cpu utilization lower than 70% last for 300s, then downscale.

There are still some risks involved since the worker may still work. Although worker pods will trigger to decommission before close, it's highly recommended to set celeborn.client.push.replicate.enabled to true for more stable performance.

RexXiong · 2024-10-08T06:49:42Z

@lianneli This is a great feature. On what metrics it will upscale/downscale. Is there any document for this?

@s0nskar The official doc is https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/.

For more conveniently using, I make a default setting in Values.yaml. Namely, choose cpu utilization for worker pods as the decisive metric. When cpu utilization higher than 70% as long as 10s, then upscale; and when cpu utilization lower than 70% last for 300s, then downscale.

There are still some risks involved since the worker may still work. Although worker pods will trigger to decommission before close, it's highly recommended to set celeborn.client.push.replicate.enabled to true for more stable performance.

Thanks @lianneli for supporting Celeborn use HPA, But for Celeborn StatefulSet, I believe there are several shortcomings with HPA and current implementations:

Worker may still working, although worker pods will trigger to decommission before close as your comment.
Once a worker pod has been decommissioned, there is currently no mechanism in place to recommission it. Consequently, if the cluster experiences increased demand post-decommissioning, new worker pods must be spun up. and the cluster may has low resource efficiency, or data maybe loss because 1.
Such as the stabilization window (stabilizationWindowSeconds), or the ability to dynamically enable/disable scaling up/down, are fixed at deployment time. Adjustments to these settings or altering the number of replicas in the StatefulSet require redeployment, IMO we can support changing those parameter/configuration at runtime.
The solution has limitation as If we want support custom scale behavior/metrics(network/disk space/memory/cpu?) or ResourceManager (Inner Platform )(not direct talk to k8s)

Someone in the community has also proposed a solution for scaling (maybe later send to dev mail list), we can discussion these two solutions about scaling celeborn.

lianneli · 2024-10-08T08:31:42Z

@RexXiong The solution is great. I will follow up the discussion though mail list.

github-actions · 2024-10-28T08:35:01Z

This PR is stale because it has been open 20 days with no activity. Remove stale label or comment or this will be closed in 10 days.

lianneli added 2 commits September 30, 2024 17:57

HPA support

0c424e5

Update values.yaml

dd4bbdb

add a new line in the end

Merge branch 'apache:main' into issue-1451

ac7a9ec

github-actions bot added the stale label Oct 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CELEBORN-1451] HPA support #2776

[CELEBORN-1451] HPA support #2776

lianneli commented Sep 30, 2024

s0nskar commented Sep 30, 2024

lianneli commented Oct 8, 2024

RexXiong commented Oct 8, 2024

lianneli commented Oct 8, 2024

github-actions bot commented Oct 28, 2024

[CELEBORN-1451] HPA support #2776

Are you sure you want to change the base?

[CELEBORN-1451] HPA support #2776

Conversation

lianneli commented Sep 30, 2024

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

s0nskar commented Sep 30, 2024

lianneli commented Oct 8, 2024

RexXiong commented Oct 8, 2024

lianneli commented Oct 8, 2024

github-actions bot commented Oct 28, 2024