Add Support For Pods Auto Scaling #817

MeydanOzeri · 2021-08-11T17:15:38Z

MeydanOzeri
Aug 11, 2021

I want to use the operator to deploy a RabbitMQ cluster on Kubernetes but I couldn't find any way to define auto scaling.
Is there any reason it's not supported ? is it planned for the future ?

When I create my custom deployment im just defining a HorizontalPodAutoscaler object and it takes care of the rest, so i thought it should be supported in the official operator as it is not a complicated matter.

Insights into it would be appreciated.

mkuratczyk · 2021-08-11T17:24:10Z

mkuratczyk
Aug 11, 2021
Maintainer

Can you share the details of when you trigger the autoscaling and how you define it right now? It's definitely something we can consider but it's also not something that is as easy as it sounds (making sure it's triggered at the right time) and in many cases doesn't actually provide any/significant benefit (in fact, it can make the performance worse since queues need to be rebalanced, based on the policies additional mirrors/followers may need to be created, etc).

0 replies

MeydanOzeri · 2021-08-11T18:19:04Z

MeydanOzeri
Aug 11, 2021
Author

This would be the HorizontalPodAutoScaler.yaml which defines the policy for scaling.

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: $(ENVIRONMENT)-rabbitmq-stateful-set-autoscaler
  namespace: $(ENVIRONMENT)-rabbitmq
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: StatefulSet
    name: $(ENVIRONMENT)-rabbitmq-stateful-set
  minReplicas: 3
  maxReplicas: 9
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 75
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 75

As can be seen a scale operation occurs when 75% of memory or cpu resources are utilized in all pods combined.
So this would be the trigger for scaling operations.

And in the StatefulSet.yaml im just using VolumeClaimTemplates to claim volumes for each created pod.

volumeClaimTemplates:
    - metadata:
        name: $(ENVIRONMENT)-rabbitmq-data
        namespace: $(ENVIRONMENT)-rabbitmq
      spec:
        storageClassName: $(STORAGE_CLASS)
        accessModes: [ReadWriteOnce]
        resources:
          requests:
            storage: 1Gi

The way I see it, correct me if im wrong, every node has a claim for its own volume, and data is not shared between nodes, thus rebalancing is not needed when a new node is going live. (it will just connect to the cluster and as soon as it's ready it will be able to handle incoming messages)

The only thing I can think of that might need special treatment is when a scale down operation occurs, but it can be achieved without rebalancing.
The idea would be to block new incoming messages for the node that will be shut down, wait for the existing messages to be processed and then we can shut it down safely without rebalancing as it will no longer have data on it.

Maybe im not aware enough of other edge cases but from my tests I didn't run in to problems even without special handling for scale downs.

0 replies

mkuratczyk · 2021-08-11T19:01:25Z

mkuratczyk
Aug 11, 2021
Maintainer

So there are two key aspects:

Triggering autoscaling is easy but there are lots of non-trivial concerns such as:

yes, each RabbitMQ node has its own volume but a node which is not a leader for any of the queues and has no data doesn't do much - all it does is, it acts as a proxy between the client app and the node that can actually do the work; therefore, unless you rebalance the queues, those additional nodes don't actually help to handle the traffic
those new nodes won't even act as a proxy unless there are connections established to those nodes; since connections are (should be) long-running, how do you trigger connections to be established with these new nodes?
rebalancing queues when the system is under heavy load, can make the system struggle even more
removing nodes from a cluster requires additional operations beyond just scaling down the StatefulSet
if a new node was added, and it became the leader for any of the queues, removing that node is again non-trivial operation (it can be easy to initiate the process, but in terms of what happens inside the cluster and how that affects connections, can be non-trivial)
a vast majority of RabbitMQ clusters in the world has 1, 3 or 5 nodes. A 9-node cluster may or may not behave well, depending on how exactly it is used (types of queues, connections, features used, etc)

Assuming we knew the answers to all of the above, at least for a specific use case, what kind of support would you expect from the Operator? Did you mean that it should create the HorizontalPodAutoscaler resource or that it shouldn't overwrite the replicas of the StatefulSet after the autoscaler changed it? Or perhaps something else entirely?

Finally, do you have any metrics showing that what you configured, helped to handle your traffic? Have you seen increased throughput and/or lower latency after additional nodes were deployed? Have you seen CPU and/or RAM usage going down after the scaling event?

You can read more about what's involved just to handle a "simple" scale-down operation here: #223.

Best,

5 replies

ccmcbeck Aug 3, 2023

@mkuratczyk, it looks like the OP never replied. We are just starting to use bitnami rabbitmq and we added an HPA ourselves to the statefulset. But we are seeing issues where the RabbitmqCluster is failing after HPA auto-scales. My question is whether it's still a bad practice for us to apply an HPA on the statefulset (for the reasons you gave)? Also, should we always use (at least) 3 replicas when we need multi-az? Thanks.

mkuratczyk Aug 3, 2023
Maintainer

Multi-az or not, if you need HA, you should use 3 nodes or more: https://www.rabbitmq.com/quorum-queues.html#quorum-requirements

Nothing fundamentally changed about my answer I think. It may work, perhaps even well, in certain scenarios, but more often than not, it's additional complexity for no obvious gain (and likely new worries and issues). I'd definitely recommend starting by answering the simple question: why do you think adding nodes to the cluster under load would be helpful (keeping in mind, that the queues are still on the "old" nodes, unless you introduce additional work to rebalance them).

ccmcbeck Aug 3, 2023

Thanks for the prompt answer. It was just me being pedantic. I have HPA on all my deployments. This is our first statefulset so we wanted it to scale horizontally. From your answer, we can see that it doesn't make sense. This is probably why bitnami didn't add it to the RabbitmqCluster CRD in the first place.

mkuratczyk Aug 3, 2023
Maintainer

Just FYI, isually by "bitnami" people refer to the chart that doesn't define a CRD, just the StatefulSet and related resources directly. CRD comes from the Operator.

Anyway, scaling stateful services isn't easy, but somehow people often think that it should just work in case of RabbitMQ, even thought they almost certainly wouldn't do that to MySQL, PostgreSQL or other "similar" applications. :)

ccmcbeck Aug 3, 2023

Roger that.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Support For Pods Auto Scaling #817

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments 5 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Add Support For Pods Auto Scaling #817

MeydanOzeri Aug 11, 2021

Replies: 3 comments · 5 replies

mkuratczyk Aug 11, 2021 Maintainer

MeydanOzeri Aug 11, 2021 Author

mkuratczyk Aug 11, 2021 Maintainer

ccmcbeck Aug 3, 2023

mkuratczyk Aug 3, 2023 Maintainer

ccmcbeck Aug 3, 2023

mkuratczyk Aug 3, 2023 Maintainer

ccmcbeck Aug 3, 2023

MeydanOzeri
Aug 11, 2021

Replies: 3 comments 5 replies

mkuratczyk
Aug 11, 2021
Maintainer

MeydanOzeri
Aug 11, 2021
Author

mkuratczyk
Aug 11, 2021
Maintainer

mkuratczyk Aug 3, 2023
Maintainer

mkuratczyk Aug 3, 2023
Maintainer