Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bitnami/rabbitmq]: cpu use much higher after updating from 11.4.0 to 11.13.0 #16141

Closed
oprudkyi opened this issue Apr 20, 2023 · 3 comments
Closed
Assignees
Labels
rabbitmq solved tech-issues The user has a technical issue about an application triage Triage is needed

Comments

@oprudkyi
Copy link

Name and Version

bitnami/rabbitmq 11.13.0

What architecture are you using?

arm64

What steps will reproduce the bug?

  • gke
  • standalone (1 replica) and ha config (3 replicas, clustering = true)
  • low load (up to 5-50 messages per second)
  • update from 11.4.0 to 11.13.0 (also checked 11.12.1)
  • check cpu load after few hours after update/restart
kubectl top pods -A | grep rabbit
rabbitmq                  rabbitmq-standalone-0                                            961m         169Mi

Are you using any custom parameters or values?

  values = [
    yamlencode({
      replicaCount          = 1
      podAntiAffinityPreset = "hard"
      auth = {
        username     = "..."
        password     = "..."
        erlangCookie = "..."
      }
      memoryHighWatermark = {
        enabled = true
        # https://www.rabbitmq.com/memory.html
        type  = "relative"
        value = "0.5"
      }
      clustering = {
        enabled   = false
        rebalance = false
      }
      maxAvailableSchedulers = "2"
      onlineSchedulers       = "2"
      resources = {
        requests = {
          memory = ""512Mi"
          cpu    = "200m"
        }
        limits = {
          memory = "2Gi"
          cpu    = "1"
        }
      }
      terminationGracePeriodSeconds = 15 //default 120, for preemptible <= 25 is supported
      service = {
        externalTrafficPolicy = "Local"
      }
      persistence = {
        enabled      = true
        storageClass = "premium-rwo"
        size         = "2Gi"
      }
      extraConfiguration = join("\n", [
        "default_vhost = some-name",
        "log.file.level = error",
        "log.console.level = error",
        "consumer_timeout = 1800000,
      ])
      nodeSelector = {
        "kubernetes.io/os" = "linux"
        role               = "common-services"
        preemptible        = "false"
      }
      metrics = {
        enabled = true
      }
      # don't allow to kill it by kube-dns
      priorityClassName = "system-cluster-critical"
    })
  ]

What is the expected behavior?

low cpu use

kubectl top pods -A | grep rabbit
rabbitmq                  rabbitmq-standalone-0                                            148m         239Mi

What do you see instead?

higher cpu use (~1CPU per process)

kubectl top pods -A | grep rabbit
rabbitmq                  rabbitmq-standalone-0                                            961m         169Mi

Additional information

it's weird, but just after update cpu usage is low - 70-200ms, but in and hour or few cpu usage raises up to ~1 cpu without any logs/errors etc
here metrics from 3 nodes ha cluster - the cpu started low and then nodes one by one started to use more cpu
зображення

also, on some clusters one replica still use low cpu for days while others start to use higher cpu

kubectl top pods -A | grep rabbitmq
rabbitmq                  rabbitmq-cluster-0                                               1019m        320Mi           
rabbitmq                  rabbitmq-cluster-1                                               103m         278Mi           
rabbitmq                  rabbitmq-cluster-2                                               1045m        324Mi   

I also tried to increase cpu limits, but without success

@oprudkyi oprudkyi added the tech-issues The user has a technical issue about an application label Apr 20, 2023
@github-actions github-actions bot added the triage Triage is needed label Apr 20, 2023
@oprudkyi
Copy link
Author

also I tried to increase request/limit for mem, but without success.
rollback down to 11.4.0 helps though

@oprudkyi
Copy link
Author

correction - rollback down to 11.4.0 didn't help . Either there are some saved changes on fs level or something like

@javsalgar
Copy link
Contributor

Hi,

Thank you for opening the ticket. I believe this is a duplicate of #11116. Feel free to reopen it if you believe it is something different.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rabbitmq solved tech-issues The user has a technical issue about an application triage Triage is needed
Projects
None yet
Development

No branches or pull requests

3 participants