High CPU usage #193

smaznet · 2024-05-12T20:08:44Z

i used caddy-l4 for forwarding data to unix sockets but i have cpu issue with caddy

handler reverse_proxy with proxy_protocol enabled

mohammed90 · 2024-05-12T22:59:42Z

It's hard to resolve such claim without specifics. You can start by reading this page to extract CPU profiles from Caddy. Share the profiles with us.

I've attached a template below that will help make this easier and faster! This will require some effort on your part -- please understand that we will be dedicating time to fix the bug you are reporting if you can just help us understand it and reproduce it easily.

This template will ask for some information you've already provided; that's OK, just fill it out the best you can. 👍 I've also included some helpful tips below the template. Feel free to let me know if you have any questions!

Thank you again for your report, we look forward to resolving it!

Template

## 1. Environment

### 1a. Operating system and version

```
paste here
```


### 1b. Caddy version (run `caddy version` or paste commit SHA)

```
paste here
```


### 1c. Go version (if building Caddy from source; run `go version`)

```
paste here
```


## 2. Description

### 2a. What happens (briefly explain what is wrong)




### 2b. Why it's a bug (if it's not obvious)




### 2c. Log output

```
paste terminal output or logs here
```



### 2d. Workaround(s)




### 2e. Relevant links




## 3. Tutorial (minimal steps to reproduce the bug)

Helpful tips

Environment: Please fill out your OS and Caddy versions, even if you don't think they are relevant. (They are always relevant.) If you built Caddy from source, provide the commit SHA and specify your exact Go version.
Description: Describe at a high level what the bug is. What happens? Why is it a bug? Not all bugs are obvious, so convince readers that it's actually a bug.
- 2c) Log output: Paste terminal output and/or complete logs in a code block. DO NOT REDACT INFORMATION except for credentials.
- 2d) Workaround: What are you doing to work around the problem in the meantime? This can help others who encounter the same problem, until we implement a fix.
- 2e) Relevant links: Please link to any related issues, pull requests, docs, and/or discussion. This can add crucial context to your report.
Tutorial: What are the minimum required specific steps someone needs to take in order to experience the same bug? Your goal here is to make sure that anyone else can have the same experience with the bug as you do. You are writing a tutorial, so make sure to carry it out yourself before posting it. Please:
- Start with an empty config. Add only the lines/parameters that are absolutely required to reproduce the bug.
- Do not run Caddy inside containers.
- Run Caddy manually in your terminal; do not use systemd or other init systems.
- If making HTTP requests, avoid web browsers. Use a simpler HTTP client instead, like curl.
- Do not redact any information from your config (except credentials). Domain names are public knowledge and often necessary for quick resolution of an issue!
- Note that ignoring this advice may result in delays, or even in your issue being closed. 😞 Only actionable issues are kept open, and if there is not enough information or clarity to reproduce the bug, then the report is not actionable.

Example of a tutorial:

Create a config file:
{ ... }
Open terminal and run Caddy:
$ caddy ...
Make an HTTP request:
$ curl ...
Notice that the result is ___ but it should be ___.

smaznet · 2024-05-13T03:35:32Z

1. Environment

Ubuntu 22.04.2 LTS

Caddy version:

v2.7.6 h1:w0NymbG2m9PcvKWsrXO6EEkY9Ru4FJK8uQbYcev1p3A=

config

......
    {
    "listen": [
            ":5000",
            ":2087"
          ],
          "routes": [
            {
              "match": [
                {
                  "tls": {}
                }
              ],
              "handle": [
                {
                  "handler": "tls",
                  "connection_policies": [
                    {
                      "client_authentication": {
                        "mode": "request"
                      }
                    }
                  ]
                },
                {
                  "handler": "proxy",
                  "proxy_protocol": "v2",
                  "upstreams": [
                    {
                      "dial": [
                        "unix//sockets/r-{l4.tls.server_name}.socket"
                      ]
                    }
                  ]
                }
              ]
            },
            {
              "match": [
                {
                  "http": []
                }
              ],
              "handle": [
                {
                  "handler": "proxy",
                  "proxy_protocol": "v2",
                  "upstreams": [
                    {
                      "dial": [
                        "unix//sockets/r-{l4.http.host}.socket"
                      ]
                    }
                  ]
                }
              ]
            }
          ]}
  
    ....

profile.svg

mohammed90 · 2024-05-13T05:33:52Z

We need the full file, not just a cropped image. Also share the full config.

smaznet · 2024-05-13T06:29:26Z

Okay my test enviroment
needed tools

haproxy
caddy with caddy-l4 module
vegeta (https://github.com/tsenart/vegeta)

OS

ubuntu 22.04

add this line to /etc/hosts

127.0.0.1 sub.dom.com

Haproxy config:

frontend http_front
    bind /tmp/r-sub.dom.com:2087.socket accept-proxy
    default_backend static_backend

backend static_backend
    mode http
    http-request return status 200 content-type text/plain string "Hello World"

Caddy full Config

{
  "logging": {
    "logs": {
      "default": {
        "level": "DEBUG"
      }
    }
  },
  "apps": {
    "tls": {
      "cache": {
        "capacity": 2000
      },
      "certificates": {
        "automate": [
          "sub.dom.com"
        ]
      },
      "automation": {
        "policies": [
          {
            "issuers": [
              {
                "module": "internal",
                "lifetime": "600d"
              }
            ]
          }
        ]
      }
    },
    "layer4": {
      "servers": {
        "ws": {
          "listen": [
            ":5000",
            ":2087"
          ],
          "routes": [
            {
              "match": [
                {
                  "tls": {}
                }
              ],
              "handle": [
                {
                  "handler": "tls",
                  "connection_policies": [
                    {
                      "client_authentication": {
                        "mode": "request"
                      }
                    }
                  ]
                },
                {
                  "handler": "proxy",
                  "proxy_protocol": "v2",
                  "upstreams": [
                    {
                      "dial": [
                        "unix//tmp/r-{l4.tls.server_name}:2087.socket"
                      ]
                    }
                  ]
                }
              ]
            },
            {
              "match": [
                {
                  "http": []
                }
              ],
              "handle": [
                {
                  "handler": "proxy",
                  "proxy_protocol": "v2",
                  "upstreams": [
                    {
                      "dial": [
                        "unix//tmp/r-{l4.http.host}.socket"
                      ]
                    }
                  ]
                }
              ]
            }
          ]
        }
      }
    }
  }
}

Test steps

haproxy -f hp.conf &
caddy run --config test.json

with ssl:

echo "GET https://sub.dom.com:2087" | ./vegeta attack -output /dev/null -rate=0 -max-workers 1000 -insecure

without ssl:

echo "GET http://sub.dom.com:2087" | ./vegeta attack -output /dev/null -rate=0 -max-workers 1000

profiles

NOTE: .gz is fake for upload to github remove this from file name

profile with ssl

profile (7).gz

profile without ssl

profile (6).gz

smaznet · 2024-05-13T06:30:36Z

We need the full file, not just a cropped image

its not cropped image its svg file open in other tab

mholt · 2024-05-13T14:31:17Z

I noticed the GitHub app has a hard time with SVGs.

looks like a LOT of time is being spent in syscalls... reading and writing file descriptors (sockets) in the proxy module. I assume these are TCP packets from the client, based on the listen address?

@smaznet If you remove the PROXY protocol config does the CPU usage improve?

smaznet · 2024-05-13T17:46:31Z

i'll try

ydylla · 2024-05-13T19:32:36Z

@smaznet What exactly do you mean with high CPU usage? What do you expect as normal usage?
Sending requests as fast as possible from 1000 workers with vegeta will obviously max out your CPU.

Since I was afraid this had something to do with the recent matching rewrite (see #192) I did some own benchmarking. With my favorite testing config 😅 :

prefetch.json

{
  "admin": {
    "disabled": true
  },
  "logging": {
    "logs": {
      "default": {"level":"DEBUG", "encoder": {"format":"console"}}
    }
  },
  "apps": {
    "tls": {
      "certificates": {
        "automate": ["localhost"]
      },
      "automation": {
        "policies": [{
          "subjects": ["localhost"],
          "issuers": [{
            "module": "internal"
          }]
        }]
      }
    },
    "layer4": {
      "servers": {
        "https": {
          "listen": ["0.0.0.0:10443"],
          "routes": [
            {
              "match": [
                {"proxy_protocol": {}}
              ],
              "handle": [
                {
                  "handler": "proxy_protocol",
                  "timeout": "2s",
                  "allow": ["127.0.0.1/32"]
                }
              ]
            },
            {
              "match": [
                {"tls": {"sni": ["localhost"]}}
              ],
              "handle": [
                {"handler": "tls"}
              ]
            },
            {
              "match": [
                {"http": [{"host": ["localhost"]}]}
              ],
              "handle": [
                {
                  "handler": "proxy",
                  "upstreams": [{"dial": ["127.0.0.1:10080"]}]
                }
              ]
            }
          ]
        }
      }
    },
    "http": {
      "servers": {
        "backend": {
          "protocols": ["h1","h2","h2c"],
          "listen": ["127.0.0.1:10080"],
          "routes": [
            {
              "handle": [{
                "handler": "static_response",
                "status_code": "200",
                "body": "Hello World\n",
                "headers": {
                  "Content-Type": ["text/plain"]
                }
              }]
            }
          ]
        }
      }
    }
  }
}

These are the numbers I get on my machine for the commit 145ec36 (before the matching rewrite) and f049165 is the current master:

PS G:\Temp> echo "GET http://localhost:10443" | ./vegeta.exe attack -rate=0 -max-workers 1000 -duration=60s -output 145ec3-http-1.bin
PS G:\Temp> echo "GET http://localhost:10443" | ./vegeta.exe attack -rate=0 -max-workers 1000 -duration=60s -output 145ec3-http-2.bin
PS G:\Temp> echo "GET http://localhost:10443" | ./vegeta.exe attack -rate=0 -max-workers 1000 -duration=60s -output 145ec3-http-3.bin
PS G:\Temp> .\vegeta.exe report .\145ec3-http-1.bin
Requests      [total, rate, throughput]         9411901, 156865.33, 156855.21
Duration      [total, attack, wait]             1m0s, 1m0s, 3.87ms
Latencies     [min, mean, 50, 90, 95, 99, max]  1.102ms, 3.036ms, 1.614ms, 7.674ms, 10.589ms, 17.863ms, 1.149s
Bytes In      [total, mean]                     112942812, 12.00
Bytes Out     [total, mean]                     0, 0.00
Success       [ratio]                           100.00%
Status Codes  [code:count]                      200:9411901
Error Set:
PS G:\Temp> .\vegeta.exe report .\145ec3-http-2.bin
Requests      [total, rate, throughput]         9527806, 158794.31, 158792.91
Duration      [total, attack, wait]             1m0s, 1m0s, 527µs
Latencies     [min, mean, 50, 90, 95, 99, max]  534.7µs, 1.688ms, 595.513µs, 4.06ms, 6.213ms, 12.558ms, 1.119s
Bytes In      [total, mean]                     114333672, 12.00
Bytes Out     [total, mean]                     0, 0.00
Success       [ratio]                           100.00%
Status Codes  [code:count]                      200:9527806
Error Set:
PS G:\Temp> .\vegeta.exe report .\145ec3-http-3.bin
Requests      [total, rate, throughput]         9404961, 156749.46, 156749.24
Duration      [total, attack, wait]             1m0s, 1m0s, 84.3µs
Latencies     [min, mean, 50, 90, 95, 99, max]  84.3µs, 3.381ms, 1.777ms, 8.625ms, 11.849ms, 19.253ms, 972.624ms
Bytes In      [total, mean]                     112859532, 12.00
Bytes Out     [total, mean]                     0, 0.00
Success       [ratio]                           100.00%
Status Codes  [code:count]                      200:9404961
Error Set:

PS G:\Temp> echo "GET http://localhost:10443" | ./vegeta.exe attack -rate=0 -max-workers 1000 -duration=60s -output f04916-http-1.bin
PS G:\Temp> echo "GET http://localhost:10443" | ./vegeta.exe attack -rate=0 -max-workers 1000 -duration=60s -output f04916-http-2.bin
PS G:\Temp> echo "GET http://localhost:10443" | ./vegeta.exe attack -rate=0 -max-workers 1000 -duration=60s -output f04916-http-3.bin
PS G:\Temp> .\vegeta.exe report .\f04916-http-1.bin
Requests      [total, rate, throughput]         9356041, 155928.36, 155920.48
Duration      [total, attack, wait]             1m0s, 1m0s, 3.035ms
Latencies     [min, mean, 50, 90, 95, 99, max]  22.8µs, 1.815ms, 684.44µs, 4.331ms, 6.731ms, 12.387ms, 1.376s
Bytes In      [total, mean]                     112272492, 12.00
Bytes Out     [total, mean]                     0, 0.00
Success       [ratio]                           100.00%
Status Codes  [code:count]                      200:9356041
Error Set:
PS G:\Temp> .\vegeta.exe report .\f04916-http-2.bin
Requests      [total, rate, throughput]         9228292, 153804.43, 153799.90
Duration      [total, attack, wait]             1m0s, 1m0s, 1.767ms
Latencies     [min, mean, 50, 90, 95, 99, max]  110.4µs, 3.416ms, 1.819ms, 8.547ms, 11.795ms, 19.691ms, 1.148s
Bytes In      [total, mean]                     110739504, 12.00
Bytes Out     [total, mean]                     0, 0.00
Success       [ratio]                           100.00%
Status Codes  [code:count]                      200:9228292
Error Set:
PS G:\Temp> .\vegeta.exe report .\f04916-http-3.bin
Requests      [total, rate, throughput]         9500202, 158335.41, 158329.55
Duration      [total, attack, wait]             1m0s, 1m0s, 2.22ms
Latencies     [min, mean, 50, 90, 95, 99, max]  606.6µs, 1.762ms, 649.258µs, 4.178ms, 6.483ms, 12.463ms, 1.213s
Bytes In      [total, mean]                     114002424, 12.00
Bytes Out     [total, mean]                     0, 0.00
Success       [ratio]                           100.00%
Status Codes  [code:count]                      200:9500202
Error Set:
PS G:\Temp>

So just a <1% difference in average throughput. But running caddy and vegeta on the same machine is not very sensible anyways since they fight over cpu time.

Adding sending and receiving of proxy_protocol to the config did not change the numbers significantly. Running with TLS was like expected a bit slower with an average throughput of 130590.14

mholt · 2024-05-13T19:55:15Z

FWIW I'm also guessing normal operation. Lots of traffic = lots of CPU. 🤷‍♂️

By far the VAST majority of time is in syscalls anyway. Syscalls are out of Caddy's realm.

T4cC0re · 2024-06-10T09:09:54Z

Hey there!

We have also seen this issue, and when grabbing a profile of the process, it seems a lot of time is being wasted in userspace in the layer4 module.

CPU utilization went from ~1% to ~600%. We have isolated this to be a change bewteen commit c5d815d (good, pre 19:00) and ce9789f (awful, post 19:00)

Please find attached cpu profiles captured (no GC, binary, 60s duration each) during regular traffic periods with Caddy at version 2.8.4 and either commit of the module.

profiles.zip

The following configuration was used: (redacted)

We need to use unix sockets as an intermediate because of a use-case we have.

{
  "logging": {
    "logs": {
      "default": {
        "level": "info",
        "writer": {
          "output": "stdout"
        },
        "encoder": {
          "format": "json"
        }
      }
    }
  },
  "apps": {
    "layer4": {
      "servers": {
        "http": {
          "listen": [
            ":80"
          ],
          "routes": [
            {
              "match": [
                {
                  "http": [
                    {
                      "host": [
                        "gitlab.REDACTED"
                      ]
                    }
                  ]
                }
              ],
              "handle": [
                {
                  "handler": "proxy",
                  "proxy_protocol": "v2",
                  "upstreams": [
                    {
                      "dial": [
                        "unix//var/opt/caddy/caddy.gitlab"
                      ]
                    }
                  ]
                }
              ]
            },
            {
              "match": [
                {
                  "http": [
                    {
                      "host": [
                        "registry.REDACTED"
                      ]
                    }
                  ]
                }
              ],
              "handle": [
                {
                  "handler": "proxy",
                  "proxy_protocol": "v2",
                  "upstreams": [
                    {
                      "dial": [
                        "unix//var/opt/caddy/caddy.registry"
                      ]
                    }
                  ]
                }
              ]
            }
          ]
        },
        "https": {
          "listen": [
            ":443"
          ],
          "routes": [
            {
              "match": [
                {
                  "tls": {}
                }
              ],
              "handle": [
                {
                  "handler": "subroute",
                  "routes": [
                    {
                      "match": [
                        {
                          "tls": {
                            "sni": [
                              "gitlab.REDACTED"
                            ]
                          }
                        }
                      ],
                      "handle": [
                        {
                          "handler": "proxy",
                          "proxy_protocol": "v2",
                          "upstreams": [
                            {
                              "dial": [
                                "unix//var/opt/caddy/caddy.gitlab"
                              ]
                            }
                          ]
                        }
                      ]
                    },
                    {
                      "match": [
                        {
                          "tls": {
                            "sni": [
                              "registry.REDACTED"
                            ]
                          }
                        }
                      ],
                      "handle": [
                        {
                          "handler": "proxy",
                          "proxy_protocol": "v2",
                          "upstreams": [
                            {
                              "dial": [
                                "unix//var/opt/caddy/caddy.registry"
                              ]
                            }
                          ]
                        }
                      ]
                    }
                  ]
                }
              ]
            }
          ]
        },
        "ssh": {
          "listen": [
            ":22"
          ],
          "routes": [
            {
              "handle": [
                {
                  "handler": "proxy",
                  "proxy_protocol": "v2",
                  "upstreams": [
                    {
                      "dial": [
                        "10.0.1.1:2022"
                      ]
                    }
                  ]
                }
              ]
            }
          ]
        }
      }
    },
    "http": {
      "http_port": 80,
      "https_port": 443,
      "servers": {
        "gitlab": {
          "strict_sni_host": true,
          "listen": [
            "unix//var/opt/caddy/caddy.gitlab"
          ],
          "listener_wrappers": [
            {
              "wrapper": "proxy_protocol",
              "timeout": 0
            },
            {
              "wrapper": "http_redirect"
            },
            {
              "wrapper": "tls"
            }
          ],
          "metrics": {},
          "routes": [
            {
              "match": [
                {
                  "host": [
                    "gitlab.REDACTED"
                  ]
                }
              ],
              "handle": [
                {
                  "handler": "headers",
                  "response": {
                    "set": {
                      "Server": [
                        "REDACTED"
                      ]
                    }
                  }
                },
                {
                  "encodings": {
                    "gzip": {},
                    "zstd": {}
                  },
                  "handler": "encode",
                  "prefer": [
                    "zstd",
                    "gzip"
                  ]
                },
                {
                  "handler": "reverse_proxy",
                  "load_balancing": {
                    "selection_policy": {
                      "policy": "round_robin"
                    }
                  },
                  "upstreams": [
                    {
                      "dial": "10.0.1.1:8081"
                    }
                  ]
                }
              ],
              "terminal": true
            }
          ]
        },
        "registry": {
          "strict_sni_host": true,
          "listen": [
            "unix//var/opt/caddy/caddy.registry"
          ],
          "listener_wrappers": [
            {
              "wrapper": "proxy_protocol",
              "timeout": 0
            },
            {
              "wrapper": "http_redirect"
            },
            {
              "wrapper": "tls"
            }
          ],
          "metrics": {},
          "routes": [
            {
              "match": [
                {
                  "host": [
                    "registry.REDACTED"
                  ]
                }
              ],
              "handle": [
                {
                  "handler": "headers",
                  "response": {
                    "set": {
                      "Server": [
                        "REDACTED"
                      ]
                    }
                  }
                },
                {
                  "encodings": {
                    "gzip": {},
                    "zstd": {}
                  },
                  "handler": "encode",
                  "prefer": [
                    "zstd",
                    "gzip"
                  ]
                },
                {
                  "handler": "reverse_proxy",
                  "load_balancing": {
                    "selection_policy": {
                      "policy": "round_robin"
                    }
                  },
                  "upstreams": [
                    {
                      "dial": "10.0.1.1:18081"
                    }
                  ]
                }
              ],
              "terminal": true
            }
          ]
        }
      }
    },
    "tls": {
      "certificates": {
        "load_folders": [
          "/var/opt/caddy/certificates"
        ]
      },
      "automation": {
        "policies": [
          {
            "on_demand": true,
            "issuers": [
              {
                "email": "REDACTED",
                "module": "acme",
                "challenges": {
                  "http": {
                    "disabled": true
                  },
                  "tls-alpn": {
                    "disabled": false
                  }
                }
              }
            ]
          }
        ],
        "on_demand": {
          "permission": {
            "endpoint": "http://127.0.0.1:2223",
            "module": "http"
          }
        }
      }
    }
  }
}

mholt · 2024-06-10T13:53:19Z

Thanks for the details. So lots of time spent matching TLS in your case.

Does the following PR help? Can you try this branch? #199

mohammed90 added the need info The issue is missing critical information label May 12, 2024

Repository owner deleted a comment from hndjdjdj Jun 3, 2024

ydylla mentioned this issue Jun 6, 2024

fix: endless matching loop #199

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High CPU usage #193

High CPU usage #193

smaznet commented May 12, 2024

mohammed90 commented May 12, 2024

smaznet commented May 13, 2024 •

edited

Loading

mohammed90 commented May 13, 2024

smaznet commented May 13, 2024

smaznet commented May 13, 2024

mholt commented May 13, 2024 •

edited

Loading

smaznet commented May 13, 2024

ydylla commented May 13, 2024

mholt commented May 13, 2024 •

edited

Loading

T4cC0re commented Jun 10, 2024

mholt commented Jun 10, 2024

High CPU usage #193

High CPU usage #193

Comments

smaznet commented May 12, 2024

mohammed90 commented May 12, 2024

Template

Helpful tips

smaznet commented May 13, 2024 • edited Loading

1. Environment

Ubuntu 22.04.2 LTS

Caddy version:

v2.7.6 h1:w0NymbG2m9PcvKWsrXO6EEkY9Ru4FJK8uQbYcev1p3A=

config

profile.svg

mohammed90 commented May 13, 2024

smaznet commented May 13, 2024

OS

add this line to /etc/hosts

Haproxy config:

Caddy full Config

Test steps

with ssl:

without ssl:

profiles

NOTE: .gz is fake for upload to github remove this from file name

profile with ssl

profile without ssl

smaznet commented May 13, 2024

mholt commented May 13, 2024 • edited Loading

smaznet commented May 13, 2024

ydylla commented May 13, 2024

mholt commented May 13, 2024 • edited Loading

T4cC0re commented Jun 10, 2024

mholt commented Jun 10, 2024

smaznet commented May 13, 2024 •

edited

Loading

mholt commented May 13, 2024 •

edited

Loading

mholt commented May 13, 2024 •

edited

Loading