Skip to content
This repository has been archived by the owner on Oct 16, 2020. It is now read-only.

Possible memory leak #1742

Closed
jsermeno opened this issue Jan 1, 2017 · 2 comments
Closed

Possible memory leak #1742

jsermeno opened this issue Jan 1, 2017 · 2 comments

Comments

@jsermeno
Copy link

jsermeno commented Jan 1, 2017

Issue Report

Bug

CoreOS Version

$ cat /etc/os-release
NAME=CoreOS
ID=coreos
VERSION=1185.5.0
VERSION_ID=1185.5.0
BUILD_ID=2016-12-07-0937
PRETTY_NAME="CoreOS 1185.5.0 (MoreOS)"
ANSI_COLOR="1;32"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://github.com/coreos/bugs/issues"

Environment

What hardware/cloud provider/hypervisor is being used to run CoreOS?

3 digital ocean 2gb droplets.

Expected Behavior

Running fleet, etcd2, and flannel by themselves should not use enough memory to crash a machine with 2gb of memory.

Actual Behavior

After boot memory usage slowly increases until maxing out, after which point kswapd0 cpu usage skyrockets until machine becomes unresponsive. This entire process takes about 1-2 hours. Related to #1424.

Reproduction Steps

  1. Setup a 3-machine CoreOS cluster on digital ocean with etcd2, fleet, and flannel.
  2. Wait 1-2 hours.

Other Information

Using TLS with etcd2.

$ top
Tasks:  84 total,   2 running,  82 sleeping,   0 stopped,   0 zombie
%Cpu(s): 42.4 us, 13.7 sy,  0.0 ni, 43.4 id,  0.3 wa,  0.0 hi,  0.0 si,  0.2 st
KiB Mem:   2053072 total,  1935180 used,   117892 free,    72924 buffers
KiB Swap:        0 total,        0 used,        0 free.   305964 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 2758 fleet     20   0  682492 380360   6392 R  26.3 18.5   3:19.82 fleetd
    1 root      20   0  178100  47984   6036 S  28.9  2.3   1:51.92 systemd
 2739 etcd      20   0  185564  39660  11424 S   1.0  1.9   0:15.44 etcd2
 1717 core      20   0   83096  36944   5392 S  17.6  1.8   1:04.90 systemd

A little later:

Tasks:  80 total,   3 running,  77 sleeping,   0 stopped,   0 zombie
%Cpu(s): 10.5 us, 88.5 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.8 hi,  0.2 si,  0.0 st
KiB Mem:   2053072 total,  1926704 used,   126368 free,    51004 buffers
KiB Swap:        0 total,        0 used,        0 free.   185272 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 2758 fleet     20   0  682876 402224   6228 S  92.8 19.6   5:34.14 fleetd
   34 root      20   0       0      0      0 S  71.2  0.0   3:04.17 kswapd0
   26 root      39  19       0      0      0 R  26.3  0.0   0:02.15 khugepaged
 2739 etcd      20   0  259296  40584  11060 S   2.7  2.0   0:20.41 etcd2

This continues until the machine becomes unresponsive.

$ systemctl status fleet
● fleet.service - fleet daemon
   Loaded: loaded (/usr/lib/systemd/system/fleet.service; disabled; vendor preset: disabl
  Drop-In: /run/systemd/system/fleet.service.d
           └─20-cloudinit.conf, 30-certificates.conf
   Active: active (running) since Sun 2017-01-01 04:42:56 UTC; 28min ago
 Main PID: 2758 (fleetd)
    Tasks: 8
   Memory: 388.5M
      CPU: 5min 55.946s
   CGroup: /system.slice/fleet.service
           └─2758 /usr/bin/fleetd

cloud-config:

      #cloud-config

      coreos:
        etcd2:
          # generate a new token for each unique cluster from
          # https://discovery.etcd.io/new:
          discovery: discovery_url
          # multi-region deployments, multi-cloud deployments, and Droplets without
          # private networking need to use $public_ipv4:
          advertise-client-urls: https://$private_ipv4:2379,https://$private_ipv4:4001
          initial-advertise-peer-urls: https://$private_ipv4:2380
          # listen on the official ports 2379, 2380 and one legacy port 4001:
          listen-client-urls: https://0.0.0.0:2379,https://0.0.0.0:4001
          listen-peer-urls: https://$private_ipv4:2380
        fleet:
          etcd_servers: https://$private_ipv4:4001
          public-ip: $private_ipv4   # used for fleetctl ssh command
        flannel:
          etcd_endpoints: "https://127.0.0.1:2379"
          etcd_cafile: /home/core/ca.pem
          etcd_certfile: /home/core/coreos.pem
          etcd_keyfile: /home/core/coreos-key.pem
        update:
          reboot-strategy: off
        units:
          - name: etcd2.service
            command: start
          - name: fleet.service
            command: start
          - name: flanneld.service
            drop-ins:
              - name: 50-network-config.conf
                content: |
                  [Service]
                  ExecStartPre=/usr/bin/etcdctl --endpoints="https://127.0.0.1:2379" --cert-file=/home/core/coreos.pem --key-file=/home/core/coreos-key.pem --ca-file=/home/core/ca.pem set /coreos.com/network/config '{ "Network": "10.1.0.0/16" }'
            command: start
      write_files:
        # tell etcd2 and fleet where our certificates are going to live:
        - path: /run/systemd/system/etcd2.service.d/30-certificates.conf
          permissions: 0644
          content: |
            [Service]
            # client environment variables
            Environment=ETCD_CA_FILE=/home/core/ca.pem
            Environment=ETCD_CERT_FILE=/home/core/coreos.pem
            Environment=ETCD_KEY_FILE=/home/core/coreos-key.pem
            # peer environment variables
            Environment=ETCD_PEER_CA_FILE=/home/core/ca.pem
            Environment=ETCD_PEER_CERT_FILE=/home/core/coreos.pem
            Environment=ETCD_PEER_KEY_FILE=/home/core/coreos-key.pem
        - path: /run/systemd/system/fleet.service.d/30-certificates.conf
          permissions: 0644
          content: |
            [Service]
            # client auth certs
            Environment=FLEET_ETCD_CAFILE=/home/core/ca.pem
            Environment=FLEET_ETCD_CERTFILE=/home/core/coreos.pem
            Environment=FLEET_ETCD_KEYFILE=/home/core/coreos-key.pem
@crawford
Copy link
Contributor

We are in the process of sun-setting fleet with the intent to eventually remove it from the OS. We are encouraging users to move to Kubernetes instead.

@jsermeno
Copy link
Author

Okay, thanks.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants