Some points to use in production #174

gabrieladt · 2017-05-11T12:07:29Z

Hi. This is my first post , I don't know if I am doing right, sorry case not.
So I am using a fork of tack in my productions envs. I would like to share some import points for us that I found.

1 - Disable auto updates(coreos).
Today coreos and kubernetes are not synchronized this means the nodes are rebooted and you have errors (until the health checks fails and the pods are reallocated, also in ingress) or downtime.
coreos/bugs#1274
units:
- name: update-engine.service
mask: true
- name: locksmithd.service
mask: true
Or try use this guy(I need to test too)
https://github.com/coreos/container-linux-update-operator (ref on bug above)

2- Very Important, cause a lot of issues.
Is normal and happen a lot "coredumps". The problem is, the default action of systemd is
get the dump and compress, when this happen the CPU of the machine fully consumed, causing the degradation of everything that is running in the same node, this was the main cause of our first downtime on k8s,"some coredumps were generated by some bad threads, make the cpu achieve 100%, making the health check fails for all other pods and making the pods be killed, etc,etc.

So basically we disable coredump to be save to the disk (only log)
/etc/systemd/coredump.conf
Storage=none

Running

hyperkube-tag = "v1.5.4_coreos.0"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some points to use in production #174

Some points to use in production #174

gabrieladt commented May 11, 2017 •

edited

Loading

Some points to use in production #174

Some points to use in production #174

Comments

gabrieladt commented May 11, 2017 • edited Loading

gabrieladt commented May 11, 2017 •

edited

Loading