-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
systemd ratelimit being hit after upgrade #505
Comments
@ryanm101 Commenting per your init remark as I have felt your pain. 😂 Don't have a solution, but maybe a thought or two: SystemD has targets to avoid race conditions, so that may be one thing for you. In addition there are different types - OneShot for example, which would negate having to run a LBNL, something we have to use frequently are overrides. Maybe you can fix another unit to create the directory in Anyhow, can you explain what prompts you having to create the directory in the first place? That seems like a small hack and maybe there is a different way to achieve it? |
Honestly not 100% why it was done (previous team did this piece of code prior to leaving) so i've inherited it. From the descriptions it looks like (again i'm not really hugely familiar with systemd) the 1st job does nothing but create some sort of watcher for that file to appear which it will once our bootstrapping is completed. that allows the 2nd job to trigger which then stops the static pods used for bootstrapping to allow our kubelet to start properly. All i can see is this change of behaviour when we've jumped flatcar versions. I've already had to sort a systemd DNS issue due to our massive jump, so i am hoping this is a similar change. It's only frustrated by my lack of systemd knowledge. |
@ryanm101 Hmm, learned something new and it's just Monday: It seems like your The other thing I would look into is, |
@till So i never actually stop the
I think it is else it might never run???? i'm not sure how often it 'tries' compared to the It's not a directory it's looking for, it's a file called we dont update flarcar or kubes by 'upgrading' we just rebuild our servers from scratch in a rolling rebuild so when we 'upgrade' a server this occurs on first boot. rebooting and it works as expected (because the file exists so the criteria is not met) I'll take a look at that extra debug stuff, thanks |
Looks like you relied on some undefined behavior that is not supported anymore? |
is the service that does the bootstrapping I wonder if i added |
@ryanm101 Were you able to test that out? |
Closing this issue. Feel free to re-open if required. |
Upgraded flarcar from
Flatcar Container Linux by Kinvolk 2512.5.0 (Oklo) 4.19.145-flatcar docker://18.6.3
to
Flatcar Container Linux by Kinvolk 2905.2.3 (Oklo) 5.10.61-flatcar docker://19.3.15
as #286 solves my encryption issues, the cluster is on Kubes v1.17.17
We have two systemd files that are now failing:
On
2512.5.0
On
2905.2.3
in both cases
systemctl status kubelet-prebootstrap.service
I've tried tweaking both:
StartLimitBurst
andStartLimitIntervalSec
to no differencePS. I'm no systemd expert, i'm more of a init person so i'm only starting to look at systemd in any real anger, so it is possible / likely i'm just misunderstanding something / plain wrong on some level.
EDIT: Just to add a datapoint, This has happened on all my 'worker' nodes but not on my 'master' nodes
The text was updated successfully, but these errors were encountered: