Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

design journal persistence for early provisioning #955

Open
cgwalters opened this issue Sep 7, 2021 · 10 comments
Open

design journal persistence for early provisioning #955

cgwalters opened this issue Sep 7, 2021 · 10 comments

Comments

@cgwalters
Copy link
Member

A general philosophy of Ignition is that we have the system fully configured before switching into the real root and running user code. For example, using Ignition kernel argument support we run systemctl reboot from the initramfs.

However, this means that all logs from this early provisioning time are lost because systemd-journal-flush.service only runs when successfully switching to the real root.

This is likely something to take to upstream systemd, but basically I'd propose as a strawman that we configure journald to persist to /boot/journal or so during this early provisioning phase, and then have it switch over.

@jlebon
Copy link
Member

jlebon commented Sep 7, 2021

Is this RFE driven by a specific instance you have in mind where the Ignition kargs reboot nuked system logs you cared about?

Related: #928

@cgwalters
Copy link
Member Author

Is this RFE driven by a specific instance you have in mind where the Ignition kargs reboot nuked system logs you cared about?

Not me specifically, but from an internal chat where another team member was wondering how to find the logs from coreos-kargs-reboot.service.

@dustymabe
Copy link
Member

Sounds reasonable to me. Do you want to discuss this at the weekly meeting?

@HuijingHei
Copy link
Member

Is this RFE driven by a specific instance you have in mind where the Ignition kargs reboot nuked system logs you cared about?

Not me specifically, but from an internal chat where another team member was wondering how to find the logs from coreos-kargs-reboot.service.

Thanks!
Actually it is me who ask the question. Because I want to add auto case to check for rhcos-afterburn-checkin.service dependencies: After=coreos-kargs-reboot.service (openshift/os@e0363da, refer to #BZ1980679), but from journalctl can not find coreos-kargs-reboot.service logs.

Reason from @cgwalters : the journal isn't persisted here is because we reboot before switching root, which is when systemd saves it

@jlebon jlebon added the meeting topics for meetings label Sep 8, 2021
@travier
Copy link
Member

travier commented Sep 8, 2021

Generally agree that this would be useful, but most probably only for debugging some very specific cases.

We also have to be careful on loading "untrusted" content from /boot into the journal as well as stored unencrypted logs in /boot as that may not be OK for some use cases where users might expect everything to be stored encrypted on the disk at all times (excepted /boot content, maybe also in PXE boot use cases?).

This also has implications for measured boot in FCOS (which does not exist/work yet but could make it harder).

@jlebon
Copy link
Member

jlebon commented Sep 8, 2021

We discussed this in today's community meeting. Some things raised:

  • Timothée's points above
  • Doubt about usefulness of those logs vs implementation complexity
    • If something breaks on first boot, then we wouldn't reboot anyway. It seems unlikely we'd have an error where something in the first boot causes breaks in the second boot, since very few things don't rerun.
  • Log gathering for troubleshooting is still possible via serial console

@cgwalters
Copy link
Member Author

Maybe instead of the whole journal, we just write a tiny bit of information into /boot such as the fact that we did an early reboot, and then have code that runs in the real root that logs journal messages from that? journalctl --list-boots would still lie, but at least we'd be able to see something there.

@HuijingHei
Copy link
Member

Agree with @cgwalters , maybe can output the simple logs include coreos-kargs-reboot.service, then we can know that it actually triggers the reboot, does this make sense?

@bgilbert
Copy link
Contributor

bgilbert commented Sep 9, 2021

@HuijingHei If the goal is just to demonstrate that the system rebooted after adding kernel arguments, you don't need logs for that, since you'll be able to see the new arguments in /proc/cmdline. We already have such a test here.

@HuijingHei
Copy link
Member

@HuijingHei If the goal is just to demonstrate that the system rebooted after adding kernel arguments, you don't need logs for that, since you'll be able to see the new arguments in /proc/cmdline. We already have such a test here.

If in this case, the logs about early reboot are less important, just check the final goal

@jlebon jlebon removed the meeting topics for meetings label Sep 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants