-
Notifications
You must be signed in to change notification settings - Fork 27
Conversation
We'll need something like this for rpm-ostree's CI too at least, where right now the journal kola collects stops on the first reboot (again, because kola doesn't know the node is being rebooted). I had thought of something similar though possibly using another console instead, and Anyway, WDYT about having this functionality in https://github.com/coreos/fedora-coreos-config directly, and just conditionalizing the unit on |
(To clarify, what I'm suggesting here is making this a streaming thing instead, installing it in both the initrd and the real root, and making it more of a "host API" than something Ignition-specific.) |
I think these are strongly related but still orthogonal things. We don't need to stream the journal from the initrd - assuming the initrd works fine, if we do journal streaming in the real root we'll get the logs we need then. Hence, I'd propose merging this PR mostly as is, and do what you're suggesting as a separate virtio channel indeed owned by fedora-coreos-config (since it's not really related to Ignition). |
BTW, I wrote exactly what you're suggesting for gnome-continuous for several reasons, but one of the most interesting is that the default for desktop systems is not to have ssh on. (It could make sense to change mantle to default to 'exec over virtio' but that's a separate discussion) |
Any further thoughts on this one? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these are strongly related but still orthogonal things. We don't need to stream the journal from the initrd - assuming the initrd works fine, if we do journal streaming in the real root we'll get the logs we need then.
Hmm, I don't follow. If the goal is to have a debugging hook like this, IMO it'd be even more useful if it streamed starting from the initrd too. E.g. systemd.log_level=debug systemd.log_target=console
works on both the initrd systemd and real root systemd. And if we do that, I think it can be used in place of this.
But yeah, this is clearly useful to have today, so no issues from me getting this in meanwhile.
Anyway, a few optional comments, but LGTM as is too.
Yeah...though it would duplicate then what kola is doing with gathering the journals (we could replace that only on qemu of course). We'd also need to handle being killed and restarted across the switchroot and think about how that appears in logs. I guess again my main concern is getting the journal when things go wrong - when things go "right" (at least up till ssh) one has a ton of options. Arguably, we should have a similar service in the real root that also handles failure to reach the default target. |
Yeah, the goal would definitely be to make kola use that for qemu (and fixing the rpm-ostree vmcheck test logs case).
This would be tricky to do but not unsolvable I think. E.g. the proxy service could just write out on shutdown the cursor of the last message it proxied?
The way I'm thinking of it, the contexts in which you would have this set up is also where you want to be ready for things to go wrong (e.g. Ignition debugging, test harnesses, etc..). I don't see it as re-implementing e.g. But again, I definitely see the value of just something that fires on emergency in the initrd. So this WFM! |
Pairs with coreos/ignition-dracut#146 What we really want is to use this in kola, will do as a separate followup.
Now pairs with coreos/coreos-assembler#1290 and tested to work (or I guess successfully fail?) together. Will merge both when both are approved. |
Actually now that I play with this more...it might be nice if we wrote to the channel just |
I thought about the "generalize this to post-initramfs" and realized we don't necessarily need to bake it into CoreOS by default - it could be injected via Ignition. |
Pairs with coreos/ignition-dracut#146 This way, we error out fast if something went wrong in the initramfs rather than timing out. And further, we get the journal as JSON, so we can do something intelligent in the future to analyze it.
Pairs with coreos/ignition-dracut#146 This way, we error out fast if something went wrong in the initramfs rather than timing out. And further, we get the journal as JSON, so we can do something intelligent in the future to analyze it.
Pairs with coreos/ignition-dracut#146 This way, we error out fast if something went wrong in the initramfs rather than timing out. And further, we get the journal as JSON, so we can do something intelligent in the future to analyze it.
The way I think of it is that a generalized version of this would be like the serial console output; it just streams from start to end of the VM on the same port. The same output you get from |
Debugging failures in the initrd is annoying; this code looks for a virtio-serial port named `com.coreos.ignition.journal`, and runs as part of `emergency.target`. I plan to change mantle to set up this port by default, so if something fails in the initramfs we'll at least reliably get the journal in a sane parsable format. This is a special targeted subset of coreos/ignition#585
Sure, but post-switchroot any code injected via Ignition to write to a port or do whatever is going to get those logs too - it'll just be delayed until the switchroot happens. Another important thing is that instead of getting all logs the calling code can also use e.g. |
554e8af
to
84c89f4
Compare
Pairs with coreos/ignition-dracut#146 This way, we error out fast if something went wrong in the initramfs rather than timing out. And further, we get the journal as JSON, so we can do something intelligent in the future to analyze it. And add a test case for this.
OK last call on this one...if there aren't any further objections/thoughts I plan to merge. |
Pairs with coreos/ignition-dracut#146 This way, we error out fast if something went wrong in the initramfs rather than timing out. And further, we get the journal as JSON, so we can do something intelligent in the future to analyze it. And add a test case for this.
Pairs with coreos/ignition-dracut#146 This way, we error out fast if something went wrong in the initramfs rather than timing out. And further, we get the journal as JSON, so we can do something intelligent in the future to analyze it. And add a test case for this.
Pairs with coreos/ignition-dracut#146 This way, we error out fast if something went wrong in the initramfs rather than timing out. And further, we get the journal as JSON, so we can do something intelligent in the future to analyze it. And add a test case for this.
This is similar to: coreos/ignition-dracut#146 For our test system, it generally works really well to inject things via Ignition. That PR was about handling failures in the initramfs *before* Ignition runs. This PR is trying to help us test the scenario where no Ignition is injected into the Live ISO. Let's also use the virtio-channel approach.
This is similar to: coreos/ignition-dracut#146 For our test system, it generally works really well to inject things via Ignition. That PR was about handling failures in the initramfs *before* Ignition runs. This PR is trying to help us test the scenario where no Ignition is injected into the Live ISO. Let's also use the virtio-channel approach.
This is similar to: coreos/ignition-dracut#146 For our test system, it generally works really well to inject things via Ignition. That PR was about handling failures in the initramfs *before* Ignition runs. This PR is trying to help us test the scenario where no Ignition is injected into the Live ISO. Let's also use the virtio-channel approach.
This is similar to: coreos/ignition-dracut#146 For our test system, it generally works really well to inject things via Ignition. That PR was about handling failures in the initramfs *before* Ignition runs. This PR is trying to help us test the scenario where no Ignition is injected into the Live ISO. Let's also use the virtio-channel approach.
This is similar to: coreos/ignition-dracut#146 For our test system, it generally works really well to inject things via Ignition. That PR was about handling failures in the initramfs *before* Ignition runs. This PR is trying to help us test the scenario where no Ignition is injected into the Live ISO. Let's also use the virtio-channel approach.
This is similar to: coreos/ignition-dracut#146 For our test system, it generally works really well to inject things via Ignition. That PR was about handling failures in the initramfs *before* Ignition runs. This PR is trying to help us test the scenario where no Ignition is injected into the Live ISO. Let's also use the virtio-channel approach.
This is similar to: coreos/ignition-dracut#146 For our test system, it generally works really well to inject things via Ignition. That PR was about handling failures in the initramfs *before* Ignition runs. This PR is trying to help us test the scenario where no Ignition is injected into the Live ISO. Let's also use the virtio-channel approach.
This is similar to: coreos/ignition-dracut#146 For our test system, it generally works really well to inject things via Ignition. That PR was about handling failures in the initramfs *before* Ignition runs. This PR is trying to help us test the scenario where no Ignition is injected into the Live ISO. Let's also use the virtio-channel approach.
This is similar to: coreos/ignition-dracut#146 For our test system, it generally works really well to inject things via Ignition. That PR was about handling failures in the initramfs *before* Ignition runs. This PR is trying to help us test the scenario where no Ignition is injected into the Live ISO. Let's also use the virtio-channel approach.
This finally unifies the advantages of `cosa run` and `kola spawn`. I kept getting annoyed by how serial console sizing is broken (e.g. trying to use `less` etc.). Using `ssh` via `kola spawn` addresses that, but it means you can't debug the initramfs. Now things work in an IMO pretty cool way; if you do e.g. `cosa run --kargs ignition.config.url=blah://` (or inject a bad Ignition config) to cause a failure in the initramfs, you'll see a nice error (building on coreos/ignition-dracut#146 ) telling you to rerun with `cosa run --devshell-console`. Things are also wired up cleanly so that we support rebooting with the equivalent of `kola spawn --reconnect` (which we should probably remove now). You can exit via *either* quitting SSH cleanly or using `poweroff`, and the lifecycle of ssh and qemu is wired together. And finally, if we detect a cosa workdir we also bind it in by default. More to come here, such as auto-injecting debugging tools and containers.
This finally unifies the advantages of `cosa run` and `kola spawn`. I kept getting annoyed by how serial console sizing is broken (e.g. trying to use `less` etc.). Using `ssh` via `kola spawn` addresses that, but it means you can't debug the initramfs. Now things work in an IMO pretty cool way; if you do e.g. `cosa run --kargs ignition.config.url=blah://` (or inject a bad Ignition config) to cause a failure in the initramfs, you'll see a nice error (building on coreos/ignition-dracut#146 ) telling you to rerun with `cosa run --devshell-console`. Things are also wired up cleanly so that we support rebooting with the equivalent of `kola spawn --reconnect` (which we should probably remove now). You can exit via *either* quitting SSH cleanly or using `poweroff`, and the lifecycle of ssh and qemu is wired together. And finally, if we detect a cosa workdir we also bind it in by default. More to come here, such as auto-injecting debugging tools and containers.
This finally unifies the advantages of `cosa run` and `kola spawn`. I kept getting annoyed by how serial console sizing is broken (e.g. trying to use `less` etc.). Using `ssh` via `kola spawn` addresses that, but it means you can't debug the initramfs. Now things work in an IMO pretty cool way; if you do e.g. `cosa run --kargs ignition.config.url=blah://` (or inject a bad Ignition config) to cause a failure in the initramfs, you'll see a nice error (building on coreos/ignition-dracut#146 ) telling you to rerun with `cosa run --devshell-console`. Things are also wired up cleanly so that we support rebooting with the equivalent of `kola spawn --reconnect` (which we should probably remove now). You can exit via *either* quitting SSH cleanly or using `poweroff`, and the lifecycle of ssh and qemu is wired together. And finally, if we detect a cosa workdir we also bind it in by default. More to come here, such as auto-injecting debugging tools and containers.
This finally unifies the advantages of `cosa run` and `kola spawn`. I kept getting annoyed by how serial console sizing is broken (e.g. trying to use `less` etc.). Using `ssh` via `kola spawn` addresses that, but it means you can't debug the initramfs. Now things work in an IMO pretty cool way; if you do e.g. `cosa run --kargs ignition.config.url=blah://` (or inject a bad Ignition config) to cause a failure in the initramfs, you'll see a nice error (building on coreos/ignition-dracut#146 ) telling you to rerun with `cosa run --devshell-console`. Things are also wired up cleanly so that we support rebooting with the equivalent of `kola spawn --reconnect` (which we should probably remove now). You can exit via *either* quitting SSH cleanly or using `poweroff`, and the lifecycle of ssh and qemu is wired together. And finally, if we detect a cosa workdir we also bind it in by default. More to come here, such as auto-injecting debugging tools and containers.
This finally unifies the advantages of `cosa run` and `kola spawn`. I kept getting annoyed by how serial console sizing is broken (e.g. trying to use `less` etc.). Using `ssh` via `kola spawn` addresses that, but it means you can't debug the initramfs. Now things work in an IMO pretty cool way; if you do e.g. `cosa run --kargs ignition.config.url=blah://` (or inject a bad Ignition config) to cause a failure in the initramfs, you'll see a nice error (building on coreos/ignition-dracut#146 ) telling you to rerun with `cosa run --devshell-console`. Things are also wired up cleanly so that we support rebooting with the equivalent of `kola spawn --reconnect` (which we should probably remove now). You can exit via *either* quitting SSH cleanly or using `poweroff`, and the lifecycle of ssh and qemu is wired together. And finally, if we detect a cosa workdir we also bind it in by default. More to come here, such as auto-injecting debugging tools and containers.
This finally unifies the advantages of `cosa run` and `kola spawn`. I kept getting annoyed by how serial console sizing is broken (e.g. trying to use `less` etc.). Using `ssh` via `kola spawn` addresses that, but it means you can't debug the initramfs. Now things work in an IMO pretty cool way; if you do e.g. `cosa run --kargs ignition.config.url=blah://` (or inject a bad Ignition config) to cause a failure in the initramfs, you'll see a nice error (building on coreos/ignition-dracut#146 ) telling you to rerun with `cosa run --devshell-console`. Things are also wired up cleanly so that we support rebooting with the equivalent of `kola spawn --reconnect` (which we should probably remove now). You can exit via *either* quitting SSH cleanly or using `poweroff`, and the lifecycle of ssh and qemu is wired together. And finally, if we detect a cosa workdir we also bind it in by default. More to come here, such as auto-injecting debugging tools and containers.
This came up in coreos/ignition-dracut#146 and since then we've been doing more "ad hoc unit writing to virtio" in mantle, but let's add a general API that streams the journal. This is just better for what devshell wants - we can more precisely watch for sshd starting. And more code in e.g. `testiso.go` could use it too which can come later.
This came up in coreos/ignition-dracut#146 and since then we've been doing more "ad hoc unit writing to virtio" in mantle, but let's add a general API that streams the journal. This is just better for what devshell wants - we can more precisely watch for sshd starting. And more code in e.g. `testiso.go` could use it too which can come later. The immediate motivation here is I may add another kola test which could use this.
This came up in coreos/ignition-dracut#146 and since then we've been doing more "ad hoc unit writing to virtio" in mantle, but let's add a general API that streams the journal. This is just better for what devshell wants - we can more precisely watch for sshd starting. And more code in e.g. `testiso.go` could use it too which can come later. The immediate motivation here is I may add another kola test which could use this.
This came up in coreos/ignition-dracut#146 and since then we've been doing more "ad hoc unit writing to virtio" in mantle, but let's add a general API that streams the journal. This is just better for what devshell wants - we can more precisely watch for sshd starting. And more code in e.g. `testiso.go` could use it too which can come later. The immediate motivation here is I may add another kola test which could use this.
This came up in coreos/ignition-dracut#146 and since then we've been doing more "ad hoc unit writing to virtio" in mantle, but let's add a general API that streams the journal. This is just better for what devshell wants - we can more precisely watch for sshd starting. And more code in e.g. `testiso.go` could use it too which can come later. The immediate motivation here is I may add another kola test which could use this.
This came up in coreos/ignition-dracut#146 and since then we've been doing more "ad hoc unit writing to virtio" in mantle, but let's add a general API that streams the journal. This is just better for what devshell wants - we can more precisely watch for sshd starting. And more code in e.g. `testiso.go` could use it too which can come later. The immediate motivation here is I may add another kola test which could use this.
Debugging failures in the initrd is annoying; this code
looks for a virtio-serial port named
com.coreos.ignition.journal
,and runs as part of
emergency.target
.I plan to change mantle to set up this port by default, so if
something fails in the initramfs we'll at least reliably get
the journal in a sane parsable format.
This is a special targeted subset of
coreos/ignition#585