Skip to content

Commit

Permalink
daemon: Validate kernel arguments
Browse files Browse the repository at this point in the history
I'm debugging https://bugzilla.redhat.com/show_bug.cgi?id=2075126
and while I haven't verified this is the case, as far as I can tell
from looking through the code and thinking about things, if we somehow
fail to apply the expected kernel arguments (which can occur if
`ostree-finalize-staged` fails) then we will (on the next boot)
drop in to `validateOnDiskState()` which has for a long time
checked that all the expected *files* exist and mark the update as
complete.  But we didn't check the kernel arguments.

That can then cause later problems because in trying to apply further
updates we'll ask rpm-ostree to try to remove kernel arguments that
aren't actually present.

Worse, often these kernel arguments are actually *quite important*
and may even have security relevant properties (e.g. `nosmt`).

Now...I am actually increasingly convinced that we *really* need
to move opinionated kernel argument handling into ostree (and rpm-ostree).

There's ye olde ostreedev/ostree#2217
and the solution may look something like that.  Particularly now
with the layering philosophy that it makes sense to support
e.g. customizations dropping content in `/usr/lib` and such.

For now though, validating that we didn't get the expected kargs
should make things go Degraded, the same as if there was a file conflict.
And *that* in turn should make it easier to debug failures.

As of right now, it will appear that updates are complete, and then
we'll only find out much later that the kargs are actually missing.
And in turn, because kubelet spams the journal, any error messages
from e.g. `ostree-finalize-staged.service` may be lost.
  • Loading branch information
cgwalters committed Apr 26, 2022
1 parent 2a1e3a6 commit ad068c1
Showing 1 changed file with 41 additions and 0 deletions.
41 changes: 41 additions & 0 deletions pkg/daemon/daemon.go
Original file line number Diff line number Diff line change
Expand Up @@ -1759,6 +1759,40 @@ func (dn *Daemon) triggerUpdateWithMachineConfig(currentConfig, desiredConfig *m
return dn.update(currentConfig, desiredConfig)
}

// validateKernelArguments checks that the current boot has all arguments specified
// in the target machineconfig.
func (dn *CoreOSDaemon) validateKernelArguments(currentConfig *mcfgv1.MachineConfig) error {
rpmostreeKargsBytes, err := runGetOut("rpm-ostree", "kargs")
if err != nil {
return err
}
rpmostreeKargs := strings.TrimSpace(string(rpmostreeKargsBytes))
foundArgsArray := strings.Split(rpmostreeKargs, " ")
foundArgs := make(map[string]bool)
for _, arg := range foundArgsArray {
foundArgs[arg] = true
}
expected := parseKernelArguments(currentConfig.Spec.KernelArguments)
missing := []string{}
for _, karg := range expected {
if _, ok := foundArgs[karg]; !ok {
missing = append(missing, karg)
}
}
if len(missing) > 0 {
cmdlinebytes, err := ioutil.ReadFile(CmdLineFile)
if err != nil {
glog.Warningf("Failed to read %s: %v", CmdLineFile, err)
} else {
glog.Infof("Booted command line: %s", string(cmdlinebytes))
}
glog.Infof("Current ostree kargs: %s", rpmostreeKargs)
glog.Infof("Expected MachineConfig kargs: %v", expected)
return fmt.Errorf("Missing expected kernel arguments: %v", missing)
}
return nil
}

// validateOnDiskState compares the on-disk state against what a configuration
// specifies. If for example an admin ssh'd into a node, or another operator
// is stomping on our files, we want to highlight that and mark the system
Expand All @@ -1770,6 +1804,13 @@ func (dn *Daemon) validateOnDiskState(currentConfig *mcfgv1.MachineConfig) error
return fmt.Errorf("expected target osImageURL %q, have %q", currentConfig.Spec.OSImageURL, dn.bootedOSImageURL)
}

if dn.os.IsCoreOSVariant() {
coreOSDaemon := CoreOSDaemon{dn}
if err := coreOSDaemon.validateKernelArguments(currentConfig); err != nil {
return err
}
}

return validateOnDiskState(currentConfig, pathSystemd)
}

Expand Down

0 comments on commit ad068c1

Please sign in to comment.