Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace the Machine ID usage with an "Invocation ID" #4230

Merged
merged 1 commit into from
Apr 10, 2024

Conversation

twz123
Copy link
Member

@twz123 twz123 commented Apr 3, 2024

Description

Instead of relying on an externally maintained "machine identity" that may or may not exist on hosts, use a so-called Invocation ID. The Invocation ID is a randomly generated 32-byte hexadecimal string that is created during process startup. The name is inspired by systemd's INVOCATION_ID.

K0s doesn't need a unique host identifier that is stable across process invocations. It only needs to have unique identifiers per process. This removes a dependency on the k0s host machine (some machine-id file or registry entry), a Go module dependency, Machine ID related documentation and system probes. It also makes the use of a unique ID obvious. The machine ID has been hard-coded into the guts of the leader election and telemetry components. The invocation ID is passed around instead.

Stub out the use of the anonymous ID in telemetry, as the new ID is no longer stable and would be useless for this type of use.

Resolves #3983.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update

How Has This Been Tested?

  • Manual test
  • Auto test added

Checklist:

  • My code follows the style guidelines of this project
  • My commit messages are signed-off
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • I have checked my code and corrected any misspellings

Instead of relying on an externally maintained "machine identity"
that may or may not exist on hosts, use a so-called Invocation ID. The
Invocation ID is a randomly generated 32-byte hexadecimal string that
is created during process startup. The name is inspired by systemd's
INVOCATION_ID.

K0s doesn't need a unique host identifier that is stable across process
invocations. It only needs to have unique identifiers per process.
This removes a dependency on the k0s host machine (some machine-id
file or registry entry),  a Go module dependency, Machine ID related
documentation and system probes. It also makes the use of a unique ID
obvious. The machine ID has been hard-coded into the guts of the leader
election and telemetry components. The invocation ID is passed around
instead.

Stub out the use of the anonymous ID in telemetry, as the new ID is no
longer stable and would be useless for this type of use.

Signed-off-by: Tom Wieczorek <[email protected]>
@twz123 twz123 added this to the 1.30 milestone Apr 3, 2024
@twz123 twz123 marked this pull request as ready for review April 3, 2024 11:23
@twz123 twz123 requested a review from a team as a code owner April 3, 2024 11:23
@twz123 twz123 requested review from ncopa and jnummelin April 3, 2024 11:23
@ncopa
Copy link
Collaborator

ncopa commented Apr 5, 2024

I think at some point kubelet would fail if the machine-id was missing. Is this no longer the case?

@twz123
Copy link
Member Author

twz123 commented Apr 5, 2024

I think at some point kubelet would fail if the machine-id was missing. Is this no longer the case?

At least the inttests pass, and I removed all machine ID related setup from the bootloose test image. So the Alpine containers shouldn't have a machine ID anymore. I checked the Kubernetes sources, and there are very few traces about machine IDs, as described in #3983 (comment).

@twz123
Copy link
Member Author

twz123 commented Apr 5, 2024

Checked the Kubernetes sources again.

On Linux, MachineID should be empty: MachineID: getInfoFromFiles(filepath.Join(rootFs, *machineIDFilePath)), assigns it, machineIDFilePath lists the files to check, and klog.Warningf("Couldn't collect info from any of the files in %q", filePaths) is logged in case none of the files could be read.

On Windows, there's even no attempt to read if from the registry, it is simply set to the hostname: https://github.com/kubernetes/kubernetes/blob/v1.30.0-rc.1/pkg/kubelet/winstats/perfcounter_nodestats.go#L162

@@ -151,7 +150,7 @@ func (c Component) sendTelemetry(ctx context.Context) {

c.log.WithField("data", data).WithField("hostdata", hostData).Info("sending telemetry")
if err := c.analyticsClient.Enqueue(analytics.Track{
AnonymousId: machineID(),
AnonymousId: "(removed)",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AnonymousID is the Segment entity: https://segment.com/docs/connections/sources/catalog/libraries/website/javascript/identity/#anonymous-ids

As far as I can see it's not used in our data pipeline now, but are we sure, that we want to remove it completely?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would think the unique cluster id is more important for telemetry than the machine id 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, of course, I was wondering if it could break the pipeline, but we checked with Chen and it should be ok.
Chen suggested testing it a bit, but in general we are fine

@twz123 twz123 merged commit b898a16 into k0sproject:main Apr 10, 2024
79 checks passed
@twz123 twz123 deleted the remove-machine-id branch April 10, 2024 08:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Replace machine ID with k0s-managed ID
4 participants