Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GCE fingerprinting functionality #215

Merged
merged 13 commits into from
Oct 12, 2015

Conversation

dimfeld
Copy link
Contributor

@dimfeld dimfeld commented Oct 5, 2015

Also fixes AWS fingerprinter to not run on GCE

This reads the following:

* hostname
* instance id
* machine-type
* zone
* internal IP
* external IP (if any)
* tags
* attributes

Atributes are placed under the platform.gce.attr.* hierarchy.

Tags are set up as platform.gce.tag.TagName=true.
GCE and AWS both expose metadata servers, and GCE's 404 response
includes the URL in the content, which maatches the regex. So,
check the response code as well and if a 4xx code comes back,
take that to meanit's not AWS.
// provide their own
metadataURL := os.Getenv("GCE_ENV_URL")
if metadataURL == "" {
metadataURL = "http://169.254.169.254/computeMetadata/v1/instance/"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move this string to a const at the top and put a comment on it (make it look less like a magic value)

node.Links = make(map[string]string)
}

client := NewGCEMetadataClient(f.logger)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we put the client in the EnvGCEFingerprint struct so it doesn't have to be made twice?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. I originally did it that way so that the client wouldn't stay around forever once it was no longer needed, but I see that the Fingerprint objects aren't persistent anyway.

This allows easier reuse of the same client across multiple functions.
@dimfeld
Copy link
Contributor Author

dimfeld commented Oct 5, 2015

Above comments should be all taken care of. Just let me know if you have any others :)

}

// Get internal and external IP (if it exits)
value, err := f.Get("network-interfaces/0/ip", false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to look at all the attached network interfaces. So f.Get("network-interfaces/", true)

* No longer setting Device name in the network interface since we can't
match up the info here with real device names.
* Add attributes for all external IPs if more than one exists.
@dimfeld
Copy link
Contributor Author

dimfeld commented Oct 7, 2015

Some comments and outstanding questions on notable changes from the latest push:

  1. Any thoughts on what to do about the network Device names in this case? We potentially end up with multiple interfaces and don't really know what do do about the Device field for each one. The latest code doesn't set Device at all, but I can change that back if you'd like.
  2. Additionally, the fingerprinter is adding a new network interface which overlaps with the interface added by network_unix.go (the AWS fingerprinter has the same problem, I think). A quick browse of the source code makes it look like this isn't actually an issue but it feels a bit odd to end up with two structures that both cover the same network device.
  3. The new format for external IPs is platform.gce.network.NETWORKNAME.external-ip.INDEX where index is 0, 1, 2, etc. This allows us to support multiple values.

FWIW I haven't been able to figure out how to actually attach more than 1 network to a GCE node so most of the above may be theoretical for now.

@cbednarski
Copy link
Contributor

Any thoughts on what to do about the network Device names in this case? We potentially end up with multiple interfaces and don't really know what do do about the Device field for each one. The latest code doesn't set Device at all, but I can change that back if you'd like.

I don't know what this means in GCE, but let me give you an example use-case to put this in context.

In AWS some instances have enhanced networking, which makes the network device on the host directly available to the VM (it is not virtualized at all). This provides lower network latency. In Nomad I might want to schedule a latency-sensitive workload on these types of nodes. If the fingerprinter identifies the device type, I can use that information to make placement decisions.

Additionally, the fingerprinter is adding a new network interface which overlaps with the interface added by network_unix.go (the AWS fingerprinter has the same problem, I think). A quick browse of the source code makes it look like this isn't actually an issue but it feels a bit odd to end up with two structures that both cover the same network device.

In this scenario the fingerprinter that has more information (AWS or GCE in this case) should replace, update, or supplement data into the existing entry.

This pattern means generic fingerprinters can run everywhere and provide baseline information, while more specialized fingerprinter can come in afterward and provide more detailed information.

The new format for external IPs is platform.gce.network.NETWORKNAME.external-ip.INDEX where index is 0, 1, 2, etc. This allows us to support multiple values.

Can you clarify the meaning of "network"? Is that a network subnet, or cidr range, or a security group, or just a GCE identifier? I would lean towards mirroring whatever the representation of this data is in GCE's API so it's more intuitive to write placement rules against it.

@dimfeld
Copy link
Contributor Author

dimfeld commented Oct 9, 2015

In GCE, a "network" is pretty much the equivalent of an AWS VPC. That is, a subnet which is isolated from any other subnets, and with external access through external IPs, address translation, and security-group-like rules.

An instance can only be in a single network; from what I've seen there's no way to create an instance in more than one network/subnet or with more than one interface. I haven't found any options that would expose another network device to the VM such as the AWS option you mentioned.

I think things get confusing here because the GCE metadata API has provisions for instances with multiple network interfaces, and with multiple external IPs per network interface. This was probably done to be forward-compatible, but for now there will never be more than one of each. The main issue with supporting multiple network interfaces is that there isn't really a way to match up the interface for a network named default, for example, to the network device eth0. If we assume there's only one network interface, like it is now, then this is trivial, but otherwise it becomes difficult.

Suggestions

That said, with the current data available there isn't really any extra or better information to add to the Network resources that network_unix.go isn't already finding. So it might be best to just remove all interaction between the GCE fingerprinter and the network resources, and have it only set platform-specific attributes under the platform.gce prefix.

As for the external IP representation, you can see the full data format in the GCEMetadataNetworkInterface struct at the top of the file. The metadata server returns an array of these objects. Part of the issue here is that inside the object is an array of AccessConfigs, but the map[string]string nature of the attributes can't implicitly represent an array, and once again there can only be one right now but that might not be true forever.

Mirroring it exactly would lead to something like platform.gce.network.NETWORKNAME.access-configs.N.external-ip, which we could do but is rather verbose. I'm personally leaning more toward platform.gce.network.external-ip.N right now, but I don't feel strongly about it and will accede to whatever format you'd like.

* cpu-platform
* scheduling.automatic-restart
* scheduling.on-host-maintenance
* network.NETWORKNAME=true
It has nothing to add that the generic fingerprinters aren't
finding on their own already.
@dimfeld
Copy link
Contributor Author

dimfeld commented Oct 9, 2015

I see in feasible.go that there are already plans to implement a "contains" operator. Given that, one way to represent arrays would be as a single attribute containing a comma-separated string. Personally I'm fine with that, but will await your thoughts before messing with it more.

metadataURL string
}

// NewEnvGCEFingerprint is used to create a CPU fingerprint
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update comment.

@dimfeld
Copy link
Contributor Author

dimfeld commented Oct 12, 2015

@dadgar seems like your last comment disappeared (deleted?) but here's the response I wrote:

The data at that link is what's accessible through the full GCE API though, not through the metadata server, and we don't necessarily have access to the full API (I'm not sure if it's even possible without API keys). Unfortunately much less is exposed through the metadata server endpoints: https://cloud.google.com/compute/docs/metadata?hl=en

As for the IPs, network_unix.go should be picking up the internal IP already anyway on GCE, since that's what Linux is aware of and what shows up in the ifconfig output and so on. There isn't any visibility from the VM as to what its external IP is on GCE except through the metadata server since whatever NAT work is done to make that work takes place outside the VM; AWS works the same way I believe.

if err := json.Unmarshal([]byte(value), &interfaces); err != nil {
f.logger.Printf("[WARN] fingerprint.env_gce: Error decoding network interface information: %s", err.Error())
} else {
for _, intf := range interfaces {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dimfeld: Yeah I deleted because I realized exactly what you wrote. I spent some time reading the GCE docs and create an instance to play with and I agree with what you did in this PR. We will just use the network_unix.go. Just small comments and then we will merge this!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool sounds great! Just took care of the one above.

@dadgar
Copy link
Contributor

dadgar commented Oct 12, 2015

Awesome! Thanks @dimfeld

dadgar added a commit that referenced this pull request Oct 12, 2015
Add GCE fingerprinting functionality
@dadgar dadgar merged commit 10fe21c into hashicorp:master Oct 12, 2015
@dimfeld dimfeld deleted the gce_fingerprint branch October 13, 2015 01:55
benbuzbee pushed a commit to benbuzbee/nomad that referenced this pull request Jul 21, 2022
Ensure installSnapshot consume stream. fixes issue hashicorp#212
@github-actions
Copy link

github-actions bot commented May 7, 2023

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 7, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants