Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add podresourcesstore in awscontainerinsightreceiver #167

Merged

Conversation

aditya-purang
Copy link

@aditya-purang aditya-purang commented Feb 22, 2024

Description:

- Add a new store in awscontainerinsightreceiver, this store gets the mapping between pod/container and their allocated devices.
PS : we need to add the mount volume mount and volume for pod-resources in the agent operator package, will ask @ssamartp to add those in his PR.

Testing:

Tested the api in a test cluster with a custom script:

&PodResources{
  Name: trn1-mlp,
  Namespace: default,
  Containers: []*ContainerResources{
    &ContainerResources{
      Name: trn1-mlp,
      Devices: []*ContainerDevices{
        &ContainerDevices{ResourceName: aws.amazon.com/neuron, DeviceIds: [9], Topology: nil},
        &ContainerDevices{ResourceName: aws.amazon.com/neuron, DeviceIds: [8], Topology: nil},
        &ContainerDevices{ResourceName: aws.amazon.com/neuron, DeviceIds: [1], Topology: nil},
        &ContainerDevices{ResourceName: aws.amazon.com/neuron, DeviceIds: [4], Topology: nil},
        &ContainerDevices{ResourceName: aws.amazon.com/neuron, DeviceIds: [0], Topology: nil},
        &ContainerDevices{ResourceName: aws.amazon.com/neuron, DeviceIds: [11], Topology: nil},
        &ContainerDevices{ResourceName: aws.amazon.com/neuron, DeviceIds: [3], Topology: nil},
        &ContainerDevices{ResourceName: aws.amazon.com/neuron, DeviceIds: [7], Topology: nil},
        &ContainerDevices{ResourceName: aws.amazon.com/neuron, DeviceIds: [12], Topology: nil},
        &ContainerDevices{ResourceName: aws.amazon.com/neuron, DeviceIds: [6], Topology: nil},
        &ContainerDevices{ResourceName: aws.amazon.com/neuron, DeviceIds: [13], Topology: nil},
        &ContainerDevices{ResourceName: aws.amazon.com/neuron, DeviceIds: [2], Topology: nil},
        &ContainerDevices{ResourceName: aws.amazon.com/neuron, DeviceIds: [10], Topology: nil},
        &ContainerDevices{ResourceName: aws.amazon.com/neuron, DeviceIds: [14], Topology: nil},
        &ContainerDevices{ResourceName: aws.amazon.com/neuron, DeviceIds: [5], Topology: nil},
        &ContainerDevices{ResourceName: aws.amazon.com/neuron, DeviceIds: [15], Topology: nil},
      },
      CpuIds: [],
      Memory: []*ContainerMemory{},
      DynamicResources: []*DynamicResource{},
    },
  },
}

also tested on a the test cluster and printed the map, some of the entries in the map:

2024-03-01T17:06:36Z I! {"caller":"stores/podresourcesstore.go:146","msg":"/nContainerInfo : {default_trn1-mlp_trn1-mlp} -> ResourceInfo : {aws.amazon.com/neuron_2_}","kind":"receiver","name":"awscontainerinsightreceiver","data_type":"metrics"}
2024-03-01T17:06:36Z I! {"caller":"stores/podresourcesstore.go:146","msg":"/nContainerInfo : {default_trn1-mlp_trn1-mlp} -> ResourceInfo : {aws.amazon.com/neuron_14_}","kind":"receiver","name":"awscontainerinsightreceiver","data_type":"metrics"}
2024-03-01T17:06:36Z I! {"caller":"stores/podresourcesstore.go:146","msg":"/nContainerInfo : {default_trn1-mlp_trn1-mlp} -> ResourceInfo : {aws.amazon.com/neuron_5_}","kind":"receiver","name":"awscontainerinsightreceiver","data_type":"metrics"}
2024-03-01T17:06:36Z I! {"caller":"stores/podresourcesstore.go:146","msg":"/nContainerInfo : {default_trn1-mlp_trn1-mlp} -> ResourceInfo : {aws.amazon.com/neuron_9_}","kind":"receiver","name":"awscontainerinsightreceiver","data_type":"metrics"}
2024-03-01T17:06:36Z I! {"caller":"stores/podresourcesstore.go:146","msg":"/nContainerInfo : {default_trn1-mlp_trn1-mlp} -> ResourceInfo : {aws.amazon.com/neuron_12_}","kind":"receiver","name":"awscontainerinsightreceiver","data_type":"metrics"}
2024-03-01T17:06:46Z I! {"caller":"stores/podresourcesstore.go:146","msg":"/nContainerInfo : {default_trn1-mlp_trn1-mlp} -> ResourceInfo : {aws.amazon.com/neuron_14_}","kind":"receiver","name":"awscontainerinsightreceiver","data_type":"metrics"}
2024-03-01T17:06:46Z I! {"caller":"stores/podresourcesstore.go:146","msg":"/nContainerInfo : {default_trn1-mlp_trn1-mlp} -> ResourceInfo : {aws.amazon.com/neuron_4_}","kind":"receiver","name":"awscontainerinsightreceiver","data_type":"metrics"}
2024-03-01T17:06:46Z I! {"caller":"stores/podresourcesstore.go:146","msg":"/nContainerInfo : {default_trn1-mlp_trn1-mlp} -> ResourceInfo : {aws.amazon.com/neuron_7_}","kind":"receiver","name":"awscontainerinsightreceiver","data_type":"metrics"}
2024-03-01T17:06:46Z I! {"caller":"stores/podresourcesstore.go:146","msg":"/nContainerInfo : {default_trn1-mlp_trn1-mlp} -> ResourceInfo : {aws.amazon.com/neuron_3_}","kind":"receiver","name":"awscontainerinsightreceiver","data_type":"metrics"}
2024-03-01T17:06:46Z I! {"caller":"stores/podresourcesstore.go:146","msg":"/nContainerInfo : {default_trn1-mlp_trn1-mlp} -> ResourceInfo : {aws.amazon.com/neuron_8_}","kind":"receiver","name":"awscontainerinsightreceiver","data_type":"metrics"}
2024-03-01T17:06:46Z I! {"caller":"stores/podresourcesstore.go:146","msg":"/nContainerInfo : {default_trn1-mlp_trn1-mlp} -> ResourceInfo : {aws.amazon.com/neuron_1_}","kind":"receiver","name":"awscontainerinsightreceiver","data_type":"metrics"}
2024-03-01T17:06:46Z I! {"caller":"stores/podresourcesstore.go:146","msg":"/nContainerInfo : {default_trn1-mlp_trn1-mlp} -> ResourceInfo : {aws.amazon.com/neuron_5_}","kind":"receiver","name":"awscontainerinsightreceiver","data_type":"metrics"}
2024-03-01T17:06:46Z I! {"caller":"stores/podresourcesstore.go:146","msg":"/nContainerInfo : {default_trn1-mlp_trn1-mlp} -> ResourceInfo : {aws.amazon.com/neuron_10_}","kind":"receiver","name":"awscontainerinsightreceiver","data_type":"metrics"}

@aditya-purang aditya-purang self-assigned this Feb 22, 2024

for _, deviceID := range device.GetDeviceIds() {
resourceInfo := ResourceInfo{
resourceName: device.GetResourceName(),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to pass in a set of device resource names that we are interested in, so that we don't waste space storing all devices.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it make sense to have a add_resource_name method which adds that to a set which we can then use for lookup and populate the map in the next run?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I like that.

resourceName: device.GetResourceName(),
deviceID: deviceID,
}
_, found := p.resourceNameSet[resourceInfo.resourceName]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another optimization - we don't even need to do the scrape at all if resourceNameSet is empty.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch, updated the PR

straussb
straussb previously approved these changes Mar 5, 2024
Comment on lines +84 to +86
if err != nil {
return
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just to avoid the compiler giving you a warning about ignoring the error?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup

straussb
straussb previously approved these changes Mar 5, 2024
@movence
Copy link

movence commented Mar 6, 2024

Please update the target branch to aws-cwa-dev

@aditya-purang aditya-purang merged commit a4af460 into amazon-contributing:main Mar 6, 2024
265 of 283 checks passed
aditya-purang added a commit to sam6134/opentelemetry-collector-contrib that referenced this pull request Mar 6, 2024
aditya-purang added a commit to sam6134/opentelemetry-collector-contrib that referenced this pull request Mar 6, 2024
aditya-purang added a commit that referenced this pull request Mar 6, 2024
#183)

* [receivers/awscontainerinsightsreceiver] Add podresourcesstore in awscontainerinsightreceiver (#167)

ref : #167
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants