Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: infiniband plugin #304

Merged
merged 1 commit into from
Jun 4, 2024
Merged

Conversation

spencermckee
Copy link
Contributor

@spencermckee spencermckee commented Apr 23, 2024

Description

Adds a plugin to collect metrics from the Nvidia Infiniband driver port counters at /sys/class/infiniband//ports//counters and debug status parameters at /sys/class/net//debug. The data is available as two new metrics, InfinibandCounterStats (labels: counter name, device, and port) and InfinibandStatusParams (labels: status param name and interface).

Checklist

  • I have read the contributing documentation.
  • I signed and signed-off the commits (git commit -S -s ...). See this documentation on signing commits.
  • I have correctly attributed the author(s) of the code.
  • I have tested the changes locally.
  • I have followed the project's style guidelines.
  • I have updated the documentation, if necessary.
  • I have added tests, if applicable.

Please refer to the CONTRIBUTING.md file for more information on how to contribute to this project.

@spencermckee spencermckee force-pushed the spencermckee/infiniband-plugin branch 5 times, most recently from 278f3ae to 902d328 Compare April 26, 2024 00:29
@spencermckee spencermckee marked this pull request as ready for review April 26, 2024 16:48
@spencermckee spencermckee requested a review from a team as a code owner April 26, 2024 16:48
@spencermckee spencermckee force-pushed the spencermckee/infiniband-plugin branch from 902d328 to 2bf81a5 Compare April 26, 2024 16:49
@rbtr rbtr requested a review from matmerr April 29, 2024 16:18
plugin.sh Outdated Show resolved Hide resolved
test-summary Outdated Show resolved Hide resolved
pkg/plugin/packetparser/packetparser_bpfel_x86.o Outdated Show resolved Hide resolved
pkg/plugin/infiniband/infiniband_stats_linux.go Outdated Show resolved Hide resolved
pkg/plugin/infiniband/infiniband_stats_linux_test.go Outdated Show resolved Hide resolved
pkg/plugin/infiniband/infiniband_stats_linux.go Outdated Show resolved Hide resolved
pkg/plugin/infiniband/infiniband_stats_linux.go Outdated Show resolved Hide resolved
docs/metrics/plugins/infiniband.md Outdated Show resolved Hide resolved
pkg/metrics/types.go Show resolved Hide resolved
pkg/plugin/infiniband/infiniband_linux.go Outdated Show resolved Hide resolved
pkg/plugin/infiniband/infiniband_stats_linux_test.go Outdated Show resolved Hide resolved
plugin.sh Outdated Show resolved Hide resolved
test/plugin/infiniband/main_linux.go Outdated Show resolved Hide resolved
pkg/plugin/infiniband/infiniband_linux.go Outdated Show resolved Hide resolved
estebancams
estebancams previously approved these changes May 21, 2024
Copy link

@estebancams estebancams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving. I recommend to generate the test files programatically instead of adding more files to the repo. That also allows for more flexibility (since you can generate different tests scenarios)

@spencermckee spencermckee force-pushed the spencermckee/infiniband-plugin branch 3 times, most recently from 7432c7d to 58a6204 Compare May 21, 2024 20:58
@spencermckee spencermckee force-pushed the spencermckee/infiniband-plugin branch 2 times, most recently from 63d7186 to d289b57 Compare June 3, 2024 05:08
@spencermckee spencermckee dismissed matmerr’s stale review June 3, 2024 05:21

Made the requested changes

@spencermckee spencermckee force-pushed the spencermckee/infiniband-plugin branch from d289b57 to aba99ed Compare June 3, 2024 05:23
matmerr
matmerr previously approved these changes Jun 3, 2024
Copy link
Member

@matmerr matmerr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@spencermckee spencermckee added this pull request to the merge queue Jun 3, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jun 3, 2024
@spencermckee spencermckee added this pull request to the merge queue Jun 3, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jun 3, 2024
@spencermckee spencermckee force-pushed the spencermckee/infiniband-plugin branch from d44e5d5 to d7819a7 Compare June 4, 2024 00:07
@spencermckee spencermckee force-pushed the spencermckee/infiniband-plugin branch from d7819a7 to 29ce809 Compare June 4, 2024 17:26
@spencermckee spencermckee added this pull request to the merge queue Jun 4, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jun 4, 2024
@spencermckee spencermckee added this pull request to the merge queue Jun 4, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jun 4, 2024
@spencermckee spencermckee added this pull request to the merge queue Jun 4, 2024
Merged via the queue into main with commit 98241d8 Jun 4, 2024
21 checks passed
@spencermckee spencermckee deleted the spencermckee/infiniband-plugin branch June 4, 2024 23:41
@nddq nddq linked an issue Jun 6, 2024 that may be closed by this pull request
matmerr pushed a commit to matmerr/retina that referenced this pull request Jul 3, 2024
# Description

Adds a plugin to collect metrics from the Nvidia Infiniband driver port
counters at /sys/class/infiniband/<device>/ports/<port>/counters and
debug status parameters at /sys/class/net/<iface>/debug. The data is
available as two new metrics, InfinibandCounterStats (labels: counter
name, device, and port) and InfinibandStatusParams (labels: status param
name and interface).

## Checklist

- [x] I have read the [contributing
documentation](https://retina.sh/docs/contributing).
- [x] I signed and signed-off the commits (`git commit -S -s ...`). See
[this
documentation](https://docs.github.com/en/authentication/managing-commit-signature-verification/about-commit-signature-verification)
on signing commits.
- [x] I have correctly attributed the author(s) of the code.
- [x] I have tested the changes locally.
- [x] I have followed the project's style guidelines.
- [x] I have updated the documentation, if necessary.
- [x] I have added tests, if applicable.

---

Please refer to the [CONTRIBUTING.md](../CONTRIBUTING.md) file for more
information on how to contribute to this project.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Add Infiniband support for reading and exposing counters and stats
5 participants