Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MGMT-4718 Capture network latency and packet loss at L3 network connectivity #187

Conversation

jordigilh
Copy link
Contributor

@jordigilh jordigilh commented Mar 30, 2021

This PR extends the L3 connectivity information gathered by the agent by capturing the average latency (RTT) and the packet loss information per host connection.

Code has been refactored and unit tests extended to cover the new functionality as well as the aggregation process to validate the refactored code.

This PR is the counterpart in the assisted-service.

@pkliczewski @jakub-dzon @machacekondra @masayag @YuviGold please review at your discretion in the meanwhile.

/Jordi

@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 30, 2021
@jordigilh jordigilh force-pushed the MGMT-4718_capture_network_latency branch from 8d50ae2 to 02ba00f Compare March 31, 2021 00:28
@jordigilh jordigilh force-pushed the MGMT-4718_capture_network_latency branch from 02ba00f to c30b1e5 Compare April 3, 2021 16:19
@jordigilh jordigilh changed the title [WIP] MGMT-4718 Capture network latency and packet loss at L3 network connectivity MGMT-4718 Capture network latency and packet loss at L3 network connectivity Apr 3, 2021
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 3, 2021
@jordigilh jordigilh force-pushed the MGMT-4718_capture_network_latency branch from c30b1e5 to 76a49e9 Compare April 4, 2021 15:29
@jordigilh
Copy link
Contributor Author

/retest

@jordigilh jordigilh force-pushed the MGMT-4718_capture_network_latency branch 2 times, most recently from 58f5b4e to 0a69fcd Compare April 5, 2021 16:05
@jordigilh jordigilh changed the title MGMT-4718 Capture network latency and packet loss at L3 network connectivity [WIP] MGMT-4718 Capture network latency and packet loss at L3 network connectivity Apr 5, 2021
@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 5, 2021
@jordigilh jordigilh force-pushed the MGMT-4718_capture_network_latency branch 2 times, most recently from 9a82355 to 3ed01a7 Compare April 8, 2021 14:56
@jordigilh jordigilh changed the title [WIP] MGMT-4718 Capture network latency and packet loss at L3 network connectivity MGMT-4718 Capture network latency and packet loss at L3 network connectivity Apr 8, 2021
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 8, 2021
@openshift-ci openshift-ci bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 12, 2021
@jordigilh jordigilh force-pushed the MGMT-4718_capture_network_latency branch 3 times, most recently from c5225b8 to be0517a Compare April 27, 2021 19:06
@pkliczewski
Copy link

@YuviGold please review

@openshift-ci openshift-ci bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 29, 2021
src/commands/connectivity_check.go Outdated Show resolved Hide resolved
src/commands/connectivity_check.go Outdated Show resolved Hide resolved
src/commands/connectivity_check.go Outdated Show resolved Hide resolved
@jordigilh jordigilh force-pushed the MGMT-4718_capture_network_latency branch from be0517a to dfd5212 Compare April 29, 2021 13:29
@jordigilh jordigilh force-pushed the MGMT-4718_capture_network_latency branch from dfd5212 to 1499058 Compare April 30, 2021 14:07
src/commands/connectivity_check.go Outdated Show resolved Hide resolved
L3Connectivity: []*models.L3Connectivity{},
}
addresses := getOutgoingAddresses(conCheck.getHost().Nics)
dataCh := make(chan any)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why channel with size 1 ?

Copy link
Contributor Author

@jordigilh jordigilh May 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because there is only 1 routine consuming the data from the channel. Having the channel with a bigger queue won't improve the time it takes for this single routine to process all the information generated by all the other routines.

src/commands/connectivity_check.go Outdated Show resolved Hide resolved
allDstMacs = append(allDstMacs, destNic.Mac)
func l2CheckConnectivity(dataCh chan any, conCheck connectivityCmd) {
defer sendDone(dataCh)
doneCh := make(chan any)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest to have single method to signal the go-routine terminates - either wait-group or done channel.
It is not relevant to channels that the done information is sent along with the data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. I'll refactor this part to use a wait group like in

wg := sync.WaitGroup{}
wg.Add(len(addresses))

It will save the need to use this doneCh to sync.

@jordigilh jordigilh force-pushed the MGMT-4718_capture_network_latency branch from 1499058 to 1cfbb11 Compare May 4, 2021 23:37
@ori-amizur
Copy link
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label May 5, 2021
@ori-amizur
Copy link
Contributor

/approve

@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jordigilh, ori-amizur

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 6, 2021
@openshift-merge-robot openshift-merge-robot merged commit 3eb4188 into openshift:master May 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants