-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
console: filter out stats referencing dead nodes #62917
console: filter out stats referencing dead nodes #62917
Conversation
Previously, nodes could reference nodes we didn't know about in their stats protos. Our frontend wouldn't know those nodeIDs because they weren't in the list of known nodes that were returned from the DB and would fail to render the network stats table. Now we filter out any nodeIDs that are unknown to us so that we don't render a broken table. This is meant to be a targeted fix for a specific problem. Future work will be done that addresses properly accounting for the various node liveness states that have been changed on the backend and have not been fully accounted for in the frontend. Resolves cockroachdb#59322 Release note (ui change): Fixes a bug where nodes would reference dead cluster nodes in their network stats and cause the network DB Console page to crash.
7fafc19
to
f3a324d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dhartunian and @nathanstilwell)
a discussion (no related file):
Current change looks good to me 👍🏽 but also Network page can crash when network activity items (in node status) don't contain "latency" record. Regarding to code on backend side it is possible case. See pkg/server/status/recorder.go
Line 401 (getNetworkActivity
method).
Please take a look on PR with suggested fix: dhartunian#2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the "quick fix" nature of this PR, LGTM. I think Andrii's proposal is an interesting one, but can imagine a cleaner fix on the backend.
Agree, this proposal isn't the best one, but probably a quick one 😃 |
This PR is stale and I think #56529 addressed this on the server side so I'm closing it. |
Previously, nodes could reference nodes we didn't know about in their
stats protos. Our frontend wouldn't know those nodeIDs because they
weren't in the list of known nodes that were returned from the DB and
would fail to render the network stats table.
Now we filter out any nodeIDs that are unknown to us so that we don't
render a broken table.
This is meant to be a targeted fix for a specific problem. Future work
will be done that addresses properly accounting for the various node
liveness states that have been changed on the backend and have not been
fully accounted for in the frontend.
Resolves #59322
Release note (ui change): Fixes a bug where nodes would reference dead
cluster nodes in their network stats and cause the network DB Console
page to crash.