Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ui: Use liveness info to populate decommissioned node lists #76538

Merged
merged 1 commit into from
Feb 15, 2022

Conversation

zachlite
Copy link
Contributor

Previously, the decommissioned node lists considered node status entries
to determine decommissioning and decommissioned status. This changed in #56529,
resulting in empty lists. Now, the node's liveness entry is considered
and these lists are correctly populated.

Release note (bug fix): The list of recently decommissioned nodes
and the historical list of decommissioned nodes correctly display
decommissioned nodes.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@zachlite
Copy link
Contributor Author

zachlite commented Feb 14, 2022

Addresses #61812 and cockroachdb/docs#9968

@zachlite
Copy link
Contributor Author

With these changes, here's what decommissioning via cockroach node looks like:

  1. Done moving replicas
    Screen Shot 2022-02-14 at 3 31 32 PM

  2. Immediately
    Screen Shot 2022-02-14 at 4 04 09 PM

  3. After 5 minutes
    Screen Shot 2022-02-14 at 4 08 00 PM

  4. Full History
    Screen Shot 2022-02-14 at 3 56 46 PM

@zachlite zachlite marked this pull request as ready for review February 14, 2022 21:38
@zachlite zachlite force-pushed the fix-decommissioned-node-list branch from cd39890 to 7b9c8fe Compare February 14, 2022 22:55
Copy link
Collaborator

@koorosh koorosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 4 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker, @Santamaura, @tbg, and @zachlite)


pkg/ui/workspaces/db-console/src/views/cluster/containers/nodesOverview/index.tsx, line 56 at r1 (raw file):

  StatusTooltip,
} from "./tooltips";
import { cockroach } from "oss/src/js/protos";

nit. change import path to avoid oss part (it is added with auto-imports).
Should be: from "src/js/protos"


pkg/ui/workspaces/db-console/src/views/cluster/containers/nodesOverview/index.tsx, line 174 at r1 (raw file):

const NodeNameColumn: React.FC<{
  record: NodeStatusRow | DecommissionedNodeStatusRow;
  shouldLink: boolean;

Can it be defined with default value (shouldLink = true), so it shouldn't be defined explicitly in other places?


pkg/ui/workspaces/db-console/src/views/reports/containers/nodeHistory/decommissionedNodeHistory.tsx, line 21 at r1 (raw file):

import { nodesSummarySelector } from "src/redux/nodes";
import { refreshLiveness, refreshNodes } from "src/redux/apiReducers";
import { cockroach } from "oss/src/js/protos";

nit. the same as above, remove oss part from import.

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't reviewed any of the code (since it's all in the UI), but I did go through the PR message and the new behavior looks good. For my education, where does the behaviour to display the node as "unavailable" in the decommissioned section come from? I think if we're being idealistic about the UX then once a node is decommissioned, it shouldn't matter whether it has been offline for five minutes or not.

Thanks!

Copy link
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What Tobi said, LGTM!

Also, in general I'd encourage always joining kv_node_status data against kv_node_liveness before use. The liveness data is the source of truth for node existence. Using a status entry when a liveness entry does not exist, or ignoring a liveness entry when the status entry does not exist, both yield incorrect and surprising results -- we've had several cases where failing to do this has caused pretty severe downstream problems.

@zachlite
Copy link
Contributor Author

Thanks for the review, all.
@tbg, the unavailable status comes from the enum cockroach.kv.kvserver.liveness.livenesspb.NodeLivenessStatus
@erikgrinaker, ack. I'm treating liveness data as the source of truth.

Previously, the decommissioned node lists considered node status entries
to determine decommissioning and decommissioned status. This changed in cockroachdb#56529,
resulting in empty lists. Now, the node's liveness entry is considered
and these lists are correctly populated.

Release note (bug fix): The list of recently decommissioned nodes
and the historical list of decommissioned nodes correctly display
decommissioned nodes.

fix tests

fix pb import and add default arg value
@zachlite zachlite force-pushed the fix-decommissioned-node-list branch from 7b9c8fe to 7179469 Compare February 15, 2022 16:12
Copy link
Collaborator

@koorosh koorosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @Santamaura and @zachlite)

a discussion (no related file):
:lgtm:


@zachlite
Copy link
Contributor Author

bors r+

@craig craig bot merged commit 3cb7eb0 into cockroachdb:master Feb 15, 2022
@craig
Copy link
Contributor

craig bot commented Feb 15, 2022

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants