Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance/cache pd status #986

Merged
merged 19 commits into from
Dec 21, 2020
Merged

Conversation

9547
Copy link
Contributor

@9547 9547 commented Dec 12, 2020

What problem does this PR solve?

What is changed and how it works?

Check List

Tests

The old one need ~2minutes

root@control:/tiup-cluster# time bin/tiup-cluster display c1
Cluster type:    tidb
Cluster name:    c1
Cluster version: v4.0.4
SSH type:        builtin

ID                  Role            Host          Ports                            OS/Arch       Status  Data Dir                                                 Deploy Dir
--                  ----            ----          -----                            -------       ------  --------                                                 ----------
172.19.0.101:9093   alertmanager    172.19.0.101  9093/9094                        linux/x86_64  Up      /home/tidb/deploy/alertmanager-9093/data                 /home/tidb/deploy/alertmanager-9093
172.19.0.103:8300   cdc             172.19.0.103  8300                             linux/x86_64  Down    -                                                        /home/tidb/deploy/cdc-8300
172.19.0.104:8300   cdc             172.19.0.104  8300                             linux/x86_64  Down    -                                                        /home/tidb/deploy/cdc-8300
172.19.0.105:8300   cdc             172.19.0.105  8300                             linux/x86_64  Down    -                                                        /home/tidb/deploy/cdc-8300
172.19.0.101:8249   drainer         172.19.0.101  8249                             linux/x86_64  Up      /home/tidb/data/drainer-8249/data                        /home/tidb/deploy/drainer-8249
172.19.0.101:3000   grafana         172.19.0.101  3000                             linux/x86_64  Up      -                                                        /home/tidb/deploy/grafana-3000
172.19.0.103:2379   pd              172.19.0.103  2379/2380                        linux/x86_64  Down    /home/tidb/deploy/pd-2379/data                           /home/tidb/deploy/pd-2379
172.19.0.104:2379   pd              172.19.0.104  2379/2380                        linux/x86_64  Down    /home/tidb/deploy/pd-2379/data                           /home/tidb/deploy/pd-2379
172.19.0.105:2379   pd              172.19.0.105  2379/2380                        linux/x86_64  Down    /home/tidb/deploy/pd-2379/data                           /home/tidb/deploy/pd-2379
172.19.0.101:9090   prometheus      172.19.0.101  9090                             linux/x86_64  Up      /home/tidb/deploy/prometheus-9090/data                   /home/tidb/deploy/prometheus-9090
172.19.0.102:9090   prometheus      172.19.0.102  9090                             linux/x86_64  Up      /home/tidb/deploy/prometheus-9090/data                   /home/tidb/deploy/prometheus-9090
172.19.0.103:8250   pump            172.19.0.103  8250                             linux/x86_64  Up      /home/tidb/deploy/pump-8250/data                         /home/tidb/deploy/pump-8250
172.19.0.104:8250   pump            172.19.0.104  8250                             linux/x86_64  Up      /home/tidb/deploy/pump-8250/data                         /home/tidb/deploy/pump-8250
172.19.0.105:8250   pump            172.19.0.105  8250                             linux/x86_64  Up      /home/tidb/deploy/pump-8250/data                         /home/tidb/deploy/pump-8250
172.19.0.101:4000   tidb            172.19.0.101  4000/10080                       linux/x86_64  Up      -                                                        /home/tidb/deploy/tidb-4000
172.19.0.102:4000   tidb            172.19.0.102  4000/10080                       linux/x86_64  Up      -                                                        /home/tidb/deploy/tidb-4000
172.19.0.103:9000   tiflash         172.19.0.103  9000/8123/3930/20170/20292/8234  linux/x86_64  Down    /home/tidb/deploy/tiflash-9000/data1,/data/tiflash-data  /home/tidb/deploy/tiflash-9000
172.19.0.101:20160  tikv            172.19.0.101  20160/20180                      linux/x86_64  Down    /home/tidb/deploy/tikv-20160/data                        /home/tidb/deploy/tikv-20160
172.19.0.103:20160  tikv            172.19.0.103  20160/20180                      linux/x86_64  Down    /home/tidb/my_kv_data                                    /home/tidb/deploy/tikv-20160
172.19.0.104:20160  tikv            172.19.0.104  20160/20180                      linux/x86_64  Down    /home/tidb/deploy/tikv-20160/data                        /home/tidb/deploy/tikv-20160
172.19.0.105:20160  tikv            172.19.0.105  20160/20180                      linux/x86_64  Down    /home/tidb/deploy/tikv-20160/data                        /home/tidb/deploy/tikv-20160
172.19.0.103:7077   tispark-master  172.19.0.103  7077/8080                        linux/x86_64  Up      -                                                        /home/tidb/deploy/tispark-master-7077
172.19.0.104:7078   tispark-worker  172.19.0.104  7078/8081                        linux/x86_64  Up      -                                                        /home/tidb/deploy/tispark-worker-7078
Total nodes: 23

real    1m56.029s
user    0m0.287s
sys     0m0.133s

After cached, the used time is limited in 3*10s

root@control:/tiup-cluster# time bin/tiup-cluster display c1
Cluster type:       tidb
Cluster name:       c1
Cluster version:    v4.0.4
SSH type:           builtin
ID                  Role            Host          Ports                            OS/Arch       Status  Data Dir                                                 Deploy Dir
--                  ----            ----          -----                            -------       ------  --------                                                 ----------
172.19.0.101:9093   alertmanager    172.19.0.101  9093/9094                        linux/x86_64  Up      /home/tidb/deploy/alertmanager-9093/data                 /home/tidb/deploy/alertmanager-9093
172.19.0.103:8300   cdc             172.19.0.103  8300                             linux/x86_64  Down    -                                                        /home/tidb/deploy/cdc-8300
172.19.0.104:8300   cdc             172.19.0.104  8300                             linux/x86_64  Down    -                                                        /home/tidb/deploy/cdc-8300
172.19.0.105:8300   cdc             172.19.0.105  8300                             linux/x86_64  Down    -                                                        /home/tidb/deploy/cdc-8300
172.19.0.101:8249   drainer         172.19.0.101  8249                             linux/x86_64  Up      /home/tidb/data/drainer-8249/data                        /home/tidb/deploy/drainer-8249
172.19.0.101:3000   grafana         172.19.0.101  3000                             linux/x86_64  Up      -                                                        /home/tidb/deploy/grafana-3000
172.19.0.103:2379   pd              172.19.0.103  2379/2380                        linux/x86_64  Down    /home/tidb/deploy/pd-2379/data                           /home/tidb/deploy/pd-2379
172.19.0.104:2379   pd              172.19.0.104  2379/2380                        linux/x86_64  Down    /home/tidb/deploy/pd-2379/data                           /home/tidb/deploy/pd-2379
172.19.0.105:2379   pd              172.19.0.105  2379/2380                        linux/x86_64  Down    /home/tidb/deploy/pd-2379/data                           /home/tidb/deploy/pd-2379
172.19.0.101:9090   prometheus      172.19.0.101  9090                             linux/x86_64  Up      /home/tidb/deploy/prometheus-9090/data                   /home/tidb/deploy/prometheus-9090
172.19.0.102:9090   prometheus      172.19.0.102  9090                             linux/x86_64  Up      /home/tidb/deploy/prometheus-9090/data                   /home/tidb/deploy/prometheus-9090
172.19.0.103:8250   pump            172.19.0.103  8250                             linux/x86_64  Up      /home/tidb/deploy/pump-8250/data                         /home/tidb/deploy/pump-8250
172.19.0.104:8250   pump            172.19.0.104  8250                             linux/x86_64  Up      /home/tidb/deploy/pump-8250/data                         /home/tidb/deploy/pump-8250
172.19.0.105:8250   pump            172.19.0.105  8250                             linux/x86_64  Up      /home/tidb/deploy/pump-8250/data                         /home/tidb/deploy/pump-8250
172.19.0.101:4000   tidb            172.19.0.101  4000/10080                       linux/x86_64  Up      -                                                        /home/tidb/deploy/tidb-4000
172.19.0.102:4000   tidb            172.19.0.102  4000/10080                       linux/x86_64  Up      -                                                        /home/tidb/deploy/tidb-4000
172.19.0.103:9000   tiflash         172.19.0.103  9000/8123/3930/20170/20292/8234  linux/x86_64  N/A     /home/tidb/deploy/tiflash-9000/data1,/data/tiflash-data  /home/tidb/deploy/tiflash-9000
172.19.0.101:20160  tikv            172.19.0.101  20160/20180                      linux/x86_64  N/A     /home/tidb/deploy/tikv-20160/data                        /home/tidb/deploy/tikv-20160
172.19.0.103:20160  tikv            172.19.0.103  20160/20180                      linux/x86_64  N/A     /home/tidb/my_kv_data                                    /home/tidb/deploy/tikv-20160
172.19.0.104:20160  tikv            172.19.0.104  20160/20180                      linux/x86_64  N/A     /home/tidb/deploy/tikv-20160/data                        /home/tidb/deploy/tikv-20160
172.19.0.105:20160  tikv            172.19.0.105  20160/20180                      linux/x86_64  N/A     /home/tidb/deploy/tikv-20160/data                        /home/tidb/deploy/tikv-20160
172.19.0.103:7077   tispark-master  172.19.0.103  7077/8080                        linux/x86_64  Up      -                                                        /home/tidb/deploy/tispark-master-7077
172.19.0.104:7078   tispark-worker  172.19.0.104  7078/8081                        linux/x86_64  Up      -                                                        /home/tidb/deploy/tispark-worker-7078
Total nodes: 23

real    0m26.010s
user    0m0.209s
sys     0m0.122s

One more thing

for the dm's cluster, I've highlighted the dm's status code, similar to tidb-cluster's, like below:

image

And if dm-masters were down, the dm-workers' status previous as below:

image

after this or, changed to: (Down -> N/A, means worker's status depends on master, but all masters were down)

image

Dashboard url

display with dashboard-url appended

root@control:/tiup-cluster# tiup-cluster display xxx
Cluster type:       tidb
Cluster name:       xxx
Cluster version:    v4.0.4
SSH type:           builtin
Dashboard URL:      http://n5:2379/dashboard
ID        Role            Host  Ports                            OS/Arch       Status  Data Dir                                  Deploy Dir
...

Code changes

  • Has exported function/method change
  • Has exported variable/fields change
  • Has interface methods change
  • Has persistent data change

Side effects

  • Possible performance regression
  • Increased code complexity
  • Breaking backward compatibility

Related changes

  • Need to cherry-pick to the release branch
  • Need to update the documentation

Release notes:

* enhancement: speed up the `{tiup|dm} cluster display;
* feature: Add dashboard URL in `tiup cluster display`;
* feature: highlight dm's status in `dm cluster display`;

@ti-chi-bot ti-chi-bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Dec 12, 2020
@codecov-io
Copy link

codecov-io commented Dec 12, 2020

Codecov Report

Merging #986 (9590b25) into master (edb12b8) will increase coverage by 0.00%.
The diff coverage is 88.42%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #986   +/-   ##
=======================================
  Coverage   55.64%   55.65%           
=======================================
  Files         279      279           
  Lines       19715    19743   +28     
=======================================
+ Hits        10970    10987   +17     
- Misses       7037     7044    +7     
- Partials     1708     1712    +4     
Flag Coverage Δ
cluster 43.55% <85.55%> (-0.02%) ⬇️
dm 23.97% <55.78%> (+0.07%) ⬆️
integrate 49.92% <88.42%> (+0.01%) ⬆️
playground 20.32% <0.00%> (-0.01%) ⬇️
tiup 16.50% <0.00%> (+0.02%) ⬆️
unittest 23.01% <0.00%> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/cluster/spec/util.go 81.39% <0.00%> (-0.83%) ⬇️
pkg/cluster/spec/spec.go 88.57% <75.00%> (-0.40%) ⬇️
pkg/cluster/spec/pd.go 69.87% <77.77%> (-2.53%) ⬇️
pkg/cluster/manager/display.go 82.67% <91.30%> (+1.04%) ⬆️
components/dm/spec/topology_dm.go 82.40% <100.00%> (+0.56%) ⬆️
pkg/cluster/api/pdapi.go 60.06% <100.00%> (-1.86%) ⬇️
pkg/utils/http_client.go 67.56% <100.00%> (-4.66%) ⬇️
pkg/repository/store/txn.go 62.01% <0.00%> (+2.32%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update edb12b8...9590b25. Read the comment docs.

@9547 9547 force-pushed the feature/cache-pd-status branch from 895545e to 3c20b65 Compare December 13, 2020 00:01
@breezewish
Copy link
Member

Hi, could you update your PR description about the Dashboard URL change?

@9547
Copy link
Contributor Author

9547 commented Dec 14, 2020

Hi, could you update your PR description about the Dashboard URL change?

Sorry, and I'll add it in the Release notes, it's anything else I'm missing?

@breezewish
Copy link
Member

Hi, could you update your PR description about the Dashboard URL change?

Sorry, and I'll add it in the Release notes, it's anything else I'm missing?

Actually I only want to know what it looks like in the terminal output for this Dashbaord URL change :)

@AstroProfundis AstroProfundis added the category/usability Categorizes issue or PR as a usability enhancement. label Dec 15, 2020
@9547
Copy link
Contributor Author

9547 commented Dec 16, 2020

@breeswish PTAL

root@control:/tiup-cluster# tiup-cluster display xxx
Cluster type:       tidb
Cluster name:       xxx
Cluster version:    v4.0.4
SSH type:           builtin
Dashboard URL:      http://n5:2379/dashboard
ID        Role            Host  Ports                            OS/Arch       Status  Data Dir                                  Deploy Dir
...

@lucklove
Copy link
Member

Maybe we can remove the --dashboard flag if we merged it into normal display? @breeswish

@9547
Copy link
Contributor Author

9547 commented Dec 16, 2020

Maybe we can remove the --dashboard flag if we merged it into normal display? @breeswish

LGTM

pkg/cluster/spec/pd.go Outdated Show resolved Hide resolved
@9547 9547 requested a review from lucklove December 19, 2020 03:38
@lucklove
Copy link
Member

/lgtm

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Dec 21, 2020
@lucklove
Copy link
Member

/merge

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Dec 21, 2020
@ti-chi-bot
Copy link
Member

Can merge label has been added.

Git tree hash: 9590b25

@ti-chi-bot ti-chi-bot merged commit aee2af0 into pingcap:master Dec 21, 2020
@lucklove lucklove changed the title Feature/cache pd status Enhance/cache pd status Dec 31, 2020
@lucklove lucklove added this to the v1.3.1 milestone Dec 31, 2020
lucklove added a commit that referenced this pull request Dec 31, 2020
* typo(cluster/pdapi): misused arg name

* enhance(cluster/pd): pd.Status check by itself

* feat(cluster/spec): add GetDashardAddress

* enhance(utils): set http clien't dial timeout to 5second

* feat(cluster/manager): cache pd's status

* feat(cluster/manager): display dashboard url

* tests(cluster): add display result check

* feat(dm): get master's status by local host

* feat(cluster/manager): cache dm's master status

* feat(cluster/manager): highlight display dm's status

* feat(cluster/manager): highlight display dm's status

* refact(cluster/pd): rm notused suffix

* typo(spec/cluster): UP -> Up

Co-authored-by: SIGSEGV <[email protected]>
@9547 9547 deleted the feature/cache-pd-status branch April 6, 2021 23:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category/usability Categorizes issue or PR as a usability enhancement. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT1 Indicates that a PR has LGTM 1.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TiUP cluster display <cluster-name> 等待时间过长
7 participants