Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add shell command for query_disk_info #498

Merged
merged 51 commits into from
Mar 24, 2020

Conversation

foreverneverer
Copy link
Contributor

@foreverneverer foreverneverer commented Mar 13, 2020

Important things

The PR depends XiaoMi/rdsn#409 and XiaoMi/rdsn#416, it should not be merged until XiaoMi/rdsn#409 and XiaoMi/rdsn#416 are merged

What problem does this PR solve?

add shell command for query_disk_info. related pr: XiaoMi/rdsn#409

What is changed and how it works?

Check List

Tests

  • Manual test (add detailed scripts or steps below)

Use 127.0.0.1 instead of real node address.

test disk capacity info

>>> disk_capacity  # defaut show all node overall disk capacity info
node                total_capacity(MB)  avalable_capacity(MB)  avalable_ratio(%)  capacity_balance  
127.0.0.1:34801         3183565             3150795                99                 0                 
127.0.0.1:34802         3183565             3150795                99                 0                 
127.0.0.1:34803         3183565             3150793                99                 0                        

>>> disk_capacity -d # show all nodes detail disk capacity info
[node:127.0.0.1:34801]        
disk  total_capacity(MB)  avalable_capacity(MB)  avalable_ratio(%)   capacity_balance
ssd1  454795                      450109                 99                 0             
ssd2  454795                      450113                 99                 0             
ssd3  454795                      450113                 99                 0             
total 3183565                     3150795                99                 0      

[node:127.0.0.1:34802] 
disk  total_capacity(MB)  avalable_capacity(MB)  avalable_ratio(%)  capacity_balance  
ssd1  454795                      450109                 99                 0             
ssd2  454795                      450113                 99                 0             
ssd3  454795                      450113                 99                 0             
total 3183565                     3150795                99                 0              

>>> disk_capacity -n 127.0.0.1:34801 # show one node capacity info
node                total_capacity(MB)  avalable_capacity(MB)  avalable_ratio(%)  capacity_balance  
127.0.0.1:34801          3183565             3150795                99                 0                 

>>> disk_capacity -n 127.0.0.1:34801 -d # show one node detail capacity info
[node:127.0.0.1:34801]        
disk  total_capacity(MB)  avalable_capacity(MB)  avalable_ratio(%)   capacity_balance
ssd1  454795                      450109                 99                 0             
ssd2  454795                      450113                 99                 0             
ssd3  454795                      450113                 99                 0             
total 3183565                     3150795                99                 0                            

test disk replica count info

>>> disk_replica # show all nodes disk, all app replica_count
[node: 127.0.0.1:34801]            
disk  primary_count  secondary_count  replica_count  
ssd1  1              4                5              
ssd2  2              4                6              
ssd3  2              4                6          

[node: 127.0.0.1:34802]    
disk  primary_count  secondary_count  replica_count  
ssd1  1              4                5              
ssd2  2              4                6              
ssd3  2              4                6             

>>> disk_replica -n 127.0.0.1:34801 # show one nodes disk, all app replica_count
[node: 127.0.0.1:34801]             
disk  primary_count  secondary_count  replica_count  
ssd1  2              8                10              
ssd2  4              8                12              
ssd3  4              8                12  

>>> disk_replica -n 127.0.0.1:34801 -a app_name # show one nodes disk, one app 
[nodes: 127.0.0.1:34801]                
disk  primary_count  secondary_count  replica_count  
ssd1  1              4                5              
ssd2  2              4                6              
ssd3  2              4                6  

test invalid arguments and err response (take disk_replica as example)

>>> disk_replica -n 12345 # if node address invalid(not belong to node groups or illegal ip:port), return false and info
please input valid target node_address!
USAGE: 	disk_replica              [-n|--node node_address] [-a|--app app_name]

>>> disk_replica -n # if miss node address , return false and info
missing param <node_address>
USAGE: 	disk_replica              [-n|--node node_address] [-a|--app app_name]

>>> disk_replica -n 127.0.0.1:34801 -a not_existed_app # if app is not existed, return false and info
disk of node[127.0.0.1:34801] info skiped because request failed, error=ERR_OBJECT_NOT_FOUND
USAGE:  disk_replica              [-n|--node node_address] [-a|--app app_name]

>>> disk_replica  # if one node error, show success node info
disk of node[127.0.0.1:34801] replica_count info skiped because request failed, error=ERR_NETWORK_FAILURE: unable to send rpc to server
[node: 127.0.0.1:34802]  
disk  primary_count  secondary_count  replica_count  
ssd1  1              4                5              
ssd2  2              4                6              
ssd3  2              4                6              

Related changes

  • Need to cherry-pick to the release branch
  • Need to update the documentation
  • Need to be included in the release note

Copy link
Member

@acelyc111 acelyc111 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • When I query a node's replica distribution by add -a parameter, disk capacity details will disappear?
  • How can I query all nodes disk capacity details or replica distribution details, I have to query them one by one?

It's too puzzled, I suggest you to design the shell command well and we reach an agreement at first.

src/shell/main.cpp Outdated Show resolved Hide resolved
src/shell/commands/disk_rebalance.cpp Outdated Show resolved Hide resolved
src/shell/commands/disk_rebalance.cpp Outdated Show resolved Hide resolved
src/shell/commands/disk_rebalance.cpp Outdated Show resolved Hide resolved
src/shell/commands/disk_rebalance.cpp Outdated Show resolved Hide resolved
src/shell/commands/disk_rebalance.cpp Outdated Show resolved Hide resolved
src/shell/commands/disk_rebalance.cpp Outdated Show resolved Hide resolved
src/shell/commands/disk_rebalance.cpp Outdated Show resolved Hide resolved
src/shell/commands/disk_rebalance.cpp Show resolved Hide resolved
src/shell/commands/disk_rebalance.cpp Outdated Show resolved Hide resolved
@foreverneverer
Copy link
Contributor Author

  • When I query a node's replica distribution by add -a parameter, disk capacity details will disappear?
  • How can I query all nodes disk capacity details or replica distribution details, I have to query them one by one?

It's too puzzled, I suggest you to design the shell command well and we reach an agreement at first.

OK, Current design reaosns are:

  • capacity info and replica_count info are different point of view. I expect focus one aspect when query, -a means query app replica info, you can think it is flag or mod. so if input -a means you query one or all app replica count info;
  • separate replica count info and capacity info has some other consideration:
    • disk_info meas overall, if show replica count info meanwhile, it will same nodes -d. so I design disk_info without -a means default is capacity to be different from nodes -d.
    • disk_info -a can show one app replica count info. if you show capacity meanwhile but can not show one app capacity info, you will get all app capacity and one app replica count view (disk_info -a 0 can't show app (id = 0) disk capacity but can replcia count in current design ), I think it's puzzled. Actually, show one app disk capticy is app_disk, so here no need to do same work.
  • I think that one node has many ssd, so if show all node detail info, it will show too many ssd info (node_count * ssd_count), I don't think it is good way to focus user want. Actually, user often get overall info by disk_info and -n to focus he want detail node info. Of course, add -d to give choice may be better.

acelyc111
acelyc111 previously approved these changes Mar 23, 2020
@neverchanje neverchanje added the component/shell pegasus shell label Mar 23, 2020
@acelyc111 acelyc111 merged commit 47cea63 into apache:master Mar 24, 2020
@neverchanje neverchanje mentioned this pull request Mar 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants