Skip to content
This repository has been archived by the owner on May 10, 2022. It is now read-only.

close session when it's not responding for a while #32

Merged
merged 4 commits into from
Jan 17, 2019

Conversation

neverchanje
Copy link

@neverchanje neverchanje commented Jan 17, 2019

fix #25

Tested with ycsb threadcount=1/10/20:

  1. Set up a docker onebox and run ycsb.
>>> nodes -d
address               status              replica_count       primary_count       secondary_count     
172.21.0.21:34801     ALIVE               4                   1                   3                   
172.21.0.22:34801     ALIVE               5                   2                   3                   
172.21.0.23:34801     ALIVE               5                   2                   3                   
172.21.0.24:34801     ALIVE               5                   1                   4                   
172.21.0.25:34801     ALIVE               5                   2                   3                   

total_node_count   : 5
alive_node_count   : 5
unalive_node_count : 0
  1. Partition replica1 with rest of the nodes
docker run -it --rm -v /var/run/docker.sock:/var/run/docker.sock gaiaadm/pumba netem --duration 1h --tc-image gaiadocker/iproute2 loss --percent 100 pegasus_replica1_1

>>> nodes -d
address               status              replica_count       primary_count       secondary_count     
172.21.0.21:34801     UNALIVE             0                   0                   0                   
172.21.0.22:34801     ALIVE               5                   3                   2                   
172.21.0.23:34801     ALIVE               5                   2                   3                   
172.21.0.24:34801     ALIVE               5                   1                   4                   
172.21.0.25:34801     ALIVE               5                   2                   3                   

total_node_count   : 5
alive_node_count   : 4
unalive_node_count : 1
  1. The client recovers in a short time as expected.
2019-01-17 11:44:27:239 10 sec: 35953 operations; 3595.3 current ops/sec; est completion in 7 hours 43 minutes [INSERT: Count=35954, Max=99327, Min=191, Avg=272.59, 90=329, 99=507, 99.9=2625, 99.99=5851] 
2019-01-17 11:44:37:238 20 sec: 60071 operations; 2411.8 current ops/sec; est completion in 9 hours 14 minutes [INSERT: Count=24118, Max=64831, Min=279, Avg=410.72, 90=541, 99=629, 99.9=2685, 99.99=16231] 
2019-01-17 11:44:47:238 30 sec: 83800 operations; 2372.9 current ops/sec; est completion in 9 hours 56 minutes [INSERT: Count=23728, Max=45951, Min=236, Avg=418.61, 90=548, 99=646, 99.9=3277, 99.99=9167] 
2019-01-17 11:44:57:239 40 sec: 87318 operations; 351.8 current ops/sec; est completion in 12 hours 42 minutes [INSERT: Count=3518, Max=3018751, Min=223, Avg=1999.53, 90=292, 99=526, 99.9=3831, 99.99=3018751] 
Retrying insertion, retry count: 1
2019-01-17 11:45:07:239 50 sec: 87333 operations; 1.5 current ops/sec; est completion in 15 hours 53 minutes [INSERT: Count=15, Max=3018751, Min=308, Avg=202048.07, 90=1420, 99=3018751, 99.9=3018751, 99.99=3018751] [INSERT-FAILED: Count=1, Max=5009407, Min=5005312, Avg=5007360, 90=5009407, 99=5009407, 99.9=5009407, 99.99=5009407] 
Retrying insertion, retry count: 2
2019-01-17 11:45:17:239 60 sec: 87333 operations; 0 current ops/sec; est completion in 19 hours 4 minutes [INSERT: Count=0, Max=0, Min=9223372036854775807, Avg=NaN, 90=0, 99=0, 99.9=0, 99.99=0] [INSERT-FAILED: Count=1, Max=5005311, Min=5001216, Avg=5003264, 90=5005311, 99=5005311, 99.9=5005311, 99.99=5005311] 
Retrying insertion, retry count: 3
2019-01-17 11:45:27:238 70 sec: 100171 operations; 1283.8 current ops/sec; est completion in 19 hours 23 minutes [INSERT: Count=12839, Max=2678783, Min=167, Avg=444.76, 90=265, 99=457, 99.9=1351, 99.99=3909] [INSERT-FAILED: Count=1, Max=5005311, Min=5001216, Avg=5003264, 90=5005311, 99=5005311, 99.9=5005311, 99.99=5005311] 
2019-01-17 11:45:37:238 80 sec: 143605 operations; 4343.4 current ops/sec; est completion in 15 hours 27 minutes [INSERT: Count=43434, Max=10351, Min=172, Avg=228.61, 90=245, 99=311, 99.9=1130, 99.99=7731] [INSERT-FAILED: Count=0, Max=0, Min=9223372036854775807, Avg=NaN, 90=0, 99=0, 99.9=0, 99.99=0] 
2019-01-17 11:45:47:238 90 sec: 186595 operations; 4299 current ops/sec; est completion in 13 hours 22 minutes [INSERT: Count=42990, Max=7927, Min=172, Avg=231.06, 90=250, 99=328, 99.9=1195, 99.99=3267] [INSERT-FAILED: Count=0, Max=0, Min=9223372036854775807, Avg=NaN, 90=0, 99=0, 99.9=0, 99.99=0] 

@neverchanje neverchanje merged commit 3f0f2e6 into XiaoMi:thrift-0.11.0-inlined Jan 17, 2019
@neverchanje
Copy link
Author

neverchanje commented Jun 5, 2019

For 1.11.4, the client cannot get recovered after a long time retrying.

2019-06-05 09:21:50:548 70 sec: 238328 operations; 1994.3 current ops/sec; est completion in 8 hours 8 minutes [INSERT: Count=19942, Max=168703, Min=180, Avg=411.84, 90=530, 99=604, 99.9=4167, 99.99=73151] 
Retrying insertion, retry count: 1
2019-06-05 09:22:00:549 80 sec: 238331 operations; 0.3 current ops/sec; est completion in 9 hours 18 minutes [INSERT: Count=3, Max=3160063, Min=901, Avg=2057857.67, 90=3160063, 99=3160063, 99.9=3160063, 99.99=3160063] [INSERT-FAILED: Count=1, Max=5009407, Min=5005312, Avg=5007360, 90=5009407, 99=5009407, 99.9=5009407, 99.99=5009407] 
Retrying insertion, retry count: 2
2019-06-05 09:22:10:549 90 sec: 238331 operations; 0 current ops/sec; est completion in 10 hours 27 minutes [INSERT: Count=0, Max=0, Min=9223372036854775807, Avg=NaN, 90=0, 99=0, 99.9=0, 99.99=0] [INSERT-FAILED: Count=1, Max=5005311, Min=5001216, Avg=5003264, 90=5005311, 99=5005311, 99.9=5005311, 99.99=5005311] 
Retrying insertion, retry count: 3
2019-06-05 09:22:20:549 100 sec: 238331 operations; 0 current ops/sec; est completion in 11 hours 37 minutes [INSERT: Count=0, Max=0, Min=9223372036854775807, Avg=NaN, 90=0, 99=0, 99.9=0, 99.99=0] [INSERT-FAILED: Count=1, Max=5005311, Min=5001216, Avg=5003264, 90=5005311, 99=5005311, 99.9=5005311, 99.99=5005311] 
Retrying insertion, retry count: 4
2019-06-05 09:22:30:548 110 sec: 238331 operations; 0 current ops/sec; est completion in 12 hours 47 minutes [INSERT: Count=0, Max=0, Min=9223372036854775807, Avg=NaN, 90=0, 99=0, 99.9=0, 99.99=0] [INSERT-FAILED: Count=1, Max=5005311, Min=5001216, Avg=5003264, 90=5005311, 99=5005311, 99.9=5005311, 99.99=5005311] 
Retrying insertion, retry count: 5
Retrying insertion, retry count: 6
2019-06-05 09:22:40:548 120 sec: 238331 operations; 0 current ops/sec; est completion in 13 hours 57 minutes [INSERT: Count=0, Max=0, Min=9223372036854775807, Avg=NaN, 90=0, 99=0, 99.9=0, 99.99=0] [INSERT-FAILED: Count=2, Max=5005311, Min=5001216, Avg=5003264, 90=5005311, 99=5005311, 99.9=5005311, 99.99=5005311] 
Retrying insertion, retry count: 7
2019-06-05 09:22:50:549 130 sec: 238331 operations; 0 current ops/sec; est completion in 15 hours 6 minutes [INSERT: Count=0, Max=0, Min=9223372036854775807, Avg=NaN, 90=0, 99=0, 99.9=0, 99.99=0] [INSERT-FAILED: Count=1, Max=5005311, Min=5001216, Avg=5003264, 90=5005311, 99=5005311, 99.9=5005311, 99.99=5005311] 
Retrying insertion, retry count: 8
2019-06-05 09:23:00:549 140 sec: 238331 operations; 0 current ops/sec; est completion in 16 hours 16 minutes [INSERT: Count=0, Max=0, Min=9223372036854775807, Avg=NaN, 90=0, 99=0, 99.9=0, 99.99=0] [INSERT-FAILED: Count=1, Max=5005311, Min=5001216, Avg=5003264, 90=5005311, 99=5005311, 99.9=5005311, 99.99=5005311] 
Retrying insertion, retry count: 9
2019-06-05 09:23:10:549 150 sec: 238331 operations; 0 current ops/sec; est completion in 17 hours 26 minutes [INSERT: Count=0, Max=0, Min=9223372036854775807, Avg=NaN, 90=0, 99=0, 99.9=0, 99.99=0] [INSERT-FAILED: Count=1, Max=5005311, Min=5001216, Avg=5003264, 90=5005311, 99=5005311, 99.9=5005311, 99.99=5005311] 
Retrying insertion, retry count: 10

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

query meta when an amount of ERR_TIMEOUT occurred but no ERR_SESSION_RESET
2 participants