balance: some changes to make balance more reliable. #560

disksing · 2017-03-08T09:25:59Z

random select region for leader balance.
Fix Select target store first for leader transfer when some nodes fall behind too much. #545. The old way makes balance slow when several stores fall behind too much, and also has the over-schedule problem because scheduling is always concentrated on the store with most leaders.
balance limit should exclude down/tombstone stores.
With this patch, balance speed will not go too fast when several stores are down.
add storageThresholdFilter for regionBalancer.
Do not move regions to a store which is almost full. It normally won't happen, but it's more safe to check it explicitly.

siddontang · 2017-03-08T09:44:15Z

server/cache.go

@@ -225,6 +225,13 @@ func (r *regionsInfo) randFollowerRegion(storeID uint64) *regionInfo {
 	return randRegion(r.followers[storeID])
 }

+func (r *regionsInfo) randomRegion() *regionInfo {
+	for _, region := range r.regions {


why not using a random index to get the random region?

Then we have to use another []uint64 to record all region ids and sync it with regions map. I think we can rely on the fact that maps are iterated from random position.

do you mean if the regions is not changed, every for in range will get a random region?

maybe we should add a test to verify it.

You are right. We should not rely on the randomness of map iteration. I'm working on another PR to correct it.

siddontang · 2017-03-08T09:45:15Z

any test?

disksing · 2017-03-09T03:28:05Z

@siddontang I've added test case for 2. 1 is covered by many cases. 3 basically won't happen, so not easy to test.

siddontang · 2017-03-10T08:32:56Z

server/coordinator_test.go

+		}
+		time.Sleep(time.Millisecond * 100)
+	}
+}


need panic here?

# Conflicts: # server/cache.go

nolouch · 2017-03-14T07:05:29Z

server/scheduler.go

 		return nil, nil
 	}

-	region := cluster.randLeaderRegion(source.GetId())
-	if region == nil {
+	leaderStore := cluster.getStore(region.Leader.GetStoreId())


random region let hot leader store get less chance to transfer leader. I see the maxScheduleRetries is 10. if region large than 100000, it just sampling 0.01 percent. the most scores almost same. is it very possibly gets nil operator and schedule interval will increase to the maximum fastly? I think we also need to adjust schedule-retry with the region total number and schedule-interval increase way. or may we can directly add a schedule for the least leader store.

seem we should add a metric to check the useless select counter.

I forget the score are the leader count in that store. may we can test it.

@nolouch Assume the peers are perfectly balanced, only leaders need to transfer. The rate of successfully find a region that need be balanced is mainly determined by node count and how many nodes are not well balanced.
Assume we have 100000 regions and 10 nodes, then each node contains 10000 leaders. When a region is randomly selected, the rate of its leader on a specific node is 10%.
I have checked that it works well on a 10 nodes cluster, it may be inefficient it if there are more nodes. I'll continue do more test :)

so how about add a schedule the store with the least leader count as the target source. it isn't affected by node count.

siddontang · 2017-03-17T13:32:24Z

any update? @disksing

nolouch · 2017-03-20T05:53:58Z

server/scheduler.go

-		return nil, nil
+	var averageLeader float64
+	for _, s := range stores {
+		averageLeader += float64(cluster.getStoreLeaderCount(s.GetId())) / float64(len(stores))


s.leaderCount() directly

In fact I intentionally avoid using s.leaderCount() here, because leader count in cache is more up to date.

nolouch · 2017-03-20T05:55:37Z

server/scheduler.go

 		return nil, nil
 	}

-	return region, region.GetStorePeer(target.GetId())
+	if leastLeaderStore == nil || math.Abs(mostLeaderStore.leaderScore()-averageLeader) > math.Abs(leastLeaderStore.leaderScore()-averageLeader) {


if mostLeaderStore be nil will panic here.

I have already checked if both leastLeaderStore and mostLeaderstore are nil.

You're right. I'll fix it.

nolouch · 2017-03-20T08:53:24Z

server/scheduler.go

-	if target == nil {
+	var mostLeaderDistance, leastLeaderDistance float64
+	if mostLeaderStore != nil {
+		mostLeaderDistance = math.Abs(mostLeaderStore.leaderScore() - averageLeader)


also use cache way get the score in here?

You are right. but we need to merge #575 before fixing it.

disksing · 2017-03-23T10:37:30Z

PTAL @nolouch @siddontang

nolouch · 2017-03-24T03:29:15Z

LGTM

siddontang · 2017-03-24T13:02:34Z

PTAL @andelf

disksing · 2017-03-28T05:21:16Z

PTAL @siddontang

siddontang

LGTM

disksing added 3 commits March 8, 2017 14:52

balance: random select region for leader balance.

2e01947

balance: balance limit should exclude down/tombstone stores.

07508cd

balancer: add storageThresholdFilter for regionBalancer.

51e448b

siddontang reviewed Mar 8, 2017

View reviewed changes

disksing changed the title ~~balance: some changes to make balance more reliable.~~ [wip] balance: some changes to make balance more reliable. Mar 8, 2017

disksing added 2 commits March 9, 2017 11:00

update test.

dc9a4bd

add balanceLimit test.

eeadda1

disksing changed the title ~~[wip] balance: some changes to make balance more reliable.~~ balance: some changes to make balance more reliable. Mar 9, 2017

siddontang reviewed Mar 10, 2017

View reviewed changes

server/coordinator_test.go Outdated

}

time.Sleep(time.Millisecond * 100)

}

}

Copy link

Contributor

siddontang Mar 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need panic here?

disksing added 2 commits March 14, 2017 13:32

Merge branch 'master' into disksing/balance

257c7c7

# Conflicts: # server/cache.go

address comment.

e012c77

nolouch reviewed Mar 14, 2017

View reviewed changes

address comment.

84ade18

nolouch reviewed Mar 20, 2017

View reviewed changes

addres comment.

557ad5f

nolouch reviewed Mar 20, 2017

View reviewed changes

disksing added 4 commits March 23, 2017 15:55

Merge branch 'master' into disksing/balance

b444bf3

address comment.

ec5e611

add test.

a8d4b72

update leader balance test.

a7d975f

siddontang approved these changes Mar 28, 2017

View reviewed changes

Merge branch 'master' into disksing/balance

c08646c

Merge branch 'master' into disksing/balance

ca85455

disksing merged commit 1ce6782 into master Mar 29, 2017

disksing deleted the disksing/balance branch March 29, 2017 02:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

balance: some changes to make balance more reliable. #560

balance: some changes to make balance more reliable. #560

disksing commented Mar 8, 2017

siddontang Mar 8, 2017

disksing Mar 9, 2017

siddontang Mar 10, 2017

disksing Mar 10, 2017

siddontang commented Mar 8, 2017

disksing commented Mar 9, 2017

siddontang Mar 10, 2017

nolouch Mar 14, 2017 •

edited

Loading

siddontang Mar 14, 2017

nolouch Mar 14, 2017 •

edited

Loading

disksing Mar 14, 2017

nolouch Mar 14, 2017

nolouch Mar 14, 2017 •

edited

Loading

siddontang commented Mar 17, 2017

nolouch Mar 20, 2017

disksing Mar 20, 2017

nolouch Mar 20, 2017

nolouch Mar 20, 2017

disksing Mar 20, 2017

disksing Mar 20, 2017

nolouch Mar 20, 2017

disksing Mar 21, 2017

disksing commented Mar 23, 2017

nolouch commented Mar 24, 2017

siddontang commented Mar 24, 2017

disksing commented Mar 28, 2017

siddontang left a comment

balance: some changes to make balance more reliable. #560

balance: some changes to make balance more reliable. #560

Conversation

disksing commented Mar 8, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

siddontang commented Mar 8, 2017

disksing commented Mar 9, 2017

Choose a reason for hiding this comment

nolouch Mar 14, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nolouch Mar 14, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nolouch Mar 14, 2017 • edited Loading

Choose a reason for hiding this comment

siddontang commented Mar 17, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

disksing commented Mar 23, 2017

nolouch commented Mar 24, 2017

siddontang commented Mar 24, 2017

disksing commented Mar 28, 2017

siddontang left a comment

Choose a reason for hiding this comment

nolouch Mar 14, 2017 •

edited

Loading

nolouch Mar 14, 2017 •

edited

Loading

nolouch Mar 14, 2017 •

edited

Loading