Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

balance: some changes to make balance more reliable. #560

Merged
merged 15 commits into from
Mar 29, 2017
Merged

Conversation

disksing
Copy link
Contributor

@disksing disksing commented Mar 8, 2017

  1. random select region for leader balance.
    Fix Select target store first for leader transfer when some nodes fall behind too much. #545. The old way makes balance slow when several stores fall behind too much, and also has the over-schedule problem because scheduling is always concentrated on the store with most leaders.

  2. balance limit should exclude down/tombstone stores.
    With this patch, balance speed will not go too fast when several stores are down.

  3. add storageThresholdFilter for regionBalancer.
    Do not move regions to a store which is almost full. It normally won't happen, but it's more safe to check it explicitly.

PTAL @siddontang @nolouch

server/cache.go Outdated
@@ -225,6 +225,13 @@ func (r *regionsInfo) randFollowerRegion(storeID uint64) *regionInfo {
return randRegion(r.followers[storeID])
}

func (r *regionsInfo) randomRegion() *regionInfo {
for _, region := range r.regions {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not using a random index to get the random region?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we have to use another []uint64 to record all region ids and sync it with regions map. I think we can rely on the fact that maps are iterated from random position.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean if the regions is not changed, every for in range will get a random region?

maybe we should add a test to verify it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. We should not rely on the randomness of map iteration. I'm working on another PR to correct it.

@siddontang
Copy link
Contributor

any test?

@disksing disksing changed the title balance: some changes to make balance more reliable. [wip] balance: some changes to make balance more reliable. Mar 8, 2017
@disksing
Copy link
Contributor Author

disksing commented Mar 9, 2017

@siddontang I've added test case for 2. 1 is covered by many cases. 3 basically won't happen, so not easy to test.

@disksing disksing changed the title [wip] balance: some changes to make balance more reliable. balance: some changes to make balance more reliable. Mar 9, 2017
}
time.Sleep(time.Millisecond * 100)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need panic here?

return nil, nil
}

region := cluster.randLeaderRegion(source.GetId())
if region == nil {
leaderStore := cluster.getStore(region.Leader.GetStoreId())
Copy link
Contributor

@nolouch nolouch Mar 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

random region let hot leader store get less chance to transfer leader. I see the maxScheduleRetries is 10. if region large than 100000, it just sampling 0.01 percent. the most scores almost same. is it very possibly gets nil operator and schedule interval will increase to the maximum fastly? I think we also need to adjust schedule-retry with the region total number and schedule-interval increase way. or may we can directly add a schedule for the least leader store.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seem we should add a metric to check the useless select counter.

Copy link
Contributor

@nolouch nolouch Mar 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forget the score are the leader count in that store. may we can test it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nolouch Assume the peers are perfectly balanced, only leaders need to transfer. The rate of successfully find a region that need be balanced is mainly determined by node count and how many nodes are not well balanced.
Assume we have 100000 regions and 10 nodes, then each node contains 10000 leaders. When a region is randomly selected, the rate of its leader on a specific node is 10%.
I have checked that it works well on a 10 nodes cluster, it may be inefficient it if there are more nodes. I'll continue do more test :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it

Copy link
Contributor

@nolouch nolouch Mar 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so how about add a schedule the store with the least leader count as the target source. it isn't affected by node count.

@siddontang
Copy link
Contributor

any update? @disksing

return nil, nil
var averageLeader float64
for _, s := range stores {
averageLeader += float64(cluster.getStoreLeaderCount(s.GetId())) / float64(len(stores))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s.leaderCount() directly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact I intentionally avoid using s.leaderCount() here, because leader count in cache is more up to date.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it

return nil, nil
}

return region, region.GetStorePeer(target.GetId())
if leastLeaderStore == nil || math.Abs(mostLeaderStore.leaderScore()-averageLeader) > math.Abs(leastLeaderStore.leaderScore()-averageLeader) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if mostLeaderStore be nil will panic here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have already checked if both leastLeaderStore and mostLeaderstore are nil.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I'll fix it.

if target == nil {
var mostLeaderDistance, leastLeaderDistance float64
if mostLeaderStore != nil {
mostLeaderDistance = math.Abs(mostLeaderStore.leaderScore() - averageLeader)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also use cache way get the score in here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. but we need to merge #575 before fixing it.

@disksing
Copy link
Contributor Author

PTAL @nolouch @siddontang

@nolouch
Copy link
Contributor

nolouch commented Mar 24, 2017

LGTM

@siddontang
Copy link
Contributor

PTAL @andelf

@disksing
Copy link
Contributor Author

PTAL @siddontang

Copy link
Contributor

@siddontang siddontang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@disksing disksing merged commit 1ce6782 into master Mar 29, 2017
@disksing disksing deleted the disksing/balance branch March 29, 2017 02:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants