Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

evict-leader-scheduler lost config after PD leader transfer #6897

Closed
kennytm opened this issue Aug 6, 2023 · 1 comment · Fixed by #7108
Closed

evict-leader-scheduler lost config after PD leader transfer #6897

kennytm opened this issue Aug 6, 2023 · 1 comment · Fixed by #7108
Labels
affects-5.3 affects-5.4 This bug affects the 5.4.x(LTS) versions. affects-6.1 This bug affects the 6.1.x(LTS) versions. affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. severity/major type/bug The issue is confirmed as a bug.

Comments

@kennytm
Copy link

kennytm commented Aug 6, 2023

Bug Report

What did you do?

$ tiup playground v7.1.1 --mode tikv-slim --pd 3 --kv 3 --without-monitor
$ # first check what stores are available
$ tiup ctl:v7.1.1 pd store --jq .stores[].store.id
1
3
2

$ # add the evict-leader-scheduler
$ tiup ctl:v7.1.1 pd scheduler add evict-leader-scheduler 1
$ tiup ctl:v7.1.1 pd scheduler add evict-leader-scheduler 2
$ tiup ctl:v7.1.1 pd scheduler add evict-leader-scheduler 3
$ tiup ctl:v7.1.1 pd scheduler config evict-leader-scheduler
{"store-id-ranges":{"1":[{"start-key":"","end-key":""}],"2":[{"start-key":"","end-key":""}],"3":[{"start-key":"","end-key":""}]}}

$ # transfer leader 
$ tiup ctl:v7.1.1 pd member leader show | jq .name
"pd-2"
$ tiup ctl:v7.1.1 pd member leader transfer pd-0

$ # wait until the schedulers are ready...
$ tiup ctl:v7.1.1 pd scheduler config evict-leader-scheduler
[404] scheduler not found
...
$ tiup ctl:v7.1.1 pd scheduler config evict-leader-scheduler
{"store-id-ranges":{"1":[{"start-key":"","end-key":""}],"2":[{"start-key":"","end-key":""}],"3":[{"start-key":"","end-key":""}]}}

$ # transfer leader again
$ tiup ctl:v7.1.1 pd member leader transfer pd-1
$ tiup ctl:v7.1.1 pd scheduler config evict-leader-scheduler
[404] scheduler not found
...
$ tiup ctl:v7.1.1 pd scheduler config evict-leader-scheduler # ?!
{"store-id-ranges":{"1":[{"start-key":"","end-key":""}]}}

What did you expect to see?

The evict-leader-scheduler retains the config showing all stores 1, 2, 3.

What did you see instead?

After the second transfer the scheduler can only see the first added store 1.

What version of PD are you using (pd-server -V)?

v6.5.3, v7.1.1

@kennytm kennytm added the type/bug The issue is confirmed as a bug. label Aug 6, 2023
@kennytm
Copy link
Author

kennytm commented Aug 6, 2023

After the first PD leader transfer (pd-2 → pd-0 in the example), (*coordinator).run() is executed for the first time, in particular executing these two lines:

s, err := schedule.CreateScheduler(cfg.Type, c.opController, c.cluster.storage, schedule.ConfigJSONDecoder([]byte(data)))

s, err := schedule.CreateScheduler(schedulerCfg.Type, c.opController, c.cluster.storage, schedule.ConfigSliceDecoder(schedulerCfg.Type, schedulerCfg.Args))

The first (line 376) is the new way to create scheduler with the independent configuration. It has correctly added the scheduler with config store = 1,2,3

The second (line 397) is the old way to create the scheduler, which computes the config from the initial args, which is just [1] (stores 2,3 are included into the scheduler through UpdateConfig).

The second "old way" CreateScheduler call persisted the incomplete config, which is why after the second PD leader transfer (pd-0 → pd-1) we are left with only 1 store for evict-leader-scheduler.

ti-chi-bot pushed a commit to ti-chi-bot/pd that referenced this issue Sep 27, 2023
@HuSharp HuSharp added affects-5.3 affects-5.4 This bug affects the 5.4.x(LTS) versions. affects-6.1 This bug affects the 6.1.x(LTS) versions. labels Sep 27, 2023
ti-chi-bot pushed a commit to ti-chi-bot/pd that referenced this issue Sep 27, 2023
ti-chi-bot pushed a commit to ti-chi-bot/pd that referenced this issue Sep 27, 2023
ti-chi-bot pushed a commit to ti-chi-bot/pd that referenced this issue Sep 27, 2023
ti-chi-bot bot pushed a commit that referenced this issue Sep 28, 2023
close #6897

Signed-off-by: husharp <[email protected]>

Co-authored-by: husharp <[email protected]>
Co-authored-by: Hu# <[email protected]>
ti-chi-bot bot added a commit that referenced this issue Oct 7, 2023
close #6897

Signed-off-by: husharp <[email protected]>

Co-authored-by: husharp <[email protected]>
Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
ti-chi-bot bot pushed a commit that referenced this issue Oct 11, 2023
rleungx pushed a commit to rleungx/pd that referenced this issue Dec 1, 2023
close tikv#6897

Signed-off-by: husharp <[email protected]>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
ti-chi-bot bot pushed a commit that referenced this issue Sep 14, 2024
close #6897

Signed-off-by: husharp <[email protected]>

Co-authored-by: husharp <[email protected]>
Co-authored-by: Hu# <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-5.3 affects-5.4 This bug affects the 5.4.x(LTS) versions. affects-6.1 This bug affects the 6.1.x(LTS) versions. affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. severity/major type/bug The issue is confirmed as a bug.
Projects
Development

Successfully merging a pull request may close this issue.

4 participants