Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Over-replicated ranges after zone config change #18911

Closed
jseldess opened this issue Sep 29, 2017 · 3 comments
Closed

Over-replicated ranges after zone config change #18911

jseldess opened this issue Sep 29, 2017 · 3 comments
Assignees
Labels
A-kv-distribution Relating to rebalancing and leasing. C-cleanup Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior.

Comments

@jseldess
Copy link
Contributor

I'm running a 5-node cluster on Digital Ocean. The version running on each node is:

root@docs-1:~# cockroach version
Build Tag:    v1.1-beta.20170921
Build Time:   2017/09/21 15:48:53
Distribution: CCL
Platform:     linux amd64
Go Version:   go1.8.3
C Compiler:   gcc 6.3.0
Build SHA-1:  be60cc5a4ccfb6a597a47f5eb9d0a88a0e1b1289
Build Type:   release-gnu

I was running tpcc against the cluster for a while. Then I created a zone config to replicate the tpcc.customer table 5 times instead of 3:

echo 'num_replicas: 5' | cockroach zone set tpcc.customer --insecure --host=67.205.145.32 -f -

After a few minutes, I checked to make sure the table was, in fact, replicated 5 times:

[email protected]:26257/> show testing_ranges from table tpcc.customer;
+-----------+---------+-------------+--------------+
| Start Key | End Key |  Replicas   | Lease Holder |
+-----------+---------+-------------+--------------+
| NULL      | NULL    | {1,2,3,4,5} |            3 |
+-----------+---------+-------------+--------------+
(1 row)

Time: 938.839773ms

A little while later, I change the tables replication factor back to 3:

echo 'num_replicas: 3' | cockroach zone set tpcc.customer --insecure --host=67.205.145.32 -f -

I confirmed that the zone config was updated:

root@docs-1:~# cockroach zone get tpcc.customer --insecure --host=67.205.145.32
# Server version: CockroachDB CCL v1.1-beta.20170921 (linux amd64, built 2017/09/21 15:48:53, go1.8.3) (same version as client)
# Cluster ID: c3779182-655d-4c0e-955b-5b4e814e47b4
tpcc.customer
range_min_bytes: 1048576
range_max_bytes: 67108864
gc:
  ttlseconds: 90000
num_replicas: 3
constraints: []

But after another 20 min or so, the table still has 5 replicas:

[email protected]:26257/> show testing_ranges from table tpcc.customer;
+-----------+---------+-------------+--------------+
| Start Key | End Key |  Replicas   | Lease Holder |
+-----------+---------+-------------+--------------+
| NULL      | NULL    | {1,2,3,4,5} |            3 |
+-----------+---------+-------------+--------------+
(1 row)

Time: 1.008673392s
@jseldess jseldess added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Sep 29, 2017
@a-robinson
Copy link
Contributor

@BramGruneir you mentioned earlier that this was due to replicas not being GC'ed after being removed from the raft group (rather than being due to not removing any of the replicas from the raft group). How did you determine that from this info? Were you inspecting the cluster directly? And if so, is it perhaps still up to take a look at?

@tbg
Copy link
Member

tbg commented Apr 19, 2018

We should cake this specific type of operation into a replication-minded roachtest to see if this is a reproducible issue.

@tbg tbg removed the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Apr 19, 2018
@tbg tbg added the A-kv-distribution Relating to rebalancing and leasing. label May 15, 2018
@tbg tbg added the C-cleanup Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior. label Jul 22, 2018
@tbg
Copy link
Member

tbg commented Jul 22, 2018

Closing for #17000.

@tbg tbg closed this as completed Jul 22, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-distribution Relating to rebalancing and leasing. C-cleanup Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior.
Projects
None yet
Development

No branches or pull requests

4 participants