-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: rework section on node storage #9652
Conversation
|
||
- cpu/network load per store | ||
- ranges that are used together often in queries | ||
- number of active ranges per store |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add number of range leases held per store
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
- cpu/network load per store | ||
- ranges that are used together often in queries | ||
- number of active ranges per store | ||
- ranges that already have some overlap (to decrease the chance of data loss) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate on this? I'm not sure what it means so it probably needs to be rephrased.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A less strict way of doing copysets.
I just removed it. It's theoretical and hard to sum up quickly. If we add it, when can update the doc.
[here](http://docs.basho.com/riak/latest/ops/advanced/backends/leveldb/). | ||
# Stores and Storage | ||
|
||
Nodes contain one ore more stores. And each of those stores should be on a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/ ore / or /g
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
# Stores and Storage | ||
|
||
Nodes contain one ore more stores. And each of those stores should be on a | ||
unique disk and contains an instance of RocksDB. And these stores in turn have |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second sentence is slightly odd. We create the RocksDB instances. Perhaps: Each store should be placed on a unique disk. Internally, each store contains a single instance of RocksDB with a block cache shared amongst all of the stores in a node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
much better, done
Nodes contain one ore more stores. And each of those stores should be on a | ||
unique disk and contains an instance of RocksDB. And these stores in turn have | ||
a collection of range replicas. More than one replica for a range will never | ||
occur on the same store or even the same node. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/occur/be placed/g
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Early on, when a cluster is first initialized, the few default starting ranges | ||
will only have a single replica, but as soon as other nodes are available they | ||
will replicate to them until they've reached their desired replication factor, | ||
the default being 3. Ranges can have different replication factors and when set |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mention zone configs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I've re-written the section to include them.
the default being 3. Ranges can have different replication factors and when set | ||
they will up or down replicate to the appropriate number of replicas. Since | ||
ranges are only created via splits, all replicas for a range split at the same | ||
time so the replication factors rarely need to be adjusted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how the first part of the sentence connects to the second. Why does the splitting of a range to create new ranges affect the desire to change replication factors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
To combat this and to spread the overall load across the full cluster, replicas | ||
will be moved between stores maintaining the desired replication factor. The | ||
heuristics used to perform this rebalancing are in flux, but they take into | ||
account a number of different factors: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd leave out the in flux part and just describe the currently used heuristics and the load-based ones we're considering for the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
This adds a little bit on repair and rebalancing as well. Part of cockroachdb#9634.
2c0f3f1
to
0a0d79a
Compare
Made the requested changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Previously, the `kv/restart/nodes=12` roachtest was unable to pass and was skipped. Following the changes introduced for cockroachdb#9652, the test now passes. This commit enables the `kv/restart/nodes=12` roachtest as a weekly test. Informs: cockroachdb#95159 Release note: None
Previously, the `kv/restart/nodes=12` roachtest was unable to pass and was skipped. Following the changes introduced for cockroachdb#9652, the test now passes. This commit enables the `kv/restart/nodes=12` roachtest as a weekly test. informs: cockroachdb#95159 resolves: cockroachdb#98296 Release note: None
Previously, the `kv/restart/nodes=12` roachtest was unable to pass and was skipped. Following the changes introduced for cockroachdb#9652, the test now passes. This commit enables the `kv/restart/nodes=12` roachtest as a weekly test. resolves: cockroachdb#98296 Release note: None
This adds a little bit on repair and rebalancing as well.
Part of #9634.
This change is