-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Faster shard loading #5372
Faster shard loading #5372
Conversation
@dswarbrick @toddboom Would be interested to see if this improves startup times of some larger DBs. |
@jwilder tested this on a smaller database this morning with inconclusive results (there's one large shard that takes all of the time, so parallelization didn't matter), but will test it on the larger datasets this afternoon. |
This needs to be rebased FYI. |
Yes. I may need to revert some changes in here as well. |
0c8716c
to
cacb6fe
Compare
cacb6fe
to
779d5ea
Compare
@@ -203,23 +209,35 @@ func (f *FileStore) Remove(paths ...string) { | |||
sort.Sort(tsmReaders(f.files)) | |||
} | |||
|
|||
func (f *FileStore) Keys() []string { | |||
func (f *FileStore) WalkKeys(fn func(key string, typ byte) error) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exported method needs a comment.
LGTM |
values = getFloat64Values(nvals) | ||
for i := 0; i < nvals; i++ { | ||
values[i] = &FloatValue{} | ||
} | ||
case integerEntryType: | ||
values = getIntegerValues(nvals) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete line:571?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
When loading many shards concurrently they block trying to acquire a write lock in the sync pool adding a new source of contention. Since this code flow always needs to allocate a buffer it's not really buying us much.
Since loading a shard can allocate a lot of memory, running them all at once could OOM the process. This limits the number of shards loaded to 4. This will be changed to a config option provided the approach helps.
Avoids allocating a big map or all keys.
This PR improves startup times for databases with many shards. The main changes are:
Should help #5311.