Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Remote Store] Parallel segment downloads in IndexShard #8236

Open
shourya035 opened this issue Jun 23, 2023 · 2 comments
Open

[Remote Store] Parallel segment downloads in IndexShard #8236

shourya035 opened this issue Jun 23, 2023 · 2 comments
Labels
enhancement Enhancement or improvement to existing feature or request Storage:Durability Issues and PRs related to the durability framework Storage Issues and PRs relating to data and metadata storage

Comments

@shourya035
Copy link
Member

shourya035 commented Jun 23, 2023

Is your feature request related to a problem? Please describe.
As of today, we are using the Directory#copyFrom method synchronously. The invocation is done on a simple for loop which might end up slowing down the shard recovery process.

for (String file : uploadedSegments.keySet()) {
long checksum = Long.parseLong(uploadedSegments.get(file).getChecksum());
if (overrideLocal || localDirectoryContains(storeDirectory, file, checksum) == false) {
if (localSegmentFiles.contains(file)) {
storeDirectory.deleteFile(file);
}
storeDirectory.copyFrom(remoteDirectory, file, file, IOContext.DEFAULT);
downloadedSegments.add(file);
} else {
skippedSegments.add(file);
}

We should look into making this copyFrom invocation in parallel to speed up segment downloads during shard recovery

Describe the solution you'd like
Use a new threadpool called remote_download to download these segment files in parallel

@shourya035 shourya035 added enhancement Enhancement or improvement to existing feature or request untriaged labels Jun 23, 2023
@Xtansia Xtansia added the Storage:Durability Issues and PRs related to the durability framework label Jun 23, 2023
@anasalkouz
Copy link
Member

Seems related to #8187

@kotwanikunal
Copy link
Member

Seems related to #8187

It's actually related to #8596

@Bukhtawar Bukhtawar added the Storage Issues and PRs relating to data and metadata storage label Jul 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Storage:Durability Issues and PRs related to the durability framework Storage Issues and PRs relating to data and metadata storage
Projects
Status: 🆕 New
Development

No branches or pull requests

5 participants