Skip to content

Commit

Permalink
Parallelize closing of files on write
Browse files Browse the repository at this point in the history
This change parallelizes the closing of files on writes. This solves a
performance problem when the user was using S3 or other object store
where we buffer the multi-part writes. If the user's data was below the
buffer size, then no io would have occurred until the closing when we
flush buffers. This causes a large performance penalty relative to
expected because up to three files per field had to be uploaded
serially.
  • Loading branch information
Shelnutt2 committed Jan 27, 2021
1 parent 73c7cc4 commit 97b6fc6
Show file tree
Hide file tree
Showing 2 changed files with 22 additions and 5 deletions.
2 changes: 2 additions & 0 deletions HISTORY.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@

## Improvements

* Parallelize across attributes when closing a write [#2048](https://github.com/TileDB-Inc/TileDB/pull/2048)

## Deprecations

## Bug fixes
Expand Down
25 changes: 20 additions & 5 deletions tiledb/sm/query/writer.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1308,15 +1308,30 @@ void Writer::clear_coord_buffers() {

Status Writer::close_files(FragmentMetadata* meta) const {
// Close attribute and dimension files
for (const auto& it : buffers_) {
const auto& name = it.first;
RETURN_NOT_OK(storage_manager_->close_file(meta->uri(name)));
const auto buffer_name = buffer_names();

std::vector<URI> file_uris;
file_uris.reserve(buffer_name.size() * 3);

for (const auto& name : buffer_name) {
file_uris.emplace_back(meta->uri(name));
if (array_schema_->var_size(name))
RETURN_NOT_OK(storage_manager_->close_file(meta->var_uri(name)));
file_uris.emplace_back(meta->var_uri(name));
if (array_schema_->is_nullable(name))
RETURN_NOT_OK(storage_manager_->close_file(meta->validity_uri(name)));
file_uris.emplace_back(meta->validity_uri(name));
}

auto statuses = parallel_for(
storage_manager_->io_tp(), 0, file_uris.size(), [&](uint64_t i) {
const auto& file_ur = file_uris[i];
RETURN_NOT_OK(storage_manager_->close_file(file_ur));
return Status::Ok();
});

// Check all statuses
for (auto& st : statuses)
RETURN_NOT_OK(st);

return Status::Ok();
}

Expand Down

0 comments on commit 97b6fc6

Please sign in to comment.