Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The lock in BlobStore::removePosFromStats introducing performance jitter #8650

Closed
JaySon-Huang opened this issue Jan 3, 2024 · 0 comments · Fixed by #8651
Closed

The lock in BlobStore::removePosFromStats introducing performance jitter #8650

JaySon-Huang opened this issue Jan 3, 2024 · 0 comments · Fixed by #8651
Assignees
Labels
affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. component/storage severity/major type/bug The issue is confirmed as a bug.

Comments

@JaySon-Huang
Copy link
Contributor

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

2. What did you expect to see? (Required)

3. What did you see instead (Required)

When all page data in a BlobFile gets removed in the GC thread, removePosFromStats will remove the BlobFile from disk. However, it holds a lock on mtx_blob_files when removing the file from disk. If BlobStore::write happens to write a page data to a new BlobFile, BlobStore::getBlobFile will be blocked by the GC thread.

According to local tests, removing a BlobFile takes tens of milliseconds. Thus introducing performance jitter.

{
// Remove the blob file from disk and memory
std::lock_guard files_gurad(mtx_blob_files);
if (auto iter = blob_files.find(blob_id); iter != blob_files.end())
{
auto blob_file = iter->second;
blob_file->remove();
blob_files.erase(iter);
}
// If the blob_id does not exist, the blob_file is never
// opened for read/write. It is safe to ignore it.
}

SCOPE_EXIT({
GET_METRIC(tiflash_storage_page_write_duration_seconds, type_blob_write).Observe(watch.elapsedSeconds());
});
auto blob_file = getBlobFile(blob_id);
blob_file->write(buffer, offset_in_file, all_page_data_size, write_limiter);

template <typename Trait>
BlobFilePtr BlobStore<Trait>::getBlobFile(BlobFileId blob_id)
{
std::lock_guard files_gurad(mtx_blob_files);
if (auto iter = blob_files.find(blob_id); iter != blob_files.end())
return iter->second;
auto file = std::make_shared<BlobFile>(getBlobFileParentPath(blob_id), blob_id, file_provider, delegator);
blob_files.emplace(blob_id, file);
return file;
}

4. What is your TiFlash version? (Required)

master

@JaySon-Huang JaySon-Huang added the type/bug The issue is confirmed as a bug. label Jan 3, 2024
@JaySon-Huang JaySon-Huang self-assigned this Jan 3, 2024
@JaySon-Huang JaySon-Huang added severity/major component/storage affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. labels Jan 3, 2024
@ti-chi-bot ti-chi-bot bot closed this as completed in #8651 Jan 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. component/storage severity/major type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant