-
Notifications
You must be signed in to change notification settings - Fork 412
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: Use mmap to view vector index #9313
storage: Use mmap to view vector index #9313
Conversation
f09cde7
to
e9a6e68
Compare
dbms/src/Interpreters/Context.cpp
Outdated
@@ -1385,16 +1385,16 @@ void Context::dropMinMaxIndexCache() const | |||
{ | |||
auto lock = getLock(); | |||
if (shared->minmax_index_cache) | |||
shared->minmax_index_cache->reset(); | |||
shared->minmax_index_cache.reset(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why change it to call shared_ptr::reset()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try to destroy the instance. shared_ptr::reset()
let reference count - 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we destroy the instance, getMinMaxIndexCache
and getVectorIndexCache
will not re-create the cache. Is it as expected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original ->reset()
is calling LRUCache::reset()
to release all cached objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is implemented but not used. Since it is called drop
rather than clear
, I think destroying the instance is more acceptable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, it is drop rather than clear. After TearDown
, all statuses should be reset so that all cases start from the same statuses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think TearDown
just want to clear the data in cache instread of destroy the cache.
Otherwise, the remaining cases will not go to the cache.
Normally, a query should run to the first branch shown in the screenshot below.
If the shared pointer index_cache
is reset, it will go to the else
branch, which means 'no cache'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
dbms/src/Storages/DeltaMerge/File/VectorColumnFromIndexReader.cpp
Outdated
Show resolved
Hide resolved
dbms/src/Storages/DeltaMerge/File/VectorColumnFromIndexReader.cpp
Outdated
Show resolved
Hide resolved
4d60cba
to
ff13cae
Compare
I have tried to use @coderabbitai to review this PR. And it outputs the following recommendations, some of which is reasonable IMO. Could you also check CalvinNeo#9 for those suggestions? @Lloyd-Pottiger |
fixed typo |
dbms/src/Storages/DeltaMerge/File/DMFileWithVectorIndexBlockInputStream.cpp
Outdated
Show resolved
Hide resolved
dbms/src/Storages/S3/FileCache.cpp
Outdated
{ | ||
// Space not enough. | ||
GET_METRIC(tiflash_storage_remote_cache, type_dtfile_full).Increment(); | ||
LOG_DEBUG( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this log level is DEBUG here? It throws later, why not append this log into exception itself, or change to INFO level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The exception will be caught by the caller, will not print to log.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the caller hits i <= 1
, then it will be thrown?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so the log is necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the exception is thrown, and the log level is INFO which is normal, then we can get no information like (capacity={} used={} estimzted_size={})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
dbms/src/Storages/DeltaMerge/File/DMFileWithVectorIndexBlockInputStream.cpp
Show resolved
Hide resolved
#ifdef __clang__ | ||
#pragma clang diagnostic push | ||
#pragma clang diagnostic ignored "-Wdeprecated-declarations" | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why need these macros?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
Signed-off-by: Wish <[email protected]>
Signed-off-by: Wish <[email protected]>
Signed-off-by: Lloyd-Pottiger <[email protected]>
Signed-off-by: Lloyd-Pottiger <[email protected]>
Signed-off-by: Lloyd-Pottiger <[email protected]>
Signed-off-by: Lloyd-Pottiger <[email protected]>
Signed-off-by: Lloyd-Pottiger <[email protected]>
Signed-off-by: Lloyd-Pottiger <[email protected]>
Signed-off-by: Lloyd-Pottiger <[email protected]>
Signed-off-by: Lloyd-Pottiger <[email protected]>
Signed-off-by: Lloyd-Pottiger <[email protected]>
35f1aaf
to
ec80dec
Compare
Signed-off-by: Lloyd-Pottiger <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
void cleanOutdatedLoop(); | ||
|
||
// TODO(vector-index): Use task on BackgroundProcessingPool instead of a raw thread | ||
std::thread cleaner_thread; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this change be in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: CalvinNeo, JaySon-Huang The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[LGTM Timeline notifier]Timeline:
|
What problem does this PR solve?
Issue Number: ref #9032
Problem Summary:
What is changed and how it works?
Pick https://github.com/tidbcloud/tiflash-cse/pull/165, https://github.com/tidbcloud/tiflash-cse/pull/166 and https://github.com/tidbcloud/tiflash-cse/pull/175.
Check List
Tests
Side effects
Documentation
Release note