-
-
Notifications
You must be signed in to change notification settings - Fork 509
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how to search fast with ramchunk(rt table) in High-Frequency Write Systems #2787
Comments
The point is that searching in a RAM chunk doesn't utilize some of the performance optimization techniques that Manticore uses when searching in disk chunks, such as:
You're correct about the workaround if the RAM chunk's performance isn't sufficient. We've been considering improving it by flushing the RAM chunk to disk as soon as there are no changes in the table for a certain period. Since smaller disk chunks are merged first, we can expect the merging overhead to be minimal in this case.
This is likely because queries on disk chunks are parallelized using pseudo sharding. In the profile, you're seeing the combined sums from all threads. |
We discussed your issue in detail during today's call. It seems there might be other ways to improve performance in this case. Could you share a reproducible example with us so we can test and investigate the issue locally? |
@sanikolaev Thank you for your reply!
I will provide you with an non-sensitive dataset for testing purposes. However, it will take a few days to prepare and share this data with you.
I am eager to contribute to this enhancement by coding to implement the functionality that flush the RAM chunk when there are no changes in the table for a certain period. I am currently learning and implementing this feature. |
@zhangdapao745 Hello. Did you have a chance to prepare the test case? |
@sanikolaev Of course, if there are no unexpected problems, I will upload the test case tomorrow. Additionally, is the following content enough for the upload?
|
Yes, please! |
@sanikolaev Here is the reproducible test case.
3.Query sql Generate the variable ${id_1000} through the following Python script and replace it in the aforementioned SQL.
When querying the entire dataset in Manticore Search without performing any actions after loading the file(with all ramchunk), it takes 160ms. After calling 'FLUSH RAMCHUNK rt_table', the query time is reduced to 15 ms. |
Thanks for preparing the test case @zhangdapao745 @tomatolog please reproduce it, profile and suggest what we could do to improve the performance. |
Confirmation Checklist:
Your question:
I found that queries will slow down if a table has a RAM chunk. After flushing the RAM chunk to a new disk chunk by using the command 'FLUSH RAMCHUNK rt_table', the query will return quickly.
Server version: 6.3.2
create table :
data : 200w
query sql
SELECT mid FROM rt_table WHERE mid IN (${mids_1000}) group by mid LIMIT 1000;
I have three questions:
The text was updated successfully, but these errors were encountered: