-
Notifications
You must be signed in to change notification settings - Fork 312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Redesign of Pegasus Scanner #723
Comments
Redesign of Pegasus Scanner, to solve the problem scan timeout. Why comparator use the default ByteWiseComparator at the beginning? 1、support postfix,should scan all data,the cost as before, maybe the filter not important. |
Changing comparator will be a pain, as all the old data can not be read any more. Introduce a table level flag to indicate that whether we should use customized comparator? And we also need to test the performance impact of using customized comparator. |
First of all, we use the default ByteWiseComparator because we design the key schema based on it.
With the default comparator, the two keys are seen as distinct:
So we chose this method, but didn't consider that one day we would need prefix filtering of hashkey. So now the problem is: |
So let's change the comparator and check the performance impact first? |
If there are no compatibility issues, I think changing the comparator is feasible, look forward to your PR @shenxingwuying |
@Apache9 @Smityz XiaoMi/pegasus-java-client#156 and XiaoMi/pegasus-go-client#86 have fix |
Proposal Redesign of Pegasus Scanner
Background
Pegasus provides three interfaces
on_get_scanner
on_scan
andon_clear_scanner
, for clients to execute scanning tasks.If we want to full scan the whole table, at first, the client will call
on_get_scanner
on each partition, and then partitions return acontext_id
which is a random number generated by the server to record some parameters such ashash_key_filter_type
,batch_size
and the context of this scanning task.Secondly, the client uses this
context_id
to callon_scan
and completes scanning in the corresponding partition in turn. Servers will scan the whole data of the table on the disk, and return compliant value to the client in batches.If the tasking end or any error happened, the client will call
on_clear_scanner
to clear its context_id on the server.Problem Statement
In actual use, such a design will cause some problems.
If we execute this scanning task:
Server will scan all the data in the table, then returns the prefix match key of the pattern. But we can speed it up by using prefix seeking futures of RocksDB.
Although we have a batch size to limit the scan time, it does not work if the data is sparse. In the case above, we need to scan almost the whole partition but it is possible that there is no row which matches the prefix,then it will be easy to timeout.
Proposal
For problem 1
[hashkey_len(2bytes)][hashkey][sortkey]
, so we can't directly use prefix seeking. But we can prefix seek[01][prefix_pattern]
,[02][prefix_pattern]
,[03][prefix_pattern]
...[65535][prefix_pattern]
in RocksDB.For problem 2
We can set a
HeartbeatCheck
during scanning like Hbase StoreScanner, pegasus sever sends heartbeat packets periodically to avoid timeout, which performed like a stream.We can change the way to count batch size: compliant value number -> already scan value number
The text was updated successfully, but these errors were encountered: