-
Notifications
You must be signed in to change notification settings - Fork 851
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose GRFilter's methods that accept a cached stream reader #1186
Conversation
This allows to use call match the same filter against a different set of elements without requiring to re-read the filter data, which it really expensive.
/// <param name="key">Key used for hashing the data elements.</param> | ||
/// <param name="reader">Golomb-Rice stream reader.</param> | ||
/// <returns>true if at least one of the elements is in the filter, otherwise false.</returns> | ||
public bool MatchAny(IEnumerable<byte[]> data, byte[] key, GRCodedStreamReader reader) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this can be static internal method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this method can be very useful in many scenarios. Just one example:
public void ProcessFilter(GolombRiceFilter filter, uint256 blockId, IEnumerable<Wallet> openedWallets)
{
var reader = filter.GetGRStreamReader();
foreach (var wallet in openedWallets)
{
if (filter.MatchAny(wallet.GetAllScriptPubKeys(), filterKey, reader))
{
Console.WriteLine($"The block {blockId} contains transactions relevant for wallet {wallet.Name}");
}
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this can be static internal method?
There are P
, N
, and M
referenced in InternalMatchAny
. So it would require additional refactoring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few comments.
/// <returns>A new cached Golomb-Rice stream reader instance</returns> | ||
public CachedGRCodedStreamReader GetGRStreamReader() | ||
{ | ||
return new CachedGRCodedStreamReader(new BitStream(Data), P, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stream data is a sorted set of ulong numbers so, it seems that caching on the fly is better than caching everything immediately. Is that the idea here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A GR filter is set of "compressed" data items. Uncompressing an item is an expensive operation. The matching algorithm uncompress items as it needs it and it stops immediately after a match is found so, it doesn't uncompress items unnecessarily (all items are uncompressed only in the worst scenario where not matching element is found in the filter)
/// <param name="key">Key used for hashing the data elements.</param> | ||
/// <param name="reader">Golomb-Rice stream reader.</param> | ||
/// <returns>true if at least one of the elements is in the filter, otherwise false.</returns> | ||
public bool MatchAny(IEnumerable<byte[]> data, byte[] key, GRCodedStreamReader reader) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this can be static internal method?
There are P
, N
, and M
referenced in InternalMatchAny
. So it would require additional refactoring.
This allows to use call match the same filter against a different set of elements without requiring to re-read the filter data, which it really expensive.