-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Kernel] Fix RoaringBitmapArray create/add methods. Closes issue #3881 #3882
base: master
Are you sure you want to change the base?
[Kernel] Fix RoaringBitmapArray create/add methods. Closes issue #3881 #3882
Conversation
- The bitmaps array should be initialized in the `create` path. - The `expandBitMaps` method should set new bitmaps from the old length up to the new length, instead of overwriting the old ones. - Adds a `toArray` method mimicking the one provided by the scala version which the class is based on. Signed-off-by: Antoni Reus <[email protected]>
Signed-off-by: Antoni Reus <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this!
newBitmaps[i] = new RoaringBitmap(); | ||
} | ||
bitmaps = newBitmaps; | ||
} | ||
|
||
public static RoaringBitmapArray create(long... values) { | ||
RoaringBitmapArray bitmap = new RoaringBitmapArray(); | ||
bitmap.bitmaps = new RoaringBitmap[0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of doing this here we should just initialize bitmaps
to an empty array in the class. Since we don't have a constructor you can just do this on line 103
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Good point
for (long value : values) { | ||
bitmap.add(value); | ||
} | ||
return bitmap; | ||
} | ||
|
||
public long[] toArray() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any particular reason for adding this?
I think I would prefer to either
(1) not add this for now since we don't use it
(2) if we are adding it, it needs more tests
I would lean toward (1) since it's not used but if you have a good reason for adding it lmk
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One reason I think is that with the current methods available is not possible to assert that the contents of the data structure are exactly the expected.
The unit test I added tests the contains
method for some particular values that are expected or not expected to be there, but it cannot completely discard that an unwanted value is present. With this method it can be asserted that the wanted values are there, and only them.
Another one, is somehow related. Without this method any implementation of the scanning of files is constrained to use a filter approach to read the data, and call the contains
for each single row. Depending on the number and the relation between total rows and deleted rows, and how the file is processed, I think it can be interesting to also allow the approach of reading the whole rows and then discarding the ones deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I'm still okay with either (1) or (2). But if we are going with (2) can you please add full test coverage?
i.e. cover things like
- varios number of underlying bitmaps
- empty case
- etc
Which Delta project/connector is this regarding?
Description
Resolves #3881
create
path.expandBitMaps
method should set new bitmaps from the old length up to the new length, instead of overwriting the old ones.Also adds a
toArray
method mimicking the one provided by the scala version which the class is based on.How was this patch tested?
Added unit tests.
Does this PR introduce any user-facing changes?
No