Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Kernel] Fix RoaringBitmapArray create/add methods. Closes issue #3881 #3882

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

YotillaAntoni
Copy link

Which Delta project/connector is this regarding?

  • Spark
  • Standalone
  • Flink
  • Kernel
  • Other (fill in here)

Description

Resolves #3881

  • The bitmaps array should be initialized in the create path.
  • The expandBitMaps method should set new bitmaps from the old length up to the new length, instead of overwriting the old ones.

Also adds a toArray method mimicking the one provided by the scala version which the class is based on.

How was this patch tested?

Added unit tests.

Does this PR introduce any user-facing changes?

No

 - The bitmaps array should be initialized in the `create` path.
 - The `expandBitMaps` method should set new bitmaps from the old length up to the new length, instead of overwriting the old ones.
 - Adds a `toArray` method mimicking the one provided by the scala version which the class is based on.

Signed-off-by: Antoni Reus <[email protected]>
Copy link
Collaborator

@allisonport-db allisonport-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this!

newBitmaps[i] = new RoaringBitmap();
}
bitmaps = newBitmaps;
}

public static RoaringBitmapArray create(long... values) {
RoaringBitmapArray bitmap = new RoaringBitmapArray();
bitmap.bitmaps = new RoaringBitmap[0];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of doing this here we should just initialize bitmaps to an empty array in the class. Since we don't have a constructor you can just do this on line 103

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Good point

for (long value : values) {
bitmap.add(value);
}
return bitmap;
}

public long[] toArray() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason for adding this?

I think I would prefer to either
(1) not add this for now since we don't use it
(2) if we are adding it, it needs more tests

I would lean toward (1) since it's not used but if you have a good reason for adding it lmk

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One reason I think is that with the current methods available is not possible to assert that the contents of the data structure are exactly the expected.
The unit test I added tests the contains method for some particular values that are expected or not expected to be there, but it cannot completely discard that an unwanted value is present. With this method it can be asserted that the wanted values are there, and only them.

Another one, is somehow related. Without this method any implementation of the scanning of files is constrained to use a filter approach to read the data, and call the contains for each single row. Depending on the number and the relation between total rows and deleted rows, and how the file is processed, I think it can be interesting to also allow the approach of reading the whole rows and then discarding the ones deleted.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I'm still okay with either (1) or (2). But if we are going with (2) can you please add full test coverage?

i.e. cover things like

  • varios number of underlying bitmaps
  • empty case
  • etc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG][Kernel] RoaringBitmapArray create/add test methods are broken
2 participants