-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When removeNullBytes is set, length calculations did not take into account null bytes. #17232
Conversation
…ccount null bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be cleaner if the tests are added in the following way:
- Remove the inherited file
StringFieldWritersWithNullHandlingTest.java
since it's not checking stuff with "null-handling". We are checking null-bytes, which is completely different than what Druid associates the term "null-handling" for. - In the
setUp
ofStringFieldWriterTest,
there are a couple of field writers. There should be two more -fieldWriterWithRemoveNullBytes
,fieldWriterUtf8WithRemoveNullBytes
. - Replicate the
doTest
method, withdoTestWithRemoveNullBytes
. Call that method along withdoTest
in the test cases that are present already. - Add a new test case that tests a String like "abc\u0000". I am surprised that you are able to replicate the behavior with the added test case since that is not what the patch is for.
doTest
should fail with this test case, because of null bytes in the string (which is not happening in this PR), and we should have an assertion for that error. ThedoTestWithRemoveNullBytes
will pass. - Moreover, I think the memory is initialized with null bytes , so you'd have to fill up the memory with junk bytes that are not null. I am not sure if this is strictly required. We can check this, but I don't see any downside of having some randomly generated garbage data (without null) in the memory where the field writer is writing that will ensure that the test fails regardless of the initialization conditions.
What do you think of the above? Also, I have a couple of questions:
- Have I missed any reason for having a new class overriding the before method for the testcases and making the variables protected?
- Are you running into any errors with the test case that has been added, without the changes. I expect the test case to run fine because that is not what the patch is intended for, and it is certainly worth looking into if it is failing without these changes.
@@ -106,6 +106,12 @@ public void testMultiValueStringContainingNulls() | |||
doTest(Arrays.asList("foo", NullHandling.emptyToNullIfNeeded(""), "bar", null)); | |||
} | |||
|
|||
@Test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This name is misleading because it tests "nulls" instead of "null bytes". The string should be like "foo\u0000" in order for it to have the null byte. Does this test fail without the changes to the StringFieldWriter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just pushed up a new patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think a new file is required for the test case. This should be a helper method like doTest
in the StringFieldWriterTest
.
This will allow the reader to visualize the different test cases together. Both the helper methods would exhibit the same behavior except for strings where null bytes are present, where the doTest
would throw and the newer one would silently eat up the null bytes.
I have added a new test case and machinery to test individual field writers. The new test case should be hitting the new code changes now. |
…count null bytes. (apache#17232) * When replaceNullBytes is set, length calculations did not take into account null bytes.
…count null bytes. (#17232) (#17266) * When replaceNullBytes is set, length calculations did not take into account null bytes. Co-authored-by: Karan Kumar <[email protected]>
Fix for errros like :
Patch was prepared with help from @LakshSingla .
The test cases added here does not completely check the new code flow.