-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-23284][SQL] Document the behavior of several ColumnVector's get APIs when accessing null slot #20455
Conversation
Once map support is added later, we should also document |
Test build #86878 has finished for PR 20455 at commit
|
LGTM so far. Let's wait for map type support. |
LGTM so far, one thing I wanna add is to also document the behavior of accessing null primitive values, e.g. |
BTW we should also update |
LGTM for this behavior and comments. |
Since the map support is added, I'll do related change later. |
We also need to add |
@ueshin Yes, missing it. Thanks. I'll add it. |
5246fcc
to
35548e6
Compare
@@ -1261,4 +1261,140 @@ class ColumnarBatchSuite extends SparkFunSuite { | |||
batch.close() | |||
allocator.close() | |||
} | |||
|
|||
testVector("getUTF8String should return null for null slot", 4, StringType) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we already have test cases for each type, can we just change the existing test cases a little to add this null check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds ok to me. I'll commit the change by tonight.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems we don't have individual tests for binary type and decimal type?
Test build #86918 has finished for PR 20455 at commit
|
Test build #86921 has finished for PR 20455 at commit
|
@@ -1261,4 +1269,38 @@ class ColumnarBatchSuite extends SparkFunSuite { | |||
batch.close() | |||
allocator.close() | |||
} | |||
|
|||
testVector("getDecimal should return null for null slot", 4, DecimalType.IntDecimal) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we make it a normal test case for decimal type? we can follow the other tests, e.g. create a decimal array, and check the value of column vector at the same index.
Test build #86923 has finished for PR 20455 at commit
|
Test build #86926 has finished for PR 20455 at commit
|
Test build #86935 has finished for PR 20455 at commit
|
retest this please |
ok to test |
Test build #86941 has finished for PR 20455 at commit
|
LGTM. |
thanks, merging to master/2.3! |
…t APIs when accessing null slot ## What changes were proposed in this pull request? For some ColumnVector get APIs such as getDecimal, getBinary, getStruct, getArray, getInterval, getUTF8String, we should clearly document their behaviors when accessing null slot. They should return null in this case. Then we can remove null checks from the places using above APIs. For the APIs of primitive values like getInt, getInts, etc., this also documents their behaviors when accessing null slots. Their returning values are undefined and can be anything. ## How was this patch tested? Added tests into `ColumnarBatchSuite`. Author: Liang-Chi Hsieh <[email protected]> Closes #20455 from viirya/SPARK-23272-followup. (cherry picked from commit 90848d5) Signed-off-by: Wenchen Fan <[email protected]>
What changes were proposed in this pull request?
For some ColumnVector get APIs such as getDecimal, getBinary, getStruct, getArray, getInterval, getUTF8String, we should clearly document their behaviors when accessing null slot. They should return null in this case. Then we can remove null checks from the places using above APIs.
For the APIs of primitive values like getInt, getInts, etc., this also documents their behaviors when accessing null slots. Their returning values are undefined and can be anything.
How was this patch tested?
Added tests into
ColumnarBatchSuite
.