-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support fixed_size_list for make_array #6759
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I create fixedsizelist parquet from my own script.
Is there any existing way for us to easily create/update this kind of test case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you look at the data sets available in https://github.com/apache/parquet-testing (which is a submodule of this repo)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this contribution @jayzhan211 -- I think other than the patch to crates.io this PR is ready to go.
I left some suggestions, which I think would improve the PR
Cargo.toml
Outdated
@@ -53,6 +53,14 @@ arrow-array = { version = "42.0.0", default-features = false, features = ["chron | |||
parquet = { version = "42.0.0", features = ["arrow", "async", "object_store"] } | |||
sqlparser = { version = "0.35", features = ["visitor"] } | |||
|
|||
[patch.crates-io] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we will have to wait for this code to be released to crates.io (in about 10 days) before merging this PR
@@ -3646,29 +3675,9 @@ impl fmt::Display for ScalarValue { | |||
ScalarValue::TimestampNanosecond(e, _) => format_option!(f, e)?, | |||
ScalarValue::Utf8(e) => format_option!(f, e)?, | |||
ScalarValue::LargeUtf8(e) => format_option!(f, e)?, | |||
ScalarValue::Binary(e) => match e { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you look at the data sets available in https://github.com/apache/parquet-testing (which is a submodule of this repo)
@@ -486,6 +497,8 @@ impl<'a> Tokenizer<'a> { | |||
"Date32" => Token::SimpleType(DataType::Date32), | |||
"Date64" => Token::SimpleType(DataType::Date64), | |||
|
|||
"List" => Token::List, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
FYI @izveigor |
Marking as a draft as it looks like this PR is not waiting on review -- it is waiting on an upstream release (scheduled early next week) as well as to address some other comments |
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
* feat: supports NULL in arrays * feat: supports NULL in array functions * fix: array_fill error * fix: merge * fix: cargo fmt --------- Co-authored-by: Andrew Lamb <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
ecc93dd
to
c6ba92a
Compare
@@ -125,7 +125,7 @@ fn array_array(args: &[ArrayRef], data_type: DataType) -> Result<ArrayRef> { | |||
} | |||
|
|||
let list_data_type = | |||
DataType::List(Arc::new(Field::new("item", data_type, false))); | |||
DataType::List(Arc::new(Field::new("item", data_type, true))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert the change from #6662 to avoid failure on the schema mismatch
Signed-off-by: jayzhan211 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @jayzhan211
cc @izveigor |
* support make_array for fixed_size_list Signed-off-by: jayzhan211 <[email protected]> * add arrow-typeof in test Signed-off-by: jayzhan211 <[email protected]> * fix schema mismatch Signed-off-by: jayzhan211 <[email protected]> * cleanup code Signed-off-by: jayzhan211 <[email protected]> * create array data with correct len Signed-off-by: jayzhan211 <[email protected]> --------- Signed-off-by: jayzhan211 <[email protected]>
* support make_array for fixed_size_list Signed-off-by: jayzhan211 <[email protected]> * add arrow-typeof in test Signed-off-by: jayzhan211 <[email protected]> * fix schema mismatch Signed-off-by: jayzhan211 <[email protected]> * cleanup code Signed-off-by: jayzhan211 <[email protected]> * create array data with correct len Signed-off-by: jayzhan211 <[email protected]> --------- Signed-off-by: jayzhan211 <[email protected]>
Which issue does this PR close?
Ref #6560
Rationale for this change
Add FixedSizeList support for array methods
What changes are included in this PR?
Only MakeArray is done for this PR.
Are these changes tested?
sqllogictest
array.slt
unit test in type_coercion.rs
Are there any user-facing changes?
Task