Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unparse struct to sql #13493

Merged
merged 8 commits into from
Nov 21, 2024
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 60 additions & 1 deletion datafusion/sql/src/unparser/expr.rs
Original file line number Diff line number Diff line change
Expand Up @@ -462,7 +462,9 @@ impl Unparser<'_> {
match func_name {
"make_array" => self.make_array_to_sql(args),
"array_element" => self.array_element_to_sql(args),
// TODO: support for the construct and access functions of the `map` and `struct` types
"named_struct" => self.named_struct_to_sql(args),
"get_field" => self.get_field_to_sql(args),
// TODO: support for the construct and access functions of the `map` type
_ => self.scalar_function_to_sql_internal(func_name, args),
}
}
Expand Down Expand Up @@ -514,6 +516,57 @@ impl Unparser<'_> {
})
}

fn named_struct_to_sql(&self, args: &[Expr]) -> Result<ast::Expr> {
if args.len() % 2 != 0 {
return internal_err!("named_struct must have an even number of arguments");
}

let args = args
.chunks_exact(2)
.map(|chunk| {
let key = match &chunk[0] {
Expr::Literal(lit) => self.new_ident_quoted_if_needs(lit.to_string()),
delamarch3 marked this conversation as resolved.
Show resolved Hide resolved
_ => return internal_err!("named_struct expects even arguments to be strings, but received: {:?}", &chunk[0])
};

Ok(ast::DictionaryField {
key,
value: Box::new(self.expr_to_sql(&chunk[1])?),
})
})
.collect::<Result<Vec<_>>>()?;

Ok(ast::Expr::Dictionary(args))
}

fn get_field_to_sql(&self, args: &[Expr]) -> Result<ast::Expr> {
if args.len() != 2 {
return internal_err!("get_field must have exactly 2 arguments");
}

let mut id = match &args[0] {
Expr::Column(col) => match self.col_to_sql(col)? {
ast::Expr::Identifier(ident) => vec![ident],
ast::Expr::CompoundIdentifier(idents) => idents,
other => return internal_err!("expected col_to_sql to return an Identifier or CompoundIdentifier, but received: {:?}", other),
},
_ => return internal_err!("get_field expects first argument to be column, but received: {:?}", &args[0]),
};
Comment on lines +547 to +554
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should consider the more complex cases, a struct contains a struct field. Consider the following SQL that uses another way to access the struct:

> explain SELECT s.a['a1'] FROM (SELECT {a:{a1:c1}, b:{a1:c1}} AS s from t1);
+---------------+-------------------------------------------------------------------------------------------------------------------------------------------+
| plan_type     | plan                                                                                                                                      |
+---------------+-------------------------------------------------------------------------------------------------------------------------------------------+
| logical_plan  | Projection: get_field(get_field(named_struct(Utf8("a"), __common_expr_1, Utf8("b"), __common_expr_1), Utf8("a")), Utf8("a1")) AS s[a][a1] |
|               |   Projection: named_struct(Utf8("a1"), t1.c1) AS __common_expr_1                                                                          |
|               |     TableScan: t1 projection=[c1]   

You can see the get_field accepts another get_field function. However, I think we can't do a roundtrip test for this kind of case currently because SQL like select s.a.a1 isn't supported (apache/datafusion-sqlparser-rs#1541 will address it).

By the way, I think we can file a follow-up issue for the nested case. We don't need to support it in this PR.


let field = match &args[1] {
Expr::Literal(lit) => self.new_ident_quoted_if_needs(lit.to_string()),
_ => {
return internal_err!(
"get_field expects second argument to be a string, but received: {:?}",
&args[0]
)
}
};
id.push(field);

Ok(ast::Expr::CompoundIdentifier(id))
}

pub fn sort_to_sql(&self, sort: &Sort) -> Result<ast::OrderByExpr> {
let Sort {
expr,
Expand Down Expand Up @@ -1524,6 +1577,7 @@ mod tests {
Signature, Volatility, WindowFrame, WindowFunctionDefinition,
};
use datafusion_expr::{interval_month_day_nano_lit, ExprFunctionExt};
use datafusion_functions::expr_fn::{get_field, named_struct};
use datafusion_functions_aggregate::count::count_udaf;
use datafusion_functions_aggregate::expr_fn::sum;
use datafusion_functions_nested::expr_fn::{array_element, make_array};
Expand Down Expand Up @@ -1937,6 +1991,11 @@ mod tests {
array_element(make_array(vec![lit(1), lit(2), lit(3)]), lit(1)),
"[1, 2, 3][1]",
),
(
named_struct(vec![lit("a"), lit("1"), lit("b"), lit(2)]),
"{a: '1', b: 2}",
),
(get_field(col("a.b"), "c"), "a.b.c"),
];

for (expr, expected) in tests {
Expand Down
4 changes: 3 additions & 1 deletion datafusion/sql/tests/cases/plan_to_sql.rs
Original file line number Diff line number Diff line change
Expand Up @@ -188,7 +188,9 @@ fn roundtrip_statement() -> Result<()> {
"SELECT ARRAY[1, 2, 3][1]",
"SELECT [1, 2, 3]",
"SELECT [1, 2, 3][1]",
"SELECT left[1] FROM array"
"SELECT left[1] FROM array",
"SELECT {a:1, b:2}",
"SELECT s.a FROM (SELECT {a:1, b:2} AS s)"
];

// For each test sql string, we transform as follows:
Expand Down