-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: basic support for executing prepared statements #13242
Conversation
@@ -176,27 +175,6 @@ async fn prepared_statement_type_coercion() -> Result<()> { | |||
Ok(()) | |||
} | |||
|
|||
#[tokio::test] | |||
async fn prepared_statement_invalid_types() -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this test because now parameters will be cast to the target type.
This is also the behavior of PostgreSQL.
psql=> prepare p(int) as select 100+$1;
PREPARE
psql=> execute p('100');
?column?
----------
200
(1 row)
psql=> execute p(20.12);
?column?
----------
120
(1 row)
@@ -106,9 +105,9 @@ async fn test_prepare_statement() -> Result<()> { | |||
let ctx = create_ctx_with_partition(&tmp_dir, partition_count).await?; | |||
|
|||
// sql to statement then to prepare logical plan with parameters | |||
// c1 defined as UINT32, c2 defined as UInt64 but the params are Int32 and Float64 | |||
let dataframe = | |||
ctx.sql("PREPARE my_plan(INT, DOUBLE) AS SELECT c1, c2 FROM test WHERE c1 > $2 AND c1 < $1").await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PREPARE now returns an empty DataFrame and can't call with_param_values
on it. I think this change is reasonable because the result of PREPARE
is not a real relation, as it doesn't contain valid rows. Its result is more like DDL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree -- PREPARE is a statement in my mind (and has no results)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much @jonahgao -- this PR looks basically perfect. My only concern is the security implications of executing prepared statements when the SQL is defined to be "read only"
It would also be neat to have some way to deallocate prepared statements, but that would be a natural follow on ticket (perhaps adding support for DEALLOCATE
for example)
@@ -687,7 +688,31 @@ impl SessionContext { | |||
LogicalPlan::Statement(Statement::SetVariable(stmt)) => { | |||
self.set_variable(stmt).await | |||
} | |||
|
|||
LogicalPlan::Prepare(Prepare { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am worried that LogicalPlan::Prepare
seems to be runnable even if we run sql_with_options
and disable statements.
datafusion/datafusion/core/src/execution/context/mod.rs
Lines 605 to 614 in aad7744
pub async fn sql_with_options( | |
&self, | |
sql: &str, | |
options: SQLOptions, | |
) -> Result<DataFrame> { | |
let plan = self.state().create_logical_plan(sql).await?; | |
options.verify_plan(&plan)?; | |
self.execute_logical_plan(plan).await | |
} |
I realize that this PR does not change if Prepare is a statement or not, but it is the first that actually makes Prepare
do something
Could you please add a test like this (but for PREPARE
statements) and make sure the statement can't be executed if statements are disabled?
datafusion/datafusion/core/tests/sql/sql_api.rs
Lines 99 to 114 in 6692382
async fn unsupported_statement_returns_error() { | |
let ctx = SessionContext::new(); | |
ctx.sql("CREATE TABLE test (x int)").await.unwrap(); | |
let options = SQLOptions::new().with_allow_statements(false); | |
let sql = "set datafusion.execution.batch_size = 5"; | |
let df = ctx.sql_with_options(sql, options).await; | |
assert_eq!( | |
df.unwrap_err().strip_backtrace(), | |
"Error during planning: Statement not supported: SetVariable" | |
); | |
let options = options.with_allow_statements(true); | |
ctx.sql_with_options(sql, options).await.unwrap(); | |
} |
Perhaps (as a follow on PR) we can make LogicalPlan::Prepare
a statement https://docs.rs/datafusion/latest/datafusion/logical_expr/enum.Statement.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Temporarily added PREPARE to BadPlanVisitor and added a test for this in 4729766
I will make LogicalPlan::Prepare a statement in a follow-up PR.
.collect::<Result<_>>()?; | ||
} | ||
|
||
let params = ParamValues::List(params); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice
Ok(()) | ||
} | ||
Entry::Occupied(e) => { | ||
exec_err!("Prepared statement '{}' already exists", e.key()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this errors if an existing prepared statement should we add some way to erase / clear prepared plans? Now there is no way to avoid accumulating them over time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I plan to implement DEALLOCATE in next PR.
@@ -106,9 +105,9 @@ async fn test_prepare_statement() -> Result<()> { | |||
let ctx = create_ctx_with_partition(&tmp_dir, partition_count).await?; | |||
|
|||
// sql to statement then to prepare logical plan with parameters | |||
// c1 defined as UINT32, c2 defined as UInt64 but the params are Int32 and Float64 | |||
let dataframe = | |||
ctx.sql("PREPARE my_plan(INT, DOUBLE) AS SELECT c1, c2 FROM test WHERE c1 > $2 AND c1 < $1").await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree -- PREPARE is a statement in my mind (and has no results)
@@ -30,56 +30,175 @@ select * from person; | |||
# Error due to syntax and semantic violation | |||
|
|||
# Syntax error: no name specified after the keyword prepare | |||
statement error | |||
statement error DataFusion error: SQL error: ParserError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
2 Bar | ||
|
||
|
||
# Test issue: https://github.com/apache/datafusion/issues/12294 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @jonahgao -- this looks great
@@ -1705,6 +1773,14 @@ impl<'n, 'a> TreeNodeVisitor<'n> for BadPlanVisitor<'a> { | |||
LogicalPlan::Statement(stmt) if !self.options.allow_statements => { | |||
plan_err!("Statement not supported: {}", stmt.name()) | |||
} | |||
// TODO: Implement PREPARE as a LogicalPlan::Statement | |||
LogicalPlan::Prepare(_) if !self.options.allow_statements => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Thanks @alamb for the review. |
Which issue does this PR close?
Closes #4549.
Rationale for this change
Store the logical plan of the prepared statement in the session state. During
EXECUTE
, retrieve the prepared logical plan, replace the placeholders with execution parameters, and then execute.What changes are included in this PR?
Support executing prepared statements.
Are these changes tested?
Yes
Are there any user-facing changes?
Yes.
ctx.sql("Prepare p(int) as query")
now returns an empty DataFrame, similar to Create View and other DDL statements. Previously, it would return a DataFrame containing the Prepare logical plan.