-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Placeholders in prepared statements #678
Conversation
@@ -299,7 +300,7 @@ impl SessionContext { | |||
&mut self, | |||
name: String, | |||
stmt: Option<StatementWithExtensions>, | |||
params: Vec<i32>, | |||
_params: Vec<i32>, // TODO: We can use these for providing types for parameters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Holding off on this for now. I would like to follow up in another pr later this week or next.
crates/pgrepr/src/reader.rs
Outdated
fn read_bool(buf: &[u8]) -> Result<bool> { | ||
TextReader::parse::<_, SqlBool>(buf).map(|b| b.0) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fn read_bool(buf: &[u8]) -> Result<bool> { | |
TextReader::parse::<_, SqlBool>(buf).map(|b| b.0) | |
} | |
fn read_bool(buf: &[u8]) -> Result<bool> { | |
Self::parse::<_, SqlBool>(buf).map(|b| b.0) | |
} |
Just a personal preference! No need to change for the PR :)
crates/pgrepr/src/types.rs
Outdated
} | ||
} | ||
|
||
fn decode_not_null_value<R: Reader>(buf: &[u8], arrow_type: &ArrowType) -> Result<ScalarValue> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to do it like from: &PgType
, to: &ArrowType
? Leaves room for other kinds of transformations. All the read and write methods are corresponding to the PG datatypes.
For eg: we will need to cast PgType::JSON
to ArrowType::Utf8
but the read method will differ when reading JSON (it would be read_json
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might?
Postgres docs have this to say:
first sends a Parse message, which contains a textual query string, optionally some information about data types of parameter placeholders, [...] Parameter data types can be specified by OID; if not given, the parser attempts to infer the data types in the same way as it would do for untyped literal string constants.
Currently I'm ignoring the oids being provided during parse and inferring everything. I think we'll want this function to accept something like PostgresOrArrowType
or just convert the arrow type that we get from inferring into the associated postgres type. I'll mess around with using the oids provided during parse and see what makes sense here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should have the PG type even when inferred which then should be translated to arrow type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making that change right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latest change swaps out the arrow types with pg types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Can we add the todo comment? I'll look into making the changes when working on filters!
fn decode_not_null_value<R: Reader>(buf: &[u8], pg_type: &PgType) -> Result<ScalarValue> { | ||
Ok(match pg_type { | ||
&PgType::BOOL => R::read_bool(buf)?.into(), | ||
&PgType::INT2 => R::read_int2(buf)?.into(), | ||
&PgType::INT4 => R::read_int4(buf)?.into(), | ||
&PgType::INT8 => R::read_int8(buf)?.into(), | ||
&PgType::FLOAT4 => R::read_float4(buf)?.into(), | ||
&PgType::FLOAT8 => R::read_float8(buf)?.into(), | ||
&PgType::TEXT => ScalarValue::Utf8(Some(R::read_text(buf)?)), | ||
other => return Err(PgReprError::UnsupportedPgTypeForDecode(other.clone())), | ||
}) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks quite good. Can we add a little TODO that we infer types as SQL or PG types (instead of arrow rolling with) because this is where the issue lies, arrow doesn't really work well with all the data types when inferring. I'll see what we can do, if we can work around arrow or actually implement something like inferring PG types (and not convert them from Arrow). Also might want to carry both the types along -- Arrow and PG (will be helpful in catching bugs).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will add that todo.
What benefits do you see with providing the arrow type here as well? Since converting from an arrow type to a pg type is unambiguous, I'm not sure what we'd be able to use the arrow type for in this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, so I am just recalling it that the ScalarValue
enum won't be sufficient. It was required if we could've used another scalar to represent a value but yeah, if I'm going to change the scalar enum, it will be redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think trying to switch out the scalar enum will be very tricky given how intertwined datafusion planning and execution are and how much we rely on those. Even inferring pg types directly from a query with parameters will be difficult since the produced logical plan will be constrained to arrow types.
I'm going to merge this as-is for now, but I'm down to add in the arrow types once they're needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I do need to look at the scalar values. What I’m hoping is we can switch the scalar just for PG conversions (to and fro). The problem with Arrow scalars is that there exists an array, the value is valid, but Arrow just doesn’t have a scalar to represent it!
Provides an initial placeholder implementation. Much of this is relying on datafusion's ability to extract to types from the logical plan. In some cases, it's not able to infer types, leading to us emitting an error to the client. I believe this will get better as datafusion continues to improve its support for placeholders.
See the added protocol tests for some examples of things we do support and things we error on.
Closes #122. Note that we don't support "everything", but "everything" is quite large. This PR is a base to work on, and we'll want to open up more specific issues moving forward.