feat: Placeholders in prepared statements #678

scsmithr · 2023-02-22T15:41:04Z

Provides an initial placeholder implementation. Much of this is relying on datafusion's ability to extract to types from the logical plan. In some cases, it's not able to infer types, leading to us emitting an error to the client. I believe this will get better as datafusion continues to improve its support for placeholders.

See the added protocol tests for some examples of things we do support and things we error on.

Closes #122. Note that we don't support "everything", but "everything" is quite large. This PR is a base to work on, and we'll want to open up more specific issues moving forward.

scsmithr · 2023-02-22T20:43:05Z

crates/sqlexec/src/context.rs

@@ -299,7 +300,7 @@ impl SessionContext {
        &mut self,
        name: String,
        stmt: Option<StatementWithExtensions>,
-        params: Vec<i32>,
+        _params: Vec<i32>, // TODO: We can use these for providing types for parameters.


Holding off on this for now. I would like to follow up in another pr later this week or next.

vrongmeal · 2023-02-23T15:35:35Z

crates/pgrepr/src/reader.rs

+    fn read_bool(buf: &[u8]) -> Result<bool> {
+        TextReader::parse::<_, SqlBool>(buf).map(|b| b.0)
+    }


Suggested change

fn read_bool(buf: &[u8]) -> Result<bool> {

TextReader::parse::<_, SqlBool>(buf).map(|b| b.0)

}

fn read_bool(buf: &[u8]) -> Result<bool> {

Self::parse::<_, SqlBool>(buf).map(|b| b.0)

}

Just a personal preference! No need to change for the PR :)

vrongmeal · 2023-02-23T15:47:44Z

crates/pgrepr/src/types.rs

+    }
+}
+
+fn decode_not_null_value<R: Reader>(buf: &[u8], arrow_type: &ArrowType) -> Result<ScalarValue> {


Does it make sense to do it like from: &PgType, to: &ArrowType? Leaves room for other kinds of transformations. All the read and write methods are corresponding to the PG datatypes.

For eg: we will need to cast PgType::JSON to ArrowType::Utf8 but the read method will differ when reading JSON (it would be read_json).

It might?

Postgres docs have this to say:

first sends a Parse message, which contains a textual query string, optionally some information about data types of parameter placeholders, [...] Parameter data types can be specified by OID; if not given, the parser attempts to infer the data types in the same way as it would do for untyped literal string constants.

Currently I'm ignoring the oids being provided during parse and inferring everything. I think we'll want this function to accept something like PostgresOrArrowType or just convert the arrow type that we get from inferring into the associated postgres type. I'll mess around with using the oids provided during parse and see what makes sense here.

I think we should have the PG type even when inferred which then should be translated to arrow type

Making that change right now.

Latest change swaps out the arrow types with pg types.

vrongmeal

Looks good. Can we add the todo comment? I'll look into making the changes when working on filters!

vrongmeal · 2023-02-23T17:52:48Z

crates/pgrepr/src/types.rs

+fn decode_not_null_value<R: Reader>(buf: &[u8], pg_type: &PgType) -> Result<ScalarValue> {
+    Ok(match pg_type {
+        &PgType::BOOL => R::read_bool(buf)?.into(),
+        &PgType::INT2 => R::read_int2(buf)?.into(),
+        &PgType::INT4 => R::read_int4(buf)?.into(),
+        &PgType::INT8 => R::read_int8(buf)?.into(),
+        &PgType::FLOAT4 => R::read_float4(buf)?.into(),
+        &PgType::FLOAT8 => R::read_float8(buf)?.into(),
+        &PgType::TEXT => ScalarValue::Utf8(Some(R::read_text(buf)?)),
+        other => return Err(PgReprError::UnsupportedPgTypeForDecode(other.clone())),
+    })
+}
+


This looks quite good. Can we add a little TODO that we infer types as SQL or PG types (instead of arrow rolling with) because this is where the issue lies, arrow doesn't really work well with all the data types when inferring. I'll see what we can do, if we can work around arrow or actually implement something like inferring PG types (and not convert them from Arrow). Also might want to carry both the types along -- Arrow and PG (will be helpful in catching bugs).

Will add that todo.

What benefits do you see with providing the arrow type here as well? Since converting from an arrow type to a pg type is unambiguous, I'm not sure what we'd be able to use the arrow type for in this function.

OK, so I am just recalling it that the ScalarValue enum won't be sufficient. It was required if we could've used another scalar to represent a value but yeah, if I'm going to change the scalar enum, it will be redundant.

I do think trying to switch out the scalar enum will be very tricky given how intertwined datafusion planning and execution are and how much we rely on those. Even inferring pg types directly from a query with parameters will be difficult since the produced logical plan will be constrained to arrow types.

I'm going to merge this as-is for now, but I'm down to add in the arrow types once they're needed here.

Yeah, I do need to look at the scalar values. What I’m hoping is we can switch the scalar just for PG conversions (to and fro). The problem with Arrow scalars is that there exists an array, the value is valid, but Arrow just doesn’t have a scalar to represent it!

scsmithr added 6 commits February 22, 2023 09:40

parameters stubs

7f7116d

start reader

8a3d22a

slowly but surely

23d375f

Simple case working

81f2796

Merge remote-tracking branch 'origin/main' into sean/parameters

d69bec9

cleanup

100e2d8

scsmithr requested review from vrongmeal and RustomMS February 22, 2023 20:41

scsmithr marked this pull request as ready for review February 22, 2023 20:41

scsmithr commented Feb 22, 2023

View reviewed changes

remove unused error variant

795e3d6

RustomMS approved these changes Feb 22, 2023

View reviewed changes

vrongmeal reviewed Feb 23, 2023

View reviewed changes

scsmithr added 2 commits February 23, 2023 10:26

Merge remote-tracking branch 'origin/main' into sean/parameters

f7c67f4

Use pg types when decoding params

cd9ba52

scsmithr requested a review from vrongmeal February 23, 2023 17:35

vrongmeal approved these changes Feb 23, 2023

View reviewed changes

Add todo

d57696b

scsmithr merged commit af6d189 into main Feb 23, 2023

scsmithr deleted the sean/parameters branch February 23, 2023 19:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Placeholders in prepared statements #678

feat: Placeholders in prepared statements #678

scsmithr commented Feb 22, 2023 •

edited

Loading

scsmithr Feb 22, 2023

vrongmeal Feb 23, 2023

vrongmeal Feb 23, 2023

scsmithr Feb 23, 2023

vrongmeal Feb 23, 2023

scsmithr Feb 23, 2023

scsmithr Feb 23, 2023

vrongmeal left a comment

vrongmeal Feb 23, 2023

scsmithr Feb 23, 2023

vrongmeal Feb 23, 2023

scsmithr Feb 23, 2023

vrongmeal Feb 23, 2023

feat: Placeholders in prepared statements #678

feat: Placeholders in prepared statements #678

Conversation

scsmithr commented Feb 22, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vrongmeal left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

scsmithr commented Feb 22, 2023 •

edited

Loading