Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support placeholders when planning queries #122

Closed
6 tasks
scsmithr opened this issue Sep 22, 2022 · 4 comments · Fixed by #678
Closed
6 tasks

Support placeholders when planning queries #122

scsmithr opened this issue Sep 22, 2022 · 4 comments · Fixed by #678
Assignees
Labels
feat New feature or request

Comments

@scsmithr
Copy link
Member

scsmithr commented Sep 22, 2022

Particularly relevant with prepared statements.

I am not yet sure how we want to store prepared statements (e.g. do we want to store the statement itself, or is there some ahead of time planning we should try to do).

'SELECT * FROM test_table WHERE a = $1'
Parse results:
[
    Query(
        Query {
            with: None,
            body: Select(
                Select {
                    distinct: false,
                    top: None,
                    projection: [
                        Wildcard,
                    ],
                    into: None,
                    from: [
                        TableWithJoins {
                            relation: Table {
                                name: ObjectName(
                                    [
                                        Ident {
                                            value: "test_table",
                                            quote_style: None,
                                        },
                                    ],
                                ),
                                alias: None,
                                args: None,
                                with_hints: [],
                            },
                            joins: [],
                        },
                    ],
                    lateral_views: [],
                    selection: Some(
                        BinaryOp {
                            left: Identifier(
                                Ident {
                                    value: "a",
                                    quote_style: None,
                                },
                            ),
                            op: Eq,
                            right: Value(
                                Placeholder(
                                    "$1",
                                ),
                            ),
                        },
                    ),
                    group_by: [],
                    cluster_by: [],
                    distribute_by: [],
                    sort_by: [],
                    having: None,
                    qualify: None,
                },
            ),
            order_by: [],
            limit: None,
            offset: None,
            fetch: None,
            lock: None,
        },
    ),
]

tracking

  • Parse
    • accept optional list of parameter oids
    • if not provided, infer data types
  • Bind
    • the values for the statement's placeholders are given and must match the statement's
    • Convert bytes to values to insert into placeholder slots
@justinrubek justinrubek self-assigned this Sep 28, 2022
@justinrubek
Copy link
Contributor

justinrubek commented Sep 28, 2022

Following https://www.postgresql.org/docs/current/protocol-flow.html#PROTOCOL-FLOW-EXT-QUERY

It appears the intent is that the statement is parsed during the Parse message to find placeholders and it is planned during Bind. There is some suggested room for saving work:

Query planning typically occurs when the Bind message is processed. If the prepared statement has no parameters, or is executed repeatedly, the server might save the created plan and re-use it during subsequent Bind messages for the same prepared statement. However, it will do so only if it finds that a generic plan can be created that is not much less efficient than a plan that depends on the specific parameter values supplied. This happens transparently so far as the protocol is concerned.

Placeholder support needs to come in during both Parse and Bind. The current implementations don't pay attention to the parameters: Parse passes the statement through and Bind creates a plan directly from the statement.

postgres docs

The postgres docs have the most nuance on this. Here are some highlights of the important parts for this

Parse

In the extended protocol, the frontend first sends a Parse message, which contains a textual query string, optionally some information about data types of parameter placeholders, and the name of a destination prepared-statement object (an empty string selects the unnamed prepared statement). The response is either ParseComplete or ErrorResponse. Parameter data types can be specified by OID; if not given, the parser attempts to infer the data types in the same way as it would do for untyped literal string constants.

A parameter data type can be left unspecified by setting it to zero, or by making the array of parameter type OIDs shorter than the number of parameter symbols ($n) used in the query string. Another special case is that a parameter's type can be specified as void (that is, the OID of the void pseudo-type). This is meant to allow parameter symbols to be used for function parameters that are actually OUT parameters. Ordinarily there is no context in which a void parameter could be used, but if such a parameter symbol appears in a function's parameter list, it is effectively ignored. For example, a function call such as foo($1,$2,$3,$4) could match a function with two IN and two OUT arguments, if $3 and $4 are specified as having type void.

bind

Once a prepared statement exists, it can be readied for execution using a Bind message. The Bind message gives the name of the source prepared statement (empty string denotes the unnamed prepared statement), the name of the destination portal (empty string denotes the unnamed portal), and the values to use for any parameter placeholders present in the prepared statement. The supplied parameter set must match those needed by the prepared statement. (If you declared any void parameters in the Parse message, pass NULL values for them in the Bind message.) Bind also specifies the format to use for any data returned by the query; the format can be specified overall, or per-column. The response is either BindComplete or ErrorResponse.

@greyscaled
Copy link
Contributor

greyscaled commented Feb 17, 2023

I think this one is still relevant despite our shift towards analytics, as for example I couldn't use sqlc generated queries that used placeholders.

I think it's feasible a customer will hit this if they're using glaredb within application code.

I know it's on hold, but I think this might be a near-term thing. It doesn't block alpha, but it is a known limitation for what kind of clients we can support.

@scsmithr
Copy link
Member Author

Definitely still relevant.

Datafusion 17 & 18 bring in more pieces that make this more tenable to work on in the short term. https://glaredb.slack.com/archives/C03UMSRFGP4/p1676652962924899

@greyscaled
Copy link
Contributor

Ran into this on cloud unable to use code generated with sqlc.

A lot more reasonable to do with DF 18

@scsmithr scsmithr self-assigned this Feb 22, 2023
scsmithr added a commit that referenced this issue Nov 20, 2024
* crate stub

* make bin

* router

* docker

* obviously

* omfg

* simple k8s yaml

* attach hybrid client

* cors, bump version

* start annotating serde

* compile, some skips

* handler stubs

* server session stub

* table bind list

* slt replacement vars

* sink

* clarify io traits

* FileProvider trait

* ok

* fixup! ok

* delete code

* delete more code

* parse copy to

* copy to wip

* some serialization

* make finalize asyncable, also wrestle with serde

* fixup! make finalize asyncable, also wrestle with serde

* fixup! make finalize asyncable, also wrestle with serde

* idk

* more copy to impl

* ok

* bump version

* compiles

* serde

* serializable planned scalar funcs

* deserialize scalar from state

* serialize planned aggs

* deserialize planned agg

* remove specialized table function

* serialize table functions

* impl filter for remaining types except list & struct, bump version

* fix indexing logic, bump version

* typetag link

* lint

* Add csv tests to ci
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants