-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FlightSQL] Stateless prepared statement with parameter support #37720
Comments
I think this alludes to something I think would make the current protocol hard to implement even if you could guarantee that requests are routed to a particular server. The server has no reliable mechanism to tell when it can clean up after a session. This is because, unlike the protocols on which FlightSQL appears to be based, gRPC is not a connection-oriented transport. That is unlike say postgres where a session state can be associated with a particular TCP connection, there is no equivalent concept for gRPC. Most (if not all) gRPC implementations do not expose connection-level hooks for this reason, as clients should be free to multiplex as many or as few "sessions" on the underlying HTTP/2.0 transport as they deem fit, without this having any implications on the application protocol. |
The ADBC driver has long supported parameter binding.
Most things rely on a client-side token already (transactions, prepared statements), I think parameters are the only exception.
This seems reasonable to me. I suppose you're planning to embed the parameters directly into the client-side handle? This is a little unfortunate since then you're bouncing through Protobuf, but I suppose most systems are only thinking about 'small' sets of parameters anyways (and this can always be optimized if needed). |
CC @zeroshade @ywc88 to comment as well |
I agree that this seems to be fairly low overhead, provided that the parameters are few and small in number (which isn't always the case, such as a parameterized INSERT statement etc.) I guess my only real concern is that we end up losing a lot of performance or will require externally preserved state to manage that handle (of course these are orthogonal to FlightSQL as it would be application defined). I wouldn't want the protocol to encourage having to embed the parameters directly in the client-side handle, but I also don't see a better solution to this offhand. |
I agree that the protocol should not require having the parameters in the handle and should not require a new handle to be sent in the response to In terms of the performance penalty of passing parameter values back and forth, as you say that likely be an application level tradeoff. For example, many systems have special (non SQL) ingest bulk ingest APIs that can avoid |
So the semantics would be that the response may contain an updated handle and it is the responsibility of the client to check whether it is set, in which case the new handle must be used for future requests, or it is unset, in which case the client continues to use the existing handle? |
That was my reading, yes. Whether the parameters are embedded there or not, the client would have no clue. |
I agree that the proposed approach will work. I thought of some more approaches:
|
I like the idea of DoExchange |
I think this is a good idea as it reduces the number of required network roundtrips for some cases. However, I still think it is important that we change the existing Perhaps we can discuss adding a |
I asked about it on the original proposal but there didn't seem to be much interest. Though I am somewhat concerned about the number of special cases that might need to be implemented and how servers/clients will negotiate all this. |
That said let's file the ticket and discuss there |
Personally I do like the |
Filed #37741 to track the |
I believe @kallisti-dev is working on a specific proposal for this item |
apache/arrow-rs#5433 - client implementation for Rust #40243 - documentation updates will submit a PR soon for the golang client and then look at server implementations |
…s prepared statements
Issue resolved by pull request 40311 |
Thank you @lidavidm |
…t result (apache#40311) ### Rationale for this change See discussion on apache#37720 and mailing list: https://lists.apache.org/thread/3kb82ypx99q96g84qv555l6x8r0bppyq ### What changes are included in this PR? Changes the Go FlightSQL client and server implementations to support returning an updated prepared statement handle to the client as part of the `DoPut(PreparedStatement)` RPC call. ### Are these changes tested? ### Are there any user-facing changes? See parent issue and docs PR apache#40243 for details of user facing changes. **This PR includes breaking changes to public APIs.** * GitHub Issue: apache#37720 Lead-authored-by: Adam Curtis <[email protected]> Co-authored-by: David Li <[email protected]> Signed-off-by: David Li <[email protected]>
…t result (apache#40311) ### Rationale for this change See discussion on apache#37720 and mailing list: https://lists.apache.org/thread/3kb82ypx99q96g84qv555l6x8r0bppyq ### What changes are included in this PR? Changes the Go FlightSQL client and server implementations to support returning an updated prepared statement handle to the client as part of the `DoPut(PreparedStatement)` RPC call. ### Are these changes tested? ### Are there any user-facing changes? See parent issue and docs PR apache#40243 for details of user facing changes. **This PR includes breaking changes to public APIs.** * GitHub Issue: apache#37720 Lead-authored-by: Adam Curtis <[email protected]> Co-authored-by: David Li <[email protected]> Signed-off-by: David Li <[email protected]>
…t result (apache#40311) ### Rationale for this change See discussion on apache#37720 and mailing list: https://lists.apache.org/thread/3kb82ypx99q96g84qv555l6x8r0bppyq ### What changes are included in this PR? Changes the Go FlightSQL client and server implementations to support returning an updated prepared statement handle to the client as part of the `DoPut(PreparedStatement)` RPC call. ### Are these changes tested? ### Are there any user-facing changes? See parent issue and docs PR apache#40243 for details of user facing changes. **This PR includes breaking changes to public APIs.** * GitHub Issue: apache#37720 Lead-authored-by: Adam Curtis <[email protected]> Co-authored-by: David Li <[email protected]> Signed-off-by: David Li <[email protected]>
…t result (apache#40311) ### Rationale for this change See discussion on apache#37720 and mailing list: https://lists.apache.org/thread/3kb82ypx99q96g84qv555l6x8r0bppyq ### What changes are included in this PR? Changes the Go FlightSQL client and server implementations to support returning an updated prepared statement handle to the client as part of the `DoPut(PreparedStatement)` RPC call. ### Are these changes tested? ### Are there any user-facing changes? See parent issue and docs PR apache#40243 for details of user facing changes. **This PR includes breaking changes to public APIs.** * GitHub Issue: apache#37720 Lead-authored-by: Adam Curtis <[email protected]> Co-authored-by: David Li <[email protected]> Signed-off-by: David Li <[email protected]>
…t result (apache#40311) ### Rationale for this change See discussion on apache#37720 and mailing list: https://lists.apache.org/thread/3kb82ypx99q96g84qv555l6x8r0bppyq ### What changes are included in this PR? Changes the Go FlightSQL client and server implementations to support returning an updated prepared statement handle to the client as part of the `DoPut(PreparedStatement)` RPC call. ### Are these changes tested? ### Are there any user-facing changes? See parent issue and docs PR apache#40243 for details of user facing changes. **This PR includes breaking changes to public APIs.** * GitHub Issue: apache#37720 Lead-authored-by: Adam Curtis <[email protected]> Co-authored-by: David Li <[email protected]> Signed-off-by: David Li <[email protected]>
…ents Part fixed caching of statementContext
…ents Part fixed caching of statementContext
…ents Part fixed caching of statementContext
…t result (apache#40311) ### Rationale for this change See discussion on apache#37720 and mailing list: https://lists.apache.org/thread/3kb82ypx99q96g84qv555l6x8r0bppyq ### What changes are included in this PR? Changes the Go FlightSQL client and server implementations to support returning an updated prepared statement handle to the client as part of the `DoPut(PreparedStatement)` RPC call. ### Are these changes tested? ### Are there any user-facing changes? See parent issue and docs PR apache#40243 for details of user facing changes. **This PR includes breaking changes to public APIs.** * GitHub Issue: apache#37720 Lead-authored-by: Adam Curtis <[email protected]> Co-authored-by: David Li <[email protected]> Signed-off-by: David Li <[email protected]>
…ents Part fixed caching of statementContext
…lt (#40311) ### Rationale for this change See discussion on apache/arrow#37720 and mailing list: https://lists.apache.org/thread/3kb82ypx99q96g84qv555l6x8r0bppyq ### What changes are included in this PR? Changes the Go FlightSQL client and server implementations to support returning an updated prepared statement handle to the client as part of the `DoPut(PreparedStatement)` RPC call. ### Are these changes tested? ### Are there any user-facing changes? See parent issue and docs PR #40243 for details of user facing changes. **This PR includes breaking changes to public APIs.** * GitHub Issue: #37720 Lead-authored-by: Adam Curtis <[email protected]> Co-authored-by: David Li <[email protected]> Signed-off-by: David Li <[email protected]>
Describe the enhancement requested
[email protected] mailing list thread: https://lists.apache.org/thread/f0xb61z4yw611rw0v8vf9rht0qtq8opc
Usecase
InfluxDB IOx / 3.0 would like to allow customers to create prepared SQL statements with parameters so they can send parameterized queries and parameter values to the serve. Without this feature, they have to do the parameter substitution on the client side, which is both subject to possible SQL injection attacks, or (if they use a pre existing library) may not have the same parameter typing as our SQL implementation.
Given the JDBC driver doesn't yet support binding parameters to prepared statements (see #33961) I am not sure how widely used the parameter support is, but I think interest is growing -- for example apache/arrow-rs#4797 adds client side support to the Rust implementation
Background: Stateless services
A common design pattern in cloud services is that the request from a client can be handled by one of a number of identical backend servers as shown in the diagram below.
Subsequent requests may be processed by different backend servers. Any state needed to continue a session is sent to the client which passes it back in subsequent requests.
This design can used to support features such as zero downtime deployments and automatic workload based scaling. It also has the nice property that there is no server side state to clean up (via timeout or other mechanism).
Problem
As currently specified, I don't think we can implement FlightSQL prepared statements with parameters with such a stateless design.
In IOx, the
handle
returned fromActionCreatePreparedStatementRequest
contains the original SQL query text among other things. Thus the subsequent call toActionPreparedStatementExecute
have access to the SQL query.However, the
CommandPrepareStatementQuery
message to bind parameters does not return anything to the client that is sent to calls toActionPreparedStatementExecute
. Thus there is no way for the server that processesActionPreparedStatementExecute
to know the values of the parameters.FlightSQL sequence Diagram
Here is the sequence diagram from https://arrow.apache.org/docs/format/FlightSql.html for reference
Strawman Proposal
One way to support stateless implementation of prepared statements with bind parameters would be to extend the response returned from calling
DoPut
withCommandPrepareStatementQuery
to include a newCommandPrepareStatementQueryResponse
, similar toarrow/format/FlightSql.proto
Lines 1782 to 1792 in 15a8ac3
I think this would be a fairly low overhead and easy extension. Existing clients that support bind parameters would require an update, but existing servers would not. Given that bind parameters are just starting to be used more I think the overall ecosystem impact would be low
See also
See this discussion for more context: https://github.com/apache/arrow-rs/pull/4797/files#r1319807938
Component(s)
FlightRPC
The text was updated successfully, but these errors were encountered: