-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-33475: [Java][FlightRPC] Send prepared statement parameters #33961
Conversation
Thanks for opening a pull request! If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project. Then could you also rename the pull request title in the following format?
or
In the case of PARQUET issues on JIRA the title also supports:
See also: |
Oh cool, you worked this out. I had half a PR but never finished it. |
return false; | ||
} | ||
|
||
static int toJdbcType(ArrowType.ArrowTypeID type) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am uncertain where these methods should go.
case Duration: | ||
return Timestamp.class.getCanonicalName(); | ||
} | ||
return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FlightSql schema are quite the superset of JDBC, so the mapping of deeply nested structs and such should probably be a subsequent PR?
* limitations under the License. | ||
*/ | ||
|
||
package org.apache.arrow.vector.table; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean to delete this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Temporarily yes. I could not get it to compile and will put it back before I un-draft.
try(PreparedStatement ps = con.prepareStatement("select * from person where id=$1")) { | ||
ParameterMetaData md = ps.getParameterMetaData(); | ||
assertEquals(1, md.getParameterCount()); | ||
assertEquals("Int", md.getParameterTypeName(1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These assertions pass against our server implementation, which is part #1 of this PR.
ParameterMetaData md = ps.getParameterMetaData(); | ||
assertEquals(1, md.getParameterCount()); | ||
assertEquals("Int", md.getParameterTypeName(1)); | ||
ps.setInt(1, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not seem to affect parameterBindingRoot
so the parameters are not sent back to the server.
I am a bit lost and would appreciate some help following the flow of how those parameters should be sent and why they are not going through at present.
In general you can look at the arrow-jdbc module for existing mapping between Arrow and JDBC types, as you've noted, not all types are easily handled |
@@ -1,856 +0,0 @@ | |||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll put this back before I un-draft the PR. I could not get it to compile locally, so I deleted it for now.
@@ -133,6 +126,28 @@ public void testShouldConnectWhenProvidedWithValidUrl() throws Exception { | |||
} | |||
} | |||
|
|||
@Test | |||
public void testParameters() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be happy to receive pointers on how to properly unit test this, since we will probably need to remove this integration test before submitting as a non-draft PR.
I also happen to be working on support for parameterized statements in InfluxDB IOx (WIP at https://github.com/influxdata/influxdb_iox/pull/6790) and I had just hit this. I plan to try and test / review this PR tomorrow |
note there was a prior PR in the issue description, I don't know if that answers any of your questions here |
FieldVector v = new | ||
} | ||
VectorSchemaRoot parameters = new VectorSchemaRoot(); | ||
this.preparedStatement.setParameters(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the missing piece. If we can translate the AvaticaParameter
s into a VectorSchemaRoot
the server will get what it needs.
Previous PR #14627 |
} | ||
} | ||
VectorSchemaRoot parameters = new VectorSchemaRoot(fields); | ||
this.preparedStatement.setParameters(parameters); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Step #2 of this PR seems complete: now we can both:
- Get the schema from the server
- Set the parameter values to pass to the server
Latest update: everything is working (if not a little ugly) Unfortunately it's working against a proprietary server, so I'll begin the work to put it into the example server in Then I could use some help getting this PR ready for merge. |
The prior PR uses a mocked server for unit testing - we can do that here, right? |
Yes, I'll look into it. Happy to pull over those changes from your PR if that's the easiest way. |
bdf71ae
to
6277f58
Compare
@avantgardnerio was there a problem with the PR? |
Oh, you're right - I hadn't looked at this branch in so long I thought it was stale, but I do think this needs to go in. |
Closing because it has been untouched for a while, in case it's still relevant feel free to reopen and move it forward 👍 |
Will prepared statements be supported anytime in the near future? |
Contributions are welcome. I'm not aware of any active development on the JDBC driver. |
This will be a priority for our team within the next few months. If no one has done it by then I will likely take this up. |
@aiguofer what I have in this branch basically works, I think it just needs to cleanup and tests to merge. Sorry I don't have time to help right now... |
Sweet that's what I'm hoping for. I'd likely strip out all the metrics stuff since it's unrelated, but I'm hoping between this PR and @lidavidm's original PR that it should be pretty easy to put together. We have some other higher priorities right now but in a month or so I might be able to get to this (or if I somehow find myself extremely bored on a weekend before then haha). |
…C driver (#38404) This PR is a combination of #33961 and #14627. The goal is to support parametrized queries through the Arrow Flight SQL JDBC driver. An Arrow Flight SQL server returns a Schema for the `PreparedStatement` parameters. The driver then converts the `Field` list associated with the Schema into a list of `AvaticaParameter`. When the user sets values for the parameters, Avatica generates a list of `TypedValue`, which we then bind to each parameter vector. This conversion between Arrow and Avatica is handled by implementations of a `AvaticaParameterConverter` interface for each Arrow type. This interface which provides 2 methods: - createParameter: Create an `AvaticaParameter` from the given Arrow `Field`. - bindParameter: Cast the given `TypedValue` and bind it to the `FieldVector` at the specified index. This PR purposely leaves out a few features: - We currently naively cast the `TypedValue` values assuming users set the type correctly. If this cast fails, we raise an exception letting the user know that the cast is not supported. This could be improved in subsequent PRs to do smarter conversions from other types. - We currently don't provide conversions for complex types such as List, Map, Struct, Union, Interval, and Duration. The stubs are there so they can be implemented as needed. - Tests for specific types have not been implemented. I'm not very familiar with a lot of these JDBC types so it's hard to implement rigorous tets. * Closes: #33475 * Closes: #35536 Authored-by: Diego Fernandez <[email protected]> Signed-off-by: David Li <[email protected]>
…in JDBC driver (apache#38404) This PR is a combination of apache#33961 and apache#14627. The goal is to support parametrized queries through the Arrow Flight SQL JDBC driver. An Arrow Flight SQL server returns a Schema for the `PreparedStatement` parameters. The driver then converts the `Field` list associated with the Schema into a list of `AvaticaParameter`. When the user sets values for the parameters, Avatica generates a list of `TypedValue`, which we then bind to each parameter vector. This conversion between Arrow and Avatica is handled by implementations of a `AvaticaParameterConverter` interface for each Arrow type. This interface which provides 2 methods: - createParameter: Create an `AvaticaParameter` from the given Arrow `Field`. - bindParameter: Cast the given `TypedValue` and bind it to the `FieldVector` at the specified index. This PR purposely leaves out a few features: - We currently naively cast the `TypedValue` values assuming users set the type correctly. If this cast fails, we raise an exception letting the user know that the cast is not supported. This could be improved in subsequent PRs to do smarter conversions from other types. - We currently don't provide conversions for complex types such as List, Map, Struct, Union, Interval, and Duration. The stubs are there so they can be implemented as needed. - Tests for specific types have not been implemented. I'm not very familiar with a lot of these JDBC types so it's hard to implement rigorous tets. * Closes: apache#33475 * Closes: apache#35536 Authored-by: Diego Fernandez <[email protected]> Signed-off-by: David Li <[email protected]>
…in JDBC driver (apache#38404) This PR is a combination of apache#33961 and apache#14627. The goal is to support parametrized queries through the Arrow Flight SQL JDBC driver. An Arrow Flight SQL server returns a Schema for the `PreparedStatement` parameters. The driver then converts the `Field` list associated with the Schema into a list of `AvaticaParameter`. When the user sets values for the parameters, Avatica generates a list of `TypedValue`, which we then bind to each parameter vector. This conversion between Arrow and Avatica is handled by implementations of a `AvaticaParameterConverter` interface for each Arrow type. This interface which provides 2 methods: - createParameter: Create an `AvaticaParameter` from the given Arrow `Field`. - bindParameter: Cast the given `TypedValue` and bind it to the `FieldVector` at the specified index. This PR purposely leaves out a few features: - We currently naively cast the `TypedValue` values assuming users set the type correctly. If this cast fails, we raise an exception letting the user know that the cast is not supported. This could be improved in subsequent PRs to do smarter conversions from other types. - We currently don't provide conversions for complex types such as List, Map, Struct, Union, Interval, and Duration. The stubs are there so they can be implemented as needed. - Tests for specific types have not been implemented. I'm not very familiar with a lot of these JDBC types so it's hard to implement rigorous tets. * Closes: apache#33475 * Closes: apache#35536 Authored-by: Diego Fernandez <[email protected]> Signed-off-by: David Li <[email protected]>
Rationale for this change
Presently, the
flight-sql-jdbc-driver
does not send prepared statement parameter values to the server. This prevents Arrow FlightSql server implementors for receiving these values and users from being able to use prepared statements properly. This causes some significant issues, in our case not being able to performance test our database with TPC-C.Will resolve: #33475
What changes are included in this PR?
Are these changes tested?
Currently, only by an integration test against our proprietary, unreleased database.
Are there any user-facing changes?
Parameterized queries should work when this PR is completed.