-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-33475: [Java] Add parameter binding for Prepared Statements in JDBC driver #38404
Conversation
|
|
|
.../flight-sql-jdbc-core/src/main/java/org/apache/arrow/driver/jdbc/ArrowFlightJdbcFactory.java
Show resolved
Hide resolved
...t-sql-jdbc-core/src/main/java/org/apache/arrow/driver/jdbc/ArrowFlightPreparedStatement.java
Show resolved
Hide resolved
.../flight-sql-jdbc-core/src/main/java/org/apache/arrow/driver/jdbc/utils/ArrowToJdbcUtils.java
Outdated
Show resolved
Hide resolved
...ht-sql-jdbc-core/src/test/java/org/apache/arrow/driver/jdbc/utils/MockFlightSqlProducer.java
Show resolved
Hide resolved
.../flight-sql-jdbc-core/src/main/java/org/apache/arrow/driver/jdbc/utils/ArrowToJdbcUtils.java
Outdated
Show resolved
Hide resolved
.../flight-sql-jdbc-core/src/main/java/org/apache/arrow/driver/jdbc/utils/ArrowToJdbcUtils.java
Outdated
Show resolved
Hide resolved
.../flight-sql-jdbc-core/src/main/java/org/apache/arrow/driver/jdbc/utils/ArrowToJdbcUtils.java
Outdated
Show resolved
Hide resolved
.../flight-sql-jdbc-core/src/main/java/org/apache/arrow/driver/jdbc/utils/ArrowToJdbcUtils.java
Outdated
Show resolved
Hide resolved
.../flight-sql-jdbc-core/src/main/java/org/apache/arrow/driver/jdbc/utils/ArrowToJdbcUtils.java
Outdated
Show resolved
Hide resolved
.../flight-sql-jdbc-core/src/main/java/org/apache/arrow/driver/jdbc/utils/TypedValueBinder.java
Outdated
Show resolved
Hide resolved
.../flight-sql-jdbc-core/src/main/java/org/apache/arrow/driver/jdbc/utils/TypedValueBinder.java
Outdated
Show resolved
Hide resolved
.../flight-sql-jdbc-core/src/main/java/org/apache/arrow/driver/jdbc/utils/TypedValueBinder.java
Outdated
Show resolved
Hide resolved
...l-jdbc-core/src/test/java/org/apache/arrow/driver/jdbc/ArrowFlightPreparedStatementTest.java
Show resolved
Hide resolved
java/flight/flight-sql/src/main/java/org/apache/arrow/flight/sql/FlightSqlClient.java
Outdated
Show resolved
Hide resolved
|
I refactored this quite a bit to put Arrow <-> Avatica conversions in the same class for each Arrow type. This should help ensure we do this correctly for all Arrow types. Currently, testing is pretty basic and we're missing conversions for complex types, but I think it's a pretty good starting point to implement the rest. I'm not sure why the one test failed and can't figure out how to re-run it. |
...src/main/java/org/apache/arrow/driver/jdbc/converter/impl/DateAvaticaParameterConverter.java
Show resolved
Hide resolved
@lidavidm @jduo Hey all, I've spent a lot of time on this PR. It's a little frustrating that a PR that started well after mine has merged and now I have to deal with merge conflicts while mine still hasn't been thoroughly reviewed. We need this for one of our partner integrations and there's still been no movement on this. Could I at least get some eyes on this? |
@aiguofer sorry about that - I'll give this a review over the weekend once Github is working again. |
Hi @aiguofer . It looks good. I didn't have additional comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None of my comments are critical - if you want to just file an issue to track them that would be fine
We currently naively cast the TypedValue values assuming users set the type correctly. If this cast fails, we raise an exception letting the user know that the cast is not supported. This could be improved in subsequent PRs to do smarter conversions from other types.
We can file a followup for this too.
We currently don't provide conversions for complex types such as List, Map, Struct, Union, Interval, and Duration. The stubs are there so they can be implemented as needed.
Ditto.
Tests for specific types have not been implemented. I'm not very familiar with a lot of these JDBC types so it's hard to implement rigorous tets.
I would be in favor of trying to set up integration suites with systems that depend on the driver, in the same way we have integration tests with Spark and Pandas (two other projects that heavily depend on Arrow). But this is also something we can defer.
* | ||
* @param vector FieldVector that the parameter should be bound to. | ||
* @param typedValue TypedValue to bind as a parameter. | ||
* @param index Vector index that the TypedValue should be bound to. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0-index, right? (I ask because JDBC is often 1-indexed)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct! that would be the index on the Arrow vector. I'll add to the docstring just to be more clear
|
||
@Override | ||
public boolean bindParameter(FieldVector vector, TypedValue typedValue, int index) { | ||
byte[] value = (byte[]) typedValue.toJdbc(null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to handle value == null
? Or in general how does Avatica handle null bind parameters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohhh good catch. Avatica
doesn't really do anything special for null
values so it just returns null
.
For example:
public Object toJdbc(Calendar calendar) {
return this.value == null ? null : serialToJdbc(this.type, this.componentType, this.value, calendar);
}
I guess we need to call setNull
if the value is null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah - do you want to handle that here, or do you want to file a follow-up for now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, let's just get this in before something else conflicts with it
|
||
@Override | ||
public boolean bindParameter(FieldVector vector, TypedValue typedValue, int index) { | ||
// FIXME: how should we handle TZ? Do we need to convert the value to the TZ on the vector? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The timezone on the vector is purely for display; the underlying value is always UTC in Arrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool that's what I was thinking. At this point Avatica has already converted to UTC so that's perfect.
private void bind(FieldVector vector, TypedValue typedValue, int index) { | ||
try { | ||
if (!vector.getField().getType().accept(new BinderVisitor(vector, typedValue, index))) { | ||
throw new RuntimeException( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: is there a more appropriate exception than RuntimeException?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking some type of NotImplementedException
would be ideal but I couldn't find an exception like that. Any thoughts on an alternative? UnsupportedOperationException
seem to imply that it will not be supported, which might not be what we want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO UnsupportedOperationException
is fine for this sort of thing
After merging your PR, Conbench analyzed the 5 benchmarking runs that have been run so far on merge-commit fc8c6b7. There was 1 benchmark result with an error:
There were no benchmark performance regressions. 🎉 The full Conbench report has more details. It also includes information about 5 possible false positives for unstable benchmarks that are known to sometimes produce them. |
…in JDBC driver (apache#38404) This PR is a combination of apache#33961 and apache#14627. The goal is to support parametrized queries through the Arrow Flight SQL JDBC driver. An Arrow Flight SQL server returns a Schema for the `PreparedStatement` parameters. The driver then converts the `Field` list associated with the Schema into a list of `AvaticaParameter`. When the user sets values for the parameters, Avatica generates a list of `TypedValue`, which we then bind to each parameter vector. This conversion between Arrow and Avatica is handled by implementations of a `AvaticaParameterConverter` interface for each Arrow type. This interface which provides 2 methods: - createParameter: Create an `AvaticaParameter` from the given Arrow `Field`. - bindParameter: Cast the given `TypedValue` and bind it to the `FieldVector` at the specified index. This PR purposely leaves out a few features: - We currently naively cast the `TypedValue` values assuming users set the type correctly. If this cast fails, we raise an exception letting the user know that the cast is not supported. This could be improved in subsequent PRs to do smarter conversions from other types. - We currently don't provide conversions for complex types such as List, Map, Struct, Union, Interval, and Duration. The stubs are there so they can be implemented as needed. - Tests for specific types have not been implemented. I'm not very familiar with a lot of these JDBC types so it's hard to implement rigorous tets. * Closes: apache#33475 * Closes: apache#35536 Authored-by: Diego Fernandez <[email protected]> Signed-off-by: David Li <[email protected]>
…in JDBC driver (apache#38404) This PR is a combination of apache#33961 and apache#14627. The goal is to support parametrized queries through the Arrow Flight SQL JDBC driver. An Arrow Flight SQL server returns a Schema for the `PreparedStatement` parameters. The driver then converts the `Field` list associated with the Schema into a list of `AvaticaParameter`. When the user sets values for the parameters, Avatica generates a list of `TypedValue`, which we then bind to each parameter vector. This conversion between Arrow and Avatica is handled by implementations of a `AvaticaParameterConverter` interface for each Arrow type. This interface which provides 2 methods: - createParameter: Create an `AvaticaParameter` from the given Arrow `Field`. - bindParameter: Cast the given `TypedValue` and bind it to the `FieldVector` at the specified index. This PR purposely leaves out a few features: - We currently naively cast the `TypedValue` values assuming users set the type correctly. If this cast fails, we raise an exception letting the user know that the cast is not supported. This could be improved in subsequent PRs to do smarter conversions from other types. - We currently don't provide conversions for complex types such as List, Map, Struct, Union, Interval, and Duration. The stubs are there so they can be implemented as needed. - Tests for specific types have not been implemented. I'm not very familiar with a lot of these JDBC types so it's hard to implement rigorous tets. * Closes: apache#33475 * Closes: apache#35536 Authored-by: Diego Fernandez <[email protected]> Signed-off-by: David Li <[email protected]>
This PR is a combination of #33961 and #14627. The goal is to support parametrized queries through the Arrow Flight SQL JDBC driver.
An Arrow Flight SQL server returns a Schema for the
PreparedStatement
parameters. The driver then converts theField
list associated with the Schema into a list ofAvaticaParameter
. When the user sets values for the parameters, Avatica generates a list ofTypedValue
, which we then bind to each parameter vector. This conversion between Arrow and Avatica is handled by implementations of aAvaticaParameterConverter
interface for each Arrow type. This interface which provides 2 methods:AvaticaParameter
from the given ArrowField
.TypedValue
and bind it to theFieldVector
at the specified index.This PR purposely leaves out a few features:
TypedValue
values assuming users set the type correctly. If this cast fails, we raise an exception letting the user know that the cast is not supported. This could be improved in subsequent PRs to do smarter conversions from other types.