-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate S3 Select to use AWS Java SDK v2 #18270
Migrate S3 Select to use AWS Java SDK v2 #18270
Conversation
33010d7
to
358cb16
Compare
358cb16
to
a21b6e3
Compare
Hi, can anyone please review this? |
Can you rebase? Then I'll kick off the tests with secrets |
@@ -89,6 +90,7 @@ public void configure(Binder binder) | |||
.setDefault().toInstance(ImmutableList::of); | |||
|
|||
binder.bind(TrinoS3ClientFactory.class).in(Scopes.SINGLETON); | |||
binder.bind(S3SelectStats.class).toInstance(new S3SelectStats()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the purpose of this? The metrics aren't exposed anywhere. Do you intend to export these with JMX?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did intend that, I will go over this once again and make the metric system proper. Although, do we really need to collect these metrics? If not, we can remove this and avoid the complexity entirely.
@@ -82,4 +103,118 @@ public boolean isRequestComplete() | |||
{ | |||
return requestComplete; | |||
} | |||
|
|||
/** | |||
* Below classes are required for compatibility between AWS Java SDK 1.x and 2.x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does compatibility mean here? Since we're not supporting the 1.x SDK anymore, could we simplify the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The SDKv2 does not return an InputStream
for the data like SDK v1. I think the Hadoop LineReader above this layer requires an InputStream. This glue code takes the event stream and converts it into an InputStream. Maybe in the new implementation, we can remove the need for an InputStream and directly use the event stream?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be out of our hands, but it really feels like this is something the SDK should be able to provide for us. Getting object content back as an input stream seems like a common enough use case to have it be baked in. It's not like InputStreams are a Hadoop specific thing.
{ | ||
return ClientOverrideConfiguration.builder() | ||
.putAdvancedOption(SdkAdvancedClientOption.USER_AGENT_PREFIX, userAgentPrefix) | ||
.putAdvancedOption(SdkAdvancedClientOption.USER_AGENT_SUFFIX, enabled ? "Trino-select" : "Trino") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this code isn't needed and never was, since this client factory is only for S3 Select. If S3 Select is not enabled, then this factory won't be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense, will remove this code.
Closing since we removed support for S3 Select. |
Description
The PR aims to migrate AWS Java SDK usage in S3 Select to V2 from V1.
AWS SDK v2 is a rewrite of the V1 SDK, thus there is a lot of refactoring of the code.
As a part of this PR:
Additional context and related issues
Some more context around the PR can be found on the conversation at #17522
Release notes
(x) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
( ) Release notes are required, with the following suggested text: