Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S3 Ingestion from non-default endpoints #11798

Merged
merged 9 commits into from
Jul 15, 2022
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@
import com.amazonaws.services.s3.S3ClientOptions;
import com.fasterxml.jackson.annotation.JsonProperty;

import java.util.Objects;

public class AWSClientConfig
{
@JsonProperty
Expand Down Expand Up @@ -55,4 +57,37 @@ public boolean isForceGlobalBucketAccessEnabled()
{
return forceGlobalBucketAccessEnabled;
}

@Override
public String toString()
{
return "AWSClientConfig{" +
"protocol='" + protocol + '\'' +
", disableChunkedEncoding=" + disableChunkedEncoding +
", enablePathStyleAccess=" + enablePathStyleAccess +
", forceGlobalBucketAccessEnabled=" + forceGlobalBucketAccessEnabled +
'}';
}

@Override
public boolean equals(Object o)
{
if (this == o) {
return true;
}
if (o == null || getClass() != o.getClass()) {
return false;
}
AWSClientConfig that = (AWSClientConfig) o;
return disableChunkedEncoding == that.disableChunkedEncoding
&& enablePathStyleAccess == that.enablePathStyleAccess
&& forceGlobalBucketAccessEnabled == that.forceGlobalBucketAccessEnabled
&& Objects.equals(protocol, that.protocol);
}

@Override
public int hashCode()
{
return Objects.hash(protocol, disableChunkedEncoding, enablePathStyleAccess, forceGlobalBucketAccessEnabled);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
import com.fasterxml.jackson.annotation.JsonProperty;

import javax.annotation.Nullable;
import java.util.Objects;

public class AWSEndpointConfig
{
Expand All @@ -44,4 +45,32 @@ public String getSigningRegion()
{
return signingRegion;
}

@Override
public String toString()
{
return "AWSEndpointConfig{" +
"url='" + url + '\'' +
", signingRegion='" + signingRegion + '\'' +
'}';
}

@Override
public boolean equals(Object o)
{
if (this == o) {
return true;
}
if (o == null || getClass() != o.getClass()) {
return false;
}
AWSEndpointConfig that = (AWSEndpointConfig) o;
return Objects.equals(url, that.url) && Objects.equals(signingRegion, that.signingRegion);
}

@Override
public int hashCode()
{
return Objects.hash(url, signingRegion);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,8 @@

import com.fasterxml.jackson.annotation.JsonProperty;

import java.util.Objects;

public class AWSProxyConfig
{
@JsonProperty
Expand Down Expand Up @@ -54,4 +56,37 @@ public String getPassword()
{
return password;
}

@Override
public String toString()
{
return "AWSProxyConfig{" +
"host='" + host + '\'' +
", port=" + port +
", username='" + username + '\'' +
", password='" + password + '\'' +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit: Should we toString the password here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, removed it.

'}';
}

@Override
public boolean equals(Object o)
{
if (this == o) {
return true;
}
if (o == null || getClass() != o.getClass()) {
return false;
}
AWSProxyConfig that = (AWSProxyConfig) o;
return port == that.port && Objects.equals(host, that.host) && Objects.equals(
username,
that.username
) && Objects.equals(password, that.password);
}

@Override
public int hashCode()
{
return Objects.hash(host, port, username, password);
}
}
42 changes: 42 additions & 0 deletions docs/ingestion/native-batch.md
Original file line number Diff line number Diff line change
Expand Up @@ -943,13 +943,55 @@ Sample specs:
},
...
```
```json
...
"ioConfig": {
"type": "index_parallel",
"inputSource": {
"type": "s3",
"uris": ["s3://foo/bar/file.json", "s3://bar/foo/file2.json"],
"endpointConfig": {
"url" : "s3-store.aws.com",
"signingRegion" : "us-west-2"
},
"clientConfig": {
"protocol" : "http",
"disableChunkedEncoding" : true,
"enablePathStyleAccess" : true,
"forceGlobalBucketAccessEnabled" : false
},

"proxyConfig": {
"host" : "proxy-s3.aws.com",
"port" : 8888,
"username" : "admin",
"password" : "admin"
},

"properties": {
"accessKeyId": "KLJ78979SDFdS2",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace with xxxx or something ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking a look. These are fake credentials.

"secretAccessKey": "KLS89s98sKJHKJKJH8721lljkd",
"assumeRoleArn": "arn:aws:iam::2981002874992:role/role-s3"
}
},
"inputFormat": {
"type": "json"
},
...
},
...
```


|property|description|default|required?|
|--------|-----------|-------|---------|
|type|This should be `s3`.|None|yes|
|uris|JSON array of URIs where S3 objects to be ingested are located.|None|`uris` or `prefixes` or `objects` must be set|
|prefixes|JSON array of URI prefixes for the locations of S3 objects to be ingested. Empty objects starting with one of the given prefixes will be skipped.|None|`uris` or `prefixes` or `objects` must be set|
|objects|JSON array of S3 Objects to be ingested.|None|`uris` or `prefixes` or `objects` must be set|
| endpointConfig |Config for overriding the default S3 endpoint and signing region. This would allow ingesting data from a different S3 store. See below for more information.|None|No (defaults will be used if not given)
| clientConfig |S3 client properties for the overridden s3 endpoint. This is used in conjunction with `endPointConfig`. Please see [s3 config](../development/extensions-core/s3.md#connecting-to-s3-configuration) for more information.|None|No (defaults will be used if not given)
| proxyConfig |Properties for specifying proxy information for the overridden s3 endpoint. This is used in conjunction with `clientConfig`. Please see [s3 config](../development/extensions-core/s3.md#connecting-to-s3-configuration) for more information.|None|No (defaults will be used if not given)
|properties|Properties Object for overriding the default S3 configuration. See below for more information.|None|No (defaults will be used if not given)

Note that the S3 input source will skip all empty objects only when `prefixes` is specified.
Expand Down
Loading