Skip to content

Commit

Permalink
apacheGH-38663: [C++] Add support for service-specific endpoint for S…
Browse files Browse the repository at this point in the history
…3 using `AWS_ENDPOINT_URL_S3` (apache#39160)

### Rationale for this change

See apache#38663

### What changes are included in this PR?

set variable `endpoint_override` according the environment variable, prefer service-specific endpoint url over global endpoint url.

### Are these changes tested?

unittest

### Are there any user-facing changes?

No

* Closes: apache#38663

Lead-authored-by: messense <[email protected]>
Co-authored-by: Antoine Pitrou <[email protected]>
Signed-off-by: Antoine Pitrou <[email protected]>
  • Loading branch information
2 people authored and dgreiss committed Feb 17, 2024
1 parent 3910e58 commit d274984
Show file tree
Hide file tree
Showing 3 changed files with 22 additions and 3 deletions.
13 changes: 10 additions & 3 deletions cpp/src/arrow/filesystem/s3fs.cc
Original file line number Diff line number Diff line change
Expand Up @@ -156,6 +156,7 @@ using internal::ToURLEncodedAwsString;

static const char kSep = '/';
constexpr char kAwsEndpointUrlEnvVar[] = "AWS_ENDPOINT_URL";
constexpr char kAwsEndpointUrlS3EnvVar[] = "AWS_ENDPOINT_URL_S3";

// -----------------------------------------------------------------------
// S3ProxyOptions implementation
Expand Down Expand Up @@ -366,9 +367,15 @@ Result<S3Options> S3Options::FromUri(const Uri& uri, std::string* out_path) {
} else {
options.ConfigureDefaultCredentials();
}
auto endpoint_env = arrow::internal::GetEnvVar(kAwsEndpointUrlEnvVar);
if (endpoint_env.ok()) {
options.endpoint_override = *endpoint_env;
// Prefer AWS service-specific endpoint url
auto s3_endpoint_env = arrow::internal::GetEnvVar(kAwsEndpointUrlS3EnvVar);
if (s3_endpoint_env.ok()) {
options.endpoint_override = *s3_endpoint_env;
} else {
auto endpoint_env = arrow::internal::GetEnvVar(kAwsEndpointUrlEnvVar);
if (endpoint_env.ok()) {
options.endpoint_override = *endpoint_env;
}
}

bool region_set = false;
Expand Down
5 changes: 5 additions & 0 deletions cpp/src/arrow/filesystem/s3fs_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -303,6 +303,11 @@ TEST_F(S3OptionsTest, FromUri) {
ASSERT_RAISES(Invalid, S3Options::FromUri("s3://mybucket/?xxx=zzz", &path));

// Endpoint from environment variable
{
EnvVarGuard endpoint_guard("AWS_ENDPOINT_URL_S3", "http://127.0.0.1:9000");
ASSERT_OK_AND_ASSIGN(options, S3Options::FromUri("s3://mybucket/", &path));
ASSERT_EQ(options.endpoint_override, "http://127.0.0.1:9000");
}
{
EnvVarGuard endpoint_guard("AWS_ENDPOINT_URL", "http://127.0.0.1:9000");
ASSERT_OK_AND_ASSIGN(options, S3Options::FromUri("s3://mybucket/", &path));
Expand Down
7 changes: 7 additions & 0 deletions docs/source/cpp/env_vars.rst
Original file line number Diff line number Diff line change
Expand Up @@ -163,6 +163,13 @@ that changing their value later will have an effect.
.. envvar:: AWS_ENDPOINT_URL

Endpoint URL used for S3-like storage, for example Minio or s3.scality.
Alternatively, one can set :envvar:`AWS_ENDPOINT_URL_S3`.

.. envvar:: AWS_ENDPOINT_URL_S3

Endpoint URL used for S3-like storage, for example Minio or s3.scality.
This takes precedence over :envvar:`AWS_ENDPOINT_URL` if both variables
are set.

.. envvar:: GANDIVA_CACHE_SIZE

Expand Down

0 comments on commit d274984

Please sign in to comment.