Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/cloud/azure: Support specifying Azure environments in storage URLs #80511

Conversation

nlowe-sx
Copy link
Contributor

The Azure Storage cloud provider learned a new parameter, AZURE_ENVIRONMENT,
which specifies which azure environment the storage account in question
belongs to. This allows cockroach to backup and restore data to Azure
Storage Accounts outside the main Azure Public Cloud. For backwards
compatibility, this defaults to "AzurePublicCloud" if AZURE_ENVIRONMENT
is not specified.

Fixes #47163

Verification Evidence

I spun up a single node cluster:

nlowe@nlowe-z4l:~/projects/github/cockroachdb/cockroach [feat/47163-azure-storage-support-multiple-environments L|✚ 2] [🗓  2022-04-22 08:25:49]
$ bazel run //pkg/cmd/cockroach:cockroach -- start-single-node --insecure
WARNING: Option 'host_javabase' is deprecated
WARNING: Option 'javabase' is deprecated
WARNING: Option 'host_java_toolchain' is deprecated
WARNING: Option 'java_toolchain' is deprecated
INFO: Invocation ID: 11504a98-f767-413a-8994-8f92793c2ecf
INFO: Analyzed target //pkg/cmd/cockroach:cockroach (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
Target //pkg/cmd/cockroach:cockroach up-to-date:
  _bazel/bin/pkg/cmd/cockroach/cockroach_/cockroach
INFO: Elapsed time: 0.358s, Critical Path: 0.00s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
INFO: Build completed successfully, 1 total action
*
* WARNING: ALL SECURITY CONTROLS HAVE BEEN DISABLED!
*
* This mode is intended for non-production testing only.
*
* In this mode:
* - Your cluster is open to any client that can access any of your IP addresses.
* - Intruders with access to your machine or network can observe client-server traffic.
* - Intruders can log in without password and read or write any data in the cluster.
* - Intruders can consume all your server's resources and cause unavailability.
*
*
* INFO: To start a secure server without mandating TLS for clients,
* consider --accept-sql-without-tls instead. For other options, see:
*
* - https://go.crdb.dev/issue-v/53404/dev
* - https://www.cockroachlabs.com/docs/dev/secure-a-cluster.html
*
*
* WARNING: neither --listen-addr nor --advertise-addr was specified.
* The server will advertise "nlowe-z4l" to other nodes, is this routable?
*
* Consider using:
* - for local-only servers:  --listen-addr=localhost
* - for multi-node clusters: --advertise-addr=<host/IP addr>
*
*
CockroachDB node starting at 2022-04-22 15:25:55.461315977 +0000 UTC (took 2.1s)
build:               CCL unknown @  (go1.17.6)
webui:               http://nlowe-z4l:8080/
sql:                 postgresql://root@nlowe-z4l:26257/defaultdb?sslmode=disable
sql (JDBC):          jdbc:postgresql://nlowe-z4l:26257/defaultdb?sslmode=disable&user=root
RPC client flags:    /home/nlowe/.cache/bazel/_bazel_nlowe/cf6ed4d0d14c8e474a5c30d572846d8a/execroot/cockroach/bazel-out/k8-fastbuild/bin/pkg/cmd/cockroach/cockroach_/cockroach <client cmd> --host=nlowe-z4l:26257 --insecure
logs:                /home/nlowe/.cache/bazel/_bazel_nlowe/cf6ed4d0d14c8e474a5c30d572846d8a/execroot/cockroach/bazel-out/k8-fastbuild/bin/pkg/cmd/cockroach/cockroach_/cockroach.runfiles/cockroach/cockroach-data/logs
temp dir:            /home/nlowe/.cache/bazel/_bazel_nlowe/cf6ed4d0d14c8e474a5c30d572846d8a/execroot/cockroach/bazel-out/k8-fastbuild/bin/pkg/cmd/cockroach/cockroach_/cockroach.runfiles/cockroach/cockroach-data/cockroach-temp4100501952
external I/O path:   /home/nlowe/.cache/bazel/_bazel_nlowe/cf6ed4d0d14c8e474a5c30d572846d8a/execroot/cockroach/bazel-out/k8-fastbuild/bin/pkg/cmd/cockroach/cockroach_/cockroach.runfiles/cockroach/cockroach-data/extern
store[0]:            path=/home/nlowe/.cache/bazel/_bazel_nlowe/cf6ed4d0d14c8e474a5c30d572846d8a/execroot/cockroach/bazel-out/k8-fastbuild/bin/pkg/cmd/cockroach/cockroach_/cockroach.runfiles/cockroach/cockroach-data
storage engine:      pebble
clusterID:           bb3942d7-f241-4d26-aa4a-1bd0d6556e4d
status:              initialized new cluster
nodeID:              1

I was then able to view the contents of a backup hosted in an azure
government storage account:

root@:26257/defaultdb> SELECT DISTINCT object_name FROM [SHOW BACKUP 'azure://container/path/to/backup?AZURE_ACCOUNT_NAME=account&AZURE_ACCOUNT_KEY=***&AZURE_ENVIRONMENT=AzureUSGovernmentCloud'] WHERE object_type = 'database';
               object_name
------------------------------------------
  example_database
  ...
(17 rows)
 
Time: 5.859632889s

Omitting the AZURE_ENVIRONMENT parameter, we can see cockroach
defaults to the public cloud where my storage account does not exist:

root@:26257/defaultdb> SELECT DISTINCT object_name FROM [SHOW BACKUP 'azure://container/path/to/backup?AZURE_ACCOUNT_NAME=account&AZURE_ACCOUNT_KEY=***'] WHERE object_type = 'database';
ERROR: reading previous backup layers: unable to list files for specified blob: Get "https://account.blob.core.windows.net/container?comp=list&delimiter=path%2Fto%2Fbackup&restype=container&timeout=61": dial tcp: lookup account.blob.core.windows.net on 8.8.8.8:53: no such host

Tests

Two new tests are added to verify that the storage account URL is correctly
built from the provided Azure Environment name, and that the Environment
defaults to the Public Cloud if unspecified for backwards compatibility. I
verified the existing tests pass against a government storage account after
specifying AZURE_ENVIRONMENT as AzureUSGovernmentCloud in the backup URL
query parameters:

nlowe@nlowe-mbp:~/projects/github/cockroachdb/cockroachdb [feat/47163-azure-storage-support-multiple-environments| …3] [🗓  2022-04-22 17:38:26]
$ export AZURE_ACCOUNT_NAME=account
nlowe@nlowe-mbp:~/projects/github/cockroachdb/cockroachdb [feat/47163-azure-storage-support-multiple-environments| …3] [🗓  2022-04-22 17:38:42]
$ export AZURE_ACCOUNT_KEY=***
nlowe@nlowe-mbp:~/projects/github/cockroachdb/cockroachdb [feat/47163-azure-storage-support-multiple-environments| …3] [🗓  2022-04-22 17:39:25]
$ export AZURE_CONTAINER=container
nlowe@nlowe-mbp:~/projects/github/cockroachdb/cockroachdb [feat/47163-azure-storage-support-multiple-environments| …3] [🗓  2022-04-22 17:39:48]
$ export AZURE_ENVIRONMENT=AzureUSGovernmentCloud
nlowe@nlowe-mbp:~/projects/github/cockroachdb/cockroachdb [feat/47163-azure-storage-support-multiple-environments| …3] [🗓  2022-04-22 17:40:15]
$ bazel test --test_output=streamed --test_arg=-test.v --action_env=AZURE_ACCOUNT_NAME --action_env=AZURE_ACCOUNT_KEY --action_env=AZURE_CONTAINER --action_env=AZURE_ENVIRONMENT //pkg/cloud/azure:azure_test
INFO: Invocation ID: aa88a942-f3c7-4df6-bade-8f5f0e18041f
WARNING: Streamed test output requested. All tests will be run locally, without sharding, one at a time
INFO: Build option --action_env has changed, discarding analysis cache.
INFO: Analyzed target //pkg/cloud/azure:azure_test (468 packages loaded, 16382 targets configured).
INFO: Found 1 test target...
initialized metamorphic constant "span-reuse-rate" with value 28
=== RUN   TestAzure
=== RUN   TestAzure/simple_round_trip
=== RUN   TestAzure/exceeds-4mb-chunk
=== RUN   TestAzure/exceeds-4mb-chunk/rand-readats
=== RUN   TestAzure/exceeds-4mb-chunk/rand-readats/#00
    cloud_test_helpers.go:226: read 3345 of file at 4778744
=== RUN   TestAzure/exceeds-4mb-chunk/rand-readats/#01
    cloud_test_helpers.go:226: read 7228 of file at 226589
=== RUN   TestAzure/exceeds-4mb-chunk/rand-readats/#02
    cloud_test_helpers.go:226: read 634 of file at 256284
=== RUN   TestAzure/exceeds-4mb-chunk/rand-readats/#03
    cloud_test_helpers.go:226: read 7546 of file at 3546208
=== RUN   TestAzure/exceeds-4mb-chunk/rand-readats/#04
    cloud_test_helpers.go:226: read 24123 of file at 4821795
=== RUN   TestAzure/exceeds-4mb-chunk/rand-readats/#05
    cloud_test_helpers.go:226: read 16899 of file at 403428
=== RUN   TestAzure/exceeds-4mb-chunk/rand-readats/#06
    cloud_test_helpers.go:226: read 29467 of file at 4886370
=== RUN   TestAzure/exceeds-4mb-chunk/rand-readats/#07
    cloud_test_helpers.go:226: read 11700 of file at 1876920
=== RUN   TestAzure/exceeds-4mb-chunk/rand-readats/#08
    cloud_test_helpers.go:226: read 2928 of file at 489781
=== RUN   TestAzure/exceeds-4mb-chunk/rand-readats/#09
    cloud_test_helpers.go:226: read 19933 of file at 1483342
=== RUN   TestAzure/read-single-file-by-uri
=== RUN   TestAzure/write-single-file-by-uri
=== RUN   TestAzure/file-does-not-exist
=== RUN   TestAzure/List
=== RUN   TestAzure/List/root
=== RUN   TestAzure/List/file-slash-numbers-slash
=== RUN   TestAzure/List/root-slash
=== RUN   TestAzure/List/file
=== RUN   TestAzure/List/file-slash
=== RUN   TestAzure/List/slash-f
=== RUN   TestAzure/List/nothing
=== RUN   TestAzure/List/delim-slash-file-slash
=== RUN   TestAzure/List/delim-data
--- PASS: TestAzure (34.81s)
    --- PASS: TestAzure/simple_round_trip (9.66s)
    --- PASS: TestAzure/exceeds-4mb-chunk (16.45s)
        --- PASS: TestAzure/exceeds-4mb-chunk/rand-readats (6.41s)
            --- PASS: TestAzure/exceeds-4mb-chunk/rand-readats/#00 (0.15s)
            --- PASS: TestAzure/exceeds-4mb-chunk/rand-readats/#01 (0.64s)
            --- PASS: TestAzure/exceeds-4mb-chunk/rand-readats/#02 (0.65s)
            --- PASS: TestAzure/exceeds-4mb-chunk/rand-readats/#03 (0.60s)
            --- PASS: TestAzure/exceeds-4mb-chunk/rand-readats/#04 (0.75s)
            --- PASS: TestAzure/exceeds-4mb-chunk/rand-readats/#05 (0.80s)
            --- PASS: TestAzure/exceeds-4mb-chunk/rand-readats/#06 (0.75s)
            --- PASS: TestAzure/exceeds-4mb-chunk/rand-readats/#07 (0.65s)
            --- PASS: TestAzure/exceeds-4mb-chunk/rand-readats/#08 (0.65s)
            --- PASS: TestAzure/exceeds-4mb-chunk/rand-readats/#09 (0.77s)
    --- PASS: TestAzure/read-single-file-by-uri (0.60s)
    --- PASS: TestAzure/write-single-file-by-uri (0.60s)
    --- PASS: TestAzure/file-does-not-exist (1.05s)
    --- PASS: TestAzure/List (2.40s)
        --- PASS: TestAzure/List/root (0.30s)
        --- PASS: TestAzure/List/file-slash-numbers-slash (0.30s)
        --- PASS: TestAzure/List/root-slash (0.30s)
        --- PASS: TestAzure/List/file (0.30s)
        --- PASS: TestAzure/List/file-slash (0.30s)
        --- PASS: TestAzure/List/slash-f (0.30s)
        --- PASS: TestAzure/List/nothing (0.15s)
        --- PASS: TestAzure/List/delim-slash-file-slash (0.15s)
        --- PASS: TestAzure/List/delim-data (0.30s)
=== RUN   TestAntagonisticAzureRead
--- PASS: TestAntagonisticAzureRead (103.90s)
=== RUN   TestParseAzureURL
=== RUN   TestParseAzureURL/Defaults_to_Public_Cloud_when_AZURE_ENVIRONEMNT_unset
=== RUN   TestParseAzureURL/Can_Override_AZURE_ENVIRONMENT
--- PASS: TestParseAzureURL (0.00s)
    --- PASS: TestParseAzureURL/Defaults_to_Public_Cloud_when_AZURE_ENVIRONEMNT_unset (0.00s)
    --- PASS: TestParseAzureURL/Can_Override_AZURE_ENVIRONMENT (0.00s)
=== RUN   TestMakeAzureStorageURLFromEnvironment
=== RUN   TestMakeAzureStorageURLFromEnvironment/AzurePublicCloud
=== RUN   TestMakeAzureStorageURLFromEnvironment/AzureUSGovernmentCloud
--- PASS: TestMakeAzureStorageURLFromEnvironment (0.00s)
    --- PASS: TestMakeAzureStorageURLFromEnvironment/AzurePublicCloud (0.00s)
    --- PASS: TestMakeAzureStorageURLFromEnvironment/AzureUSGovernmentCloud (0.00s)
PASS
Target //pkg/cloud/azure:azure_test up-to-date:
  _bazel/bin/pkg/cloud/azure/azure_test_/azure_test
INFO: Elapsed time: 159.865s, Critical Path: 152.35s
INFO: 66 processes: 2 internal, 64 darwin-sandbox.
INFO: Build completed successfully, 66 total actions
//pkg/cloud/azure:azure_test                                             PASSED in 139.9s
 
INFO: Build completed successfully, 66 total actions

@nlowe-sx nlowe-sx requested a review from a team as a code owner April 25, 2022 20:47
@nlowe-sx nlowe-sx requested review from a team and stevendanna and removed request for a team April 25, 2022 20:47
@cockroach-teamcity
Copy link
Member

cockroach-teamcity commented Apr 25, 2022

CLA assistant check
All committers have signed the CLA.

@blathers-crl
Copy link

blathers-crl bot commented Apr 25, 2022

Thank you for contributing to CockroachDB. Please ensure you have followed the guidelines for creating a PR.

Before a member of our team reviews your PR, I have some potential action items for you:

  • Please ensure your git commit message contains a release note.
  • When CI has completed, please ensure no errors have appeared.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl blathers-crl bot added the O-community Originated from the community label Apr 25, 2022
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@irfansharif irfansharif requested a review from a team April 25, 2022 20:54
@blathers-crl
Copy link

blathers-crl bot commented Apr 25, 2022

Thank you for updating your pull request.

Before a member of our team reviews your PR, I have some potential action items for you:

  • We notice you have more than one commit in your PR. We try break logical changes into separate commits, but commits such as "fix typo" or "address review commits" should be squashed into one commit and pushed with --force
  • Please ensure your git commit message contains a release note.
  • When CI has completed, please ensure no errors have appeared.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@nlowe-sx nlowe-sx force-pushed the feat/47163-azure-storage-support-multiple-environments branch from ef46a67 to 918c7a9 Compare April 25, 2022 22:47
@blathers-crl
Copy link

blathers-crl bot commented Apr 25, 2022

Thank you for updating your pull request.

My owl senses detect your PR is good for review. Please keep an eye out for any test failures in CI.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

The Azure Storage cloud provider learned a new parameter, AZURE_ENVIRONMENT,
which specifies which azure environment the storage account in question
belongs to. This allows cockroach to backup and restore data to Azure
Storage Accounts outside the main Azure Public Cloud. For backwards
compatibility, this defaults to "AzurePublicCloud" if AZURE_ENVIRONMENT
is not specified.

Fixes cockroachdb#47163

Release note (general): When using Azure Cloud Storage for data operations,
cockroach now calculates the Storage Account URL from the provided
AZURE_ENVIRONMENT query parameter. This defaults to AzurePublicCloud if not
specified to maintain backwards compatibility.
@nlowe-sx nlowe-sx force-pushed the feat/47163-azure-storage-support-multiple-environments branch from 918c7a9 to 4b97370 Compare April 26, 2022 22:57
@tbg tbg removed the request for review from a team April 27, 2022 06:09
Copy link
Contributor

@adityamaru adityamaru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nlowe-sx thank you for this contribution and the thorough testing, this looks good to me! I'll merge it for you.

@adityamaru
Copy link
Contributor

bors r=adityamaru

@craig
Copy link
Contributor

craig bot commented Apr 29, 2022

Build succeeded:

@nlowe-sx
Copy link
Contributor Author

@adityamaru Thanks for the review!

Is it too late in the 22.1 release process to get this included?

Also, what's the chance of getting this back-ported to 21.2? We'd love to start making use of this change but won't be upgrading to 22.1 for at least 6 months due to our upgrade policy, but we are planning on upgrading to 21.2 in a few weeks.

@adityamaru
Copy link
Contributor

Is it too late in the 22.1 release process to get this included?

We are discussing this internally as it is quite late in our release process to include this in our 22.1.0 GA. We might consider backporting it to a 22.1 dot release instead.

what's the chance of getting this back-ported to 21.2

We're discussing this as well 😋 Given that the change is pretty well understood, I think it's possible. I'll open a backport and will update you with our decision early next week!

@adityamaru
Copy link
Contributor

blathers backport 21.2

@adityamaru
Copy link
Contributor

blathers backport 22.1

@dt
Copy link
Member

dt commented May 2, 2022

backports sound fine to me as along as the release note on the backport is pretty strongly worded about avoiding usage in the mixed version / upgrade state (i.e. since any non-upgraded nodes will still send requests to the default URI, i.e. public cloud).

@adityamaru
Copy link
Contributor

@nlowe-sx just to close the loop here, we did backport the changes to 21.2 and the future 22.1.1. You should be able to pick this up in 21.2.11 which should be out in a couple of weeks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O-community Originated from the community
Projects
None yet
Development

Successfully merging this pull request may close these issues.

IMPORT PGDUMP cannot import from Azure Blob Storage in Government
4 participants