Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add authorization options for azure storage backend #486

Merged
merged 21 commits into from
Nov 16, 2021

Conversation

roeap
Copy link
Collaborator

@roeap roeap commented Nov 1, 2021

Description

This PR extends the available options to authorize the azure storage backend against a storage account.

  • add AZURE_STORAGE_CONNECTION_STRING
  • Fall back on DefaultCredential if no other option is available
  • AZURE_STORAGE_ACCOUNT is still required for all bur connection string.
  • use latest azure crates

The handling of the container client, specifically handling client updates on token expiry is heavily inspired by how the rusoto crate managed credentials internally.

Related Issue(s)

the somewhat awkward implementation in list_objs may be related to Azure/azure-sdk-for-rust#377, but I lack expereince with the lifetime handling in rust, so I may just have missed something trivial.

While technically not being worked on, if we decide to adopt the test setup using azurite, the currently ignored azure test could be removed, which may be considered a fix for #59 as well.

Documentation

@roeap
Copy link
Collaborator Author

roeap commented Nov 1, 2021

@houqp , @fvaleye I tried implementing more storage options which turned out to be a bit more involved then expected. Since this includes many thins fairly new to be - especially in the rust world - I'd be very happy to get some feedback if this is going in the right direction. Specifically when it comes to handling the refresh of the container client.

@houqp
Copy link
Member

houqp commented Nov 2, 2021

@roeap I am not very familiar with azure sdk, so I will need bit more time to read through the code this weekend.

@houqp
Copy link
Member

houqp commented Nov 5, 2021

cc @thovoll since he is the expert of Azure rust binding :)

@thovoll
Copy link
Contributor

thovoll commented Nov 12, 2021

@roeap please update PR description based on removing Azurite, thanks!

@thovoll
Copy link
Contributor

thovoll commented Nov 12, 2021

@roeap to what degree can we use the current testing approach for this new code?

build/setup_azurite.sh Outdated Show resolved Hide resolved
docker-compose.yml Outdated Show resolved Hide resolved
Copy link
Contributor

@thovoll thovoll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, thanks for tackling this! I'd like to see the remaining Azurite code removed and the existing test modified, if possible.

@roeap
Copy link
Collaborator Author

roeap commented Nov 12, 2021

I removed the azurite testing code and had a look into how tests are currently done. To me it seems that the account being used may actually not even exist any more. As is the test is ignored, and when trying to ruin it, I get dns not found. Maybe @rtyler can tell us if that account still exists, and if so, if it should still be used.

Beyond that there are some challenges left to actually test the core of this PR. The "static" authorization schemes are actually quite similar. However the most interesting part - authenticating with a service principal - would not be tested. What I can say is that it worked locally using a work account, but that does not help much ... @thovoll any ideas how to test this scenario as well?

@thovoll
Copy link
Contributor

thovoll commented Nov 12, 2021

Thanks @roeap. Ultimately, an approach like the mock testing framework that is being used by the Azure Rust SDK would make sense for integration tests. Until then we are stuck with e2e testing using real Azure resources.

I found the same DNS issue when running the ignored test last night and pinged @rtyler as well, but instead of resurrecting this, I propose we use the same approach that is used in the Azure Rust SDK where the (e2e) tests have to be run manually by developers on their local machine, using their own Azure resources (requires setting env variables to make the various auth schemes work). In CI, the integration tests are just cargo checked (the integration tests using the mock testing framework are actually executed in CI). PR submitters would have to attest to the fact that they ran the e2e tests locally, like you did above. This is not ideal, but a incremental improvement nonetheless.

@thovoll
Copy link
Contributor

thovoll commented Nov 12, 2021

BTW, testing Azure Managed Identity auth is currently being discussed over in the Azure Rust SDK here.

@thovoll
Copy link
Contributor

thovoll commented Nov 12, 2021

@roeap please also update the PR description to remove the Azurite references. Thanks!

@houqp
Copy link
Member

houqp commented Nov 15, 2021

The change looks good to me overall, agree with @thovoll on testing :)

@roeap
Copy link
Collaborator Author

roeap commented Nov 15, 2021

Thanks for the feedback. Just for me to understand, in terms of testing to idea would now be to mirgate some of the S3 tests, mark them with ignored, and maybe write a few sentences for devs to understand how to do e2e testing locally. In subsequent PRs we would then gradually improve testing to eventually have it run in CI - ideally using the pipelines architecture from the azure sdk repo? @thovoll @houqp

@thovoll
Copy link
Contributor

thovoll commented Nov 15, 2021

I think we would leave the S3 tests alone, since they can work with the localstack emulator.

The Azure test would be run locally using your own Azure Storage account, I think you have already done this correct?

If you're up for it as part of this PR, you could modify this to not hard code those 3 values and document how to run this test locally (something light, just saying that "the following 3 env variables have to be set").

@thovoll
Copy link
Contributor

thovoll commented Nov 15, 2021

And yes, if we have consensus we can eventually add a mock testing framework for the Azure tests so we can run them in CI without needing an emulator or real Azure resources.

@roeap roeap changed the title WIP: Add authorization options for azure storage backend Add authorization options for azure storage backend Nov 15, 2021
@roeap roeap marked this pull request as ready for review November 15, 2021 16:33
@roeap
Copy link
Collaborator Author

roeap commented Nov 15, 2021

@thovoll @houqp - i updated the azure_test file and included a description how to setup and execute tests locally.

@thovoll
Copy link
Contributor

thovoll commented Nov 15, 2021

Thanks @roeap, looks good! I think this is good to merge. @houqp

@houqp
Copy link
Member

houqp commented Nov 16, 2021

awesome, @roeap do you mind resolving the conflicts in Cargo.lock so we can merge it?

@roeap
Copy link
Collaborator Author

roeap commented Nov 16, 2021

Thanks @houqp @thovoll for your support/reviews. The conflicts should be resolved now.

@houqp houqp enabled auto-merge (squash) November 16, 2021 07:23
Copy link
Member

@houqp houqp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@houqp houqp merged commit 7e5aaa7 into delta-io:main Nov 16, 2021
@roeap roeap deleted the azure-auth branch November 16, 2021 07:27
@houqp
Copy link
Member

houqp commented Nov 16, 2021

Thank you @roeap for the hard work and @thovoll for the review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants