Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[datalake] More Authorization policies #673

Merged
merged 13 commits into from
Mar 17, 2022
Merged

Conversation

roeap
Copy link
Contributor

@roeap roeap commented Mar 2, 2022

This PR adds more authorization options to the datalake crate. Specifically using a token credential.

In many scenarios having auto refresh available for tokens is quite handy. Thus I added an auto refreshing token credential, which wraps around token credentials from the core crate. To get the refresh to work, I had to use the tokio Mutex - I believe this is required due to the refresh being an async operation and the regular mutex cannot be used across an await statement. But I may also just have missed the simpler solution. Due to that I put the implementation behind a feature gate, since I seem to remember reading pulling in a specific runtime as dependency was not desired.

I still need to add tests and clean up a bit, but would appreciate some feedback on the general approach - spcifically the use of tokio.

@thovoll @rylev @ctaggart

Update: usage of futures::lock::Mutext solves the Tokio issue (thanks @bmc-msft)

@roeap roeap changed the title [datalake] More Authrization policies [datalake] More Authorization policies Mar 2, 2022
@bmc-msft
Copy link
Contributor

bmc-msft commented Mar 3, 2022

Are you able to use futures::lock::Mutex for a runtime agnostic Mutex implementation?

@roeap
Copy link
Contributor Author

roeap commented Mar 3, 2022

Indeed, that works :) Thanks for the advice!

of course the next question follows suit... when returning an error inside the "refresh loop" I could not get it to work to return a Box'ed error due to lifetime constraints. The new errors using an ErrorKind and message would work fine, but this would require adjusting the return type of the TokenCredential.get_token function. Also I was hesitant to introduce a new variant to the "old" errors, and changing the signature would have a much bigger impact. Making the errors clone-able could also work I guess.

So again asking for some guidance.

Right now an HeaderNotFound erro is returned since it only needs a string as input, but that obviously is more misleading then helpful :).

@cataggar cataggar requested a review from yoshuawuyts March 3, 2022 16:16
@thovoll thovoll requested a review from rylev March 8, 2022 16:18
@yoshuawuyts
Copy link
Contributor

Also I was hesitant to introduce a new variant to the "old" errors

I discussed this with @thovoll today, and we think this would be the best option. There is no "right" answer here, but this would allow us to start incrementally migrating, the way we've been doing with other types as well.

Also we on reading the code we realized using Mutex might cause contention between readers; instead using an async RwLock might make reads cheaper.

Hope that's helpful!

@roeap
Copy link
Contributor Author

roeap commented Mar 9, 2022

Thanks @yoshuawuyts, @thovoll for the support. The code now uses RwLock from the newly introduced async_lock crate (did not find one in futures), and also added a new variant to the error enum.

The last remaining concerns was testing - locally I gave the credential a try and it seemed to work. Looked a bit around the identity crate and found that specific credentials are not tested there as well - not sure how to proceed with this.

Otherwise I hope we are approaching a merge-able state :).

@roeap roeap marked this pull request as ready for review March 9, 2022 05:39
@yoshuawuyts
Copy link
Contributor

I'm not super familiar with how we're testing things; perhaps @cataggar can answer that question?

@@ -42,6 +42,8 @@ pub enum Error {
Json(#[source] serde_json::Error),
#[error("Other error")]
Other(#[source] Box<dyn std::error::Error + Send + Sync + 'static>),
#[error("Other error")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better description?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better description need here

}
}

impl AutoRefreshingTokenCredential {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this go in azure_core?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will eventually. I know @roeap wanted to do it in stages.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I completely agree :). If we can move this there now, great. Similarly, the pipeline authorization policies should likely end up at least in azure_storage, or if they are useful in other crates as well maybe also core?

If this is OK, I am happy to move this to core in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the AutoRefreshingCredential into the identity crate to the other TokenCredentials. Was considering to also move the policies to the storage_core crate, but thought it better to do this when migrating blob to the pipeline architecture.

@thovoll
Copy link
Contributor

thovoll commented Mar 11, 2022

I'm not super familiar with how we're testing things; perhaps @cataggar can answer that question?

@roeap and I discussed the testing plan and came up with these incremental steps:

  1. Add an example (this PR)
  2. Add an e2e record/playback test (maybe another PR)
  3. Add a deterministic test for the loop with the RwLock (definitely another PR)

@roeap
Copy link
Contributor Author

roeap commented Mar 11, 2022

Moved the refreshing credential in the identity crate and updated the client constructors to also offer an option for using a token credential. With respect to client initialization I tried to keep changes minimal. Main reason being that from what I gather discussion is still ongoing on what the specifics of client configuration should look like.

*guard = Some(res);
}
Some(Err(err)) => {
return Err(Error::AuthorizationPolicy(err.to_string()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

err.to_string does not feel right

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hoping @rylev has better advice for a solution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed. my understanding was that this is sort of a stop gap solution, until we are migrating to the new error structure. Given that authorization is quite pervasive throughout all crates, maybe it a good idea to do that rather sooner then later?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggested above changing the error to take a boxed error trait object which would clean this up a bit.

@@ -42,6 +42,8 @@ pub enum Error {
Json(#[source] serde_json::Error),
#[error("Other error")]
Other(#[source] Box<dyn std::error::Error + Send + Sync + 'static>),
#[error("Other error")]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better description need here

Copy link
Contributor

@rylev rylev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few comments but largely this looks good.

Comment on lines 45 to 46
#[error("Authorization policy error")]
AuthorizationPolicy(String),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#[error("Authorization policy error")]
AuthorizationPolicy(String),
#[error("authorization policy error: {0}")]
AuthorizationPolicy(Box<dyn std::error::Error + Send + Sync + 'static>),

Nit: can you also move this above the Other variant so that Other is last?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched the variants, but was not able to work with the boxed error. See comment below.

@@ -19,6 +19,10 @@ impl From<crate::errors::Error> for Error {
crate::errors::Error::GetToken(e) => Error::new(ErrorKind::Credential, e),
crate::errors::Error::HttpPrepare(e) => e.into(),
crate::errors::Error::Other(e) => Error::new(ErrorKind::Other, e),
crate::errors::Error::AuthorizationPolicy(msg) => Error::with_message(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If AuthorizationPolicy takes a boxed error trait object, you can just use Error::new instead of Error::with_message which doesn't really add much value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that when working with the guard from RwLock, we need to get a reference (guard.as_ref()) , which in turn ties the lifetime of the error to the TokenCredential itself, which then would need to be constrained by the 'static lifetime.

At least that's more or less what i gathered from the compiler errors, I get when trying to pass a box'ed error into the variant. Tried around different variations of de-referencing the error, but with no success. Somehow these lifetimes still manage to escape my grasp every so often :).

*guard = Some(res);
}
Some(Err(err)) => {
return Err(Error::AuthorizationPolicy(err.to_string()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggested above changing the error to take a boxed error trait object which would clean this up a bit.

sdk/storage_datalake/src/lib.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@rylev rylev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working hard on this! I still don't love the way that errors are being handled, but since we'll do a pass on error handling when we want to introduce the new error handling style, I think this is fine to merge.

@rylev rylev dismissed cataggar’s stale review March 17, 2022 15:46

Dismissing this review since it was asking for the same things I was asking for.

@rylev rylev merged commit 0ed3e65 into Azure:main Mar 17, 2022
@roeap
Copy link
Contributor Author

roeap commented Mar 17, 2022

Agreed with the errors, but was not able to get the box thing to work..

@roeap roeap deleted the authorization branch March 17, 2022 16:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants