Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: integration authz headers in Dirac client #87

Merged
merged 4 commits into from
Sep 18, 2023

Conversation

aldbr
Copy link
Contributor

@aldbr aldbr commented Sep 12, 2023

This PR aims to enhance the autorest Dirac client by encapsulating the configuration options such as the authorization headers. The primary purpose is to simplify code for the cli developers.

  • writing a new cli command now:
async with Dirac(endpoint=get_diracx_preferences().url) as api:
    jobs = await api.jobs.search(
        parameters=None if all else parameter,
        search=condition if condition else None,
        headers=get_auth_headers(),
    )
  • with this PR:
async with Dirac() as api:
    jobs = await api.jobs.search(
        parameters=None if all else parameter,
        search=condition if condition else None,
    )

Note: the Dirac client is able to auto refresh the access token.

Chosen solution

The solution is based on the azure.core architecture.
We redefine 3 classes:

  • Dirac: to encapsulate the endpoint and an authentication policy
  • AsyncTokenCredential: provide OAuth tokens
  • AsyncBearerTokenCredentialPolicy: an authentication policy which aims at adding the authorization headers to the requests

This solution was chosen after thorough research and analysis, as it was deemed (by myself) the simplest way to incorporate authorization headers into requests.

Considered alternatives

azure-identity

The python autorest documentation recommends using a credential type from the azure identity package to initialize the client.

The DeviceCodeCredentials seems adapted to our use case. It is able to obtain a token using the device_code flow and cache it. Then it is automatically refreshed when needed.

Problem: it is tightly coupled with Azure applications. DeviceCodeCredential inherits from InteractiveCredential, which inherits from MsalCredential. There are mandatory non-standard parameters such as tenant_id. Setting it to "" or None does not help. The only option would have been to create new classes inheriting from the mentioned classes, which would have been inefficient.

AzureAD authentication library

microsoft-authentication-library-for-python seems to provide a generic OAuth2 library but does not automatically refresh access tokens. Furthermore, it would probably be tricky to integrate to the Dirac autorest client.

authlib in combination with azure.core

authlib would help managing tokens. It is even able to automatically refresh access tokens.
But the library would be limited in our context as the Dirac client is initialized each time a new command starts.

@aldbr aldbr mentioned this pull request Sep 12, 2023
4 tasks
@chrisburr
Copy link
Member

I think the general approach looks good and like the right way to go 👍

I'll do a specific review once you figure out the CI.

@aldbr aldbr force-pushed the main_FEAT_autorestAuth branch 3 times, most recently from 533540a to adb2ddd Compare September 12, 2023 15:12
@aldbr
Copy link
Contributor Author

aldbr commented Sep 12, 2023

mypy was tough but probably fair.

Here are the changes I performed to make mypy happy:

  • solved src/diracx/cli/__init__.py:32: error: "Dirac" has no attribute "login"
    • renamed the patched Dirac client as DiracClient
    • copied Dirac().__aenter__() in DiracClient()
  • solved src/diracx/client/aio/_patch.py:51: error: Signature of "get_token" incompatible with supertype "AsyncTokenCredential" [override]
    • copied AsyncTokenCredential().close()/__aenter__()/__aexit__() in DiracAsyncTokenCredential()
    • added suggested signature to DiracAsyncTokenCredential().__aexit__() with ... as default values

It is just for info in case there are better solutions.

The issue related to DIRAC Integration tests is expected, I am going to make a PR once we are okay with the structure of this one.

Update: mypy now complains about the redefinition of the DiracBearerTokenCredentialPolicy()._token() but I don't see how I could do this differently since I can't modify self._token.expires_on which a read-only attribute.

@aldbr aldbr force-pushed the main_FEAT_autorestAuth branch 4 times, most recently from 1550268 to 5669464 Compare September 13, 2023 07:51
] # Add all objects you want publicly available to users at this package level


CREDENTIALS_PATH = Path.home() / ".cache" / "diracx" / "credentials.json"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These settings probably should be in one of:

  • diracx.cli.__init__.py because the CLI is the primary user for that location
  • diracx.client.__init__.py because so anybody using the client (cli, legacy dirac, tasks) may potentially use it
  • diracx.core.__init__.py because it it really is such a crucial value that it could be here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I vote for the last one (and maybe naming it DEFAULT_CREDENTIALS_PATH)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually had a chat with @aldbr and actually what could make even more sense is to put it in the Preferences.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put it in the Preferences

credentials.get("access_token"), credentials.get("expires_on")
)

async def close(self) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need to redefine all the methods below ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a docstring to explain, but basically because AsyncTokenCredential is a Python protocol and we need to provide an implementation of the methods.

"""

def __init__(self, **kwargs: Any) -> None:
endpoint = get_diracx_preferences().url
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should still be possible to overwrite this and the client id

@aldbr
Copy link
Contributor Author

aldbr commented Sep 15, 2023

The issue with the integration tests is related to the normal (non-aio) Dirac client, which is used by DIRAC, and which is actually not implemented here.
The easiest option would be to almost duplicate the content of the aio Dirac client in diracx/client/_patch.py.
I can see if there is an easy way to avoid duplicating too much code, I will work on that asap.

@chrisburr
Copy link
Member

To get the CI green again I've pushed a workaround to main, this commit should be reverted: 2c70a67

@aldbr aldbr force-pushed the main_FEAT_autorestAuth branch 3 times, most recently from ed8214a to 85ecd9b Compare September 18, 2023 08:21
@aldbr aldbr force-pushed the main_FEAT_autorestAuth branch 2 times, most recently from adf2704 to f731204 Compare September 18, 2023 12:02
@aldbr aldbr force-pushed the main_FEAT_autorestAuth branch from f731204 to 2740864 Compare September 18, 2023 12:09
@aldbr
Copy link
Contributor Author

aldbr commented Sep 18, 2023

I'm adding a comment to explain the latest changes because it is becoming hard to review and I am sorry for that:

  • I added a sync DiracClient version in src/diracx/client/_patch.py. I created a few methods to refactor codes used by both sync/async clients.
  • I had to make another PR on DIRAC to take into account a few changes I've made here: [8.1] fix: use write_credentials() from diracx DIRAC#7205
    • While doing this, I took the opportunity to remove the state parameter from the TokenResponse pydantic model because it is not part of the OAuth2 RFC and it is currently not used. I had to regenerate the client to make mypy happy but it generated a large number of changes in the client files.

@chaen chaen merged commit 390b18f into DIRACGrid:main Sep 18, 2023
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants