Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds an OAuthIntegration #52

Merged
merged 14 commits into from
Feb 27, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions src/posit/connect/client.py
Original file line number Diff line number Diff line change
@@ -1,15 +1,24 @@
from __future__ import annotations

import os
from requests import Response, Session
from typing import Optional

from . import hooks, urls

from .auth import Auth
from .config import Config
from .oauth import OAuthIntegration
from .users import Users


# Connect sets the value of the environment variable RSTUDIO_PRODUCT = CONNECT
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😭 it's 2024, we should start setting POSIT_PRODUCT in the env too, yeah?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use the presence of CONNECT_SERVER instead and sidestep branding issues?

# when content is running on a Connect server. Use this var to determine if the
# client SDK was initialized from a piece of content running on a Connect server.
def is_local() -> bool:
return not os.getenv("RSTUDIO_PRODUCT") == "CONNECT"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this is only being used src/posit/connect/external/databricks.py.

Should it be moved there?

This could also go in Config if there is reuse. But I'd rename it to something like run_mode, runtime_environment, etc.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently yes it's only used by the databricks module. I could see this being useful elsewhere though. No preference from me on where this lives

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻 Place it where it's used for now.



class Client:
def __init__(
self,
Expand Down Expand Up @@ -37,13 +46,20 @@ def __init__(

# Place to cache the server settings
self.server_settings = None
self._oauth = None

@property
def connect_version(self):
if self.server_settings is None:
self.server_settings = self.get("server_settings").json()
return self.server_settings["version"]

@property
def oauth(self) -> OAuthIntegration:
if self._oauth is None:
self._oauth = OAuthIntegration(config=self.config, session=self.session)
return self._oauth
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT there's nothing in OAuthIntegration that needs to be cached, so we just need to be able to access it here, as we do (now) for Users below.

Suggested change
if self._oauth is None:
self._oauth = OAuthIntegration(config=self.config, session=self.session)
return self._oauth
return OAuthIntegration(config=self.config, session=self.session)


@property
def users(self) -> Users:
return Users(config=self.config, session=self.session)
Expand Down
1 change: 0 additions & 1 deletion src/posit/connect/client_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@

from unittest.mock import MagicMock, patch


from .client import Client


Expand Down
Empty file.
60 changes: 60 additions & 0 deletions src/posit/connect/external/databricks.py
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does usage of this file look like? I think the implementation can be reduced, but I'm not convinced since it seems dependent on the consuming code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's our example from last weeks demo.

with Client() as connect_client:
credentials_provider = viewer_credentials_provider(
connect_client.oauth, content_identity=CONTENT_IDENTITY)
cfg = Config(host=DB_HOST_URL, credentials_provider=credentials_provider)

This usage has changed a little bit since then. In this PR we accept a client as an optional parameter so that the Content doesn't require a Connect server when running locally. The SDK client is initialized only when the content is running on Connect by default.

Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
import abc
from typing import Callable, Dict, Optional

from ..client import Client, is_local
from ..oauth import OAuthIntegration

HeaderFactory = Callable[[], Dict[str, str]]

# https://github.com/databricks/databricks-sdk-py/blob/v0.20.0/databricks/sdk/credentials_provider.py
# https://github.com/databricks/databricks-sql-python/blob/v3.1.0/src/databricks/sql/auth/authenticators.py
# In order to keep compatibility with the Databricks SDK
class CredentialsProvider(abc.ABC):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to avoid adding their SDK as a dependency. I'll test and see if we can remove this abstract definition altogether.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's leave this for now. I think we will have a similar problem for our content types. It would be nice to provide helpers for each framework when obtaining the identity token from the headers but it will vary by content type.

"""CredentialsProvider is the protocol (call-side interface)
for authenticating requests to Databricks REST APIs"""

@abc.abstractmethod
def auth_type(self) -> str:
...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these abstract method's raise NotImplementedError()?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


@abc.abstractmethod
def __call__(self, *args, **kwargs) -> HeaderFactory:
...


class PositOAuthIntegrationCredentialsProvider(CredentialsProvider):
def __init__(self, posit_oauth: OAuthIntegration, user_identity: str):
self.posit_oauth = posit_oauth
self.user_identity = user_identity

def auth_type(self) -> str:
return "posit-oauth-integration"

def __call__(self, *args, **kwargs) -> HeaderFactory:
def inner() -> Dict[str, str]:
access_token = self.posit_oauth.get_credentials(self.user_identity).json()['access_token']
return {"Authorization": f"Bearer {access_token}"}
return inner


def viewer_credentials_provider(client: Optional[Client], user_identity: Optional[str]) -> Optional[CredentialsProvider]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the value of user_identity ever something other than headers.get('Posit-Connect-Content-Identity')?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value is always contained in a header called Posit-Connect-Content-Identity. But the method for obtaining the header value varies by content type:

https://docs.posit.co/connect/user/shiny/#user-meta-data
https://docs.posit.co/connect/user/flask/#user-meta-data


# If the content is not running on Connect then viewer auth should
# fall back to the locally configured credentials hierarchy
if is_local():
return None

if client is None:
client = Client()


# If the user-identity-token wasn't provided and we're running on Connect then we raise an exception.
# user_identity is required to impersonate the viewer.
if user_identity is None:
raise Exception("The user-identity-token is required for viewer authentication.")
tdstein marked this conversation as resolved.
Show resolved Hide resolved

return PositOAuthIntegrationCredentialsProvider(client.oauth, user_identity)


def service_account_credentials_provider(client: Optional[Client]):
raise NotImplemented
Empty file.
35 changes: 35 additions & 0 deletions src/posit/connect/oauth.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
from __future__ import annotations

from requests import Response, Session
from typing import Optional

from . import urls
from .config import Config


class OAuthIntegration:

def __init__(
self, config: Config, session: Session
) -> None:
self.url = urls.append_path(config.url, "v1/oauth/integrations/credentials")
self.config = config
self.session = session


def get_credentials(self, user_identity: Optional[str]) -> Response:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to return Response or can it return the response body or something? I don't see this being used anywhere yet, so I can't tell.

I think we're trying to avoid having the request/response details leaking outside of the get/put/post/etc. methods.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for reference, here's the place it gets used:

access_token = self.posit_oauth.get_credentials(self.user_identity).json()['access_token']

So we could definitely make the return type more specific and not expose the response object.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry, I was using github's symbol lookup and it didn't find it, should have stuck with good ol' cmd-F.

Yeah in that case, I'd have this method return response.json()

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A defined type would be even better so that the consumer gets type hints.

@dataclass
class OAuthResponse:
    access_token: str

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the expectation that this endpoint might eventually return something other than an access_token, I like the idea of returning a defined type.


# craft a basic credential exchange request where the self.config.api_key owner
# is requesting their own credentials
data = dict()
data["grant_type"] = "urn:ietf:params:oauth:grant-type:token-exchange"
data["subject_token_type"] = "urn:posit:connect:api-key"
data["subject_token"] = self.config.api_key

# if this content is running on Connect, then it is allowed to request
# the content viewer's credentials
if user_identity:
data["subject_token_type"] = "urn:posit:connect:user-identity-token"
data["subject_token"] = user_identity

return self.session.post(self.url, json=data)
Empty file.
Loading