-
Notifications
You must be signed in to change notification settings - Fork 301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OAuth2 support for flyte-cli and SDK engine #23
Conversation
flytekit/clis/flyte_cli/main.py
Outdated
from flytekit.clis.helpers import construct_literal_map_from_variable_map as _construct_literal_map_from_variable_map, \ | ||
construct_literal_map_from_parameter_map as _construct_literal_map_from_parameter_map, \ | ||
parse_args_into_dict as _parse_args_into_dict, str2bool as _str2bool | ||
from flytekit.common import utils as _utils, launch_plan as _launch_plan_common | ||
from flytekit.common.core import identifier as _identifier | ||
from flytekit.common.types import helpers as _type_helpers | ||
from flytekit.common.utils import load_proto_from_file as _load_proto_from_file | ||
from flytekit.configuration import platform as _platform_config | ||
from flytekit.configuration import creds as _creds_config, platform as _platform_config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so I think we should keep the underlying config objects regardless (because they'll be useful in pyflyte), but currently none of these files iirc are used in flyte-cli. That is, the flytekit.config
style files are never read by flyte-cli, because it's assumed that users of flyte-cli might not have any flyte repos checked out at all - they may only be an administrator. Auth has a fair amount to config though, it's pretty awkward to pass in everything as command line args. What does @matthewphsmith think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should definitely be env var based. You can use the configuration package though to make things easier since the objects are env var aware.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think making it env var based makes it difficult to set-up for an end-user. I'd prefer to have a config that can be easily cloned and written, but I'm happy to move it to a separate directory if it makes sense organizationally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haytham's argument was that there is a .aws/config
and a .kube/config
and a .gitconfig
. Why can't there be a .flyte/config? I think it's cleaner than having to rely on five or six environment variables. What's the downside that you're worried about Matt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can definitely have .flyte/config
The only thing is then we need to provide config modification commands (but with click opening up the editor is simple)
Another option is to have .flyte/config
as an option otherwise explicitly specify all the args
flytekit/clis/auth/auth.py
Outdated
from http import client as _StatusCodes | ||
except ImportError: # Python 2 | ||
import httplib as _StatusCodes | ||
from BaseHTTPServer import HTTPServer as _HTTPServer, BaseHTTPRequestHandler as _BaseHTTPRequestHandler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: let's import the module. it makes refactoring easier and keeps the namespace cleaner.
import BaseHTTPServer as _base_http_server
Later...
_base_http_server.HTTPServer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same for urlparse below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but it makes it harder to jump into the definition of the underlying object. and the ide shows everything as yellow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm actually i can still jump around in my ide to the underlying object with refactoring
flytekit/configuration/creds.py
Outdated
|
||
from flytekit.configuration import common as _config_common | ||
|
||
DISCOVERY_ENDPOINT = _config_common.FlyteStringConfigurationEntry('credentials', 'discovery_endpoint', default=None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are all these env vars going to be auto-set via the auth flow? or would a user ever specify any of this? If yes, can you let me know about a few of those cases to help me understand?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are specific to the hosted Flyte platform - so they can't be set. For example the discovery endpoint is configured in admin. The native Okta app at Lyft we're testing with has a unique client id and a hardcoded, pre-configured redirect URI. The authorization metadata key is a departure we have to make from the standard protocol because Envoy actually behaves funkily if we use the default value.
flytekit/clis/flyte_cli/main.py
Outdated
if not _platform_config.AUTH.get(): | ||
# nothing to do | ||
return None | ||
access_token = _keyring.get_password(_metadata, _metadata_access_token) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is awesome!
I think before an approval we should either demonstrate the auth flow? |
""" | ||
This is the redirect uri registered with the app which handles authorization for a Flyte deployment. | ||
This is the callback uri registered with the app which handles authorization for a Flyte deployment. | ||
Please note the hardcoded port number. Ideally we would not do this, but some IDPs do not allow wildcards for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haytham suggested that we implement logic to either accept a wildcard port or a list of ports (i.e. if we register a bunch of hard-coded ports with the IDP.) That way we have some flexibility with choosing an available system port. I'm happy to save that for the next PR though and leave this comment in for now
…deleted, only using for debugging at the moment
flytekit/clients/raw.py
Outdated
|
||
|
||
def _handle_rpc_error(fn): | ||
@_try_three_times |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hey @matthewphsmith we introduced this to make sure we don't get stuck in a loop if authentication is persistently failing...but i'm not sure that this is the prettiest code 🙃 let us know if a better way stands out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the code's been refactored but i think the original comment still stands
…that if the host is specified in the user's ~/.flyte/config file, you don't need to specify it in the flyte-cli command
auth=true | ||
|
||
[credentials] | ||
discovery_endpoint=http://corp.idp.com/.well-known/oauth-authorization-server |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain the use case a bit as to why a user would specify the discovery and redirect endpoint? Wouldn’t that config be based server side?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify, this discovery endpoint should be pointed at flyte.corp.net
or something... and that should redirect to whatever IDP corp.net is using. Or it can return the values itself without redirecting, but the point is, you only hard-code the main flyte server. Thinking about it, maybe we should nix this @katrogan and just assume the extension off of the platform URL. That's probably the cleanest thing to do.
The redirect endpoint is configured in the IDP. It's a hardcoded fully-specified URL. This means that the callback url (including the port) that the local server that's stood up at has to match exactly what the IDP has. This information is not returned by the metadata/discovery endpoint. Why i'm not sure... it may be a security concern.
Ideally, the metadata endpoint would be less restrictive and allow us to return more things. Because of the limitations of the RFC, we also need to hardcode this in for example: https://github.com/lyft/flyteexamples/pull/97/files#diff-2e41be19539d0ba6fda0e5ce814a9123R22. This is the scope that will be requested by pyflyte when it makes the token request. Again, like the callback url, it has to match a hardcoded string set within the IDP, otherwise it'll fail. It doesn't really offer anything, but we have to have it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re: assuming the extension - I'm happy to do this if we can safely assume this is the major convention (which I guess sounds like it is?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm very interesting...well seems like you have a handle on it, so I defer to whatever you think is best though I am biased to less config wherever possible.
_utf_8 = 'utf-8' | ||
|
||
|
||
class FlyteAuthenticationException(_FlyteException): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIt: should we define this in flyte.exceptions.user?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
making a waterfall for how configuration settings are looked up from the environment variable
:return: | ||
""" | ||
auth_endpoints = _credentials_access.get_authorization_endpoints() | ||
token_endpoint = auth_endpoints.token_endpoint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guessing you'll have to do something here to support relative URLs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what you mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the token endpoint is "/token", the code doesn't work, I think
Any chance you can ask for password over stdin if password is required but not password is passed? Something like this: https://github.com/lyft/flytekit/pull/56/files#diff-39ff0a124da92894da1f07bd5ce13c67R46-R47 |
This change adds authentication support for flyte-cli and the pyflyte CLIs. # New authorization code Specifically this change introduces an **AuthorizationClient** which implements the [PKCE authorization flow](https://www.oauth.com/oauth2-servers/pkce/authorization-code-exchange/) for untrusted clients. This client handles requesting an initial access token, spinning up a callback server to receive the access token and using that to retrieve an authorization code. The client also handles refreshing expired authorization tokens. This change also includes a lightweight **DiscoveryClient** for retrieving authorization endpoint metadata defined in the [OAuth 2.0 Authorization Server Metadata](https://tools.ietf.org/id/draft-ietf-oauth-discovery-08.html) draft document. An authorization client singleton is lazily initialized for use by flyte-cli. # Pyflyte changes (basic auth) Requests an authorization token using a username and password. # Flyte-cli changes (standard auth) Requests an authorization token using the PKCE flow. # Raw client changes Wraps RPC calls to flyteadmin in a retry handler that initiates the appropriate authentication flow defined in the flytekit config in response to `HTTP 401 unauthorized` response codes.
This change adds authentication support for flyte-cli and the pyflyte CLIs.
New authorization code
Specifically this change introduces an AuthorizationClient which implements the PKCE authorization flow for untrusted clients. This client handles requesting an initial access token, spinning up a callback server to receive the access token and using that to retrieve an authorization code. The client also handles refreshing expired authorization tokens.
This change also includes a lightweight DiscoveryClient for retrieving authorization endpoint metadata defined in the OAuth 2.0 Authorization Server Metadata draft document.
An authorization client singleton is lazily initialized for use by flyte-cli.
Pyflyte changes (basic auth)
Requests an authorization token using a username and password.
Flyte-cli changes (standard auth)
Requests an authorization token using the PKCE flow.
Raw client changes
Wraps RPC calls to flyteadmin in a retry handler that initiates the appropriate authentication flow defined in the flytekit config in response to
HTTP 401 unauthorized
response codes.