Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove pbench-generate-token agent CLI functionality #3383

Merged
merged 4 commits into from
Apr 18, 2023

Conversation

npalaska
Copy link
Member

  • Since we moved away from the internal user management in Pbench-server, we can no longer use username and password to create new tokens.
  • The new functionality to generate long-lasting tokens will be available via a new server API calls and the dashboard. PR Generation of API key on pbench-server #3368

PBENCH-948

@npalaska npalaska self-assigned this Apr 13, 2023
@npalaska npalaska added the Agent label Apr 13, 2023
@npalaska npalaska added this to the v0.72 milestone Apr 13, 2023
Copy link
Member

@dbutenhof dbutenhof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This came up in discussion this morning, and Peter was arguing against removing pbench-generate-token. While the normal way to get a Keycloak token is via the redirect tango, our functional tests are currently getting a token via Keycloak API. How practical this is in normal use, I don't know, given the bearer token lifetimes and refresh token handling: but do we want to reconsider this again??? 😕

contrib/containerized-pbench/pbench_demo Outdated Show resolved Hide resolved
@npalaska
Copy link
Member Author

This came up in discussion this morning, and Peter was arguing against removing pbench-generate-token. While the normal way to get a Keycloak token is via the redirect tango, our functional tests are currently getting a token via Keycloak API. How practical this is in normal use, I don't know, given the bearer token lifetimes and refresh token handling: but do we want to reconsider this again???

But to do that we will have to expose the Keycloak APIs so the agent can call them. Can we do that with the integration lab Keycloak instance? Also is the integration lab Keycloak instance allow the CLI-based username-password authentication flow?

@dbutenhof
Copy link
Member

This came up in discussion this morning, and Peter was arguing against removing pbench-generate-token. While the normal way to get a Keycloak token is via the redirect tango, our functional tests are currently getting a token via Keycloak API. How practical this is in normal use, I don't know, given the bearer token lifetimes and refresh token handling: but do we want to reconsider this again???

But to do that we will have to expose the Keycloak APIs so the agent can call them. Can we do that with the integration lab Keycloak instance? Also is the integration lab Keycloak instance allow the CLI-based username-password authentication flow?

You're the Keycloak expert here. The thread of the discussion was that the functional test integration shows it's possible. That doesn't necessarily mean it's feasible for general use, and if it depends on specific Keycloak configuration that might be a security issue, that's probably a problem.

@npalaska
Copy link
Member Author

npalaska commented Apr 13, 2023

You're the Keycloak expert here. The thread of the discussion was that the functional test integration shows it's possible. That doesn't necessarily mean it's feasible for general use, and if it depends on specific Keycloak configuration that might be a security issue, that's probably a problem.

Yeah I mean it really depends on how the Keycloak instance in the integration lab is going to get configured. If we decide for security purposes that no one can login using the username/password through CLI (disallowing the password grant-type) then no we can't use the CLI to generate tokens. However, that brings another point, we have discussed that we (and also the integration lab instance) won't provide user registration which means all the user authentication has to happen through some identity providers. In this case, we anyway can not use the username/password on the CLI to generate the token (because our Keycloak broker is not an identity provider). Right? What we do for the functional test is just a workaround because 1. we can not use the dashboard in the functional test 2. Functional test user identity is not created using some identity provider, it was created by the broker.

Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But to do that we will have to expose the Keycloak APIs so the agent can call them.

At least in theory, we can write a CLI tool which does anything which the Dashboard does. (It might not be pretty or easy, but it should be just a "simple matter of programming".) In order to be able to push a result, the Agent already has to know where the Pbench Server is, so it can just ask the Pbench Server where the Keycloak service is (so that we don't have to "expose" the Keycloak any more than that).

Also is the integration lab Keycloak instance allow the CLI-based username-password authentication flow?

Presumably, the new pbench-token-generate would call the Pbench Server, then call the Keycloak broker, then handle the redirect to the OIDC identity provider, supply the username and password to it in a secure way, then ride the redirect back and obtain the access token and the refresh token, and finally hit the Pbench Server for an API key which it would return. (Or, perhaps it can return the access token, if the user can get the job done before it expires.)

I'm not thrilled by this model, because I think that API keys will be created rarely -- but used repeately -- and that we won't need a CLI tool to do it. (I.e, the user will create their key once and then copy it to wherever they need it, so it won't matter that there is no CLI to create it -- using the Dashboard will be sufficient, and maybe even better, because they won't need to install the Agent in the place where they do the one-time key creation.

it really depends on how the Keycloak instance in the integration lab is going to get configured.

It shouldn't...pbench-token-generate shouldn't be dependent upon any particular Pbench deployment configuration....

What we do for the functional test is just a workaround because

  1. we can not use the dashboard in the functional test
  2. Functional test user identity is not created using some identity provider, it was created by the broker.

I wouldn't call what we do in the functional test a "workaround" so much as an "alternate solution". We couldn't use the Dashboard because the functionality that we require isn't there. Once we have API keys, I expect that we'll switch most of the functional testing to use those, and we'll get away from authentication altogether (and just concentrate on authorization...). OIDC has freed us from the whole credentials biz, even in the Dashboard, thanks to all the redirection stuff, so it's just not a concern for us, now.

All that said, I got a couple of small requests for this PR, below.

contrib/containerized-pbench/pbench_demo Outdated Show resolved Hide resolved
contrib/containerized-pbench/pbench_demo Outdated Show resolved Hide resolved
Comment on lines 34 to 36
if [[ $# == 1 ]]
then
token=$1
else
username=${1:-pbench_user}
password=${2:-pbench}
token=$(pbench-generate-token --username $username --password $password)
fi
# Get the Pbench authentication token
token_location=$1
token=$(< ${token_location})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that token_location is not used except for the line immediately following its definition, let's skip defining it.

Also, given that the token string is a required parameter and that script will behave in weird ways if it's omitted, you should add a check which exits with an error if it's not provided.

@npalaska
Copy link
Member Author

@webbnh I still do not understand how the user would be able to use the CLI to login when the identity provider is not the broker, it's either Redhat SSO or Google or something else. How to achieve these redirects without the presence of a browser and then securely insert user credentials on the identity provider login page? I guess theoretically, we can do it with curls or Python-requests but wouldn't it be messy?

@npalaska
Copy link
Member Author

Once we have API keys, I expect that we'll switch most of the functional testing to use those, and we'll get away from authentication altogether (and just concentrate on authorization...)

I doubt the path is gonna be that straight-forwards, with every functional test we create a new Keycloak container and a new database container. We create a brand new tester user and if we are storing any API keys they are all flushed before starting the functional tests. Unless we use some sort of a persistent storage volume for the API keys we'll have to do some experiments with this to see whether API key validation would work.

@dbutenhof
Copy link
Member

Once we have API keys, I expect that we'll switch most of the functional testing to use those, and we'll get away from authentication altogether (and just concentrate on authorization...).

Now with that I disagree strongly. Maybe, if API Key and Keycloak auth token validation were 100% identical, we could get away with that. They're not, and we need to test both. So, we'll need to add API Key testing (whether a full duplicate test set or a few selected cases, including of course the POST /api/v1/key itself) but we absolutely need to continue registering a Keycloak user and logging in.

That's not an issue, since I expect that our functional test environment will always be based off a local transient Keycloak ID provider that we completely control.

@webbnh
Copy link
Member

webbnh commented Apr 14, 2023

I still do not understand how the user would be able to use the CLI to login when the identity provider is not the broker, it's either Redhat SSO or Google or something else. How to achieve these redirects without the presence of a browser and then securely insert user credentials on the identity provider login page? I guess theoretically, we can do it with curls or Python-requests but wouldn't it be messy?

Fundamentally, what the browser does is not magic. It just makes HTTP requests and renders the results. (If the result is Javascript or similar, then we start getting closer to magic, but for static pages it's just call-and-response.) If the response is a 30x, then there's a protocol to follow, and we can do that in Python just like the browser does, and, as you note, tools like Curl and Python libraries like requests already known how to do this. So, getting to the ID provider's page should be very straightforward; what we do when we land there is more interesting. If it's a static page requesting "forms data", then that should be pretty straightforward to respond to in Python (and collecting the credentials from the user and relaying them to the ID provider isn't super hard, either); if the page is actually a script of some sort, well...I guess things get messy, but there are ways of dealing with this, too. Then we ride the redirects back, finish the Keycloak dance, and get our access token.

It's all doable. The question is, why would we do it? I maintain that an API key is, typically, something that the user will already have, so we don't need to invest a lot in letting them create one on the fly. Thus, I don't think we actually want to implement pbench-token-generate -- having a nice UI in the Dashboard for it will be good enough and probably better overall.

Once we have API keys, I expect that we'll switch most of the functional testing to use those, and we'll get away from authentication altogether (and just concentrate on authorization...)

I doubt the path is gonna be that straight-forwards

You raise excellent points. We'll have to decide whether we want to create a new API key for each functional test run, or whether we find some way of persisting one, or provide some testing-specific approach.

Now with that I disagree strongly. Maybe, if API Key and Keycloak auth token validation were 100% identical, we could get away with that. They're not, and we need to test both. So, we'll need to add API Key testing (whether a full duplicate test set or a few selected cases, including of course the POST /api/v1/key itself) but we absolutely need to continue registering a Keycloak user and logging in.

My understanding of our current approach is that authentication/authorization is largely separate from the execution of the individual APIs, that either an access token or an API key is acceptable, and so by the time we reach the interesting part of the code, it doesn't matter which was originally specified. This is not to say that we don't need to test them both in specific cases -- obviously the authentication and authorization functions need to be appropriately tested -- but for the rest of the code we only need to use whichever one is more convenient (hence my original phrasing, "most of the functional testing" [emphasis added]). (And, yes, of course we'll need to test POST /api/v1/key itself....)

That's not an issue, since I expect that our functional test environment will always be based off a local transient Keycloak ID provider that we completely control.

😌

@npalaska
Copy link
Member Author

Moving it to draft until we decide the final strategy on this topic.

@npalaska npalaska marked this pull request as draft April 14, 2023 18:31
@webbnh
Copy link
Member

webbnh commented Apr 17, 2023

Moving it to draft until we decide the final strategy on this topic.

I don't think the lack of a final strategy needs to gate this PR.

At the most trivial, if we were to continue to provide a CLI command which fetches an access token or an API key, I don't think it will be called pbench-generate-token (i.e., it will probably be called something more like pbench-token-generate), so we will be removing lib/pbench/cli/agent/commands/generate_token.py either way.

The paramount concern for this PR is that all the tests continue to pass. However, once you fix pbench/test/unit/agent/task/test_generate_token.py, then I think that will be the case.

So, I don't know of any reason why this PR cannot move forward. (It's true that we will need to devise a method for testing each of access tokens and API keys, but I don't think we have to solve that in this PR.)

@npalaska
Copy link
Member Author

So, I don't know of any reason why this PR cannot move forward. (It's true that we will need to devise a method for testing each of access tokens and API keys, but I don't think we have to solve that in this PR.)

Right, I just wanted some preliminary understanding of our thinking on how to move forward. At the very least the current functionality is anyway not useful since it relies upon the internal user notion which we have to remove anyway. And the new approach is drastically different and may need some experimentation before we implement it.

- Since we moved away from the internal user management in
  Pbench-server, we can no longer use username and password
  to create new tokens.
- The new functionality to generate long-lasting tokens
  will be available via new server API call and the dashboard.

PBENCH-948
@npalaska npalaska marked this pull request as ready for review April 17, 2023 17:01
@npalaska npalaska requested review from dbutenhof and webbnh April 17, 2023 17:01
dbutenhof
dbutenhof previously approved these changes Apr 17, 2023
Copy link
Member

@dbutenhof dbutenhof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks odd that you fixed load-canned-data.sh which won't work anyway; but that's OK for now. 😆

@npalaska
Copy link
Member Author

It looks odd that you fixed load-canned-data.sh which won't work anyway; but that's OK for now.

Yeah, I think we should just drop this load-canned-data.sh file, I don't think anyone in the team is actually using it, right?

@webbnh
Copy link
Member

webbnh commented Apr 17, 2023

I think we should just drop this load-canned-data.sh file

Yes, I think its time has come and gone.

Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to go, once you fix the awkwardness in the demo script and remove load-canned-data.sh.

contrib/containerized-pbench/pbench_demo Outdated Show resolved Hide resolved
server/pbenchinacan/load-canned-data.sh Outdated Show resolved Hide resolved
Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@npalaska npalaska merged commit 473463f into distributed-system-analysis:main Apr 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants