Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle JWTRealm Rotating JWKS and align docs #87609

Closed
wants to merge 14 commits into from

Conversation

nielsbox
Copy link

@nielsbox nielsbox commented Jun 13, 2022

Fixes: #87608

@elasticsearchmachine elasticsearchmachine added v8.4.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Jun 13, 2022
@nielsbox nielsbox changed the title Handle JWTRealm Rotating JWKS and align docs [8.3] Handle JWTRealm Rotating JWKS and align docs Jun 13, 2022
@gwbrown gwbrown added the :Security/Authentication Logging in, Usernames/passwords, Realms (Native/LDAP/AD/SAML/PKI/etc) label Jun 13, 2022
@elasticmachine elasticmachine added the Team:Security Meta label for security team label Jun 13, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-security (Team:Security)

@justincr-elastic justincr-elastic self-assigned this Jun 13, 2022
Copy link
Contributor

@justincr-elastic justincr-elastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello. Thank you for your code submission. I provided some feedback on your PR.

In additional to these changes, new unit tests would need to be created to verify the changes. It probably requires refactoring the JwtIssuer test helper class, and other test classes, to trigger a JWKSet reload in the in-memory HTTPS Server or on-disk JWKSet(s)/JWKs, so some tests can attempt validation before and after a reload.

@nielsbox nielsbox force-pushed the nielsbox-handle-rotate-jwks branch 2 times, most recently from d68e18f to 987400b Compare June 15, 2022 15:26
@nielsbox nielsbox force-pushed the nielsbox-handle-rotate-jwks branch from 987400b to 2364386 Compare June 15, 2022 15:28
@nielsbox nielsbox force-pushed the nielsbox-handle-rotate-jwks branch from 1b087cd to 9d22222 Compare June 15, 2022 16:47
@nielsbox
Copy link
Author

nielsbox commented Jun 15, 2022

@justincr-elastic I have addressed all PR Review Comments and run the (new) tests including the full gradle check(Skipping all tests that my Apple Silicon Mac couldn't handle cause of the AArch64 architecture).

@nielsbox nielsbox changed the title [8.3] Handle JWTRealm Rotating JWKS and align docs Handle JWTRealm Rotating JWKS and align docs Jun 16, 2022
Copy link
Contributor

@justincr-elastic justincr-elastic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a lot of comments. I did not get through all of the test code.

Are the new tests for success path when rotation goes OK? I skimmed the tests for now, but I don't think I noticed any negative path tests. For example, test that the realm can do in to bad state.

@nielsbox
Copy link
Author

nielsbox commented Jun 16, 2022

I added a lot of comments. I did not get through all of the test code.

Are the new tests for success path when rotation goes OK? I skimmed the tests for now, but I don't think I noticed any negative path tests. For example, test that the realm can do in to bad state.

Thanks for the review and helpful comments!

Indeed currently the tests only check for a successful rotation: First check if normal auth works, then rotate, then check if auth works using new jwkset, then retest using the first jwt token and expect that fails.

I will add extra test cases when: fetching the new pair fails (example a gateway error). Which means we should not revalidate and throw a failed validation. The second test case should be when we are able to fetch the new jwkset but the algorithm is wrong. We then should expect everything to fail in the jwtrealm.

@justincr-elastic
Copy link
Contributor

I thought of another use case to consider while going through your latest changes. If the realm gets into a bad state, does it need to stay in a bad state indefinitely until manual intervention?

Consider what would happen if a rotated JWKSet was misconfigured, and then fixed. The JWT realm would go into a bad state, but then it should be able to recover.

If the external JWKSet configuration is fixed, it is highly desirable that manual intervention should be required. The JWT reaLM should recover on its own are start working again.

@nielsbox
Copy link
Author

I thought of another use case to consider while going through your latest changes. If the realm gets into a bad state, does it need to stay in a bad state indefinitely until manual intervention?

Consider what would happen if a rotated JWKSet was misconfigured, and then fixed. The JWT realm would go into a bad state, but then it should be able to recover.

If the external JWKSet configuration is fixed, it is highly desirable that manual intervention should be required. The JWT reaLM should recover on its own are start working again.

Agreed, I will test that aswel! Tomorrow I should have some time to address the comments.

@nielsbox nielsbox force-pushed the nielsbox-handle-rotate-jwks branch 3 times, most recently from cdfb74c to e6b2302 Compare June 17, 2022 18:30
@nielsbox nielsbox force-pushed the nielsbox-handle-rotate-jwks branch from e6b2302 to 124f4a1 Compare June 17, 2022 18:34
@justincr-elastic
Copy link
Contributor

There are a lot of corner cases for sure. I will need to think through a lot of them when reviewing this PR.

I would like to keep the scope of this PR as narrow as possible (i.e. triggered reload if falling through that JWK verify loop). That will keep the PR size and time down.

@nielsbox
Copy link
Author

There are a lot of corner cases for sure. I will need to think through a lot of them when reviewing this PR.

I would like to keep the scope of this PR as narrow as possible (i.e. triggered reload if falling through that JWK verify loop). That will keep the PR size and time down.

Indeed, I did the same when writing this enhancement, seems quite complex but the tests really helped focus on the core problems.

I also wish to keep the scope smaller, for example removing the auto recoverability would remove some of those changes. Let me know what we shouldn't handle in this PR and I will move those changes to a separate PR.

@nielsbox nielsbox force-pushed the nielsbox-handle-rotate-jwks branch from f7eaa18 to 4cbae9b Compare June 17, 2022 20:05
@justincr-elastic
Copy link
Contributor

This PR is getting quite big. It is taking longer to do each review pass, and I haven't had a chance to dive into the tests or think through the corner cases. The exception for zero algs during validation is the kind thing where it would be easy to introduce problems.

At some point, this PR may get too big and it might make sense for me to merge your changes into my fork where I could continue working on the PR. Changes would still be visible, so you would be able to see progress and make comments. However, you seem motivated to iterate on this PR, so we can keep going and reevalutate as we go.

@nielsbox
Copy link
Author

This PR is getting quite big. It is taking longer to do each review pass, and I haven't had a chance to dive into the tests or think through the corner cases. The exception for zero algs during validation is the kind thing where it would be easy to introduce problems.

At some point, this PR may get too big and it might make sense for me to merge your changes into my fork where I could continue working on the PR. Changes would still be visible, so you would be able to see progress and make comments. However, you seem motivated to iterate on this PR, so we can keep going and reevalutate as we go.

I agree, but I don't see this PR changing a lot anymore, all functionality is here, the tests for the functionality that we want seem to run fine. Maybe we could review the tests first so we can touch on the cases that we wish to support. Since some test cases depend on the comments in your review (Like rotating a wrong JWKSet (after successfull retrieval). If we decide not to support that test case we can remove the corresponding code.

If you see fit that it would be much easier to work on it yourself, that is fine too but I do like the idea of contributing and getting this to master as the author :P. Either way this doesn't matter as long as this functionality gets to a new Elastic Search version. Since we do need this at the company I work at.

@justincr-elastic
Copy link
Contributor

There is a contributor tag in the PR merge when people collaborate on a PR, so you would get recognition.

@nielsbox
Copy link
Author

nielsbox commented Jun 23, 2022

There is a contributor tag in the PR merge when people collaborate on a PR, so you would get recognition.

Yeah that is not really a problem for me but thats good to hear, if you wish to take over you have my permission. I will try to review and comment whenever possible.

Is there a specific reason why you would like/need to work on this instead?

@justincr-elastic
Copy link
Contributor

@elasticmachine update branch

@justincr-elastic
Copy link
Contributor

My concern is digging into the tests. If I run them locally and I find issues, I would be reporting them back here and waiting for fixes. Doing that async is slow. There could be many iterations of that. It is more efficient to find, fix, and re-run locally to work through that phase of a PR. A workaround for that would be if both of us could push changes into a shared branch.

@justincr-elastic
Copy link
Contributor

I did a local pull. I found a few issues.

I think it will be easier and quicker if I pick it up from here. I will link from the new PR to this one when it is ready. You will be listed as a contributor, and the new PR will link to your PR.

@nielsbox
Copy link
Author

I did a local pull. I found a few issues.

I think it will be easier and quicker if I pick it up from here. I will link from the new PR to this one when it is ready. You will be listed as a contributor, and the new PR will link to your PR.

Alright no problem, looking forward to the changes you will make and what issues you found.

@justincr-elastic
Copy link
Contributor

PR #88023 has been submitted for review if you want to check it out. Thank you.

@nielsbox
Copy link
Author

PR #88023 has been submitted for review if you want to check it out. Thank you.

I have been following your progress in the background. Looks great! Will take a closer look soon.

@elasticsearchmachine elasticsearchmachine changed the base branch from master to main July 22, 2022 23:06
@mark-vieira mark-vieira added v8.5.0 and removed v8.4.0 labels Jul 27, 2022
@nielsbox
Copy link
Author

nielsbox commented Aug 5, 2022

As this PR which was superseded by #88023, is now no longer needed I will close this :)

@nielsbox nielsbox closed this Aug 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement external-contributor Pull request authored by a developer outside the Elasticsearch team :Security/Authentication Logging in, Usernames/passwords, Realms (Native/LDAP/AD/SAML/PKI/etc) Team:Security Meta label for security team v8.5.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

JWTRealm Auto Rotating JWKS
6 participants