Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OIDC Extension - JWKS not refreshed #12984

Closed
missourian55 opened this issue Oct 27, 2020 · 17 comments · Fixed by #14953
Closed

OIDC Extension - JWKS not refreshed #12984

missourian55 opened this issue Oct 27, 2020 · 17 comments · Fixed by #14953
Labels
area/oidc kind/bug Something isn't working
Milestone

Comments

@missourian55
Copy link

Describe the bug
JWK keys are not refreshed
quarkus.oidc.token.forced-jwk-refresh-interval=10M

Expected behavior
JWK keys should be refreshed at the configured intervals by communicating with JWKS endpoint

Actual behavior
JWK keys are nor refreshed with this configuration
quarkus.oidc.token.forced-jwk-refresh-interval=10M

To Reproduce

  1. Follow this guide for securing web-applications using OIDC https://quarkus.io/guides/security-openid-connect-web-authentication
  2. Configure this property to trigger a refresh every 10 minutes quarkus.oidc.token.forced-jwk-refresh-interval=10M

Configuration

quarkus.oidc.auth-server-url=${ISSUER_URL:https://keycloak.com/quarkus}
quarkus.oidc.token.forced-jwk-refresh-interval=2M
quarkus.oidc.client-id=frontend
quarkus.oidc.application-type=web-app
quarkus.http.auth.permission.authenticated.paths=/*
quarkus.http.auth.permission.authenticated.policy=authenticated

Screenshots
(If applicable, add screenshots to help explain your problem.)

Environment (please complete the following information):

  • Output of uname -a or ver: Linux x64 & Mac OS
  • Output of java -version: Java 11
  • GraalVM version (if different from Java): NA
  • Quarkus version or git rev: 1.9.0.Final
  • Build tool (ie. output of mvnw --version or gradlew --version): mvnw
@missourian55 missourian55 added the kind/bug Something isn't working label Oct 27, 2020
@sberyozkin
Copy link
Member

@missourian55 This property should be documented better, what it does is it prevents the DOS attack where an attacker manipulates kids causing a no match in the local JWK store. This is easy to do since even though the headers are also protected from the modification for it to be verified locally one has to pick up a local JWK key first.
The JWK refresh will only happen when a client token is coming with a non-matching kid, and once it has been done, it won't be repeated until this interval expires.
So it is not a bug.
Also, the refresh is asynchronous, and in the background a token introspection is attempted. This is OK with Keycloak but if your provider is not KC and it does not support the introspection endpoint then a user might get a single 401 if the async JWK refresh has not been fast enough... This might be tuned a bit further if it is what you are seeing ?

Otherwise I'll be happy enough to close this issue with the PR updating this property documentation.

@sberyozkin
Copy link
Member

sberyozkin commented Oct 27, 2020

@missourian55 Note as mentioned by Keycloak and other OIDC experts, the JWKs are changed in the production irregularly and also quite rarely, cc @stianst @pedroigor

@missourian55
Copy link
Author

@sberyozkin thank you for the insights and the on demand trigger of JWKS refresh on kid mismatch makes sense to me. Our enterprise use an IDP which supports token introspection and general recommendation is to refresh the JWKS every 24 hours asynchronously (paranoid and it changes may be twice a year). I use keycloak to learn Quarkus OIDC security.

Both jose4j and nimbus (used by spring security) both give an option to the users to control the cache settings of JWKS

/*jose4j*/
  @Singleton
    @Produces
    @Named("jwtVerifier")
    public JwtConsumer jwtConsumer(@Named("jwksURL") final String jwksURL) {

        final HttpsJwks httpsJkws = new HttpsJwks(jwksURL);
        httpsJkws.setDefaultCacheDuration(3600 * 24);
        httpsJkws.setRetainCacheOnErrorDuration(3600);
        final HttpsJwksVerificationKeyResolver httpsJwksKeyResolver = new HttpsJwksVerificationKeyResolver(httpsJkws);
        httpsJwksKeyResolver.setDisambiguateWithVerifySignature(true);

        return new JwtConsumerBuilder()
            .setRequireJwtId()
            .setExpectedAudience(false, "account")
            .setVerificationKeyResolver(httpsJwksKeyResolver)
            .build();

    }

What is point of introspection here instead of failing the incoming request?

@missourian55
Copy link
Author

So, say if a malicious user downloaded a sample access_token(get a token from here https://auth0.com/docs/tokens/access-tokens/use-access-tokens) and post a request to my quarkus service. I unnecessarily see two requests going to idp on the first attempt

/auth/realms/quarkus/protocol/openid-connect/certs
/auth/realms/quarkus/protocol/openid-connect/token/introspect

subsequent attempts making a introspection calls

/auth/realms/quarkus/protocol/openid-connect/token/introspect

Seems not right to me. Any way to fail early like - unable to validate the incoming token with configured/cached JWKS and return error immediately? (without making certs or introspect remote calls to IDP)

@pedroigor
Copy link
Contributor

@sberyozkin Looks like we should fail fast the incoming request as suggested by @missourian55.

@sberyozkin
Copy link
Member

sberyozkin commented Oct 28, 2020

Hi @missourian55 @pedroigor it has been discussed in depth (and I really mean it was discussed in depth :-) ), privately and publicly with @pmlopes.
So yes, during the very first request when no matching kid is available, there will be 2 requests going out to Keycloak. It is totally asynchronous, so while the JWKs are beng refreshed, no blocking is done and hence the introspection fallback is executed - this is why 2 requests can happen.

It is not a problem because the key rotation is not happening in production every 10 mins or so, so if it happens that 2 requests are sent in a long running application once per every few months or so then it is not a problem.

@missourian55, re your last comment. If Keycloak will return a refreshed key set with the matching keys, then no, there will be no subsequent introspection requests. But it does not in your case, so obviously, without the local matching key the introspection fallback is executed.
Also, re your point that Jose4J allows to ping the JWK endpoint. This is actually possible to do with quarkus-smallrye-jwt but with quarkus-oidc, after discussing it with @stianst, we decided not to do it - as I said, in production, the keys are not rotated every 10 mins but irregularly and rarely, so having a timer spawning a backround request every few mins is going to be redundant and costly for most of the real world applications. Hope it clarifies it, thanks

@pedroigor
Copy link
Contributor

I agree with you on the JWKs bits.

My only point is in regards to the real need to make those two requests if signature verification failed. It might open an attack surface that allows people to DoD not only the application but also the IdP ?

@sberyozkin
Copy link
Member

sberyozkin commented Oct 28, 2020

Hi @pedroigor the 2nd (introspection) request will only happen once every few months (or whenever the keys are rotated) for the JWT tokens when no matching JWK is available locally. The local key unavailability does not mean the token signature is invalid.

And it is not happening if the signature verification with the local matching JWK fails.
And this is where the mentioned property plays its part as well - if someone forced a wrong kid => JWK refresh call, then these calls won't be happening until the interval expires.

@sberyozkin
Copy link
Member

sberyozkin commented Oct 28, 2020

In general, when dealing with the opaque tokens, and we have users who have such tokens, the introspection call will always go ahead - and again here the token can be randomized - but this an orthogonal issue - we'd need to keep a count of requests per client_id within a given interval etc, something which is worth considering in the future

@missourian55
Copy link
Author

@sberyozkin @pedroigor
I appreciate you guys for looking into this, but below statement is not correct and that is not what I am observing in 1.9.0.Final OIDC extension

"2nd (introspection) request will only happen once every few months (or whenever the keys are rotated)"

As of right now that introspection call to IDP is happening on every request with a bad access token (a token with no matching kid) I agree with @pedroigor on DoD attack on not only the application but also the IDP

@sberyozkin
Copy link
Member

@missourian55 Please read carefully my earlier responses. When JWT with no matching kid comes in or when an opaque/binary token comes in, then the only way to verify this token is via the introspection response.

In case of JWT, once the JWK refresh request has completed, there will be no follow up introspection requests as long as the local JWK is available

@sberyozkin
Copy link
Member

sberyozkin commented Oct 29, 2020

@missourian55 @pedroigor Hi, I've thought more about it. So, while I don't think changing VertxOAuth2 to block the introspection requests is realistic given the earlier discussions or (IMHO) generally sound since we can have the users dealing with not only JWT but also opaque tokens (including the encrypted JWT tokens), what can be supported in Quarkus OIDC, where the users absolutely know they want the local JWK verification only, is to try to indirectly block the introspection during this forced JWK refresh interval - I'll experiment a bit later. The remote introspection is the reality for many OAutth2/OIDC applications so it does not constitute a risk in itself, the rate request limiting can also be enforced if required by the external tools, etc

@missourian55
Copy link
Author

@sberyozkin Thanks, seems like this will work.

@sberyozkin
Copy link
Member

sberyozkin commented Nov 9, 2020

@missourian55, @pedroigor I've spent some time on testing this workaround but since both the jwk refresh and introspection calls are totally async I could not block the introspection ones.
Hence I've started this discussion: https://groups.google.com/g/vertx/c/4GD7bq-uLMU
thanks

@sberyozkin
Copy link
Member

Of course we are dealing with a somewhat academic case here, we expect the users do a proper HTTPS, even MTLS, but nonetheless, if we can optionally block the remote introspection then it would be a good option to have

@sberyozkin
Copy link
Member

This issue will be revisited once we complete the migration to Vertx Web Client

@holomekc
Copy link
Contributor

holomekc commented Nov 10, 2023

Hi @sberyozkin
I guess we have the same issue. Every week our OIDC provider is rotating the keys. And every week we see one or more requests failing with 401.

Nov 5, 2023 @ 01:00:00.675 Token verification has failed: Unable to process JOSE object (cause: org.jose4j.lang.UnresolvableKeyException: JWK with kid 'e1b32555ca706cd86a07639ada541aa645cbc328' is not available): JsonWebSignature{"alg":"RS256","kid":"e1b32555ca706cd86a07639ada541aa645cbc328"}-><token>
Nov 5, 2023 @ 01:00:00.675 No matching JWK key is found, refreshing and repeating the verification
Nov 5, 2023 @ 01:00:00.690 Token verification has failed: Unable to process JOSE object (cause: org.jose4j.lang.UnresolvableKeyException: JWK with kid 'e1b32555ca706cd86a07639ada541aa645cbc328' is not available): JsonWebSignature{"alg":"RS256","kid":"e1b32555ca706cd86a07639ada541aa645cbc328"}-><token>
Nov 5, 2023 @ 01:00:00.690 Token verification has failed: Unable to process JOSE object (cause: org.jose4j.lang.UnresolvableKeyException: JWK with kid 'e1b32555ca706cd86a07639ada541aa645cbc328' is not available): JsonWebSignature{"alg":"RS256","kid":"e1b32555ca706cd86a07639ada541aa645cbc328"}-><token>
Nov 5, 2023 @ 01:00:00.690 No matching JWK key is found, refreshing and repeating the verification
Nov 5, 2023 @ 01:00:00.691 Local JWT token verification has failed, attempting the token introspection

You can see that one request fails where the fallback to token instrospection is used, which we do not want to use. I need to disable it in the config. So I did some tests with keycloak locally:

  1. Create a valid token
  2. Send request -> 200 OK
  3. Delete Key provider in Keycloak and create a new one
  4. Send multiple requests in parallel (We picked 10)
  5. 4-6/10 requests failed with 401

In our productive system we see 3-5 request fail every week.

The only way to prevent this is to configure quarkus.oidc.token.forced-jwk-refresh-interval=0S, which is not that great. The last message is from 2021. Are there any plans to adjust this?

Edit:
As far as the debugging shows, there is a racing condition:

  • some request pass the if block of OidcProvider#JsonWebKeyResolver#refresh
  • They will update the keys and verification is ok
  • Other request do not pass the if block, but already realized that the kid does not exist locally yet
  • Then we end up in the else block, where nothing is done and the verification fails with 401.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/oidc kind/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants