-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance issues with authrequest objects #1091
Comments
The TPR/CRD backend is specifically stated to not be performance optimized. How many login events are you handling? |
Very few logins, but it seems like the authrequests are being created when k8s does an RBAC authorization? |
@blakebarnett no, only when a user attempts to login |
So apparently it's creating authrequests every time our ELB health-check hits it with HTTPS (we changed it from HTTP because it was filling the logs with It seems strange that it creates one of these records even just hitting So, my question is, what's the proper way to health check Dex if it's terminating SSL? I noticed #753 was recommended to solve this, but we are using an ELB in front of Dex and would really like it to work like #682 Maybe it would make sense to expose a separate non-SSL port just for health checks, and maybe prometheus metrics? ;) |
The /healthz endpoint is the correct way. We'd be open to exporting
prometheus metrics on a separate port though.
What's weird is that the /healthz endpoint should be cleaning up the
authrequests itself
https://github.com/coreos/dex/blob/f3c85e6936b064d2a7a6ef46fa4bb58d6e295051/server/handlers.go#L23-L51
Do you have any logs from your dex instance indicating errors during the
health check?
…On Tue, Oct 10, 2017 at 10:45 AM blakebarnett ***@***.***> wrote:
So apparently it's creating authrequests every time our ELB health-check
hits it with HTTPS (we changed it from HTTP because it was filling the logs
with TLS handshake error messages)
It seems strange that it creates one of these records even just hitting
HTTPS:<nodeport>/healthz, but as soon as I changed it back to HTTP, the
garbage collection has started cleaning them up.
So, my question is, what's the proper way to health check Dex if it's
terminating SSL? I noticed #753 <#753>
was recommended to solve this, but we are using an ELB in front of Dex and
would really like it to work like #682
<#682>
Maybe it would make sense to expose a separate non-SSL port just for
health checks, and maybe prometheus metrics? ;)
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1091 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ACO_XT2qJhACaMzsDkLzvBJ1WMYWLrGsks5sq606gaJpZM4PrLzy>
.
|
Yeah, we're seeing TLS handshake errors after switching the health check to just plain TCP as expected as well as GC errors that seem to be related (though now that we aren't using HTTPS for the health-check they are slowly disappearing):
|
@blakebarnett we're thinking of running Dex with kubernetes storage backend as well. Would you mind sharing your opinion on it so far? How does it perform? How much storage does it take up? Roughly how many kubernetes users do you have? Thanks! |
It's been working great for us aside from this issue, which seems to be specific to an external LB doing a non-TLS health-check |
We upgraded to Dex 2.7.1 and migrated to CRDs, but we are still seeing ~17.5k authrequests at all times. We tried changing the way the health check works (TLS/non-TLS etc) with no luck. We are on k8s 1.7.9 and seeing big spikes in etcd access latency for |
|
We're running Dex 2.4.1 with the LDAP connector and kubernetes storage backend. Kubernetes has RBAC enabled and is using Dex via the OIDC authorization plugin.
We're seeing large amounts of authrequest objects being created even though the cluster is quite small with very little usage:
Interestingly, 2 of our larger production clusters have roughly the same amount.
EDIT: All of the objects it returns are for the current day (expiration is working properly)
When Dex gets restarted it makes etcd latency spike and we get alerts, which seems to be because there are so many authrequest objects. (Still running etcd v2, k8s v1.7.5)
Is this expected? Should we move to a different backend?
The text was updated successfully, but these errors were encountered: