-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ldap: add timeout and retry-backoff for ldap #51927
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## master #51927 +/- ##
=================================================
- Coverage 70.7589% 54.6341% -16.1248%
=================================================
Files 1477 1583 +106
Lines 438569 605035 +166466
=================================================
+ Hits 310327 330556 +20229
- Misses 108827 251508 +142681
- Partials 19415 22971 +3556
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Signed-off-by: Yang Keao <[email protected]>
52d0f54
to
a8007e4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
// fail to bind to anonymous user, just release this connection and try to get a new one | ||
impl.ldapConnectionPool.Put(nil) | ||
|
||
retryCount++ | ||
if retryCount >= getConnectionMaxRetry { | ||
return nil, errors.Wrap(err, "fail to bind to anonymous user") | ||
} | ||
// Be careful that it's still holding the lock of the system variables, so it's not good to sleep here. | ||
// TODO: refactor the `RWLock` to avoid the problem of holding the lock. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use syncutil.RWMutex
to replace sync.RWMutex
. which can find deadlock in the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current issue is not about deadlock (there's only one lock, and is not recursive, so it's impossible to deadlock here).
The problem is that a pending write lock (by rebuildSysVarCache) will block all other read lock, which I didn't realize before 🤦 😢 , which makes this issue much more serious.
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bb7133, CbcWestwolf The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[LGTM Timeline notifier]Timeline:
|
/cherry-pick release-7.1-20240320-v7.1.3 |
Signed-off-by: ti-chi-bot <[email protected]>
@YangKeao: new pull request created to branch In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
In response to a cherrypick label: new pull request created to branch |
Signed-off-by: ti-chi-bot <[email protected]>
Signed-off-by: ti-chi-bot <[email protected]>
In response to a cherrypick label: new pull request created to branch |
What problem does this PR solve?
Issue Number: close #51883
This PR is a smaller version of #51912. We'll finally get #51912 merged, but we need a smaller one to focus on the timeout mechanism (which is much simpler than refactor the locks).
If the LDAP connection lost after the first handshake, the LDAP goroutine and function call will hang forever.
What changed and how does it work?
I have done two modifications in this PR:
Check List
Tests
docker run --network host -it yangkeao/ldap-sasl-example:d2b324 /bin/bash
to get an environment with LDAP server. Then running a TiDB server at port 4000.Create user, setup variables, and prepare CA:
sudo cp /proc/$(pidof mysqld)/root/etc/ssl/certs/example.crt /tmp/ca.crt
Then you can login to TiDB with
yangkeao
user:Enable or disable
authentication_ldap_simple_tls
are both fine.Use the following iptables command to drop all packets to LDAP server:
Then login without TLS will timeout after 10 seconds:
Then login with TLS will timeout after 20 seconds:
This PR also fixed some tiny issues: like rebuilding the connection pool after resetting the connection related variables, to avoid having wrong connection in the pool.
Release note