-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove 300ms startup delay and optimize fork_on_start #297
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Not sure why a retry loop using MAX_BIND_TRIES / BIND_TRIES_DELAY surrounds connect_to_af_unix(), since bind() is done elsewhere in the code, and connect_to_af_unix() already has its own retry loop inside the function, with its own set of defines for retries, i.e. MAX_CONNECT_TRIES / CONNECT_TRIES_DELAY, surrounding the connect(). No need to have two levels of retry; if it's not successful within connect_to_af_unix() it will never be.
…tart When the first elinks session starts (whether fork_on_start or not), it tries to connect on AF_UNIX socket, to see if an existing master is present. The current code implements a retry loop with linear-increase retry backoff, 50ms *= attempts, with 3 attempts, total delay 300ms, before concluding master is not present. After forking for fork_on_start, we do have to wait for the new master to finish setup, so we can connect to it. This is the case where retry loop might be useful. However in fork_on_start=0 case, there should be no reason to ever wait: if we get ECONNREFUSED or ENOENT, the master simply is not there, we can go on about our business as the master. With fork_on_start=1 however, we will race with the child (that will become the new master), which has to complete its socket/bind/listen sequence. Therefore we typically have to wait another 50ms (first attempt delay), for a total of 350ms delay. In both cases, there does not seem to be a purpose to the initial connection attempt retry. Conclusion after connect() failure should be authoritative: there is no master. We do not race with anyone. If we have done fork_on_start and have to wait, we can instead open a pipe before forking, and wait for the "new" master (its child) to send us a byte over the pipe. Thus, we do not need to poll, but can simply block until we get the trigger, to indicate that AF_UNIX socket setup has completed. This speeds up the first start of elinks by 300ms for fork_on_start=0, and between 300-350ms for fork_on_start=1 (or possibly more, if the machine is loaded and child did not finish AF_UNIX setup in first 50ms).
Thank you. Yes, it is faster now. |
Thanks. FYI, here's timing of just modules. Most of the "extra" time is in module
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In a previous issue it was noted that the first elinks session has a long startup delay of ~350ms, whereas slaves start instantly. A workaround is to use
dtach
for the first session, and leave it headless, so all subsequentelinks
are slaves.It was pointed out that
ui.sessions.fork_on_start
already was designed for this purpose. However, even in that case, the first-session delay still remains.After instrumenting key sites in
main.c
andinterlink.c
following the call pathinit()
->init_interlink()
->connect_to_af_unix()
like so:it was discovered that this delay is due to a fixed sequence of 300ms on start, attempting
connect()
to existing master onAF_UNIX
socket, going to sleep and retrying several times.Timings for unmodified branch with
fork_on_start=0
:We wait an extra 50ms for
fork_on_start=1
, because the master has to finish seting up the AF_UNIX socket and start the retry loop again. However we get most of the 50ms back because the fork overhead, and module init, happen in parallel in the child while we wait:The attached patches:
fork_on_start=1
case, so it can retry the secondconnect()
attempt with immediate success.This avoids the delay completely in non-fork case, and minimizes delay in fork case.
Timings for
nowait-connect
branch withfork_on_start=0
:Timings for
nowait-connect
branch withfork_on_start=1
:More details in the commit messages.
Trace instrumentation diff against master.