Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove 300ms startup delay and optimize fork_on_start #297

Merged
merged 2 commits into from
Apr 13, 2024

Conversation

smemsh
Copy link
Contributor

@smemsh smemsh commented Apr 13, 2024

In a previous issue it was noted that the first elinks session has a long startup delay of ~350ms, whereas slaves start instantly. A workaround is to use dtach for the first session, and leave it headless, so all subsequent elinks are slaves.

It was pointed out that ui.sessions.fork_on_start already was designed for this purpose. However, even in that case, the first-session delay still remains.

After instrumenting key sites in main.c and interlink.c following the call path init() -> init_interlink() -> connect_to_af_unix() like so:

struct timeval tstart, tlast;

void
print_elapsed(const char *whence)
{
       struct timeval tnow, ttotal, tdelta;
       unsigned long fromstart, fromlast;

       gettimeofday(&tnow, NULL);

       timersub(&tnow, &tstart, &ttotal);
       timersub(&tnow, &tlast, &tdelta);
       fromstart = ttotal.tv_sec * 1e6 + ttotal.tv_usec;
       fromlast = tdelta.tv_sec * 1e6 + tdelta.tv_usec;

       fprintf(stderr, "%8d%10lu +%8lu %-s\n",
               getpid(), fromstart, fromlast, whence);

       memcpy(&tlast, &tnow, sizeof(struct timeval));
}

it was discovered that this delay is due to a fixed sequence of 300ms on start, attempting connect() to existing master on AF_UNIX socket, going to sleep and retrying several times.

Timings for unmodified branch with fork_on_start=0:

 3377146         1 +       1 init_start
 3377146      4099 +    4098 init_after_home
 3377146      4116 +      17 interlink_pre_af_connect
 3377146      4118 +       2 afconnect_before_getaddr
 3377146      4122 +       4 afconnect_after_getaddr
 3377146     54376 +   50254 afconnect_post_sleep1 No such file or directory
 3377146    154653 +  100277 afconnect_post_sleep2 No such file or directory
 3377146    304948 +  150295 afconnect_post_sleep3 No such file or directory
 3377146    304974 +      26 afconnect_after_loop
 3377146    304976 +       2 interlink_post_af_connect
 3377146    315086 +   10110 interlink_child_prebind
 3377146    315680 +     594 interlink_child_postbind
 3377146    315685 +       5 init_after_interlink
 3377146    349608 +   33923 init_after_modules
 3377146    349752 +     144 init_end

We wait an extra 50ms for fork_on_start=1, because the master has to finish seting up the AF_UNIX socket and start the retry loop again. However we get most of the 50ms back because the fork overhead, and module init, happen in parallel in the child while we wait:

 3377604         0 +       0 init_start
 3377604      4333 +    4333 init_after_home
 3377604      4349 +      16 interlink_pre_af_connect
 3377604      4351 +       2 afconnect_before_getaddr
 3377604      4355 +       4 afconnect_after_getaddr
 3377604     54584 +   50229 afconnect_post_sleep1 No such file or directory
 3377604    154787 +  100203 afconnect_post_sleep2 No such file or directory
 3377604    305093 +  150306 afconnect_post_sleep3 No such file or directory
 3377604    305115 +      22 afconnect_after_loop
 3377604    305117 +       2 interlink_post_af_connect
 3377604    306225 +    1108 interlink_prefork
 3377604    306554 +     329 interlink_parent_preconnect
 3377604    306570 +      16 afconnect_before_getaddr
 3377604    306586 +      16 afconnect_after_getaddr
 3377605    306664 +     439 interlink_child_prebind
 3377605    306810 +     146 interlink_child_postbind
 3377605    306813 +       3 init_after_interlink
 3377605    328902 +   22089 init_after_modules
 3377605    328933 +      31 init_end
 3377604    356732 +   50146 afconnect_post_sleep1 No such file or directory
 3377604    356774 +      42 interlink_parent_connect
 3377604    356857 +      83 init_end

The attached patches:

  1. eliminate this unnecessary delay for the first time: master is either there or it isn't, no need to try again.
  2. use a pipe-written byte delivered by the child (new master) to its waiting parent, once the listener is set up, in fork_on_start=1 case, so it can retry the second connect() attempt with immediate success.

This avoids the delay completely in non-fork case, and minimizes delay in fork case.

Timings for nowait-connect branch with fork_on_start=0:

 3399320         0 +       0 init_start
 3399320      4310 +    4310 init_after_home
 3399320      4324 +      14 interlink_pre_af_connect
 3399320      4351 +      27 afconnect_before_getaddr
 3399320      4355 +       4 afconnect_after_getaddr
 3399320      4374 +      19 afconnect_after_connect
 3399320      4376 +       2 interlink_post_af_connect
 3399320      5684 +    1308 interlink_child_prebind
 3399320      5824 +     140 interlink_child_postbind
 3399320      5830 +       6 init_after_interlink
 3399320      5832 +       2 init_before_modules
 3399320     37862 +   32030 init_after_modules
 3399320     38010 +     148 init_end

Timings for nowait-connect branch with fork_on_start=1:

 3399545         0 +       0 init_start
 3399545      4447 +    4447 init_after_home
 3399545      4463 +      16 interlink_pre_af_connect
 3399545      4465 +       2 afconnect_before_getaddr
 3399545      4469 +       4 afconnect_after_getaddr
 3399545      4491 +      22 afconnect_after_connect
 3399545      4493 +       2 interlink_post_af_connect
 3399546      5988 +    1495 interlink_child_prebind
 3399546      6179 +     191 interlink_child_postbind
 3399546      6183 +       4 init_after_interlink
 3399546      6185 +       2 init_before_modules
 3399545      6185 +    1692 afconnect_before_getaddr
 3399545      6237 +      52 afconnect_after_getaddr
 3399545      6249 +      12 interlink_parent_connected
 3399545      6297 +      48 init_end
 3399546     35605 +   29420 init_after_modules
 3399546     35638 +      33 init_end

More details in the commit messages.

Trace instrumentation diff against master.

smemsh added 2 commits April 12, 2024 20:05
Not sure why a retry loop using MAX_BIND_TRIES / BIND_TRIES_DELAY
surrounds connect_to_af_unix(), since bind() is done elsewhere in the
code, and connect_to_af_unix() already has its own retry loop inside the
function, with its own set of defines for retries, i.e. MAX_CONNECT_TRIES
/ CONNECT_TRIES_DELAY, surrounding the connect().  No need to have two
levels of retry; if it's not successful within connect_to_af_unix() it
will never be.
…tart

When the first elinks session starts (whether fork_on_start or not), it
tries to connect on AF_UNIX socket, to see if an existing master is
present.  The current code implements a retry loop with linear-increase
retry backoff, 50ms *= attempts, with 3 attempts, total delay 300ms,
before concluding master is not present.

After forking for fork_on_start, we do have to wait for the new master
to finish setup, so we can connect to it.  This is the case where retry
loop might be useful.  However in fork_on_start=0 case, there should be
no reason to ever wait: if we get ECONNREFUSED or ENOENT, the master
simply is not there, we can go on about our business as the master.

With fork_on_start=1 however, we will race with the child (that will
become the new master), which has to complete its socket/bind/listen
sequence.  Therefore we typically have to wait another 50ms (first
attempt delay), for a total of 350ms delay.

In both cases, there does not seem to be a purpose to the initial
connection attempt retry.  Conclusion after connect() failure should be
authoritative: there is no master.  We do not race with anyone.

If we have done fork_on_start and have to wait, we can instead open a
pipe before forking, and wait for the "new" master (its child) to send
us a byte over the pipe.  Thus, we do not need to poll, but can simply
block until we get the trigger, to indicate that AF_UNIX socket setup
has completed.

This speeds up the first start of elinks by 300ms for fork_on_start=0,
and between 300-350ms for fork_on_start=1 (or possibly more, if the
machine is loaded and child did not finish AF_UNIX setup in first 50ms).
@rkd77 rkd77 merged commit 9ae8dd7 into rkd77:master Apr 13, 2024
@rkd77
Copy link
Owner

rkd77 commented Apr 13, 2024

Thank you. Yes, it is faster now.

@smemsh smemsh deleted the nowait-connect branch April 14, 2024 04:45
@smemsh
Copy link
Contributor Author

smemsh commented Apr 14, 2024

Thanks. FYI, here's timing of just modules. Most of the "extra" time is in module Scripting (I have Python enabled), ie 21ms. Maybe good for some of that to be done lazily once it's actually used. But that's for another day...

 3241628      3591 +    3590 init_module_Document_before
 3241628      3762 +     171 init_module_Document_after
 3241628      3769 +       7 init_module_Keyboard Bindings_before
 3241628      3871 +     102 init_module_Keyboard Bindings_after
 3241628      3873 +       2 init_module_Terminal_before
 3241628      3875 +       2 init_module_Terminal_after
 3241629      6311 +       2 init_before_modules
 3241629      6318 +       7 init_module_Periodic Saving_before
 3241629      6339 +      21 init_module_Periodic Saving_after
 3241629      6344 +       5 init_module_Viewer_before
 3241629      6422 +      78 init_module_Viewer_after
 3241629      6426 +       4 init_module_gettext (ELinks)_before
 3241629      6432 +       6 init_module_gettext (ELinks)_after
 3241629      6436 +       4 init_module_Cascading Style Sheets_before
 3241629      6483 +      47 init_module_Cascading Style Sheets_after
 3241629      6487 +       4 init_module_Protocol_before
 3241629      7955 +    1468 init_module_Protocol_after
 3241629      7969 +      14 init_module_SSL_before
 3241629     11823 +    3854 init_module_SSL_after
 3241629     11837 +      14 init_module_MIME_before
 3241629     11848 +      11 init_module_MIME_after
 3241629     11850 +       2 init_module_LED indicators_before
 3241629     11856 +       6 init_module_LED indicators_after
 3241629     11858 +       2 init_module_Bookmarks_before
 3241629     11946 +      88 init_module_Bookmarks_after
 3241629     11949 +       3 init_module_Cookies_before
 3241629     12031 +      82 init_module_Cookies_after
 3241629     12034 +       3 init_module_Form History_before
 3241629     12036 +       2 init_module_Form History_after
 3241629     12039 +       3 init_module_Global History_before
 3241629     12889 +     850 init_module_Global History_after
 3241629     12894 +       5 init_module_Scripting_before
 3241629     34857 +   21963 init_module_Scripting_after
 3241629     34876 +      19 init_module_Exmode_before
 3241629     34896 +      20 init_module_Exmode_after
 3241629     34898 +       2 init_module_Goto URL History_before
 3241629     34911 +      13 init_module_Goto URL History_after
 3241629     34913 +       2 init_after_modules

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants