Replace roko.TryForever with exponential backoff for ~24 hours #2588

tessereth · 2024-01-16T01:47:51Z

Occasionally something goes wrong and the agent gets stuck retrying job log uploads or job finish calls forever (eg because the agent's access token is revoked). Forever is a really long time. Instead just retry for ~24 hours. I also changed these to exponential backoff rather than constant time because if a request fails, it's likely to either be fixed very quickly (network blip) or take a long time (incident). So no point retrying every second if it's already failed a hundred times. I also made them both the same and 2 seconds initial interval to match other parts of the code.

Using the ExponentialSubsecond algorithm with a 2 second initial interval, this is what the retry intervals looks like:

Retry attempt	Delay	Cumulative delay
0	0:00:02	0:00:02
1	0:00:03	0:00:05
2	0:00:05	0:00:10
3	0:00:08	0:00:19
4	0:00:13	0:00:32
5	0:00:22	0:00:54
6	0:00:35	0:01:28
7	0:00:56	0:02:24
8	0:01:29	0:03:53
9	0:02:24	0:06:17
10	0:03:51	0:10:08
11	0:06:12	0:16:20
12	0:09:58	0:26:18
13	0:16:02	0:42:20
14	0:25:47	1:08:07
15	0:41:27	1:49:35
16	1:06:40	2:56:15
17	1:47:12	4:43:27
18	2:52:24	7:35:51
19	4:37:14	12:13:05
20	7:25:50	19:38:55
21	11:56:56	31:35:51

moskyb

forever is a long time

[citation needed]

Replace roko.TryForever with exponential backoff for 24 hours

477a1e1

moskyb approved these changes Jan 16, 2024

View reviewed changes

tessereth merged commit 00c74ac into main Jan 16, 2024
1 check passed

tessereth deleted the remove-try-forever branch January 16, 2024 03:00

tessereth mentioned this pull request Jan 23, 2024

Bump version and CHANGELOG for v3.62.0 #2597

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace roko.TryForever with exponential backoff for ~24 hours #2588

Replace roko.TryForever with exponential backoff for ~24 hours #2588

tessereth commented Jan 16, 2024 •

edited

Loading

moskyb left a comment •

edited

Loading

Replace roko.TryForever with exponential backoff for ~24 hours #2588

Replace roko.TryForever with exponential backoff for ~24 hours #2588

Conversation

tessereth commented Jan 16, 2024 • edited Loading

moskyb left a comment • edited Loading

Choose a reason for hiding this comment

tessereth commented Jan 16, 2024 •

edited

Loading

moskyb left a comment •

edited

Loading