-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UI: Log streaming bug fix medley #7820
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice collection of concise fixes for some difficult problems; I have a couple of comments on tests but nothing major.
@@ -255,5 +279,54 @@ module('Integration | Component | task log', function(hooks) { | |||
'Log request was later made to the server' | |||
); | |||
assert.ok(find('[data-test-connection-error]'), 'An error message is shown'); | |||
|
|||
await click('[data-test-connection-error-dismiss]'); | |||
// await settled(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this could be removed?
|
||
run.later(() => { | ||
click('[data-test-log-action="stdout"]'); | ||
run.later(run, run.cancelTimers, commonProps.interval * 6); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might as well use just interval
here too?
click('[data-test-log-action="stderr"]'); | ||
run.later(run, run.cancelTimers, commonProps.interval * 5); | ||
}, allowedConnectionTime / 2); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice test setup!
'Server request for stderr' | ||
); | ||
|
||
assert.notOk(find('[data-test-connection-error]'), 'An error message is shown'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The explanation should include “not” I think, but maybe it’s not necessary at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You caught me copy/pasting. The description probably isn't necessary, but I was following the convention of the file.
…onably Now options can be provided without also having to remember to pass credentials. This is convenient for abort controller signals.
Typically a failover means that the client can't be reached. However, if the client does eventually return after the timeout period, the log will stream indefinitely. This fixes that using an API that wasn't broadly available at the time this was first written.
…nnection error This would happen because a no connection error happens after the second request fails, but that's because it's assumed the second request is to a server node. However, if a user clicks stderr fast enough, the first and second requests are both to the client node. This changes the logic to check if the request is to the server before deeming log streaming a total failure.
30a12a9
to
9557475
Compare
Quick question – it this change is available in 0.11.2 ? Somehow I've started having "Cannot fetch logs" all over again after upgrading from 0.11.1 :( |
Yes, this is in 0.11.2. Since you're still seeing log funkiness, can you open a new issue with more details? |
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
This fixes a couple bugs @Gurpartap reported in #7770 as well as a couple more I found along the way.
Bug 1: Switching from stdout to stderr before the stdout request fails or times out causes an error on the stderr log stream despite successfully streaming the logs.
This was a fun race condition. Logs first attempt to connect to the client an allocation is running on before falling back on the server which should always work. When the client request fails, a flag is set to failover to the server. When that server failover flag is already set and another request fails, connecting to logs is deemed a total loss and the "Cannot fetch logs" error is shown.
If a user clicks stderr before stdout triggers the first failure flag then the stderr request will attempt to connect to the client. This causes both stdout and stderr to fail if the client can't be reached. The first failure flips the fallback flag and the second failure flips the total loss flag. However, when the fallback flag is flipped, it still causes a server request to be made, which is why log streaming continued to work.
The fix here was to capture the failover flag state at request time rather than response time. This way both racing client requests flip the same flag (or rather the first one flips it and the second is a no-op).
Bug 2: The "Cannot fetch log" error persists when switching between stdout/stderr
This one technically isn't a bug. If the error message is shown, it means no connection can be established. It only seems like a bug when Bug 1 occurs. However, I too dislike error messages that can't be dismissed, so I made it dismissable anyway.
Bug 3: Failing over to a server request after a client request takes longer than the client timeout threshold but still eventually responds causes two parallel stream requests, one which never closes.
This one was always known about but AbortControllers where still a relatively new thing when this was first written.
The task logger wisely has its own timeout for client requests to prevent a hanging request from triggering the fallback to making a server request. 9/10 (maybe even 99/100?) times this request will eventually timeout because the client can't be reached, but in the event the client does respond, that connection would remain open indefinitely eating up one of the precious few parallel connections a browser gets to a single host.
The fix here is to use an AbortController as mentioned. They now have broad support. The only browser we are concerned with that does not have them is IE11 which is ok in this instance since IE11 also doesn't support response streaming so it will use the poll logger anyway.
Bug 4: Clicking stdout/stderr when already on stdout/stderr stops the log streamer.
This one wasn't tricky by any stretch of the imagination. Just a small detail I overlooked. The fix was to return early if the mode being switched to was already the current mode.
Fixes #7770