-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Timeout error getting DevTools URL during browser launch #559
Comments
Related: #491? |
On second thought, I think this issue is a duplicate of #491. That So I'll close this issue and we can track it #491, since it also happens on Linux, though apparently much rarely than on WSL2. If it's easily reproducible on WSL2 then it would help us resolve it more quickly. |
I'm reopening this issue, as the root cause is different from #491. While the errors are the same, #491 is caused by incorrectly handling when the browser exits with a non-0 exit code, in general, or maybe something specific to Snap. In either case, it's a different root cause than this issue, even though the errors are the same. While #491 is always reproducible, this issue happens very rarely on some iterations, and we've only seen it in the Cloud. Since we ruled out the chances of a race condition in #563, the only explanation is that some environments do see a >30s delay to actually start the browser process. So it's likely that we can't do anything about it in the code, and may need to address this on the infra side. Let's leave this issue open until we decide. |
💡❓In this case, one (previously discussed) idea can be utilizing Device Farms on AWS. So that we can have a farm of browsers that are pre-launched, connect to them for each test start, and mitigate the long time it takes to launch a browser. To do that, we'll also need to work on #17. If using Device Farms turns out to be unfruitful, we might also want to evaluate/discuss pre-launched browsers without using Device Farms. For example, we can pre-launch the browsers and then connect to them for each test start. Instead of launching a new one for every test, we can connect to one of the available instances. |
@inancgumus That architecture might address this issue, but it's still a long ways off, and will not be part of the initial public beta release. Your second point is about instance reuse, which is a sensitive topic, as we must not allow data from previous test runs to be accessible to subsequent ones that use the same instance. Since we've had issues properly cleaning the data directory (#403, #484), instance reuse is currently disabled in the Cloud beta (i.e. new instances are created for each test run, and are terminated when the run ends). If we're going to reuse browsers between test runs, then it's even more critical that we handle this properly. What we might want to explore is launching the browser before we start k6, and then connecting to it when the test starts. There's a problem with this, though: So there are many open questions and issues we need to resolve before we can consider #17 as a solution to this. What I was referring to with addressing it on the infra side, is just using a different EC2 instance type. Instead of a general purpose one like |
Thanks, @imiric; these all are good points. It seems like there are a lot of things we need to consider, research, and evaluate. |
After #555, we no longer see this issue in the Cloud 🎉 So I'm closing this. That said, it's very unintuitive that #555 would be related to this issue, as during browser launch the event system hasn't been initialized yet, and no event handlers take part in this process... 😕 But maybe it impacted the launch indirectly? 🤷♂️ In any case, we can reopen this if it pops up again. |
This happens very rarely on current
main
(db80f94), even on Cloud test runs.In some cases while running the test script from #510 we see the following event logged:
The effect of this is that the iteration fails, and it gets manifested as a pause in execution for the VU while waiting for the default 30s timeout (see related issue #502). There's no reason this particular timeout should be that long, so we can shorten it, but also fix the root cause, which seems to be because of race condition between when the process starts, and us attaching the stdout listener to get the DevTools URL. Also, try to look into a more robust way of getting the URL that doesn't involve parsing stdout.
The text was updated successfully, but these errors were encountered: