Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StopWorkspace is called prior to regular workspace transitioning to running state #12282

Closed
sagor999 opened this issue Aug 22, 2022 · 15 comments
Closed
Labels
meta: stale This issue/PR is stale and will be closed soon type: bug Something isn't working

Comments

@sagor999
Copy link
Contributor

Bug description

StopWorkspace is called on workspace that has not finished initializing yet, causing ws-daemon to fail content init 10 min later.
Not sure why it fails 10 min after StopWorkspace was called, most probably some sort of timeout triggers somewhere.
Workspace logs:
https://cloudlogging.app.goo.gl/1qNcUN3hcr8GmPE16

Webapp logs:
https://cloudlogging.app.goo.gl/T344gi8ZHzDLLJNZ9

@geropl why does webapp is calling StopWorkspace prior to workspace starting up? I thought it was only implemented for prebuilds, but this happens on regular workspaces as well.

For workspace team:
we need to be able to handle this gracefully. This would be related to #11453 but that one specifically is looking to address prebuild cancellation, not regular workspaces. It is possible that work done in #11453 might address this issue as well, but most probably it will need additional work.

Steps to reproduce

not sure. need input from webapp team

Workspace affected

No response

Expected behavior

No response

Example repository

No response

Anything else?

No response

@sagor999 sagor999 added the type: bug Something isn't working label Aug 22, 2022
@sagor999
Copy link
Contributor Author

This happens because there is no way to signal to ws-daemon to stop InitWorkspace process.

@utam0k
Copy link
Contributor

utam0k commented Aug 23, 2022

I saw the same error.
I am wondering why there is an error about /dst so long after the request for a stop.

@kylos101
Copy link
Contributor

kylos101 commented Aug 23, 2022

@sagor999 how often is this happening? What is the impact to users if we don't fix?

For the meantime, I am going to assume infrequent, but add to the webapp inbox and mention Gero. This way, they can inspect the logs more leisurely using their flow.

If this happening often and negatively impacting users, let us know? I'll leave in our inbox for now to wait till I hear back.

@kylos101
Copy link
Contributor

@geropl adding this to the webapp inbox. Can we ask for webapp help in determining potential steps to reproduce?

@geropl
Copy link
Member

geropl commented Aug 23, 2022

how often is this happening? What is the impact to users if we don't fix?

Can we ask for webapp help in determining potential steps to reproduce?

For prebuilds, this happens regularly now. Whenever a new commit is pushed, we cancel prior prebuilds on said branch. This should be reproduceable in a preview env with a script to push to a repo that is set up as project before.

Especially in cases with a lot of traffic (main branches + long builds) this happens nearly every time. For some teams this means that a lot of their prebuilds fail with weird error.

@kylos101
Copy link
Contributor

@geropl thank you! That'll help us recreate, I think.

@sagor999 I am going to add this to breakdown for now, so we can socialize later this week.

@kylos101 kylos101 moved this to Breakdown in 🌌 Workspace Team Aug 23, 2022
@kylos101 kylos101 changed the title StopWorkspace is called prior to workspace transitioning to running state StopWorkspace is called prior to regular workspace transitioning to running state Aug 25, 2022
@kylos101 kylos101 removed the status in 🌌 Workspace Team Aug 25, 2022
@kylos101
Copy link
Contributor

Removing from scheduled groundwork for now, not sure how often this is happening.

@geropl
Copy link
Member

geropl commented Sep 6, 2022

Related PR: #12386

@sagor999 I think this problem might be resolved by that, right? 🤔

@kylos101
Copy link
Contributor

kylos101 commented Sep 6, 2022

@geropl #12386 is for prebuilds, but, this issue is for regular workspaces. Could you add additional logging to the webapp side, so that when calling stop, the reason for stopping is also logged? 🙏

@kylos101
Copy link
Contributor

kylos101 commented Sep 8, 2022

Hey @geropl , can you help set our expectations on when this may be able to join other Scheduled webapp work? 🙏 As an FYI, we're blocked in #11852 (comment) till the corresponding logging is added.

@geropl
Copy link
Member

geropl commented Sep 9, 2022

Hey @kylos101 , sorry for not being clear here. We'll have it at the top of the queue on Monday. 👍

@kylos101
Copy link
Contributor

👋 @geropl , it looks like this is not currently scheduled, is that intentional? I ask because of this comment.

For background, we are blocked on two things:

  1. adding this issue (itself) to groundwork
  2. continuing the resolution for Unexpected error loading prebuild #11852

@geropl
Copy link
Member

geropl commented Sep 15, 2022

@kylos101 Sorry for the back-and-forth. I was under the impression that you asked that logging why StopWorkspace is called (#12283) is what you asked here, so I scheduled that.

Coming back to the original question:

@geropl why does webapp is calling StopWorkspace prior to workspace starting up? I thought it was only implemented for prebuilds, but this happens on regular workspaces as well.

So far we don't give any guarantee that we don't call StopWorkspace in any given state workspace state. And people do it regularly from the dashboard, if their workspaces take too long to start, or for a variety of other reasons.

Given our long history here: Does it make sense to sync this afternoon to make sure we're on the same page?

@kylos101
Copy link
Contributor

WebApp involvement should be done now because server logs all stop workspace requests as of #12283

@stale
Copy link

stale bot commented Dec 16, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the meta: stale This issue/PR is stale and will be closed soon label Dec 16, 2022
@stale stale bot closed this as completed Jun 12, 2023
@github-project-automation github-project-automation bot moved this to Awaiting Deployment in 🌌 Workspace Team Jun 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
meta: stale This issue/PR is stale and will be closed soon type: bug Something isn't working
Projects
No open projects
Status: Awaiting Deployment
Development

No branches or pull requests

4 participants