Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enter: start default services before executing subcommand #257

Merged
merged 15 commits into from
Mar 21, 2024

Conversation

rebornplusplus
Copy link
Member

@rebornplusplus rebornplusplus commented Jul 7, 2023

The --run option in the enter command should ensure that the default services have all started before executing the subcommand, as specified in the help info.

This, however, does not seem to be the case as the current implementation only sends a request to the API to start the default services and immediately afterwards, executes the subcommand. This can have undesired behaviour. For example, the following fails:

$ pebble enter --run stop srv1
2023-06-01T09:52:52.516Z [pebble] Started daemon.
2023-06-01T09:52:52.546Z [pebble] POST /v1/services 29.059682ms 202
2023-06-01T09:52:52.546Z [pebble] Started default services with change 386.
2023-06-01T09:52:52.565Z [pebble] Service "srv1" starting: <cmd..>
2023-06-01T09:52:52.584Z [pebble] POST /v1/services 37.51026ms 202
error: cannot perform the following tasks:
- Stop service "srv1" (cannot stop service while starting)

It may also happen that the stop subcommand is executed, before the service hasn't even been initiated/started.

$ pebble enter --run stop srv99
2023-06-22T05:46:02.604Z [pebble] Started daemon.
2023-06-22T05:46:02.619Z [pebble] Service "srv55" starting: <cmd..>
2023-06-22T05:46:02.629Z [pebble] POST /v1/services 23.863368ms 202
2023-06-22T05:46:02.630Z [pebble] Started default services with change 9.
2023-06-22T05:46:02.637Z [pebble] Service "srv1" starting: <cmd..>
2023-06-22T05:46:02.646Z [pebble] POST /v1/services 14.761043ms 202
2023-06-22T11:46:02+06:00 INFO Service "srv99" has never been started.
...

This commit should ensure that the default services have all started, before executing the subcommand. It does so by ensuring the "change" made by the "autostart" request is ready, before running the subcommand. The status of the "change" can be "Error" once it's ready; the subcommand will still be executed after.

Add a new option ``hideProgress`` in ``waitMixin`` to hide
the progress, while waiting for changes to be ready.
@rebornplusplus rebornplusplus force-pushed the rocks-696/subcommand-waits branch 2 times, most recently from c9e980b to 151bffd Compare July 7, 2023 13:01
The --run option in the enter command should ensure that the default
services have all started before executing the subcommand, as
specified in the help info.

This, however, does not seem to be the case as the current
implementation only sends a request to the API to start the default
services and immediately afterwards, executes the subcommand.
This can have undesired behavior. For example, the following fails:

    $ pebble enter --run stop srv1
    2023-06-01T09:52:52.516Z [pebble] Started daemon.
    2023-06-01T09:52:52.546Z [pebble] POST /v1/services 29.059682ms 202
    2023-06-01T09:52:52.546Z [pebble] Started default services with change 386.
    2023-06-01T09:52:52.565Z [pebble] Service "srv1" starting: <cmd..>
    2023-06-01T09:52:52.584Z [pebble] POST /v1/services 37.51026ms 202
    error: cannot perform the following tasks:
    - Stop service "srv1" (cannot stop service while starting)

It may also happen that the stop subcommand is executed, before the
service hasn't even been initiated/started.

    $ pebble enter --run stop srv99
    2023-06-22T05:46:02.604Z [pebble] Started daemon.
    2023-06-22T05:46:02.619Z [pebble] Service "srv55" starting: <cmd..>
    2023-06-22T05:46:02.629Z [pebble] POST /v1/services 23.863368ms 202
    2023-06-22T05:46:02.630Z [pebble] Started default services with change 9.
    2023-06-22T05:46:02.637Z [pebble] Service "srv1" starting: <cmd..>
    2023-06-22T05:46:02.646Z [pebble] POST /v1/services 14.761043ms 202
    2023-06-22T11:46:02+06:00 INFO Service "srv99" has never been started.

This commit should ensure that the default services have all started,
before executing the subcommand. It does so by ensuring the "change"
made by the "autostart" request is ready, before running the
subcommand. The status of the "change" can be "Error" once it's ready;
the subcommand will still be executed after.
@rebornplusplus rebornplusplus marked this pull request as ready for review July 10, 2023 08:41
internals/cli/cmd_run.go Outdated Show resolved Hide resolved
According to the feedback from @virtustom, log the error messages from
``waitMixin.wait()`` if there are any.
@rebornplusplus rebornplusplus requested a review from woky July 10, 2023 10:56
@rebornplusplus
Copy link
Member Author

rebornplusplus commented Jul 10, 2023

@benhoyt @niemeyer -- will you please take a look at this PR? Thanks. :)

PS. The tests are failing randomly. I would appreciate it very much if any of you could re-trigger the workflow.

@niemeyer
Copy link
Contributor

PS. The tests are failing randomly. I would appreciate it very much if any of you could re-trigger the workflow.

I'd appreciate even more if the tests could be fixed so they don't fail randomly.

internals/cli/cmd_run.go Show resolved Hide resolved
@benhoyt
Copy link
Contributor

benhoyt commented Jul 11, 2023

PS. The tests are failing randomly. I would appreciate it very much if any of you could re-trigger the workflow.

I'd appreciate even more if the tests could be fixed so they don't fail randomly.

Agreed. However, note that Fred Lotter is working on that (these are the servstate failures). So I've manually re-run them for now.

internals/cli/cmd_run.go Outdated Show resolved Hide resolved
internals/cli/cmd_run.go Show resolved Hide resolved
Exit with non-zero error code if waiting for the "change" (of default
services start) to be ready returns any errors.
Copy link
Contributor

@benhoyt benhoyt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment, but looks reasonable to me. Is it possible to add a test for this?

internals/cli/cmd_run.go Show resolved Hide resolved
This commit adds a new test which checks that default services
start before executing subcommand.
Go 1.14 does not support os.Readfile(), use ioutil.Readfile() instead.
Copy link
Contributor

@woky woky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Instead of writing errors directly to os.Stderr in rcmd.Run,
write them to cli.Stderr which eventually points to os.Stderr;
but useful for testing purposes.
Test if default service start errors are properly reported and
if pebble exits with non-zero error codes. This is only tested
for the following command, for now:

    pebble enter --run [enter-OPTIONS..] <subcommand> [args..]
@rebornplusplus rebornplusplus requested a review from woky July 20, 2023 04:31
Copy link
Collaborator

@cjdcordeiro cjdcordeiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for fixing this behavior @rebornplusplus. The overall behavioural changes to the enter startup sequence with --run LGTM!

My only comment is about this thread:

Somewhat related: #167 -- I wonder if pebble run should also wait for services to be started successfully before starting up (or if not waiting, exit if they don't start properly). It seems like it'd be better for that to be a hard error. But maybe that's a separate PR/discussion...

Concerning pebble run, I think it's better to exit and report if the services don't start properly. In enter, we need the wait because the subcommand which follows needs to be run after the services have started. That may not be the case for run though.

Also, I am not sure and please correct me if I am wrong, but it seemed to be that if a startup-enabled service failed to start with an error, the services that followed in the queue were not started as well. So I think it's best to report any errors while starting up and exit afterwards.

Afaiu and according to our discussions, 4098a71 makes the Pebble daemon exit if 1 service fails to start correctly. IMO this is NOT the right behaviour, and even if it was, it deserves some discussion. The rationale is that Pebble services can be completely independent (unless one comes after or before another), and thus the failure of one service should not jeopardize the execution of another. I suggest dropping the hard error from that commit (having extra logging is still nice though).

Moreover, --run is supposed to mimic pebble run's behaviour and thus this PR would introduce a discrepancy amongst the two (as alluded by @benhoyt in his comment above).

@benhoyt @rebornplusplus please let me know if you disagree. If so, we should probably have a chat about it. IMO the right approach is to actually address #167.

@benhoyt
Copy link
Contributor

benhoyt commented Jul 21, 2023

@cjdcordeiro Yep, that makes sense and I agree. We should at least have more discussion if we're going to change that behaviour, and (separately) address #167.

This reverts commit 1281c96
and reverts the hard-error behaviour introduced in
4098a71. Pebble should no longer,
at least not due to this commit, exit with non-zero error code if
waiting for the "change" (of default services start) to be ready
returns any errors.
@rebornplusplus
Copy link
Member Author

@benhoyt @rebornplusplus please let me know if you disagree. If so, we should probably have a chat about it. IMO the right approach is to actually address #167.

@cjdcordeiro Yep, that makes sense and I agree. We should at least have more discussion if we're going to change that behaviour, and (separately) address #167.

Sounds good, let's discuss that separately. Meanwhile, I have reverted related changes in c01f729.

Copy link
Collaborator

@cjdcordeiro cjdcordeiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes ;)

@rebornplusplus
Copy link
Member Author

Hiya, this has been idle for some time. I have updated the PR to be in-line with the master branch. Please take another look at your convenience. Thanks!

@benhoyt benhoyt added the 24.04 label Mar 13, 2024
internals/cli/cmd_run.go Outdated Show resolved Hide resolved
internals/cli/cmd_run.go Outdated Show resolved Hide resolved
internals/cli/wait.go Outdated Show resolved Hide resolved
internals/cli/cmd_enter_test.go Outdated Show resolved Hide resolved
internals/cli/cmd_enter_test.go Outdated Show resolved Hide resolved
Copy link
Contributor

@benhoyt benhoyt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more suggested simplification / improvement. Sorry for the back and forth -- let me know what you think.

internals/cli/cmd_run.go Outdated Show resolved Hide resolved
internals/cli/cmd_run.go Outdated Show resolved Hide resolved
internals/cli/cmd_enter_test.go Show resolved Hide resolved
Copy link
Contributor

@benhoyt benhoyt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks. I've tested locally with both pebble run and pebble enter as well.

@benhoyt benhoyt merged commit 2f9fdc1 into canonical:master Mar 21, 2024
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants