Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid overriding higher priority status with WaitingStatus #71

Closed
wants to merge 1 commit into from

Conversation

carlcsaposs-canonical
Copy link
Contributor

@carlcsaposs-canonical carlcsaposs-canonical commented May 22, 2023

Depends on #66

Issue

_wait_until_mysql_router_ready could override a higher priority status (i.e. BlockedStatus) with WaitingStatus

Solution

When setting WaitingStatus in _wait_until_mysql_router_ready, consider other statuses—by using the set_status method

Base automatically changed from extra-user-role-status to main May 22, 2023 13:24
Retry every 5 seconds for up to 30 seconds.
"""
logger.debug("Waiting until MySQL Router is ready")
self._charm.set_status(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can safely ignore this
Should, for the sake of user visibility, instead of using set_status allow the handler set the status freely at the hook start and let set_status/determine_status at the hook's end?
My rational here is that, by using set_status at the start, we can potentially masquerade intermediate states due to the fixed priority in _determine_status, e.g. a maintenance triggered by tls setup will not be seen by the user case there's a blocked status in place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can potentially masquerade intermediate states

That was my intention, but I see your concern

My thought process was that a WaitingStatus should not override a BlockedStatus—if there's a BlockedStatus, a person needs to intervene and showing them a WaitingStatus says that a person doesn't need to intervene

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For clarity, what you described is the behavior of this charm before this PR. (set_status is already called at the end of the hook)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, so the questions boils to what's a better ui?
Other example*, imagine a long running action like backup/restore that will put the unit maintenance status - what it's best:

  • to disallow the task when the unit is blocked?
  • to hide the status change from the UI?
  • to allow a lower priority status be set while the task runs?

imo, I'll go to the last one

*even though this example don't applies to this charm, we should strive for consistent UI across charm

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that example, if I saw maintenance status I would assume the charm would end in an active status once the long running action completes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when a hook gets invoked, the focus should be on the said hook

This requires the human operator to be aware of hook invocations—for hooks that are triggered by the human operator (e.g. juju run-action or juju relate or juju remove-relation) this makes sense—but I think this could be confusing for hooks not directly triggered by the human operator.

If at the beginning of the hook the unit is in BlockedStatus and if we know it will be in BlockedStatus at the end of the hook, should we be shifting the human operator's focus away from the blocking issue to a non-blocking progress message?


Other example*, imagine a long running action like backup/restore that will put the unit maintenance status

@paulomach Do you have an example of a blocked unit status that would be temporarily overriden by the maintenance status? I'm struggling to concretely imagine a real-world example where this would occur

Copy link

@Mehdi-Bendriss Mehdi-Bendriss May 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requires the human operator to be aware of hook invocations

I'm not sure I understand what this means or the difference between the 2. If on_install is invoked as part of the lifecycle, the user should see Installing ...... Once the hook logic finishes, recompute the unit state and reflect it accordingly (the end result can be a BlockedState, an ActiveState or whatever).

2 potential statuses: 1 on hook pre_execution detailing "what this hook is about to do", 1 after the hook logic completes "what is the current state of this unit / app, taking into consideration all components".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand what this means or the difference between the 2.

I guess it's a different perspective on the UX

Putting myself in the shoes of a Juju user, I don't know/want to know what hooks are—I just want to know what action is required from me. From that perspective, seeing information about actions that I need to take is more important to me than seeing status updates about the charm. If the charm is healthy & I'm waiting, a status message is helpful—but if the charm is unhealthy, I want to know immediately & I don't want to have to watch juju status for X minutes to check if the charm is healthy.

If the user wants to know about the charm internals while there's a blocking action required from them, I think the debug-log might be a better place.

But I also see the value in showing the internals to the user so that if something goes wrong they have a better sense of where things went wrong.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed this with Carl yesterday, sharing my vision here:

UI: it is important to show users the reaction on the called action/command.

I expect to see all statuses: "waiting, maintenance=installing snap; error=failed to fetch stap, try 1; maintenance=installing snap, try 2; blocked=missing relation: database, backend-database" on command 'juju deploy mysql-router-k8s'.

As a user I do not want to see "blocked=missing relation: database, backend-database" immediately (which hides all informational statuses above for me).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very interesting. I guess the charm can be in multiple modes (installing, operating, running an action), and you kind of want a different status to surface in each mode. Perhaps you could see "installing" and "running an action" both as a long-running-operation mode, and your status calculation would be something like this (untested pseudocode)

def get_status(self):
    if charm.in_long_running_operation():
        # assume that long running operation will set status directly
        return ops.UnknownStatus()
    return this_components_status

That way if install was happening, it would set status directly with the sequence of status changes ("installing snap", blah, blah, blah), or if a long-running action was happening, same thing.

in_long_running_operation could be stored in stored state, or calculated, depending on the use case. Not entirely sure this is a good idea or if it would just add confusion.

Retry every 5 seconds for up to 30 seconds.
"""
logger.debug("Waiting until MySQL Router is ready")
self._charm.set_status(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am personally not in favor of "storing / keeping track" of a list of statuses.
I agree with @paulomach's last point, in my opinion: when a hook gets invoked, the focus should be on the said hook, which means if an operation expected to report a status is invoked, regardless of what it is => report it until the operation has completed, and only upon completion, reconcile statuses . i.e:

def _on_start(event: StartEvent):
    set_unit_status(WaitingStatus("Starting mysql router..."))
    ....
    
    if self.unit.is_leader():
        reconcile_app_status()  # check the health of the **application**, and report new status (incl. ActiveStatus if all checks pass) which may be overriden by a subsequent hook call
        
    reconcile_unit_status()  # check the health of the **unit**, and report new status (incl. ActiveStatus if all checks pass) which may be overriden by a subsequent hook call

The "priority" of statuses should be the job of the charm developer to determine in the reconcile_*_status method / function.

src/workload.py Show resolved Hide resolved
@carlcsaposs-canonical
Copy link
Contributor Author

Closed per #71 (comment)

@carlcsaposs-canonical carlcsaposs-canonical deleted the additional-statuses branch May 24, 2023 12:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants