Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add error handling for host allocation failures #1149

Merged

Conversation

FermiPerumal
Copy link
Contributor

Add logic to catch allocation errors while moving hosts
and fail gracefully

Add logic to catch allocation errors while moving hosts
and fail gracefully

Related HD ticket: http://rails.spimageworks.com/helpdesk/tickets/423339
Copy link
Collaborator

@bcipriano bcipriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change generally LGTM. Just a couple minor comments, mostly related to the warnings generated by the lint check.

self.cuebotCall(host.setAllocation,
"Set Allocation on %s Failed" % host.data.name,
allocation)
except Exception as e:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a more specific exception type you can catch here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately not. The gRPC call returns a Rendevous exception, hence I parse the exception details to check if it's an allocation error

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recently stumbled upon a similar grpc exception handling issue and this is a wide spread issue, most locations on the code are catching a general exception to handle grpc exceptions.

A better option is to catch grpc.RpcError and check the .details() to check the msg.

try:
  something_that_triggers_rendezvous()
except grpc.RpcError as rpc_error_call:  
  code = rpc_error_call.code() # (grpc.StatusCode.UNAVAILABLE, grpc.StatusCode.CANCELLED, ...)
  details = rpc_error_call.details()

See for exemple connectGrpcWithRetries on rqnetwork.py.

IMO this can be handled in another PR as it's a widespread implementation error.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bcipriano waiting for your confirmation to merge this

cuegui/cuegui/MenuActions.py Outdated Show resolved Hide resolved
@FermiPerumal
Copy link
Contributor Author

Thank you for the approval. I don't seem to have write permissions so I am not able to merge the PR.

@bcipriano bcipriano merged commit de10306 into AcademySoftwareFoundation:master Jul 6, 2022
akim-ruslanov pushed a commit to akim-ruslanov/OpenCue that referenced this pull request Jul 6, 2022
akim-ruslanov pushed a commit to akim-ruslanov/OpenCue that referenced this pull request Jul 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants