Problems using Configurable Retry Policy #21

asqui · 2019-10-01T11:02:58Z

Looking at the documentation on Configurable Retry Policy Support:

retry_count is documented as "Number of times a request is tried upon encountering a specific response code, such as 429." however I don't think the implementation respects this -- I believe it will make a total of retry_count attempts across any error type.
retry_delay and retry_max_delay documentation does not make it clear that these parameters control the full jitter algorithm. Also, they probably don't need to be integers.
In particular, retry_max_delay is documented as "The upper limit for the amount of time, in seconds, for which the retries will be attempted for." which is not respected, but would be useful to have. (i.e. I think you should add tenacity.stop_after_delay(configuration.retry_max_delay) to the Retrying(stop=...) argument, so that there is a way to limit the total amount of time spent retrying.)

Separately, it doesn't look like metrics are getting published during retries because opsgenie_sdk.rest.RESTClientObject.request() does not publish HTTP Metrics in the scenario where pool_manager.request() raises an exception. This is making it harder for me to debug what is going on.

The text was updated successfully, but these errors were encountered:

asqui · 2019-10-01T14:15:41Z

Additional observations:

The Configuration object has an undocumented retries argument which (I think) is unused, but comes up in intellisense auto-complete lists.
The back_off parameter is not prefixed with retry_ like all the rest of the parameters for retry logic. retry_back_off might be more intuitive (and easier to find in intellisense auto-complete lists)

asqui · 2019-10-01T15:22:24Z

It's not possible to disable retry logic by setting retry_enabled = False because this parameter is quietly being ignored due to a bug. The and below needs to be an & for this to work as intended.

opsgenie-python-sdk/opsgenie_sdk/api_client.py

Lines 84 to 86 in f2a2103

    
           retry=(tenacity.retry_if_result(self.is_retry_enabled) and 
        
                  ((tenacity.retry_if_exception_type(RetryableException)) | 
        
                   (tenacity.retry_if_exception_type(HTTPError)))))

This is needed because the retry logic in SDK doesn't work properly :-( (see opsgenie#21)

mhamzak008 · 2019-11-06T11:24:44Z

Hi, thanks a bunch for pointing out all of this and sorry for the late response.

I have changed the explanations of the retry_count and retry_delay in the documentation to better align with what they actually do.

The bug which was preventing the disablement of retrying will be fixed, by implementing a custom callback function for deciding whether the retry should stop or not, when #29 is released.
It looks like that Tenacity doesn't support and or & operator in the retry keyword argument because when we changed to & from and, the retrying was stopped but wasn't working with it enabled.

I will remove the unused retries parameter in the Configuration object in 17f99e2

I agree that the back_off parameter of the Configuration object should be renamed to retry_back_off, but doing so might introduce some breaking changes for some customers. So, I will be postponing it to the next major release perhaps.

And I am looking into why the metrics are not being published when pool_manager.request() raises an exception at the moment

mhamzak008 · 2019-11-06T13:26:04Z

Coming back to metrics' publishing, http metrics are published only after an http request is completed. However, to make debugging easier, I have added a change to publish sdk metrics whenever an exception is thrown while attempting to call the api in b382348

…ject now (see #21)

Fixes for #21

zfr · 2020-01-24T14:10:01Z

I'm closing the ticket now. Thanks for the collaborative work guys. If we need an another improvement later, let's create a new issue then.

mhamzak008 self-assigned this Nov 5, 2019

mhamzak008 added the bug label Nov 6, 2019

asqui added a commit to BATS/opsgenie-python-sdk that referenced this issue Nov 6, 2019

Fix retry_enabled=False config flag to correctly disable SDK retry logic

4ab04fa

This is needed because the retry logic in SDK doesn't work properly :-( (see opsgenie#21)

asqui added a commit to BATS/opsgenie-python-sdk that referenced this issue Nov 6, 2019

Fix retry_enabled=False config flag to correctly disable SDK retry logic

212d1cd

This is needed because the retry logic in SDK doesn't work properly :-( (see opsgenie#21)

asqui mentioned this issue Nov 6, 2019

Fix retry_enabled=False config flag to correctly disable SDK retry logic #28

Closed

mhamzak008 added the docs enhancement should update the documentation to better explain this label Nov 6, 2019

mhamzak008 added a commit that referenced this issue Nov 6, 2019

Bug fix: can now disable retrying of the sdk. (See #21)

3f155ee

mhamzak008 added a commit that referenced this issue Nov 6, 2019

bug fix: respecting retry_max_delay parameter of the Configuration ob…

ef71afe

…ject now (see #21)

zfr pushed a commit that referenced this issue Jan 24, 2020

Bug fix: can now disable retrying of the sdk. (See #21)

743f878

zfr pushed a commit that referenced this issue Jan 24, 2020

bug fix: respecting retry_max_delay parameter of the Configuration ob…

6b8f896

…ject now (see #21)

zfr added a commit that referenced this issue Jan 24, 2020

Merge pull request #29 from opsgenie/fixes-for-#21

28db689

Fixes for #21

zfr closed this as completed Jan 24, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problems using Configurable Retry Policy #21

Problems using Configurable Retry Policy #21

asqui commented Oct 1, 2019 •

edited

Loading

asqui commented Oct 1, 2019

asqui commented Oct 1, 2019

mhamzak008 commented Nov 6, 2019 •

edited

Loading

mhamzak008 commented Nov 6, 2019 •

edited

Loading

zfr commented Jan 24, 2020

Problems using Configurable Retry Policy #21

Problems using Configurable Retry Policy #21

Comments

asqui commented Oct 1, 2019 • edited Loading

asqui commented Oct 1, 2019

asqui commented Oct 1, 2019

mhamzak008 commented Nov 6, 2019 • edited Loading

mhamzak008 commented Nov 6, 2019 • edited Loading

zfr commented Jan 24, 2020

asqui commented Oct 1, 2019 •

edited

Loading

mhamzak008 commented Nov 6, 2019 •

edited

Loading

mhamzak008 commented Nov 6, 2019 •

edited

Loading