[8.13](backport #38294) [Metricbeat][Azure] Azure monitor/storage: Fix 429 Too Many Requests error #38457

mergify · 2024-03-20T08:08:01Z

Proposed commit message

Please see the issue elastic/integrations#8575.

Metricsets affected by this change: storage and monitor.

This PR fixes the issue the following way:

We check if the error produced has the header "Retry After".
If this header is present then we use time.sleep for the seconds given by the header and try again after the time is up.
If the header is not present, then it is likely another error and we return.

This is also the retry suggestion given by Retry Usage Guidance from Azure.

For monitor metricset, the previous implementation also had a different problem affecting the number of calls: we were trying to get the metrics definition for every metric, without checking if we had done that already for the combination namespace + resource ID. So, for example, this configuration:

    - resource_query: "resourceType eq 'Microsoft.ContainerRegistry/registries'"
      metrics:
        - name: [ "TotalPullCount", "SuccessfulPullCount", "TotalPushCount", "SuccessfulPushCount","RunDuration", "AgentPoolCPUTime" ]
          namespace: "Microsoft.ContainerRegistry/registries"
          timegrain: "PT1M"
        - name: [ "StorageUsed" ]
          namespace: "Microsoft.ContainerRegistry/registries"
          timegrain: "PT1H"
          dimensions:
            - name: "Geolocation"
              value: "*"

Would lead to 2 calls for each resource, even though the namespace is the same for every metric group. This is also fixed in this PR.

Checklist

My code follows the style guidelines of this project
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have made corresponding change to the default configuration files
I have added tests that prove my fix is effective or that my feature works
I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

How to test this PR locally

Enable azure monitor metricset, like in this example configuration.

To reproduce the error you can change the code a bit to force it to make many requests. You can put the getMetricsDefinitionsWithRetry inside a for cycle for that.

I added some printing lines to the code. You should see an output like this:

[...]
>> Trying request  400
>> Trying request  401
>> Trying request  402
>> Metricbeat will sleep for 309 seconds and retry again.
>> Trying request  402  <---------- Same request as before
>> Trying request  403
[...]

Please notice that running this takes around 7 minutes to first encounter the error. After that time you have to wait for the sleep time, and only then you start making requests again.

Related issues

Closes elastic/integrations#8575.

This is an automatic backport of pull request #38294 done by [Mergify](https://mergify.com).

…error (#38294) * Fiz azure too many requests error Signed-off-by: constanca <[email protected]> * Add CHANGELOG.next.asciidoc entry Signed-off-by: constanca <[email protected]> * Run mage check Signed-off-by: constanca <[email protected]> * Get metrics 1 time for each namespace/id Signed-off-by: constanca <[email protected]> * Add error comparison Signed-off-by: constanca <[email protected]> * Move retry to monitor service Signed-off-by: constanca <[email protected]> * Remove GetMetricDefinitions without retry Signed-off-by: constanca <[email protected]> * fix storage unit tests Signed-off-by: constanca <[email protected]> * fix monitor unit tests Signed-off-by: constanca <[email protected]> * Remove unnecessary for cycle Signed-off-by: constanca <[email protected]> --------- Signed-off-by: constanca <[email protected]> (cherry picked from commit 10ff992)

botelastic · 2024-03-20T08:08:08Z

This pull request doesn't have a Team:<team> label.

elasticmachine · 2024-03-20T09:03:48Z

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS

Expand to view the summary

Build stats

Duration: 55 min 16 sec

❕ Flaky test report

No test was executed to be analysed.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

/test : Re-trigger the build.
/package : Generate the packages and run the E2E tests.
/beats-tester : Run the installation tests with beats-tester.
run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

mergify bot requested a review from a team as a code owner March 20, 2024 08:08

mergify bot added the backport label Mar 20, 2024

mergify bot assigned constanca-m Mar 20, 2024

botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Mar 20, 2024

constanca-m merged commit 2516537 into 8.13 Mar 20, 2024
21 checks passed

constanca-m deleted the mergify/bp/8.13/pr-38294 branch March 20, 2024 09:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[8.13](backport #38294) [Metricbeat][Azure] Azure monitor/storage: Fix 429 Too Many Requests error #38457

[8.13](backport #38294) [Metricbeat][Azure] Azure monitor/storage: Fix 429 Too Many Requests error #38457

mergify bot commented Mar 20, 2024 •

edited by zube bot

Loading

botelastic bot commented Mar 20, 2024

elasticmachine commented Mar 20, 2024

Build stats

[8.13](backport #38294) [Metricbeat][Azure] Azure monitor/storage: Fix 429 Too Many Requests error #38457

[8.13](backport #38294) [Metricbeat][Azure] Azure monitor/storage: Fix 429 Too Many Requests error #38457

Conversation

mergify bot commented Mar 20, 2024 • edited by zube bot Loading

Proposed commit message

Checklist

How to test this PR locally

Related issues

botelastic bot commented Mar 20, 2024

elasticmachine commented Mar 20, 2024

💚 Build Succeeded

Build stats

❕ Flaky test report

🤖 GitHub comments

mergify bot commented Mar 20, 2024 •

edited by zube bot

Loading