-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fleet] Add escape-hatch flag for skipping upgrade rate limiting #176823
Labels
Team:Fleet
Team label for Observability Data Collection Fleet team
Comments
kpollich
added
the
Team:Fleet
Team label for Observability Data Collection Fleet team
label
Feb 13, 2024
Pinging @elastic/fleet (Team:Fleet) |
1 task
juliaElastic
added a commit
that referenced
this issue
Feb 19, 2024
## Summary Closes #176823 Added `skipRateLimitCheck` to be able to skip rate limiting on `upgrade` and `bulk_upgrade` API as an escape hatch. To verify: - enroll an agent 8.11.4 and upgrade to 8.12.0 - within 10m, try upgrade again with the API - the upgrade should fail - verify that the upgrade works if using the `skipRateLimitCheck` flag Example: ``` POST kbn:/api/fleet/agents/8b3c4f46-aedb-447f-8a9e-13fe313a3463/upgrade { "version": "8.12.1" } // should return error { "statusCode": 429, "error": "Too Many Requests", "message": "agent 8b3c4f46-aedb-447f-8a9e-13fe313a3463 was upgraded less than 10 minutes ago. Please wait 07m02s before trying again to ensure the upgrade will not be rolled back." } POST kbn:/api/fleet/agents/8b3c4f46-aedb-447f-8a9e-13fe313a3463/upgrade { "version": "8.12.1", "skipRateLimitCheck":true } // should return status 200 and upgrade action successful - check with action_status API GET kbn:/api/fleet/agents/action_status // bulk API POST kbn:/api/fleet/agents/bulk_upgrade { "version":"8.12.0", "agents":["8b3c4f46-aedb-447f-8a9e-13fe313a3463"], "start_time":"2024-02-14T14:08:23.599Z" } // should return 200, and action_status should report failed status GET kbn:/api/fleet/agents/action_status Response: { "type": "UPGRADE", "status": "FAILED", "latestErrors": [ { "agentId": "8b3c4f46-aedb-447f-8a9e-13fe313a3463", "error": "Agent 8b3c4f46-aedb-447f-8a9e-13fe313a3463 is not upgradeable: agent is already being upgraded.", "timestamp": "2024-02-14T14:36:47.749Z", "hostname": "agent1" } ] }, POST kbn:/api/fleet/agents/bulk_upgrade { "version":"8.12.0", "agents":["8b3c4f46-aedb-447f-8a9e-13fe313a3463"], "start_time":"2024-02-14T14:08:23.599Z", "skipRateLimitCheck":true } // should return 200, and action itself complete too GET kbn:/api/fleet/agents/action_status { "type": "UPGRADE", "status": "COMPLETE", "latestErrors": [] }, ``` Covered with API integration tests. ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios
kibanamachine
pushed a commit
to kibanamachine/kibana
that referenced
this issue
Feb 19, 2024
## Summary Closes elastic#176823 Added `skipRateLimitCheck` to be able to skip rate limiting on `upgrade` and `bulk_upgrade` API as an escape hatch. To verify: - enroll an agent 8.11.4 and upgrade to 8.12.0 - within 10m, try upgrade again with the API - the upgrade should fail - verify that the upgrade works if using the `skipRateLimitCheck` flag Example: ``` POST kbn:/api/fleet/agents/8b3c4f46-aedb-447f-8a9e-13fe313a3463/upgrade { "version": "8.12.1" } // should return error { "statusCode": 429, "error": "Too Many Requests", "message": "agent 8b3c4f46-aedb-447f-8a9e-13fe313a3463 was upgraded less than 10 minutes ago. Please wait 07m02s before trying again to ensure the upgrade will not be rolled back." } POST kbn:/api/fleet/agents/8b3c4f46-aedb-447f-8a9e-13fe313a3463/upgrade { "version": "8.12.1", "skipRateLimitCheck":true } // should return status 200 and upgrade action successful - check with action_status API GET kbn:/api/fleet/agents/action_status // bulk API POST kbn:/api/fleet/agents/bulk_upgrade { "version":"8.12.0", "agents":["8b3c4f46-aedb-447f-8a9e-13fe313a3463"], "start_time":"2024-02-14T14:08:23.599Z" } // should return 200, and action_status should report failed status GET kbn:/api/fleet/agents/action_status Response: { "type": "UPGRADE", "status": "FAILED", "latestErrors": [ { "agentId": "8b3c4f46-aedb-447f-8a9e-13fe313a3463", "error": "Agent 8b3c4f46-aedb-447f-8a9e-13fe313a3463 is not upgradeable: agent is already being upgraded.", "timestamp": "2024-02-14T14:36:47.749Z", "hostname": "agent1" } ] }, POST kbn:/api/fleet/agents/bulk_upgrade { "version":"8.12.0", "agents":["8b3c4f46-aedb-447f-8a9e-13fe313a3463"], "start_time":"2024-02-14T14:08:23.599Z", "skipRateLimitCheck":true } // should return 200, and action itself complete too GET kbn:/api/fleet/agents/action_status { "type": "UPGRADE", "status": "COMPLETE", "latestErrors": [] }, ``` Covered with API integration tests. ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios (cherry picked from commit 31517ef)
kibanamachine
referenced
this issue
Feb 21, 2024
…#177157) # Backport This will backport the following commits from `main` to `8.13`: - [[Fleet] added skipRateLimitCheck flag to upgrade API (#176923)](#176923) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Julia Bardi","email":"[email protected]"},"sourceCommit":{"committedDate":"2024-02-19T09:23:18Z","message":"[Fleet] added skipRateLimitCheck flag to upgrade API (#176923)\n\n## Summary\r\n\r\nCloses https://github.com/elastic/kibana/issues/176823\r\n\r\nAdded `skipRateLimitCheck` to be able to skip rate limiting on `upgrade`\r\nand `bulk_upgrade` API as an escape hatch.\r\n\r\nTo verify:\r\n- enroll an agent 8.11.4 and upgrade to 8.12.0\r\n- within 10m, try upgrade again with the API - the upgrade should fail\r\n- verify that the upgrade works if using the `skipRateLimitCheck` flag\r\n\r\nExample:\r\n\r\n```\r\nPOST kbn:/api/fleet/agents/8b3c4f46-aedb-447f-8a9e-13fe313a3463/upgrade\r\n{\r\n \"version\": \"8.12.1\"\r\n}\r\n\r\n// should return error\r\n{\r\n \"statusCode\": 429,\r\n \"error\": \"Too Many Requests\",\r\n \"message\": \"agent 8b3c4f46-aedb-447f-8a9e-13fe313a3463 was upgraded less than 10 minutes ago. Please wait 07m02s before trying again to ensure the upgrade will not be rolled back.\"\r\n}\r\n\r\nPOST kbn:/api/fleet/agents/8b3c4f46-aedb-447f-8a9e-13fe313a3463/upgrade\r\n{\r\n \"version\": \"8.12.1\",\r\n \"skipRateLimitCheck\":true\r\n}\r\n\r\n// should return status 200 and upgrade action successful - check with action_status API\r\nGET kbn:/api/fleet/agents/action_status\r\n\r\n// bulk API\r\nPOST kbn:/api/fleet/agents/bulk_upgrade\r\n{\r\n \"version\":\"8.12.0\",\r\n \"agents\":[\"8b3c4f46-aedb-447f-8a9e-13fe313a3463\"],\r\n \"start_time\":\"2024-02-14T14:08:23.599Z\"\r\n}\r\n\r\n// should return 200, and action_status should report failed status\r\nGET kbn:/api/fleet/agents/action_status\r\n\r\nResponse:\r\n {\r\n \"type\": \"UPGRADE\",\r\n \"status\": \"FAILED\",\r\n \"latestErrors\": [\r\n {\r\n \"agentId\": \"8b3c4f46-aedb-447f-8a9e-13fe313a3463\",\r\n \"error\": \"Agent 8b3c4f46-aedb-447f-8a9e-13fe313a3463 is not upgradeable: agent is already being upgraded.\",\r\n \"timestamp\": \"2024-02-14T14:36:47.749Z\",\r\n \"hostname\": \"agent1\"\r\n }\r\n ]\r\n },\r\n\r\nPOST kbn:/api/fleet/agents/bulk_upgrade\r\n{\r\n \"version\":\"8.12.0\",\r\n \"agents\":[\"8b3c4f46-aedb-447f-8a9e-13fe313a3463\"],\r\n \"start_time\":\"2024-02-14T14:08:23.599Z\",\r\n \"skipRateLimitCheck\":true\r\n}\r\n\r\n// should return 200, and action itself complete too\r\nGET kbn:/api/fleet/agents/action_status\r\n\r\n {\r\n \"type\": \"UPGRADE\",\r\n \"status\": \"COMPLETE\",\r\n \"latestErrors\": []\r\n },\r\n\r\n```\r\n\r\nCovered with API integration tests.\r\n\r\n### Checklist\r\n\r\n- [x] [Unit or functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere updated or added to match the most common scenarios","sha":"31517ef1412212e4b8bd69999e25ceaef6e897e9","branchLabelMapping":{"^v8.14.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:enhancement","Team:Fleet","v8.13.0","v8.14.0"],"title":"[Fleet] added skipRateLimitCheck flag to upgrade API","number":176923,"url":"https://github.com/elastic/kibana/pull/176923","mergeCommit":{"message":"[Fleet] added skipRateLimitCheck flag to upgrade API (#176923)\n\n## Summary\r\n\r\nCloses https://github.com/elastic/kibana/issues/176823\r\n\r\nAdded `skipRateLimitCheck` to be able to skip rate limiting on `upgrade`\r\nand `bulk_upgrade` API as an escape hatch.\r\n\r\nTo verify:\r\n- enroll an agent 8.11.4 and upgrade to 8.12.0\r\n- within 10m, try upgrade again with the API - the upgrade should fail\r\n- verify that the upgrade works if using the `skipRateLimitCheck` flag\r\n\r\nExample:\r\n\r\n```\r\nPOST kbn:/api/fleet/agents/8b3c4f46-aedb-447f-8a9e-13fe313a3463/upgrade\r\n{\r\n \"version\": \"8.12.1\"\r\n}\r\n\r\n// should return error\r\n{\r\n \"statusCode\": 429,\r\n \"error\": \"Too Many Requests\",\r\n \"message\": \"agent 8b3c4f46-aedb-447f-8a9e-13fe313a3463 was upgraded less than 10 minutes ago. Please wait 07m02s before trying again to ensure the upgrade will not be rolled back.\"\r\n}\r\n\r\nPOST kbn:/api/fleet/agents/8b3c4f46-aedb-447f-8a9e-13fe313a3463/upgrade\r\n{\r\n \"version\": \"8.12.1\",\r\n \"skipRateLimitCheck\":true\r\n}\r\n\r\n// should return status 200 and upgrade action successful - check with action_status API\r\nGET kbn:/api/fleet/agents/action_status\r\n\r\n// bulk API\r\nPOST kbn:/api/fleet/agents/bulk_upgrade\r\n{\r\n \"version\":\"8.12.0\",\r\n \"agents\":[\"8b3c4f46-aedb-447f-8a9e-13fe313a3463\"],\r\n \"start_time\":\"2024-02-14T14:08:23.599Z\"\r\n}\r\n\r\n// should return 200, and action_status should report failed status\r\nGET kbn:/api/fleet/agents/action_status\r\n\r\nResponse:\r\n {\r\n \"type\": \"UPGRADE\",\r\n \"status\": \"FAILED\",\r\n \"latestErrors\": [\r\n {\r\n \"agentId\": \"8b3c4f46-aedb-447f-8a9e-13fe313a3463\",\r\n \"error\": \"Agent 8b3c4f46-aedb-447f-8a9e-13fe313a3463 is not upgradeable: agent is already being upgraded.\",\r\n \"timestamp\": \"2024-02-14T14:36:47.749Z\",\r\n \"hostname\": \"agent1\"\r\n }\r\n ]\r\n },\r\n\r\nPOST kbn:/api/fleet/agents/bulk_upgrade\r\n{\r\n \"version\":\"8.12.0\",\r\n \"agents\":[\"8b3c4f46-aedb-447f-8a9e-13fe313a3463\"],\r\n \"start_time\":\"2024-02-14T14:08:23.599Z\",\r\n \"skipRateLimitCheck\":true\r\n}\r\n\r\n// should return 200, and action itself complete too\r\nGET kbn:/api/fleet/agents/action_status\r\n\r\n {\r\n \"type\": \"UPGRADE\",\r\n \"status\": \"COMPLETE\",\r\n \"latestErrors\": []\r\n },\r\n\r\n```\r\n\r\nCovered with API integration tests.\r\n\r\n### Checklist\r\n\r\n- [x] [Unit or functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere updated or added to match the most common scenarios","sha":"31517ef1412212e4b8bd69999e25ceaef6e897e9"}},"sourceBranch":"main","suggestedTargetBranches":["8.13"],"targetPullRequestStates":[{"branch":"8.13","label":"v8.13.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v8.14.0","branchLabelMappingKey":"^v8.14.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/176923","number":176923,"mergeCommit":{"message":"[Fleet] added skipRateLimitCheck flag to upgrade API (#176923)\n\n## Summary\r\n\r\nCloses https://github.com/elastic/kibana/issues/176823\r\n\r\nAdded `skipRateLimitCheck` to be able to skip rate limiting on `upgrade`\r\nand `bulk_upgrade` API as an escape hatch.\r\n\r\nTo verify:\r\n- enroll an agent 8.11.4 and upgrade to 8.12.0\r\n- within 10m, try upgrade again with the API - the upgrade should fail\r\n- verify that the upgrade works if using the `skipRateLimitCheck` flag\r\n\r\nExample:\r\n\r\n```\r\nPOST kbn:/api/fleet/agents/8b3c4f46-aedb-447f-8a9e-13fe313a3463/upgrade\r\n{\r\n \"version\": \"8.12.1\"\r\n}\r\n\r\n// should return error\r\n{\r\n \"statusCode\": 429,\r\n \"error\": \"Too Many Requests\",\r\n \"message\": \"agent 8b3c4f46-aedb-447f-8a9e-13fe313a3463 was upgraded less than 10 minutes ago. Please wait 07m02s before trying again to ensure the upgrade will not be rolled back.\"\r\n}\r\n\r\nPOST kbn:/api/fleet/agents/8b3c4f46-aedb-447f-8a9e-13fe313a3463/upgrade\r\n{\r\n \"version\": \"8.12.1\",\r\n \"skipRateLimitCheck\":true\r\n}\r\n\r\n// should return status 200 and upgrade action successful - check with action_status API\r\nGET kbn:/api/fleet/agents/action_status\r\n\r\n// bulk API\r\nPOST kbn:/api/fleet/agents/bulk_upgrade\r\n{\r\n \"version\":\"8.12.0\",\r\n \"agents\":[\"8b3c4f46-aedb-447f-8a9e-13fe313a3463\"],\r\n \"start_time\":\"2024-02-14T14:08:23.599Z\"\r\n}\r\n\r\n// should return 200, and action_status should report failed status\r\nGET kbn:/api/fleet/agents/action_status\r\n\r\nResponse:\r\n {\r\n \"type\": \"UPGRADE\",\r\n \"status\": \"FAILED\",\r\n \"latestErrors\": [\r\n {\r\n \"agentId\": \"8b3c4f46-aedb-447f-8a9e-13fe313a3463\",\r\n \"error\": \"Agent 8b3c4f46-aedb-447f-8a9e-13fe313a3463 is not upgradeable: agent is already being upgraded.\",\r\n \"timestamp\": \"2024-02-14T14:36:47.749Z\",\r\n \"hostname\": \"agent1\"\r\n }\r\n ]\r\n },\r\n\r\nPOST kbn:/api/fleet/agents/bulk_upgrade\r\n{\r\n \"version\":\"8.12.0\",\r\n \"agents\":[\"8b3c4f46-aedb-447f-8a9e-13fe313a3463\"],\r\n \"start_time\":\"2024-02-14T14:08:23.599Z\",\r\n \"skipRateLimitCheck\":true\r\n}\r\n\r\n// should return 200, and action itself complete too\r\nGET kbn:/api/fleet/agents/action_status\r\n\r\n {\r\n \"type\": \"UPGRADE\",\r\n \"status\": \"COMPLETE\",\r\n \"latestErrors\": []\r\n },\r\n\r\n```\r\n\r\nCovered with API integration tests.\r\n\r\n### Checklist\r\n\r\n- [x] [Unit or functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere updated or added to match the most common scenarios","sha":"31517ef1412212e4b8bd69999e25ceaef6e897e9"}}]}] BACKPORT--> Co-authored-by: Julia Bardi <[email protected]>
fkanout
pushed a commit
to fkanout/kibana
that referenced
this issue
Mar 4, 2024
## Summary Closes elastic#176823 Added `skipRateLimitCheck` to be able to skip rate limiting on `upgrade` and `bulk_upgrade` API as an escape hatch. To verify: - enroll an agent 8.11.4 and upgrade to 8.12.0 - within 10m, try upgrade again with the API - the upgrade should fail - verify that the upgrade works if using the `skipRateLimitCheck` flag Example: ``` POST kbn:/api/fleet/agents/8b3c4f46-aedb-447f-8a9e-13fe313a3463/upgrade { "version": "8.12.1" } // should return error { "statusCode": 429, "error": "Too Many Requests", "message": "agent 8b3c4f46-aedb-447f-8a9e-13fe313a3463 was upgraded less than 10 minutes ago. Please wait 07m02s before trying again to ensure the upgrade will not be rolled back." } POST kbn:/api/fleet/agents/8b3c4f46-aedb-447f-8a9e-13fe313a3463/upgrade { "version": "8.12.1", "skipRateLimitCheck":true } // should return status 200 and upgrade action successful - check with action_status API GET kbn:/api/fleet/agents/action_status // bulk API POST kbn:/api/fleet/agents/bulk_upgrade { "version":"8.12.0", "agents":["8b3c4f46-aedb-447f-8a9e-13fe313a3463"], "start_time":"2024-02-14T14:08:23.599Z" } // should return 200, and action_status should report failed status GET kbn:/api/fleet/agents/action_status Response: { "type": "UPGRADE", "status": "FAILED", "latestErrors": [ { "agentId": "8b3c4f46-aedb-447f-8a9e-13fe313a3463", "error": "Agent 8b3c4f46-aedb-447f-8a9e-13fe313a3463 is not upgradeable: agent is already being upgraded.", "timestamp": "2024-02-14T14:36:47.749Z", "hostname": "agent1" } ] }, POST kbn:/api/fleet/agents/bulk_upgrade { "version":"8.12.0", "agents":["8b3c4f46-aedb-447f-8a9e-13fe313a3463"], "start_time":"2024-02-14T14:08:23.599Z", "skipRateLimitCheck":true } // should return 200, and action itself complete too GET kbn:/api/fleet/agents/action_status { "type": "UPGRADE", "status": "COMPLETE", "latestErrors": [] }, ``` Covered with API integration tests. ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Today, Fleet enforces a 10 minute rate limit following an attempt to upgrade a given agent. Another upgrade cannot be attempted until this 10 minute timeout expires. This caused issues in the 8.12.1 release when we introduced a bug (elastic/fleet-server#3263) in Fleet Server that caused agents to become inadvertently rate limited in perpetuity.
We should introduce a flag that allows users to opt out of the rate limiting for extreme edge cases where rate limiting is behaving in an unexpected fashion. This flag should be documented alongside the existing
force
flags which allows upgrades to be restarted regardless of their current state.The text was updated successfully, but these errors were encountered: