-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Response Ops][Alerting] Creating global service for fetching and caching rules settings #192404
Conversation
@@ -588,33 +591,36 @@ export class AlertingPlugin { | |||
}; | |||
|
|||
taskRunnerFactory.initialize({ | |||
logger, | |||
actionsConfigMap: getActionsConfigMap(this.config.rules.run.actions), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alphabetizing 🙈
Pinging @elastic/response-ops (Team:ResponseOps) |
rulesSettingsService: new RulesSettingsService({ | ||
getRulesSettingsClientWithRequest, | ||
isServerless: !!plugins.serverless, | ||
}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actual changes are to pass this new global rules setting service to the task runner factory and to remove getRulesSettingsClientWithRequest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Tested locally, works as expected!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM, but feels like x-pack/plugins/alerting/server/rules_settings/rules_settings_service.test.ts
could use some additional tests. Also noted a potential design change to use just one map instead of three, within the service.
@@ -86,6 +87,8 @@ export default function createAlertsAsDataInstallResourcesTest({ getService }: F | |||
|
|||
it(`should write alert docs during rule execution with flapping.enabled: ${enableFlapping}`, async () => { | |||
await setFlappingSettings(enableFlapping); | |||
// wait so cache expires | |||
await setTimeoutAsync(60000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a minute??? Oh noes ...
I wonder ... could we write a new route for FTR that we could call to modify the value? Doesn't need to be in this PR, but it is running this test twice so we just added 2m to this FTR group, it pains my heart :-)
And there are some more below :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in cf77b09
|
||
export class RulesSettingsService { | ||
private defaultQueryDelaySettings: RulesSettingsQueryDelayProperties; | ||
private settingsLastUpdated: Map<string, number> = new Map(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering if it would be better to have one map, that had the lastUpdated, queryDelaySettings, and flappingSettings in it, vs 3 maps. Seems like we'd have a better chance of keeping everything in sync over time. Probably easier to add new settings as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in f2aa813
this.flappingSettings.has(spaceId) && | ||
this.queryDelaySettings.has(spaceId) | ||
) { | ||
const lastUpdated = new Date(this.settingsLastUpdated.get(spaceId)!).getTime(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
critical field! You can imagine this getting messed up and then we never refresh the cache, or we refresh it every time. Everything looks good here, more of a note for myself :-)
const currentFlappingSettings = this.flappingSettings.get(spaceId)!; | ||
const currentQueryDelaySettings = this.queryDelaySettings.get(spaceId)!; | ||
|
||
if (now - lastUpdated >= CACHE_INTERVAL_MS) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll want to parameterize this somehow, to let us do the FTRs without all the 1m timeouts ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in cf77b09
return await this.fetchSettings(request, spaceId, now); | ||
} catch (err) { | ||
// return cached settings on error | ||
return { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we're logging errors on the inner calls, but seems kinda scary to not have anything here. Feels like maybe having a debug logs might good. The existing comments seem like good locations and text for those logs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in a45e6a3
expect(rulesSettingsService.queryDelaySettings.get('default')).toEqual({ delay: 0 }); | ||
// @ts-ignore - accessing private variable | ||
expect(rulesSettingsService.flappingSettings.get('default')).toEqual({ | ||
enabled: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can make this a constant, save some room.
Also, we don't seem to be testing any other values that this. Seems like we should have a test like the one where we test the changed query delay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in 60ce3e6
}); | ||
|
||
test('should use cached settings if cache has not expired', async () => { | ||
const rulesSettingsClient = rulesSettingsClientMock.create(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we set the values to non-default values, to make sure we end up getting the cached ones, instead of default? Seems appropriate for all the "cached" tests below ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in 60ce3e6
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for the changes!
@elasticmachine merge upstream |
💚 Build Succeeded
Metrics [docs]
History
To update your PR or re-run it, just comment with: cc @ymao1 |
…hing rules settings (elastic#192404) Resolves elastic#184321 and elastic#149884 ## Summary Created a `RulesSettingsService` that caches flapping and query delay settings per space for 1 minute so rules are not accessing the SO index with every execution. Also moved all classes related to the rules settings into the `rules_settings` folder so some of these changes are just renaming changes. ## To Verify 1. Add some logging to `x-pack/plugins/alerting/server/rules_settings/rules_settings_service.ts` to indicate when settings are being fetched and when they're returning from the cache. 2. Create and run some rules in different spaces to see that flapping and query delay settings are being returned from the cache when the cache has not expired. --------- Co-authored-by: Elastic Machine <[email protected]> (cherry picked from commit 7a0fe2e)
💚 All backports created successfully
Note: Successful backport PRs will be merged automatically after passing CI. Questions ?Please refer to the Backport tool documentation |
…nd caching rules settings (#192404) (#193011) # Backport This will backport the following commits from `main` to `8.x`: - [[Response Ops][Alerting] Creating global service for fetching and caching rules settings (#192404)](#192404) <!--- Backport version: 9.4.3 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Ying Mao","email":"[email protected]"},"sourceCommit":{"committedDate":"2024-09-16T13:22:44Z","message":"[Response Ops][Alerting] Creating global service for fetching and caching rules settings (#192404)\n\nResolves #184321 and\r\nhttps://github.com//issues/149884\r\n\r\n## Summary\r\n\r\nCreated a `RulesSettingsService` that caches flapping and query delay\r\nsettings per space for 1 minute so rules are not accessing the SO index\r\nwith every execution.\r\n\r\nAlso moved all classes related to the rules settings into the\r\n`rules_settings` folder so some of these changes are just renaming\r\nchanges.\r\n\r\n## To Verify\r\n1. Add some logging to\r\n`x-pack/plugins/alerting/server/rules_settings/rules_settings_service.ts`\r\nto indicate when settings are being fetched and when they're returning\r\nfrom the cache.\r\n2. Create and run some rules in different spaces to see that flapping\r\nand query delay settings are being returned from the cache when the\r\ncache has not expired.\r\n\r\n---------\r\n\r\nCo-authored-by: Elastic Machine <[email protected]>","sha":"7a0fe2e83683cf38c537694daa88a3a75bfe3e57","branchLabelMapping":{"^v9.0.0$":"main","^v8.16.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["Feature:Alerting","release_note:skip","Team:ResponseOps","v9.0.0","backport:prev-minor","v8.16.0"],"title":"[Response Ops][Alerting] Creating global service for fetching and caching rules settings","number":192404,"url":"https://github.com/elastic/kibana/pull/192404","mergeCommit":{"message":"[Response Ops][Alerting] Creating global service for fetching and caching rules settings (#192404)\n\nResolves #184321 and\r\nhttps://github.com//issues/149884\r\n\r\n## Summary\r\n\r\nCreated a `RulesSettingsService` that caches flapping and query delay\r\nsettings per space for 1 minute so rules are not accessing the SO index\r\nwith every execution.\r\n\r\nAlso moved all classes related to the rules settings into the\r\n`rules_settings` folder so some of these changes are just renaming\r\nchanges.\r\n\r\n## To Verify\r\n1. Add some logging to\r\n`x-pack/plugins/alerting/server/rules_settings/rules_settings_service.ts`\r\nto indicate when settings are being fetched and when they're returning\r\nfrom the cache.\r\n2. Create and run some rules in different spaces to see that flapping\r\nand query delay settings are being returned from the cache when the\r\ncache has not expired.\r\n\r\n---------\r\n\r\nCo-authored-by: Elastic Machine <[email protected]>","sha":"7a0fe2e83683cf38c537694daa88a3a75bfe3e57"}},"sourceBranch":"main","suggestedTargetBranches":["8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/192404","number":192404,"mergeCommit":{"message":"[Response Ops][Alerting] Creating global service for fetching and caching rules settings (#192404)\n\nResolves #184321 and\r\nhttps://github.com//issues/149884\r\n\r\n## Summary\r\n\r\nCreated a `RulesSettingsService` that caches flapping and query delay\r\nsettings per space for 1 minute so rules are not accessing the SO index\r\nwith every execution.\r\n\r\nAlso moved all classes related to the rules settings into the\r\n`rules_settings` folder so some of these changes are just renaming\r\nchanges.\r\n\r\n## To Verify\r\n1. Add some logging to\r\n`x-pack/plugins/alerting/server/rules_settings/rules_settings_service.ts`\r\nto indicate when settings are being fetched and when they're returning\r\nfrom the cache.\r\n2. Create and run some rules in different spaces to see that flapping\r\nand query delay settings are being returned from the cache when the\r\ncache has not expired.\r\n\r\n---------\r\n\r\nCo-authored-by: Elastic Machine <[email protected]>","sha":"7a0fe2e83683cf38c537694daa88a3a75bfe3e57"}},{"branch":"8.x","label":"v8.16.0","branchLabelMappingKey":"^v8.16.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Ying Mao <[email protected]>
…tching and caching rules settings (elastic#192404) (elastic#193011)" This reverts commit 7939f22.
Resolves #184321 and #149884
Summary
Created a
RulesSettingsService
that caches flapping and query delay settings per space for 1 minute so rules are not accessing the SO index with every execution.Also moved all classes related to the rules settings into the
rules_settings
folder so some of these changes are just renaming changes.To Verify
x-pack/plugins/alerting/server/rules_settings/rules_settings_service.ts
to indicate when settings are being fetched and when they're returning from the cache.