-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Security Solution] Setting a large default rule interval #171814
base: main
Are you sure you want to change the base?
[Security Solution] Setting a large default rule interval #171814
Conversation
Hi @banderror, Please let me know your thoughts on this. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @VidhiRambhia, I reviewed the change and left a comment.
Pinging @elastic/security-detections-response (Team:Detections and Resp) |
Pinging @elastic/security-solution (Team: SecuritySolution) |
/ci |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for the Threat Hunting team
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving with the caveat I think with these changes it would be good to see a test case in x-pack/test/security_solution_api_integration/test_suites/detections_response/rules_management/rule_creation/trial_license_complete_tier/create_rules.ts
that removes the interval, creates a rule with no interval defined, and checks that it still defaults to 5m
.
@@ -1624,7 +1624,7 @@ export default ({ getService }: FtrProviderContext): void => { | |||
{ | |||
type: BulkActionEditTypeEnum.set_rule_actions, | |||
value: { | |||
throttle: '1h', | |||
throttle: '1d', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a preference here for mixing '1d'
values with '24h'
values? Should we standardize on one vs the other?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used 24h
everywhere, but here it had to be 1d
, because throttle has to be on of these
const ThrottleForBulkActions = z.enum(['rule', '1h', '1d', '7d']);
Thanks @dhurley14! Added in this commit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
security-engineering-productivity changes LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates @nikitaindik! Rule management changes lgtm, just left a question
@@ -145,13 +145,17 @@ export default ({ getService }: FtrProviderContext): void => { | |||
const rule: ReturnType<typeof getSimpleRule> = { | |||
...getSimpleRule('rule-1'), | |||
throttle: '1h', // <-- throttle makes this a scheduled action | |||
interval: '50m', // has to be less than "throttle" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this have to be less than throttle here but equal in other places? Just because we're testing action logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, you're right. I have re-checked and it looks like we can use 24h
in both places.
…se-schedule-interval
@elasticmachine merge upstream |
@elasticmachine run elasticsearch-ci/docs |
/ci |
…kibana into VidhiRambhia_increase-schedule-interval
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this file we have mocks with interval == 5m
. Is this on purpose?
getUpdateNewTermsSchemaMock
- this function seems to be unused. Can we update the interval here to 24h?getCreateThreatMatchRulesSchemaMock
- this one is used, but only from unit tests. Can we update the interval here to 24h?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Changed to 24h
in both functions in this commit
interval: '24h', | ||
from: 'now-6m', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The interval
and from
fields are tightly coupled. In the app, the from
field will always be now - (interval + lookback)
. Here, the implied lookback was 1 minute before the change to 24h.
I'd suggest to update the from
field according to the new interval value in all the places where we're updating the interval in this PR.
interval: '24h', | |
from: 'now-6m', | |
interval: '24h', // or 1440 mins | |
from: 'now-1450m', // lookback of 10 mins in addition to the interval of 1440 mins |
We can keep the tests and mocks where from is a constant from the very past, where likely it was done on purpose to be able to read source events with static timestamps (e.g. from an es archive):
interval: '24h',
from: '1900-01-01T00:00:00.000Z',
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Updated in this commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a general comment not related to this file. I can still see many instances of the 5-minute interval in test mocks, integration tests, and e2e tests.
Common and server-side mocks:
Line 367 in 557e61d
schedule: { interval: '5m' }, Line 520 in 557e61d
interval: '5m', kibana/x-pack/plugins/security_solution/server/lib/detection_engine/routes/__mocks__/utils.ts
Line 51 in 557e61d
interval: '5m', Line 237 in 557e61d
interval: '5m', Line 501 in 557e61d
interval: '5m', kibana/x-pack/plugins/security_solution/server/lib/detection_engine/rule_types/__mocks__/rule.ts
Line 28 in 557e61d
interval: '5m',
There are many interval: '5m'
instances in the security_solution_api_integration/test_suites/detections_response
folder.
There's also one in security_solution_cypress/cypress/e2e/detection_response
:
Finally, there's a lot of them in unit tests.
While these intervals might not be affecting those tests where we create enabled rules and run them, keeping hundreds of interval: '5m'
instances in the code seems off, unnecessary and potentially unsafe.
I'd be fine with merging this PR without further large-scale refactoring, if we create a new tech debt ticket for addressing this in the future. As part of this ticket, we might want to consolidate all our rule mocks under security_solution/common/api/detection_engine/model/rule_schema
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another general comment: did we run the changed tests in the flaky test runner?
buildkite test this |
…se-schedule-interval
… `getUpdateNewTermsSchemaMock`
…kibana into VidhiRambhia_increase-schedule-interval
Thanks for the feedback @banderror! I'll update "from" values as you suggested, add a new Flaky Test Runner run and will create a ticket for further refactoring. Changing the PR status back to Draft for now. |
@@ -11,5 +11,6 @@ export default function ({ loadTestFile }: FtrProviderContext) { | |||
loadTestFile(require.resolve('./create_rules')); | |||
loadTestFile(require.resolve('./create_new_terms')); | |||
loadTestFile(require.resolve('./preview_rules')); | |||
loadTestFile(require.resolve('./create_rules_bulk')); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test file was not imported anywhere for some reason. Even though the tests in this file seem legit and not a duplicate of some other tests. So I imported it here.
/ci |
💔 Build FailedFailed CI StepsTest Failures
Metrics [docs]
History
To update your PR or re-run it, just comment with: |
Resolves: #153689
Summary
The issue: In many tests we use a short rule execution interval ('5m' or less). This leads to rules getting scheduled for execution many times during a test run. It unnecessarily hogs on CI resources. For most tests we just want a single execution.
This PR:
24h
as the defaultinterval
for all rule creation utils used in Cypress tests and in API integration tests24h
interval
for a few tests that need a shorter interval.