Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates to pre-built Security ML jobs #146760

Merged
merged 15 commits into from
Dec 15, 2022

Conversation

ajosh0504
Copy link
Contributor

@ajosh0504 ajosh0504 commented Nov 30, 2022

Summary

This PR makes the following updates to our pre-built Security ML jobs:

  • Adds user-friendly names to our pre-built Anomaly Detection jobs. These will be displayed in the Anomalies tab on the new Entity Analytics page in the Security App instead of job IDs.
  • Fixes formatting
  • One job was missing the security job group which is required to display jobs in the Security App. Added that as well.
  • Changed the names of two modules: siem_cloudtrail -> security_cloudtrail and siem_packetbeat -> security_packetbeat. This should have happened a while ago per this issue.

Side effects

  • Any QA tests that reference the siem_cloudtrail and siem_packetbeat modules will need to be changed to reference the new modules instead
  • Any references to the siem modules in the Security App will need to be updated

@elasticmachine
Copy link
Contributor

Pinging @elastic/ml-ui (:ml)

@ajosh0504 ajosh0504 added the release_note:skip Skip the PR/issue when compiling release notes label Nov 30, 2022
@ajosh0504
Copy link
Contributor Author

@machadoum Any thoughts on backporting? Or do we want to skip?

@sophiec20
Copy link
Contributor

Please keep on main - changes are too significant for backporting, as FF has passed.

@ajosh0504
Copy link
Contributor Author

@sophiec20 Yeah we didn't intend for this to go in in 8.6. Fair point about keeping it on main owing to the magnitude of changes made.

@droberts195
Copy link
Contributor

The display names part of the change looks fine to me.

There's one change in this PR that's not mentioned in the PR description. Two siem module IDs have been renamed: siem_packetbeat -> security_packetbeat and siem_cloudtrail -> security_cloudtrail. If that was intended to be part of this PR please mention it in the issue description.

This rename will have side effects in other areas. The ML QA tests will need to be changed to reference the new IDs and I guess those IDs might need updating somewhere in the Security app too? This isn't necessarily a problem, but needs to be clearly flagged up as a side effect that others will have to deal with.

  • One job was missing the security job group which is required to display jobs in the Security App. Added that as well.

That part is a bug fix, so could be backported to 8.6. (Just that one line change.)

@peteharverson
Copy link
Contributor

@ajosh0504 if you keep the changes to the module IDs in this PR, you'll need to adjust some of the API tests in x-pack/test/api_integration/apis/ml/modules which reference the old module IDs. Please reach out if you need help finding what to change. I think the 3 files that need some small edits are recognize_module.ts, setup_module.ts and get_module.ts.

@peteharverson peteharverson added the Feature:Anomaly Detection ML anomaly detection label Dec 1, 2022
@machadoum
Copy link
Member

machadoum commented Dec 1, 2022

Two siem module IDs have been renamed: siem_packetbeat -> security_packetbeat and siem_cloudtrail -> security_cloudtrail...
... I guess those IDs might need updating somewhere in the Security app

I searched for the IDs inside the security repo, and this is the only reference that I could find:

@ajosh0504 Could you please update it?

@ajosh0504 ajosh0504 changed the title Adding friendly names to all pre-built Security ML jobs Updates to pre-built Security ML jobs Dec 1, 2022
@ajosh0504
Copy link
Contributor Author

@droberts195 Thanks for catching that. I've updated the PR title, as well as the description to indicate that multiple changes are being made to the pre-built ML jobs in this PR. I've also called out that updates will need to be made in a few other places and I'll try to update as many tests as possible in this PR per recommendations from @peteharverson and @machadoum.

@ajosh0504 ajosh0504 requested a review from a team as a code owner December 1, 2022 18:46
@ajosh0504 ajosh0504 requested a review from jpdjere December 1, 2022 18:46
@ajosh0504
Copy link
Contributor Author

@peteharverson I updated the test files you mentioned above. While updating those, I realized that there's some other siem references in the tests owing to the folders here being named as siem_*. Is this sometime we want to modify in this PR as well?

Another thing I noted was references to moduleIDs containing versions (for example here), but those module IDs don't really exist here?

Copy link
Contributor

@jpdjere jpdjere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codeowners review from Detection-Rules LGTM

@jpdjere
Copy link
Contributor

jpdjere commented Dec 2, 2022

@rylnd Could you double check if the renaming of the modules:

siem_cloudtrail -> security_cloudtrail
siem_packetbeat -> security_packetbeat

has any impact on your side?

We have them hardcoded on x-pack/plugins/security_solution/public/common/components/ml_popover/ml_modules.tsx and this PR modifies that, but these are compared in the useSecurityJobs hook against the modules fetched by the /api/ml/modules/get_module/{moduleId} API, which I'm not familiar with. Not sure if this means that we need to adjust anything serverside.

@ajosh0504 Can we wait until getting confirmation on this before merging?

@rylnd
Copy link
Contributor

rylnd commented Dec 3, 2022

Could you double check if the renaming of the modules has any impact on your side?

@jpdjere thanks for the ping; I'm not certain on what the exact behavior is going to be here currently, but I can share some context/thoughts:

  • Because ML Modules don't (currently) have a categorization field associated with them, we cannot programmatically filter for security modules in the same way that we can Jobs (which have a group: security field). That mlModules list is our allowlist for Security's jobs, so we'd need to add the "new" module names to that list in order for them to appear in our UI.
  • My main concern is around how those old jobs are going to behave with this rename. I believe we mainly use the job_id for the purposes of identification, and so existing jobs will continue to show as installed despite the module rename, but we should verify that before this is released.

@pheyos
Copy link
Member

pheyos commented Dec 5, 2022

Hi @ajosh0504,

I realized that there's some other siem references in the tests owing to the folders here being named as siem_*. Is this sometime we want to modify in this PR as well?

The esArchives have been named close to the module ids to make it easier for developers to recognize / find them, so it would make sense to rename the archives as part of this PR. Note, that this should also include updating the index name used by the data archive. Let me know if you need help with this or prefer to have this in a separate PR.

Another thing I noted was references to moduleIDs containing versions (for example here), but those module IDs don't really exist here?

Please note, that the directory names listed here are not necessarily the module ids. The actual module ids are defined in the manifest files, se e.g. here.

@ajosh0504
Copy link
Contributor Author

I've requested @pheyos to help with the esArchive naming update.

@rylnd I've added the new module names to the mlModules list as pointed out by @machadoum earlier. Also, since we aren't backporting these changes, I don't see why old jobs should be impacted. But yes, let me see if I can verify this.

@pheyos
Copy link
Member

pheyos commented Dec 13, 2022

@ajosh0504 the esArchive rename (including renaming of the contained indices) is done in 5e60cec.

@ajosh0504
Copy link
Contributor Author

@pheyos Seeing some failing tests after your commit, although looking at the logs, I'm not sure if the failures are related to your changes. Could you take a look?

@pheyos
Copy link
Member

pheyos commented Dec 13, 2022

@ajosh0504 yes, the failures seem unrelated to the archive rename. Might be worth to comment @elasticmachine merge upstream.

@ajosh0504
Copy link
Contributor Author

@elasticmachine merge upstream

Copy link
Contributor

@peteharverson peteharverson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ML changes LGTM. Just left some minor comments for the security_app_display_name values.

I was able to test creation of the jobs inside ML for:

  • security_auth
  • security_cloudtrail
  • security_linux
  • security_windows

Jobs were created successfully - example from the cloudtrail jobs:

image

@ajosh0504
Copy link
Contributor Author

ajosh0504 commented Dec 14, 2022

@ryland I ran some tests locally to check the impact on existing ML jobs.

Steps

  • Changed the security_cloudtrail module to how to was before i.e. siem_cloudtrail and containing siem references in the job configs, manifest etc.
  • Enabled the Cloudtrail module in my local cluster. Ensured that I was seeing the old configurations- note that the created_by field in the screenshot says ml-module-siem-cloudtrail

Screen Shot 2022-12-14 at 2 49 17 PM

  • Changed the module id back to security_cloudtrail and removed other siem references.
  • Noticed errors saying the job ids and datafeeds already exist.

Screen Shot 2022-12-14 at 2 51 50 PM

This means that updating the module id and removing other siem references from the job configurations will not impact old jobs.

@kibana-ci
Copy link
Collaborator

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

  • [job] [logs] Security Solution Tests #1 / Alerts timeline Add a non-empty property to default timeline

Metrics [docs]

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
securitySolution 10.1MB 10.1MB +32.0B
Unknown metric groups

ESLint disabled in files

id before after diff
osquery 1 2 +1

ESLint disabled line counts

id before after diff
enterpriseSearch 19 21 +2
fleet 61 67 +6
osquery 109 115 +6
securitySolution 445 451 +6
total +20

Total ESLint disabled count

id before after diff
enterpriseSearch 20 22 +2
fleet 70 76 +6
osquery 110 117 +7
securitySolution 521 527 +6
total +21

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@ajosh0504 ajosh0504 enabled auto-merge (squash) December 15, 2022 15:36
@ajosh0504 ajosh0504 merged commit 9e4cb9d into elastic:main Dec 15, 2022
@kibanamachine kibanamachine added the backport:skip This commit does not require backporting label Dec 15, 2022
@spong
Copy link
Member

spong commented Dec 16, 2022

@ajosh0504 @rylnd -- just now seeing this PR, and was also curious about the upgrade flow and UI functionality within the Security app with this module name change so I did a little testing as well.

Similar to @ajosh0504's test above, I checked out the commit before this landed 578d643 where the module was still named siem_packetbeat, setup packetbeat, then installed the jobs via the Security ML Job Settings UI, and then was able to successfully enable them as well. I then stopped kibana, checked out this branch, and verified the Security ML Jobs Settings UI functioned without issue. The previous running job was still enabled, and it was still possible to stop/start it or any of the other jobs within the same module.

I was then curious if there would be any hiccups in the Security UI if needing to re-install any of the jobs, so I went to the ML UI, deleted one job (packetbeat_rare_urls), then went back to the Security UI and enabled that job and it successfully re-installed and enabled the job without issue/error 🎉. You can see in the image below the different created_by values as they differ between the module update:

So as you suspected @rylnd, since everything on the Security side is using job_id for identification (and those didn't change here), we're all good with regards to showing jobs and their relationship to any detection rules that may be active. Additionally (and where my concern was), the Security job management UI's still function for install/enable/disable when previous installs are present, and the Detection Rule Job selector has no issues displaying the jobs either.

Also, I went went ahead and created this enhancement for surfacing the new security_app_display_name field in our Security UI's. Thank you @ajosh0504 for adding this! 🙂

@ajosh0504
Copy link
Contributor Author

@spong Thank you so much for the additional tests you ran. Glad that my changes don't break things at y'all's end. :)

Also, yay enhancements!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:skip This commit does not require backporting Feature:Anomaly Detection ML anomaly detection :ml release_note:skip Skip the PR/issue when compiling release notes v8.7.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.