Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Security Solutions][Detection Engine] Adds a warning banner when the alerts data has not been migrated yet. #90258

Merged
merged 14 commits into from
Feb 17, 2021

Conversation

FrankHassanabad
Copy link
Contributor

@FrankHassanabad FrankHassanabad commented Feb 4, 2021

Summary

Adds a warning banner for when the alerting/signals data has not been migrated to the new structure. Although we are planning on supporting some backwards compatibility where the rules don't completely blow up, this support of backwards compatibility is going to be best effort and not have explicit tests and checks against backwards compatibility. Hence the reason we need to alert any users of the system when we can that they should have an administrator visit the detections page to start a migration.

From previous reasons why we don't migrate on startup of Kibana is that there are multiple instances running and it might be a worse situation so we migrate on page visit by an administrator to reduce chances of issues. In the future we might revisit this decision but for now this is what we have moved forward with.

If the user does not have sufficient privileges such as t1 analyst to see if they have should upgrade, no message is shown to those users.

This PR adds the following banner which is non-dismissible to:

  • Main detections page
  • Manage rules page
  • View/Edit rules page

Screen Shot 2021-02-03 at 4 16 00 PM

If other dismissible alerts are on the page then you will get a stacked effect until you dismiss those messages. Again, this message cannot be dismissed intentionally to let the user know that they should contact an administrator to update/upgrade the alerting/signal data:
Screen Shot 2021-02-03 at 5 41 57 PM

Other items of note:

  • Added ability to remove the button from the callouts
  • Consolidated in one area some types
  • Removed one part of the callout that has branching logic we never activate. We can re-add that later if we do have a need for it
  • e2e Cypress tests added to detect when the banner should be present
  • Backfilled unit tests for enzyme for some of the callout code

Manual testing:
Bump this number in your dev env:
https://github.com/elastic/kibana/blob/master/x-pack/plugins/security_solution/server/lib/detection_engine/routes/index/get_signals_template.ts#L11

Give yourself these permissions (or use one of the scripts for creating these roles):
Screen Shot 2021-02-05 at 1 49 02 PM

Visit the page.

Checklist

Delete any items that are not applicable to this PR.

@FrankHassanabad FrankHassanabad changed the title Warning banner [Security Solutions][Detection Engine] Adds a warning banner when the alerts data has not been migrated yet. Feb 4, 2021
@FrankHassanabad FrankHassanabad self-assigned this Feb 4, 2021
@FrankHassanabad FrankHassanabad added v8.0.0 v7.12.0 v7.11.1 release_note:skip Skip the PR/issue when compiling release notes labels Feb 4, 2021
@FrankHassanabad FrankHassanabad marked this pull request as ready for review February 5, 2021 20:52
@FrankHassanabad FrankHassanabad requested review from a team as code owners February 5, 2021 20:52
@FrankHassanabad FrankHassanabad added release_note:enhancement and removed release_note:skip Skip the PR/issue when compiling release notes labels Feb 5, 2021
Copy link
Contributor

@banderror banderror left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I'm super sorry, didn't have enough time to carefully review this PR today. Posted some comments, would look at it closer tomorrow morning if it's still open. Pls don't wait for me if other people approve it.

Comment on lines 32 to 41
describe('Detections > Need Admin Callouts indicating an admin is needed to migrate the alert data set', () => {
const NEED_ADMIN_FOR_UPDATE_CALLOUT = 'need-admin-for-update-rules';
const ALERTS_CALLOUT = 'read-only-access-to-alerts';
const RULES_CALLOUT = 'read-only-access-to-rules';

before(() => {
cleanKibana();
loginAndWaitForPageWithoutDateRange(DETECTIONS_URL, ROLES.t1_analyst);
waitForAlertsIndexToBeCreated();
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did I noticed it right that there's a bunch of very similar spec.ts files which test the same behaviour, with the only difference in the role of the user? Here it's ROLES.t1_analyst, in the other files all the other roles we have?

I'm thinking maybe it's possible to run describe() blocks in a for loop and pass these roles and other required parameters as "test cases"? Just like we can do

cases.forEach(c => {
  it(c.name, ...
});

maybe we could use the same trick with describe blocks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These behave differently for each role in most cases. For example in some roles we have to test for some callouts and in others we have to test for the non-existent.

E.x.

    it('We show three primary callouts', () => {
      waitForCallOutToBeShown(ALERTS_CALLOUT, 'primary');
      waitForCallOutToBeShown(RULES_CALLOUT, 'primary');
      waitForCallOutToBeShown(NEED_ADMIN_FOR_UPDATE_CALLOUT, 'primary');
    });

vs. in another test:

    it('We show 1 primary callout', () => {
      getCallOut(ALERTS_CALLOUT).should('not.exist');
      getCallOut(RULES_CALLOUT).should('not.exist');
      getCallOut(NEED_ADMIN_FOR_UPDATE_CALLOUT).should('not.exist');
    });

So each look similar but in the details of each they are very different from each other.

Comment on lines 39 to 41
const ALERTS_CALLOUT = 'read-only-access-to-alerts';
const RULES_CALLOUT = 'read-only-access-to-rules';
const NEED_ADMIN_FOR_UPDATE_CALLOUT = 'need-admin-for-update-rules';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, do you think this test should check NEED_ADMIN_FOR_UPDATE_CALLOUT as well? I'd check this one only in the newly added tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, because with this role I do not want the NEED_ADMIN_FOR_UPDATE_CALLOUT to end up showing per the user's role and the fact that this test will return index_mapping_outdated: false.

The other test that is the companion to this one is:

alerts_detection_callouts_readonly_need_admin.spec.ts

Where it returns index_mapping_outdated: true and then I test to ensure that it shows up next to each of these callouts.

Comment on lines 34 to 35
const ALERTS_CALLOUT = 'read-only-access-to-alerts';
const RULES_CALLOUT = 'read-only-access-to-rules';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could omit checking these guys in all tests which check the behaviour of NEED_ADMIN_FOR_UPDATE_CALLOUT?

Copy link
Contributor Author

@FrankHassanabad FrankHassanabad Feb 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these are missing tests for the callouts since so far what we have been testing is that callouts will appear when they're supposed to appear based on a lack of permissions. However, the other flip side is that we need tests based on when the user has sufficient permissions and some if not all the callouts should not appear.

So, without tests that are checking for the non-existence of callouts when users have sufficient privileges we could introduce bugs where callouts show up on pages when users have sufficient privileges and we wouldn't catch it. With these additional tests in place we should hopefully catch all if not most of the test cases where:

  • Users have such sufficient privileges that no callouts appear.
  • Users do not have sufficient privileges and 1 or more of the callouts should appear.
  • Repeat the same two tests but now with index_mapping being outdated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, without tests that are checking for the non-existence of callouts when users have sufficient privileges we could introduce bugs where callouts show up on pages when users have sufficient privileges and we wouldn't catch it

That's a great point, I agree! However after re-reading all my comments, and the answers, and checking the code I think I just failed to explain what I meant. Would you have some time to chat on that stuff specifically?

@FrankHassanabad FrankHassanabad requested a review from a team as a code owner February 9, 2021 03:03
Comment on lines 34 to 35
const ALERTS_CALLOUT = 'read-only-access-to-alerts';
const RULES_CALLOUT = 'read-only-access-to-rules';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, without tests that are checking for the non-existence of callouts when users have sufficient privileges we could introduce bugs where callouts show up on pages when users have sufficient privileges and we wouldn't catch it

That's a great point, I agree! However after re-reading all my comments, and the answers, and checking the code I think I just failed to explain what I meant. Would you have some time to chat on that stuff specifically?

@@ -0,0 +1,113 @@
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding unit tests. 🚀


import { mount } from 'enzyme';
import React from 'react';
import { CallOut, CallOutMessage } from '.';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's maybe import from ./callout_types to avoid circular deps?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I can do that.

import { mount } from 'enzyme';
import React from 'react';
import { TestProviders } from '../../../../common/mock';
import { NeedAdminForUpdateRulesCallOut } from '.';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooops again 🙂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can change this one to be ./index but usually for index.ts based files I just import from the folder directly vs something like ./index and not find that a big issue with circles.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see, here it's not a big deal. Np 👍

Comment on lines 21 to 30
/**
* Callout component that lets the user know that an administrator is needed for performing
* and auto-update of signals or not. For this component to render the user must:
* - Have the permissions to be able to read "signalIndexMappingOutdated" and that condition is "true"
* - Have the permissions to be able to read "hasIndexManage" and that condition is "false"
*
* If the user has the permissions to see that signalIndexMappingOutdated is true and that
* hasIndexManage is also true, then the user should be performing the update on the page which is
* why we do not show it for that condition.
*/
Copy link
Contributor

@banderror banderror Feb 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for additional comments, again super useful thing.
Nit: not sure what do you mean here by "Have the permissions to be able to read" and "If the user has the permissions to see"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding these comments. Let me know if it clears things up:

 * Some users do not have sufficient privileges to be able to determine if "signalIndexMappingOutdated"
 * is outdated or not. Same could apply to "hasIndexManage". When users do not have enough permissions
 * to determine if "signalIndexMappingOutdated" is true or false, the permissions system returns a "null"
 * instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super clear, thank you

Comment on lines 34 to 42
const signalIndexMappingIsOutdated =
signalIndexMappingOutdated != null && signalIndexMappingOutdated;

const userHasIndexManage = hasIndexManage != null && hasIndexManage;

return (
<CallOutPersistentSwitcher
condition={signalIndexMappingIsOutdated && !userHasIndexManage}
message={needAdminForUpdateRulesMessage}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here signalIndexMappingOutdated != null and hasIndexManage != null means that they have been loaded and now are available for checking.

Considering that, I suspect that condition might be incorrect.

signalIndexMappingIsOutdated && !userHasIndexManage ===
signalIndexMappingIsOutdated && !(hasIndexManage != null && hasIndexManage) ===
signalIndexMappingIsOutdated && hasIndexManage == null && !hasIndexManage) where
hasIndexManage == null && !hasIndexManage gives true when we are in initial state,
which would lead to flickering.

I think this is the right one:

const signalIndexMappingIsOutdated =
    signalIndexMappingOutdated != null && signalIndexMappingOutdated;
const userDoesntHaveIndexManage = hasIndexManage != null && !hasIndexManage;

condition={signalIndexMappingIsOutdated && userDoesntHaveIndexManage}

Copy link
Contributor Author

@FrankHassanabad FrankHassanabad Feb 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch here. I agree. Just want to point out one additional use case which is:

The null can be set when it's in an initial condition as well as when the user does not have enough privileges to even determine if signalIndexMappingOutdated is set to true or false so null is a overloaded in that it can stay null when the user does not know if they have enough privileges to determine if the signals index mapping is outdated as well as initial page load when it hasn't fetched anything yet. We might at some point want the user to know for sure rather than work around the limitation which might be trickier as we would end up using a system based user to determine if the user has the limitation or not.

But for now, that is the limitation and a side effect is that users without enough privileges won't show that they should contact an admin.

For hasIndexManage that could be the case maybe too where the user doesn't have enough privileges to determine if they do or do not have the ability to manage an index. I think this is more doubtful looking at the code on the backend but it might change later depending on how we create the permission from other permissions.

Regardless, I think what you're saying here is better and valid of avoiding having a flicker and still satisfies the use cases/limitations outlined above at the moment. I will add this line as suggested:

const userDoesntHaveIndexManage = hasIndexManage != null && !hasIndexManage;

Thanks for the hard look at this 👍

Copy link
Contributor

@banderror banderror left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

@banderror
Copy link
Contributor

One more thought I had. In this callout we say "auto-migrate", signals migration etc. Do you think it should be clear enough for the user what do we mean by that? Might sound scary IMHO :)

Do we have docs on signals migration? Maybe we could link to them and/or explain what this migration is about, how it works and why it's needed.
Could go in a separate PR.

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
securitySolution 2190 2192 +2

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
securitySolution 7.5MB 7.6MB +3.5KB

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@FrankHassanabad FrankHassanabad merged commit 8ce6ed4 into elastic:master Feb 17, 2021
@FrankHassanabad FrankHassanabad deleted the warning-banner branch February 17, 2021 03:58
FrankHassanabad added a commit that referenced this pull request Feb 17, 2021
… alerts data has not been migrated yet. (#90258) (#91602)

## Summary

Adds a warning banner for when the alerting/signals data has not been migrated to the new structure. Although we are planning on supporting some backwards compatibility where the rules don't completely blow up, this support of backwards compatibility is going to be best effort and not have explicit tests and checks against backwards compatibility. Hence the reason we need to alert any users of the system when we can that they should have an administrator visit the detections page to start a migration.

From previous reasons why we don't migrate on startup of Kibana is that there are multiple instances running and it might be a worse situation so we migrate on page visit by an administrator to reduce chances of issues. In the future we might revisit this decision but for now this is what we have moved forward with.

If the user does not have sufficient privileges such as t1 analyst to see if they have should upgrade, no message is shown to those users.

This PR adds the following banner which is non-dismissible to:
* Main detections page
* Manage rules page
* View/Edit rules page
<img width="2259" alt="Screen Shot 2021-02-03 at 4 16 00 PM" src="https://user-images.githubusercontent.com/1151048/106926989-eb2fb300-66ce-11eb-877c-1210357af108.png">

If other dismissible alerts are on the page then you will get a stacked effect until you dismiss those messages. Again, this message cannot be dismissed intentionally to let the user know that they should contact an administrator to update/upgrade the alerting/signal data:
<img width="1526" alt="Screen Shot 2021-02-03 at 5 41 57 PM" src="https://user-images.githubusercontent.com/1151048/106927465-6b561880-66cf-11eb-8c0f-dfdfa624c24b.png">

Other items of note:
* Added ability to remove the button from the callouts
* Consolidated in one area some types
* Removed one part of the callout that has branching logic we never activate. We can re-add that later if we do have a need for it
* e2e Cypress tests added to detect when the banner should be present
* Backfilled unit tests for enzyme for some of the callout code

Manual testing:
Bump this number in your dev env:
https://github.com/elastic/kibana/blob/master/x-pack/plugins/security_solution/server/lib/detection_engine/routes/index/get_signals_template.ts#L11

Give yourself these permissions (or use one of the scripts for creating these roles):
<img width="1243" alt="Screen Shot 2021-02-05 at 1 49 02 PM" src="https://user-images.githubusercontent.com/1151048/107087773-30301400-67b9-11eb-9ac9-0a67fafd8231.png">

Visit the page.

### Checklist

Delete any items that are not applicable to this PR.

- [x] Any text added follows [EUI's writing guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses sentence case text and includes [i18n support](https://github.com/elastic/kibana/blob/master/packages/kbn-i18n/README.md)
- [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios
- [ ] Any UI touched in this PR is usable by keyboard only (learn more about [keyboard accessibility](https://webaim.org/techniques/keyboard/))
- [ ] Any UI touched in this PR does not create any new axe failures (run axe in browser: [FF](https://addons.mozilla.org/en-US/firefox/addon/axe-devtools/), [Chrome](https://chrome.google.com/webstore/detail/axe-web-accessibility-tes/lhdoppojpmngadmnindnejefpokejbdd?hl=en-US))
- [ ] This renders correctly on smaller devices using a responsive layout. (You can test this [in your browser](https://www.browserstack.com/guide/responsive-testing-on-local-server))
- [x] This was checked for [cross-browser compatibility](https://www.elastic.co/support/matrix#matrix_browsers)
gmmorris added a commit to gmmorris/kibana that referenced this pull request Feb 17, 2021
* master: (157 commits)
  [DOCS] Adds machine learning to the security section of alerting (elastic#91501)
  [Uptime] Ping list step screenshot caption formatting (elastic#91403)
  [Vislib] Use timestamp on brush event instead of iso dates (elastic#91483)
  [Application Usage] Remove deprecated & unused legacy.appChanged API (elastic#91464)
  Migrate logstash, monitoring, url_drilldowns, xpack_legacy to ts projects (elastic#91194)
  [APM] Wrap Elasticsearch client errors (elastic#91125)
  [APM] Fix optimize-tsconfig script (elastic#91487)
  [Discover][docs] Add searchFieldsFromSource description (elastic#90980)
  Adds support for 'ip' data type (elastic#85087)
  [Detection Rules] Add updates from 7.11.2 rules (elastic#91553)
  [SECURITY SOLUTION] Eql in timeline (elastic#90816)
  [APM] Correlations Beta (elastic#86477) (elastic#89952)
  [Security Solutions][Detection Engine] Adds a warning banner when the alerts data has not been migrated yet. (elastic#90258)
  [Security Solution] [Timeline] Endpoint row renderers (2nd batch) (elastic#91446)
  skip flaky suite (elastic#91450)
  skip flaky suite (elastic#91592)
  [Security Solution][Endpoint][Admin] Endpoint Details UX Enhancements (elastic#90870)
  [ML] Add better UI support for runtime fields Transforms  (elastic#90363)
  [Security Solution] [Detections] Replace 'partial failure' with 'warning' for rule statuses (elastic#91167)
  [Security Solution][Detections] Adds Indicator path config for indicator match rules (elastic#91260)
  ...
@timroes timroes added the Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. label Mar 16, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-solution (Team: SecuritySolution)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release_note:enhancement Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. v7.12.0 v8.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants