Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ResponseOps][Alerting] Decouple feature IDs from consumers #183756

Merged
merged 213 commits into from
Dec 3, 2024

Conversation

cnasikas
Copy link
Member

@cnasikas cnasikas commented May 17, 2024

Summary

This PR aims to decouple the feature IDs from the consumer attribute of rules and alerts.

Towards: #187202
Fixes: #181559
Fixes: #182435

Note

Unfortunately, I could not break the PR into smaller pieces. The APIs could not work anymore with feature IDs and had to convert them to use rule type IDs. Also, I took the chance and refactored crucial parts of the authorization class that in turn affected a lot of files. Most of the changes in the files are minimal and easy to review. The crucial changes are in the authorization class and some alerting APIs.

Architecture

Alerting RBAC model

The Kibana security uses Elasticsearch's application privileges. This way Kibana can represent and store its privilege models within Elasticsearch roles. To do that, Kibana security creates actions that are granted by a specific privilege. Alerting uses its own RBAC model and is built on top of the existing Kibana security model. The Alerting RBAC uses the rule_type_id and consumer attributes to define who owns the rule and the alerts procured by the rule. To connect the rule_type_id and consumer with the Kibana security actions the Alerting RBAC registers its custom actions. They are constructed as alerting:<rule-type-id>/<feature-id>/<alerting-entity>/<operation>. Because to authorizate a resource an action has to be generated and because the action needs a valid feature ID the value of the consumer should be a valid feature ID. For example, the alerting:siem.esqlRule/siem/rule/get action, means that a user with a role that grants this action can get a rule of type siem.esqlRule with consumer siem.

Problem statement

At the moment the consumer attribute should be a valid feature ID. Though this approach worked well so far it has its limitation. Specifically:

  • Rule types cannot support more than one consumer.
  • To associate old rules with a new feature ID required a migration on the rule's SOs and the alerts documents.
  • The API calls are feature ID-oriented and not rule-type-oriented.
  • The framework has to be aware of the values of the consumer attribute.
  • Feature IDs are tightly coupled with the alerting indices leading to bugs.
  • Legacy consumers that are not a valid feature anymore can cause bugs.
  • The framework has to be aware of legacy consumers to handle edge cases.
  • The framework has to be aware of specific consumers to handle edge cases.

Proposed solution

This PR aims to decouple the feature IDs from consumers. It achieves that a) by changing the way solutions configure the alerting privileges when registering a feature and b) by changing the alerting actions. The schema changes as:

// Old formatting
id: 'siem', <--- feature ID
alerting:['siem.queryRule']

// New formatting
id: 'siem', <--- feature ID
alerting: [{ ruleTypeId: 'siem.queryRule', consumers: ['siem'] }] <-- consumer same as the feature ID in the old formatting

The new actions are constructed as alerting:<rule-type-id>/<consumer>/<alerting-entity>/<operation>. For example alerting:rule-type-id/my-consumer/rule/get. The new action means that a user with a role that grants this action can get a rule of type rule-type with consumer my-consumer. Changing the action strings is not considered a breaking change as long as the user's permission works as before. In our case, this is true because the consumer will be the same as before (feature ID), and the alerting security actions will be the same. For example:

Old formatting

Schema:

id: 'logs', <--- feature ID
alerting:['.es-query'] <-- rule type ID

Generated action:

alerting:.es-query/logs/rule/get

New formatting

Schema:

id: 'siem', <--- feature ID
alerting: [{ ruleTypeId: '.es-query', consumers: ['logs'] }] <-- consumer same as the feature ID in the old formatting

Generated action:

alerting:.es-query/logs/rule/get <--- consumer is set as logs and the action is the same as before

In both formating the actions are the same thus breaking changes are avoided.

Alerting authorization class

The alerting plugin uses and exports the alerting authorization class (AlertingAuthorization). The class is responsible for handling all authorization actions related to rules and alerts. The class changed to handle the new actions as described in the above sections. A lot of methods were renamed, removed, and cleaned up, all method arguments converted to be an object, and the response signature of some methods changed. These changes affected various pieces of the code. The changes in this class are the most important in this PR especially the _getAuthorizedRuleTypesWithAuthorizedConsumers method which is the cornerstone of the alerting RBAC. Please review carefully.

Instantiation of the alerting authorization class

The AlertingAuthorizationClientFactory is used to create instances of the AlertingAuthorization class. The AlertingAuthorization class needs to perform async operations upon instantiation. Because JS, at the moment, does not support async instantiation of classes the AlertingAuthorization class was assigning Promise objects to variables that could be resolved later in other phases of the lifecycle of the class. To improve readability and make the lifecycle of the class clearer, I separated the construction of the class (initialization) from the bootstrap process. As a result, getting the AlertingAuthorization class or any client that depends on it (getRulesClient for example) is an async operation.

Filtering

A lot of routes use the authorization class to get the authorization filter (getFindAuthorizationFilter), a filter that, if applied, returns only the rule types and consumers the user is authorized to. The method that returns the filter was built in a way to also support filtering on top of the authorization filter thus coupling the authorized filter with router filtering. I believe these two operations should be decoupled and the filter method should return a filter that gives you all the authorized rule types. It is the responsibility of the consumer, router in our case, to apply extra filters on top of the authorization filter. For that reason, I made all the necessary changes to decouple them.

Legacy consumers & producer

A lot of rules and alerts have been created and are still being created from observability with the alerts consumer. When the Alerting RBAC encounters a rule or alert with alerts as a consumer it falls back to the producer of the rule type ID to construct the actions. For example if a rule with ruleTypeId: .es-query and consumer: alerts the alerting action will be constructed as alerting:.es-query/stackAlerts/rule/get where stackRules is the producer of the .es-query rule type. The producer is used to be used in alerting authorization but due to its complexity, it was deprecated and only used as a fallback for the alerts consumer. To avoid breaking changes all feature privileges that specify access to rule types add the alerts consumer when configuring their alerting privileges. By moving the alerts consumer to the registration of the feature we can stop relying on the producer. The producer is not used anymore in the authorization class. In the next PRs the producer will removed entirely.

Routes

The following changes were introduced to the alerting routes:

  • All related routes changed to be rule-type oriented and not feature ID oriented.
  • All related routes support the ruleTypeIds and the consumers parameters for filtering. In all routes, the filters are constructed as ruleTypeIds: ['foo'] AND consumers: ['bar'] AND authorizationFilter. Filtering by consumers is important. In o11y for example, we do not want to show ES rule types with the stackAlerts consumer even if the user has access to them.
  • The /internal/rac/alerts/_feature_ids route got deleted as it was not used anywhere in the codebase and it was internal.

All the changes in the routes are related to internal routes and no breaking changes are introduced.

Constants

I moved the o11y and stack rule type IDs to kbn-rule-data-utils and exported all security solution rule type IDs from kbn-securitysolution-rules. I am not a fan of having a centralized place for the rule type IDs. Ideally, consumers of the framework should specify keywords like observablility (category or subcategory) or even apm.* and the framework should know which rule type IDs to pick up. I think it is out of the scope of the PR, and at the moment it seems the most straightforward way to move forward. I will try to clean up as much as possible in further iterations. If you are interested in the upcoming work follow this issue #187202.

Other notable code changes

  • Change all instances of feature IDs to rule type IDs.
  • isSiemRuleType: This is a temporary helper function that is needed in places where we handle edge cases related to security solution rule types. Ideally, the framework should be agnostic to the rule types or consumers. The plan is to be removed entirely in further iterations.
  • Rename alerting PluginSetupContract and PluginStartContract to AlertingServerSetup and AlertingServerStart. This made me touch a lot of files but I could not resist.
  • filter_consumers was mistakenly exposed to a public API. It was undocumented.
  • Files or functions that were not used anywhere in the codebase got deleted.
  • Change the returned type of the list method of the RuleTypeRegistry from Set<RegistryRuleType> to Map<string, RegistryRuleType>.
  • Assertion of KueryNode in tests changed to an assertion of KQL using toKqlExpression.
  • Removal of useRuleAADFields as it is not used anywhere.

Testing

Caution

It is very important to test all the areas of the application where rules or alerts are being used directly or indirectly. Scenarios to consider:

  • The correct rules, alerts, and aggregations on top of them are being shown as expected as a superuser.
  • The correct rules, alerts, and aggregations on top of them are being shown as expected by a user with limited access to certain features.
  • The changes in this PR are backward compatible with the previous users' permissions.

Solutions

Please test and verify that:

  • All the rule types you own with all possible combinations of permissions both in ESS and in Serverless.
  • The consumers and rule types make sense when registering the features.
  • The consumers and rule types that are passed to the components are the intended ones.

ResponseOps

The most important changes are in the alerting authorization class, the search strategy, and the routes. Please test:

  • The rules we own with all possible combinations of permissions.
  • The stack alerts page and its solution filtering.
  • The categories filtering in the maintenance window UI.

Risks

Warning

The risks involved in this PR are related to privileges. Specifically:

  • Users with no privileges can access rules and alerts they do not have access to.
  • Users with privileges cannot access rules and alerts they have access to.

An excessive list of integration tests is in place to ensure that the above scenarios will not occur. In the case of a bug, we could a) release an energy release for serverless and b) backport the fix in ESS. Given that this PR is intended to be merged in 8.17 we have plenty of time to test and to minimize the chances of risks.

FQA

  • I noticed that a lot of routes support the filter parameter where we can pass an arbitrary KQL filter. Why we do not use this to filter by the rule type IDs and the consumers and instead we introduce new dedicated parameters?

The filter parameter should not be exposed in the first place. It assumes that the consumer of the API knows the underlying structure and implementation details of the persisted storage API (SavedObject client API). For example, a valid filter would be alerting.attributes.rule_type_id. In this filter the consumer should know a) the name of the SO b) the keyword attributes (storage implementation detail) and c) the name of the attribute as it is persisted in ES (snake case instead of camel case as it is returned by the APIs). As there is no abstraction layer between the SO and the API, it makes it very difficult to make changes in the persistent schema or the APIs. For all the above I decided to introduce new query parameters where the alerting framework has total control over it.

  • I noticed in the code a lot of instances where the consumer is used. Should not remove any logic around consumers?

This PR is a step forward making the framework as agnostic as possible. I had to keep the scope of the PR as contained as possible. We will get there. It needs time :).

  • I noticed a lot of hacks like checking if the rule type is siem. Should not remove the hacks?

This PR is a step forward making the framework as agnostic as possible. I had to keep the scope of the PR as contained as possible. We will get there. It needs time :).

  • I hate the "Role visibility" dropdown. Can we remove it?

I also do not like it. The goal is to remove it. Follow #189997.

@cnasikas cnasikas added the release_note:skip Skip the PR/issue when compiling release notes label May 17, 2024
@cnasikas cnasikas self-assigned this May 17, 2024
@cnasikas
Copy link
Member Author

/ci

@cnasikas cnasikas force-pushed the poc_decouple_consumers_feature_ids branch from 06d8499 to 3b7c89f Compare May 23, 2024 15:47
@cnasikas cnasikas changed the title [POC] Decouple feature IDs from consumres [POC] Decouple feature IDs from consumers May 28, 2024
@cnasikas cnasikas force-pushed the poc_decouple_consumers_feature_ids branch from f65ca2c to a2b73f9 Compare July 14, 2024 15:08
@cnasikas cnasikas force-pushed the poc_decouple_consumers_feature_ids branch from cefbb02 to 1ed4c79 Compare November 26, 2024 13:36
Copy link
Contributor

@consulthys consulthys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGT Stack Monitoring

@cnasikas cnasikas removed request for a team November 26, 2024 18:41
];

export const apmAlertingRuleTypeIds: string[] = [...OBSERVABILITY_RULE_TYPE_IDS];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
export const apmAlertingRuleTypeIds: string[] = [...OBSERVABILITY_RULE_TYPE_IDS];
export const apmAlertingRuleTypeIds: string[] = OBSERVABILITY_RULE_TYPE_IDS;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OBSERVABILITY_RULE_TYPE_IDS is used in a lot of places and I wanted to avoid someone mutating the original array. const protect us with assignments but it does not with push.

Copy link
Contributor

@cauemarcondes cauemarcondes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@ashokaditya ashokaditya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did code review only and LGTM. Thanks for taking on the massive changeset. 🙇🏼

@elasticmachine
Copy link
Contributor

elasticmachine commented Dec 3, 2024

💚 Build Succeeded

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
alerting 248 247 -1
cases 823 827 +4
observability 1074 1073 -1
securitySolution 6310 6308 -2
triggersActionsUi 862 860 -2
total -2

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
@kbn/alerts-ui-shared 305 295 -10
@kbn/rule-data-utils 135 140 +5
@kbn/securitysolution-rules 25 26 +1
alerting 848 839 -9
ml 63 64 +1
ruleRegistry 226 222 -4
uptime 1 0 -1
total -17

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
alerting 94.1KB 91.4KB -2.6KB
apm 3.4MB 3.4MB +84.0B
cases 492.7KB 492.8KB +80.0B
discover 812.9KB 813.3KB +400.0B
infra 1.8MB 1.8MB -358.0B
ml 4.7MB 4.7MB -217.0B
observability 472.8KB 472.0KB -830.0B
observabilityLogsExplorer 147.3KB 147.7KB +358.0B
securitySolution 14.6MB 14.6MB -68.0B
slo 848.0KB 848.3KB +275.0B
synthetics 1.1MB 1.1MB +404.0B
triggersActionsUi 1.7MB 1.7MB -359.0B
uptime 467.5KB 467.5KB +4.0B
total -2.8KB

Public APIs missing exports

Total count of every type that is part of your API that should be exported but is not. This will cause broken links in the API documentation system. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats exports for more detailed information.

id before after diff
features 2 3 +1

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
apm 38.1KB 38.6KB +531.0B
cases 159.5KB 159.6KB +146.0B
infra 56.9KB 58.1KB +1.2KB
ml 75.8KB 76.5KB +681.0B
observability 103.8KB 104.5KB +766.0B
observabilityShared 93.7KB 93.7KB +36.0B
slo 28.8KB 29.2KB +394.0B
synthetics 37.6KB 37.8KB +150.0B
triggersActionsUi 128.9KB 128.5KB -363.0B
uptime 22.9KB 22.9KB +1.0B
total +3.5KB
Unknown metric groups

API count

id before after diff
@kbn/alerts-grouping 31 32 +1
@kbn/alerts-ui-shared 321 311 -10
@kbn/rule-data-utils 138 152 +14
@kbn/securitysolution-rules 28 29 +1
alerting 880 872 -8
ml 148 149 +1
ruleRegistry 263 259 -4
total -5

ESLint disabled line counts

id before after diff
@kbn/test-suites-xpack 722 721 -1

Total ESLint disabled count

id before after diff
@kbn/test-suites-xpack 747 746 -1

History

cc @cnasikas

@cnasikas cnasikas merged commit a3496c9 into elastic:main Dec 3, 2024
10 checks passed
@cnasikas cnasikas deleted the poc_decouple_consumers_feature_ids branch December 3, 2024 10:21
@kibanamachine
Copy link
Contributor

Starting backport for target branches: 8.x

https://github.com/elastic/kibana/actions/runs/12137618585

@kibanamachine
Copy link
Contributor

💔 All backports failed

Status Branch Result
8.x Backport failed because of merge conflicts

Manual backport

To create the backport manually run:

node scripts/backport --pr 183756

Questions ?

Please refer to the Backport tool documentation

@cnasikas
Copy link
Member Author

cnasikas commented Dec 3, 2024

💚 All backports created successfully

Status Branch Result
8.x

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

cnasikas added a commit that referenced this pull request Dec 3, 2024
…83756) (#202667)

# Backport

This will backport the following commits from `main` to `8.x`:
- [[ResponseOps][Alerting] Decouple feature IDs from consumers
(#183756)](#183756)

<!--- Backport version: 8.9.8 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Christos
Nasikas","email":"[email protected]"},"sourceCommit":{"committedDate":"2024-12-03T10:21:53Z","message":"[ResponseOps][Alerting]
Decouple feature IDs from consumers (#183756)\n\n## Summary\r\n\r\nThis
PR aims to decouple the feature IDs from the `consumer` attribute\r\nof
rules and alerts.\r\n\r\nTowards:
https://github.com/elastic/kibana/issues/187202\r\nFixes:
https://github.com/elastic/kibana/issues/181559\r\nFixes:
https://github.com/elastic/kibana/issues/182435\r\n\r\n> [!NOTE] \r\n>
Unfortunately, I could not break the PR into smaller pieces. The
APIs\r\ncould not work anymore with feature IDs and had to convert them
to use\r\nrule type IDs. Also, I took the chance and refactored crucial
parts of\r\nthe authorization class that in turn affected a lot of
files. Most of\r\nthe changes in the files are minimal and easy to
review. The crucial\r\nchanges are in the authorization class and some
alerting APIs.\r\n\r\n## Architecture\r\n\r\n### Alerting RBAC
model\r\n\r\nThe Kibana security uses Elasticsearch's
[application\r\nprivileges](https://www.elastic.co/guide/en/elasticsearch/reference/current/security-api-put-privileges.html#security-api-put-privileges).\r\nThis
way Kibana can represent and store its privilege models
within\r\nElasticsearch roles. To do that, Kibana security creates
actions that\r\nare granted by a specific privilege. Alerting uses its
own RBAC model\r\nand is built on top of the existing Kibana security
model. The Alerting\r\nRBAC uses the `rule_type_id` and `consumer`
attributes to define who\r\nowns the rule and the alerts procured by the
rule. To connect the\r\n`rule_type_id` and `consumer` with the Kibana
security actions the\r\nAlerting RBAC registers its custom actions. They
are constructed
as\r\n`alerting:<rule-type-id>/<feature-id>/<alerting-entity>/<operation>`.\r\nBecause
to authorizate a resource an action has to be generated and\r\nbecause
the action needs a valid feature ID the value of the
`consumer`\r\nshould be a valid feature ID. For example,
the\r\n`alerting:siem.esqlRule/siem/rule/get` action, means that a user
with a\r\nrole that grants this action can get a rule of type
`siem.esqlRule` with\r\nconsumer `siem`.\r\n\r\n### Problem
statement\r\n\r\nAt the moment the `consumer` attribute should be a
valid feature ID.\r\nThough this approach worked well so far it has its
limitation.\r\nSpecifically:\r\n\r\n- Rule types cannot support more
than one consumer.\r\n- To associate old rules with a new feature ID
required a migration on\r\nthe rule's SOs and the alerts documents.\r\n-
The API calls are feature ID-oriented and not rule-type-oriented.\r\n-
The framework has to be aware of the values of the
`consumer`\r\nattribute.\r\n- Feature IDs are tightly coupled with the
alerting indices leading
to\r\n[bugs](https://github.com/elastic/kibana/issues/179082).\r\n-
Legacy consumers that are not a valid feature anymore can
cause\r\n[bugs](https://github.com/elastic/kibana/issues/184595).\r\n-
The framework has to be aware of legacy consumers to handle
edge\r\ncases.\r\n- The framework has to be aware of specific consumers
to handle edge\r\ncases.\r\n\r\n### Proposed solution\r\n\r\nThis PR
aims to decouple the feature IDs from consumers. It achieves\r\nthat a)
by changing the way solutions configure the alerting privileges\r\nwhen
registering a feature and b) by changing the alerting actions.
The\r\nschema changes as:\r\n\r\n```\r\n// Old formatting\r\nid: 'siem',
<--- feature ID\r\nalerting:['siem.queryRule']\r\n\r\n// New
formatting\r\nid: 'siem', <--- feature ID\r\nalerting: [{ ruleTypeId:
'siem.queryRule', consumers: ['siem'] }] <-- consumer same as the
feature ID in the old formatting\r\n```\r\n\r\nThe new actions are
constructed
as\r\n`alerting:<rule-type-id>/<consumer>/<alerting-entity>/<operation>`.
For\r\nexample `alerting:rule-type-id/my-consumer/rule/get`. The new
action\r\nmeans that a user with a role that grants this action can get
a rule of\r\ntype `rule-type` with consumer `my-consumer`. Changing the
action\r\nstrings is not considered a breaking change as long as the
user's\r\npermission works as before. In our case, this is true because
the\r\nconsumer will be the same as before (feature ID), and the
alerting\r\nsecurity actions will be the same. For example:\r\n\r\n**Old
formatting**\r\n\r\nSchema:\r\n```\r\nid: 'logs', <--- feature
ID\r\nalerting:['.es-query'] <-- rule type ID\r\n```\r\n\r\nGenerated
action:\r\n\r\n```\r\nalerting:.es-query/logs/rule/get\r\n```\r\n\r\n**New
formatting**\r\n\r\nSchema:\r\n```\r\nid: 'siem', <--- feature
ID\r\nalerting: [{ ruleTypeId: '.es-query', consumers: ['logs'] }] <--
consumer same as the feature ID in the old
formatting\r\n```\r\n\r\nGenerated
action:\r\n\r\n```\r\nalerting:.es-query/logs/rule/get <--- consumer is
set as logs and the action is the same as before\r\n```\r\n\r\nIn both
formating the actions are the same thus breaking changes
are\r\navoided.\r\n\r\n### Alerting authorization class\r\nThe alerting
plugin uses and exports the alerting authorization
class\r\n(`AlertingAuthorization`). The class is responsible for
handling all\r\nauthorization actions related to rules and alerts. The
class changed to\r\nhandle the new actions as described in the above
sections. A lot of\r\nmethods were renamed, removed, and cleaned up, all
method arguments\r\nconverted to be an object, and the response
signature of some methods\r\nchanged. These changes affected various
pieces of the code. The changes\r\nin this class are the most important
in this PR especially
the\r\n`_getAuthorizedRuleTypesWithAuthorizedConsumers` method which is
the\r\ncornerstone of the alerting RBAC. Please review
carefully.\r\n\r\n### Instantiation of the alerting authorization
class\r\nThe `AlertingAuthorizationClientFactory` is used to create
instances of\r\nthe `AlertingAuthorization` class. The
`AlertingAuthorization` class\r\nneeds to perform async operations upon
instantiation. Because JS, at the\r\nmoment, does not support async
instantiation of classes the\r\n`AlertingAuthorization` class was
assigning `Promise` objects to\r\nvariables that could be resolved later
in other phases of the lifecycle\r\nof the class. To improve readability
and make the lifecycle of the class\r\nclearer, I separated the
construction of the class (initialization) from\r\nthe bootstrap
process. As a result, getting the `AlertingAuthorization`\r\nclass or
any client that depends on it (`getRulesClient` for example) is\r\nan
async operation.\r\n\r\n### Filtering\r\nA lot of routes use the
authorization class to get the authorization\r\nfilter
(`getFindAuthorizationFilter`), a filter that, if applied,\r\nreturns
only the rule types and consumers the user is authorized to.
The\r\nmethod that returns the filter was built in a way to also
support\r\nfiltering on top of the authorization filter thus coupling
the\r\nauthorized filter with router filtering. I believe these two
operations\r\nshould be decoupled and the filter method should return a
filter that\r\ngives you all the authorized rule types. It is the
responsibility of the\r\nconsumer, router in our case, to apply extra
filters on top of the\r\nauthorization filter. For that reason, I made
all the necessary changes\r\nto decouple them.\r\n\r\n### Legacy
consumers & producer\r\nA lot of rules and alerts have been created and
are still being created\r\nfrom observability with the `alerts`
consumer. When the Alerting RBAC\r\nencounters a rule or alert with
`alerts` as a consumer it falls back to\r\nthe `producer` of the rule
type ID to construct the actions. For example\r\nif a rule with
`ruleTypeId: .es-query` and `consumer: alerts` the\r\nalerting action
will be constructed as\r\n`alerting:.es-query/stackAlerts/rule/get`
where `stackRules` is the\r\nproducer of the `.es-query` rule type. The
`producer` is used to be used\r\nin alerting authorization but due to
its complexity, it was deprecated\r\nand only used as a fallback for the
`alerts` consumer. To avoid breaking\r\nchanges all feature privileges
that specify access to rule types add the\r\n`alerts` consumer when
configuring their alerting privileges. By moving\r\nthe `alerts`
consumer to the registration of the feature we can stop\r\nrelying on
the `producer`. The `producer` is not used anymore in
the\r\nauthorization class. In the next PRs the `producer` will
removed\r\nentirely.\r\n\r\n### Routes\r\nThe following changes were
introduced to the alerting routes:\r\n\r\n- All related routes changed
to be rule-type oriented and not feature ID\r\noriented.\r\n- All
related routes support the `ruleTypeIds` and the
`consumers`\r\nparameters for filtering. In all routes, the filters are
constructed as\r\n`ruleTypeIds: ['foo'] AND consumers: ['bar'] AND
authorizationFilter`.\r\nFiltering by consumers is important. In o11y
for example, we do not want\r\nto show ES rule types with the
`stackAlerts` consumer even if the user\r\nhas access to them.\r\n- The
`/internal/rac/alerts/_feature_ids` route got deleted as it was\r\nnot
used anywhere in the codebase and it was internal.\r\n\r\nAll the
changes in the routes are related to internal routes and no\r\nbreaking
changes are introduced.\r\n\r\n### Constants\r\nI moved the o11y and
stack rule type IDs to `kbn-rule-data-utils` and\r\nexported all
security solution rule type IDs from\r\n`kbn-securitysolution-rules`. I
am not a fan of having a centralized\r\nplace for the rule type IDs.
Ideally, consumers of the framework should\r\nspecify keywords like
`observablility` (category or subcategory) or even\r\n`apm.*` and the
framework should know which rule type IDs to pick up. I\r\nthink it is
out of the scope of the PR, and at the moment it seems the\r\nmost
straightforward way to move forward. I will try to clean up as
much\r\nas possible in further iterations. If you are interested in the
upcoming\r\nwork follow this issue
https://github.com/elastic/kibana/issues/187202.\r\n\r\n### Other
notable code changes\r\n- Change all instances of feature IDs to rule
type IDs.\r\n- `isSiemRuleType`: This is a temporary helper function
that is needed\r\nin places where we handle edge cases related to
security solution rule\r\ntypes. Ideally, the framework should be
agnostic to the rule types or\r\nconsumers. The plan is to be removed
entirely in further iterations.\r\n- Rename alerting
`PluginSetupContract` and `PluginStartContract`
to\r\n`AlertingServerSetup` and `AlertingServerStart`. This made me
touch a\r\nlot of files but I could not resist.\r\n- `filter_consumers`
was mistakenly exposed to a public API. It was\r\nundocumented.\r\n-
Files or functions that were not used anywhere in the codebase
got\r\ndeleted.\r\n- Change the returned type of the `list` method of
the\r\n`RuleTypeRegistry` from `Set<RegistryRuleType>` to
`Map<string,\r\nRegistryRuleType>`.\r\n- Assertion of `KueryNode` in
tests changed to an assertion of KQL using\r\n`toKqlExpression`.\r\n-
Removal of `useRuleAADFields` as it is not used anywhere.\r\n\r\n##
Testing\r\n\r\n> [!CAUTION]\r\n> It is very important to test all the
areas of the application where\r\nrules or alerts are being used
directly or indirectly. Scenarios to\r\nconsider:\r\n> - The correct
rules, alerts, and aggregations on top of them are being\r\nshown as
expected as a superuser.\r\n> - The correct rules, alerts, and
aggregations on top of them are being\r\nshown as expected by a user
with limited access to certain features.\r\n> - The changes in this PR
are backward compatible with the previous\r\nusers'
permissions.\r\n\r\n### Solutions\r\nPlease test and verify that:\r\n-
All the rule types you own with all possible combinations
of\r\npermissions both in ESS and in Serverless.\r\n- The consumers and
rule types make sense when registering the features.\r\n- The consumers
and rule types that are passed to the components are the\r\nintended
ones.\r\n\r\n### ResponseOps\r\nThe most important changes are in the
alerting authorization class, the\r\nsearch strategy, and the routes.
Please test:\r\n- The rules we own with all possible combinations of
permissions.\r\n- The stack alerts page and its solution filtering.\r\n-
The categories filtering in the maintenance window UI.\r\n\r\n##
Risks\r\n> [!WARNING]\r\n> The risks involved in this PR are related to
privileges. Specifically:\r\n> - Users with no privileges can access
rules and alerts they do not\r\nhave access to.\r\n> - Users with
privileges cannot access rules and alerts they have\r\naccess
to.\r\n>\r\n> An excessive list of integration tests is in place to
ensure that the\r\nabove scenarios will not occur. In the case of a bug,
we could a)\r\nrelease an energy release for serverless and b) backport
the fix in ESS.\r\nGiven that this PR is intended to be merged in 8.17
we have plenty of\r\ntime to test and to minimize the chances of
risks.\r\n\r\n## FQA\r\n\r\n- I noticed that a lot of routes support the
`filter` parameter where we\r\ncan pass an arbitrary KQL filter. Why we
do not use this to filter by\r\nthe rule type IDs and the consumers and
instead we introduce new\r\ndedicated parameters?\r\n\r\nThe `filter`
parameter should not be exposed in the first place. It\r\nassumes that
the consumer of the API knows the underlying structure
and\r\nimplementation details of the persisted storage API (SavedObject
client\r\nAPI). For example, a valid filter would
be\r\n`alerting.attributes.rule_type_id`. In this filter the consumer
should\r\nknow a) the name of the SO b) the keyword `attributes`
(storage\r\nimplementation detail) and c) the name of the attribute as
it is\r\npersisted in ES (snake case instead of camel case as it is
returned by\r\nthe APIs). As there is no abstraction layer between the
SO and the API,\r\nit makes it very difficult to make changes in the
persistent schema or\r\nthe APIs. For all the above I decided to
introduce new query parameters\r\nwhere the alerting framework has total
control over it.\r\n\r\n- I noticed in the code a lot of instances where
the consumer is used.\r\nShould not remove any logic around
consumers?\r\n\r\nThis PR is a step forward making the framework as
agnostic as possible.\r\nI had to keep the scope of the PR as contained
as possible. We will get\r\nthere. It needs time :).\r\n\r\n- I noticed
a lot of hacks like checking if the rule type is `siem`.\r\nShould not
remove the hacks?\r\n\r\nThis PR is a step forward making the framework
as agnostic as possible.\r\nI had to keep the scope of the PR as
contained as possible. We will get\r\nthere. It needs time :).\r\n\r\n-
I hate the \"Role visibility\" dropdown. Can we remove it?\r\n\r\nI also
do not like it. The goal is to remove it.
Follow\r\nhttps://github.com//issues/189997.\r\n\r\n---------\r\n\r\nCo-authored-by:
kibanamachine
<[email protected]>\r\nCo-authored-by:
Aleh Zasypkin <[email protected]>\r\nCo-authored-by: Paula
Borgonovi
<[email protected]>","sha":"a3496c9ca6d99fa301fd8ca73ad44178cdeb2955","branchLabelMapping":{"^v9.0.0$":"main","^v8.18.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["Feature:Alerting","release_note:skip","Team:ResponseOps","v9.0.0","backport:prev-minor","ci:cloud-deploy","ci:cloud-persist-deployment","ci:build-serverless-image","Team:Obs
AI
Assistant","ci:project-deploy-observability","Team:obs-ux-infra_services","Team:obs-ux-management","apm:review","v8.18.0"],"number":183756,"url":"https://github.com/elastic/kibana/pull/183756","mergeCommit":{"message":"[ResponseOps][Alerting]
Decouple feature IDs from consumers (#183756)\n\n## Summary\r\n\r\nThis
PR aims to decouple the feature IDs from the `consumer` attribute\r\nof
rules and alerts.\r\n\r\nTowards:
https://github.com/elastic/kibana/issues/187202\r\nFixes:
https://github.com/elastic/kibana/issues/181559\r\nFixes:
https://github.com/elastic/kibana/issues/182435\r\n\r\n> [!NOTE] \r\n>
Unfortunately, I could not break the PR into smaller pieces. The
APIs\r\ncould not work anymore with feature IDs and had to convert them
to use\r\nrule type IDs. Also, I took the chance and refactored crucial
parts of\r\nthe authorization class that in turn affected a lot of
files. Most of\r\nthe changes in the files are minimal and easy to
review. The crucial\r\nchanges are in the authorization class and some
alerting APIs.\r\n\r\n## Architecture\r\n\r\n### Alerting RBAC
model\r\n\r\nThe Kibana security uses Elasticsearch's
[application\r\nprivileges](https://www.elastic.co/guide/en/elasticsearch/reference/current/security-api-put-privileges.html#security-api-put-privileges).\r\nThis
way Kibana can represent and store its privilege models
within\r\nElasticsearch roles. To do that, Kibana security creates
actions that\r\nare granted by a specific privilege. Alerting uses its
own RBAC model\r\nand is built on top of the existing Kibana security
model. The Alerting\r\nRBAC uses the `rule_type_id` and `consumer`
attributes to define who\r\nowns the rule and the alerts procured by the
rule. To connect the\r\n`rule_type_id` and `consumer` with the Kibana
security actions the\r\nAlerting RBAC registers its custom actions. They
are constructed
as\r\n`alerting:<rule-type-id>/<feature-id>/<alerting-entity>/<operation>`.\r\nBecause
to authorizate a resource an action has to be generated and\r\nbecause
the action needs a valid feature ID the value of the
`consumer`\r\nshould be a valid feature ID. For example,
the\r\n`alerting:siem.esqlRule/siem/rule/get` action, means that a user
with a\r\nrole that grants this action can get a rule of type
`siem.esqlRule` with\r\nconsumer `siem`.\r\n\r\n### Problem
statement\r\n\r\nAt the moment the `consumer` attribute should be a
valid feature ID.\r\nThough this approach worked well so far it has its
limitation.\r\nSpecifically:\r\n\r\n- Rule types cannot support more
than one consumer.\r\n- To associate old rules with a new feature ID
required a migration on\r\nthe rule's SOs and the alerts documents.\r\n-
The API calls are feature ID-oriented and not rule-type-oriented.\r\n-
The framework has to be aware of the values of the
`consumer`\r\nattribute.\r\n- Feature IDs are tightly coupled with the
alerting indices leading
to\r\n[bugs](https://github.com/elastic/kibana/issues/179082).\r\n-
Legacy consumers that are not a valid feature anymore can
cause\r\n[bugs](https://github.com/elastic/kibana/issues/184595).\r\n-
The framework has to be aware of legacy consumers to handle
edge\r\ncases.\r\n- The framework has to be aware of specific consumers
to handle edge\r\ncases.\r\n\r\n### Proposed solution\r\n\r\nThis PR
aims to decouple the feature IDs from consumers. It achieves\r\nthat a)
by changing the way solutions configure the alerting privileges\r\nwhen
registering a feature and b) by changing the alerting actions.
The\r\nschema changes as:\r\n\r\n```\r\n// Old formatting\r\nid: 'siem',
<--- feature ID\r\nalerting:['siem.queryRule']\r\n\r\n// New
formatting\r\nid: 'siem', <--- feature ID\r\nalerting: [{ ruleTypeId:
'siem.queryRule', consumers: ['siem'] }] <-- consumer same as the
feature ID in the old formatting\r\n```\r\n\r\nThe new actions are
constructed
as\r\n`alerting:<rule-type-id>/<consumer>/<alerting-entity>/<operation>`.
For\r\nexample `alerting:rule-type-id/my-consumer/rule/get`. The new
action\r\nmeans that a user with a role that grants this action can get
a rule of\r\ntype `rule-type` with consumer `my-consumer`. Changing the
action\r\nstrings is not considered a breaking change as long as the
user's\r\npermission works as before. In our case, this is true because
the\r\nconsumer will be the same as before (feature ID), and the
alerting\r\nsecurity actions will be the same. For example:\r\n\r\n**Old
formatting**\r\n\r\nSchema:\r\n```\r\nid: 'logs', <--- feature
ID\r\nalerting:['.es-query'] <-- rule type ID\r\n```\r\n\r\nGenerated
action:\r\n\r\n```\r\nalerting:.es-query/logs/rule/get\r\n```\r\n\r\n**New
formatting**\r\n\r\nSchema:\r\n```\r\nid: 'siem', <--- feature
ID\r\nalerting: [{ ruleTypeId: '.es-query', consumers: ['logs'] }] <--
consumer same as the feature ID in the old
formatting\r\n```\r\n\r\nGenerated
action:\r\n\r\n```\r\nalerting:.es-query/logs/rule/get <--- consumer is
set as logs and the action is the same as before\r\n```\r\n\r\nIn both
formating the actions are the same thus breaking changes
are\r\navoided.\r\n\r\n### Alerting authorization class\r\nThe alerting
plugin uses and exports the alerting authorization
class\r\n(`AlertingAuthorization`). The class is responsible for
handling all\r\nauthorization actions related to rules and alerts. The
class changed to\r\nhandle the new actions as described in the above
sections. A lot of\r\nmethods were renamed, removed, and cleaned up, all
method arguments\r\nconverted to be an object, and the response
signature of some methods\r\nchanged. These changes affected various
pieces of the code. The changes\r\nin this class are the most important
in this PR especially
the\r\n`_getAuthorizedRuleTypesWithAuthorizedConsumers` method which is
the\r\ncornerstone of the alerting RBAC. Please review
carefully.\r\n\r\n### Instantiation of the alerting authorization
class\r\nThe `AlertingAuthorizationClientFactory` is used to create
instances of\r\nthe `AlertingAuthorization` class. The
`AlertingAuthorization` class\r\nneeds to perform async operations upon
instantiation. Because JS, at the\r\nmoment, does not support async
instantiation of classes the\r\n`AlertingAuthorization` class was
assigning `Promise` objects to\r\nvariables that could be resolved later
in other phases of the lifecycle\r\nof the class. To improve readability
and make the lifecycle of the class\r\nclearer, I separated the
construction of the class (initialization) from\r\nthe bootstrap
process. As a result, getting the `AlertingAuthorization`\r\nclass or
any client that depends on it (`getRulesClient` for example) is\r\nan
async operation.\r\n\r\n### Filtering\r\nA lot of routes use the
authorization class to get the authorization\r\nfilter
(`getFindAuthorizationFilter`), a filter that, if applied,\r\nreturns
only the rule types and consumers the user is authorized to.
The\r\nmethod that returns the filter was built in a way to also
support\r\nfiltering on top of the authorization filter thus coupling
the\r\nauthorized filter with router filtering. I believe these two
operations\r\nshould be decoupled and the filter method should return a
filter that\r\ngives you all the authorized rule types. It is the
responsibility of the\r\nconsumer, router in our case, to apply extra
filters on top of the\r\nauthorization filter. For that reason, I made
all the necessary changes\r\nto decouple them.\r\n\r\n### Legacy
consumers & producer\r\nA lot of rules and alerts have been created and
are still being created\r\nfrom observability with the `alerts`
consumer. When the Alerting RBAC\r\nencounters a rule or alert with
`alerts` as a consumer it falls back to\r\nthe `producer` of the rule
type ID to construct the actions. For example\r\nif a rule with
`ruleTypeId: .es-query` and `consumer: alerts` the\r\nalerting action
will be constructed as\r\n`alerting:.es-query/stackAlerts/rule/get`
where `stackRules` is the\r\nproducer of the `.es-query` rule type. The
`producer` is used to be used\r\nin alerting authorization but due to
its complexity, it was deprecated\r\nand only used as a fallback for the
`alerts` consumer. To avoid breaking\r\nchanges all feature privileges
that specify access to rule types add the\r\n`alerts` consumer when
configuring their alerting privileges. By moving\r\nthe `alerts`
consumer to the registration of the feature we can stop\r\nrelying on
the `producer`. The `producer` is not used anymore in
the\r\nauthorization class. In the next PRs the `producer` will
removed\r\nentirely.\r\n\r\n### Routes\r\nThe following changes were
introduced to the alerting routes:\r\n\r\n- All related routes changed
to be rule-type oriented and not feature ID\r\noriented.\r\n- All
related routes support the `ruleTypeIds` and the
`consumers`\r\nparameters for filtering. In all routes, the filters are
constructed as\r\n`ruleTypeIds: ['foo'] AND consumers: ['bar'] AND
authorizationFilter`.\r\nFiltering by consumers is important. In o11y
for example, we do not want\r\nto show ES rule types with the
`stackAlerts` consumer even if the user\r\nhas access to them.\r\n- The
`/internal/rac/alerts/_feature_ids` route got deleted as it was\r\nnot
used anywhere in the codebase and it was internal.\r\n\r\nAll the
changes in the routes are related to internal routes and no\r\nbreaking
changes are introduced.\r\n\r\n### Constants\r\nI moved the o11y and
stack rule type IDs to `kbn-rule-data-utils` and\r\nexported all
security solution rule type IDs from\r\n`kbn-securitysolution-rules`. I
am not a fan of having a centralized\r\nplace for the rule type IDs.
Ideally, consumers of the framework should\r\nspecify keywords like
`observablility` (category or subcategory) or even\r\n`apm.*` and the
framework should know which rule type IDs to pick up. I\r\nthink it is
out of the scope of the PR, and at the moment it seems the\r\nmost
straightforward way to move forward. I will try to clean up as
much\r\nas possible in further iterations. If you are interested in the
upcoming\r\nwork follow this issue
https://github.com/elastic/kibana/issues/187202.\r\n\r\n### Other
notable code changes\r\n- Change all instances of feature IDs to rule
type IDs.\r\n- `isSiemRuleType`: This is a temporary helper function
that is needed\r\nin places where we handle edge cases related to
security solution rule\r\ntypes. Ideally, the framework should be
agnostic to the rule types or\r\nconsumers. The plan is to be removed
entirely in further iterations.\r\n- Rename alerting
`PluginSetupContract` and `PluginStartContract`
to\r\n`AlertingServerSetup` and `AlertingServerStart`. This made me
touch a\r\nlot of files but I could not resist.\r\n- `filter_consumers`
was mistakenly exposed to a public API. It was\r\nundocumented.\r\n-
Files or functions that were not used anywhere in the codebase
got\r\ndeleted.\r\n- Change the returned type of the `list` method of
the\r\n`RuleTypeRegistry` from `Set<RegistryRuleType>` to
`Map<string,\r\nRegistryRuleType>`.\r\n- Assertion of `KueryNode` in
tests changed to an assertion of KQL using\r\n`toKqlExpression`.\r\n-
Removal of `useRuleAADFields` as it is not used anywhere.\r\n\r\n##
Testing\r\n\r\n> [!CAUTION]\r\n> It is very important to test all the
areas of the application where\r\nrules or alerts are being used
directly or indirectly. Scenarios to\r\nconsider:\r\n> - The correct
rules, alerts, and aggregations on top of them are being\r\nshown as
expected as a superuser.\r\n> - The correct rules, alerts, and
aggregations on top of them are being\r\nshown as expected by a user
with limited access to certain features.\r\n> - The changes in this PR
are backward compatible with the previous\r\nusers'
permissions.\r\n\r\n### Solutions\r\nPlease test and verify that:\r\n-
All the rule types you own with all possible combinations
of\r\npermissions both in ESS and in Serverless.\r\n- The consumers and
rule types make sense when registering the features.\r\n- The consumers
and rule types that are passed to the components are the\r\nintended
ones.\r\n\r\n### ResponseOps\r\nThe most important changes are in the
alerting authorization class, the\r\nsearch strategy, and the routes.
Please test:\r\n- The rules we own with all possible combinations of
permissions.\r\n- The stack alerts page and its solution filtering.\r\n-
The categories filtering in the maintenance window UI.\r\n\r\n##
Risks\r\n> [!WARNING]\r\n> The risks involved in this PR are related to
privileges. Specifically:\r\n> - Users with no privileges can access
rules and alerts they do not\r\nhave access to.\r\n> - Users with
privileges cannot access rules and alerts they have\r\naccess
to.\r\n>\r\n> An excessive list of integration tests is in place to
ensure that the\r\nabove scenarios will not occur. In the case of a bug,
we could a)\r\nrelease an energy release for serverless and b) backport
the fix in ESS.\r\nGiven that this PR is intended to be merged in 8.17
we have plenty of\r\ntime to test and to minimize the chances of
risks.\r\n\r\n## FQA\r\n\r\n- I noticed that a lot of routes support the
`filter` parameter where we\r\ncan pass an arbitrary KQL filter. Why we
do not use this to filter by\r\nthe rule type IDs and the consumers and
instead we introduce new\r\ndedicated parameters?\r\n\r\nThe `filter`
parameter should not be exposed in the first place. It\r\nassumes that
the consumer of the API knows the underlying structure
and\r\nimplementation details of the persisted storage API (SavedObject
client\r\nAPI). For example, a valid filter would
be\r\n`alerting.attributes.rule_type_id`. In this filter the consumer
should\r\nknow a) the name of the SO b) the keyword `attributes`
(storage\r\nimplementation detail) and c) the name of the attribute as
it is\r\npersisted in ES (snake case instead of camel case as it is
returned by\r\nthe APIs). As there is no abstraction layer between the
SO and the API,\r\nit makes it very difficult to make changes in the
persistent schema or\r\nthe APIs. For all the above I decided to
introduce new query parameters\r\nwhere the alerting framework has total
control over it.\r\n\r\n- I noticed in the code a lot of instances where
the consumer is used.\r\nShould not remove any logic around
consumers?\r\n\r\nThis PR is a step forward making the framework as
agnostic as possible.\r\nI had to keep the scope of the PR as contained
as possible. We will get\r\nthere. It needs time :).\r\n\r\n- I noticed
a lot of hacks like checking if the rule type is `siem`.\r\nShould not
remove the hacks?\r\n\r\nThis PR is a step forward making the framework
as agnostic as possible.\r\nI had to keep the scope of the PR as
contained as possible. We will get\r\nthere. It needs time :).\r\n\r\n-
I hate the \"Role visibility\" dropdown. Can we remove it?\r\n\r\nI also
do not like it. The goal is to remove it.
Follow\r\nhttps://github.com//issues/189997.\r\n\r\n---------\r\n\r\nCo-authored-by:
kibanamachine
<[email protected]>\r\nCo-authored-by:
Aleh Zasypkin <[email protected]>\r\nCo-authored-by: Paula
Borgonovi
<[email protected]>","sha":"a3496c9ca6d99fa301fd8ca73ad44178cdeb2955"}},"sourceBranch":"main","suggestedTargetBranches":["8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","labelRegex":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/183756","number":183756,"mergeCommit":{"message":"[ResponseOps][Alerting]
Decouple feature IDs from consumers (#183756)\n\n## Summary\r\n\r\nThis
PR aims to decouple the feature IDs from the `consumer` attribute\r\nof
rules and alerts.\r\n\r\nTowards:
https://github.com/elastic/kibana/issues/187202\r\nFixes:
https://github.com/elastic/kibana/issues/181559\r\nFixes:
https://github.com/elastic/kibana/issues/182435\r\n\r\n> [!NOTE] \r\n>
Unfortunately, I could not break the PR into smaller pieces. The
APIs\r\ncould not work anymore with feature IDs and had to convert them
to use\r\nrule type IDs. Also, I took the chance and refactored crucial
parts of\r\nthe authorization class that in turn affected a lot of
files. Most of\r\nthe changes in the files are minimal and easy to
review. The crucial\r\nchanges are in the authorization class and some
alerting APIs.\r\n\r\n## Architecture\r\n\r\n### Alerting RBAC
model\r\n\r\nThe Kibana security uses Elasticsearch's
[application\r\nprivileges](https://www.elastic.co/guide/en/elasticsearch/reference/current/security-api-put-privileges.html#security-api-put-privileges).\r\nThis
way Kibana can represent and store its privilege models
within\r\nElasticsearch roles. To do that, Kibana security creates
actions that\r\nare granted by a specific privilege. Alerting uses its
own RBAC model\r\nand is built on top of the existing Kibana security
model. The Alerting\r\nRBAC uses the `rule_type_id` and `consumer`
attributes to define who\r\nowns the rule and the alerts procured by the
rule. To connect the\r\n`rule_type_id` and `consumer` with the Kibana
security actions the\r\nAlerting RBAC registers its custom actions. They
are constructed
as\r\n`alerting:<rule-type-id>/<feature-id>/<alerting-entity>/<operation>`.\r\nBecause
to authorizate a resource an action has to be generated and\r\nbecause
the action needs a valid feature ID the value of the
`consumer`\r\nshould be a valid feature ID. For example,
the\r\n`alerting:siem.esqlRule/siem/rule/get` action, means that a user
with a\r\nrole that grants this action can get a rule of type
`siem.esqlRule` with\r\nconsumer `siem`.\r\n\r\n### Problem
statement\r\n\r\nAt the moment the `consumer` attribute should be a
valid feature ID.\r\nThough this approach worked well so far it has its
limitation.\r\nSpecifically:\r\n\r\n- Rule types cannot support more
than one consumer.\r\n- To associate old rules with a new feature ID
required a migration on\r\nthe rule's SOs and the alerts documents.\r\n-
The API calls are feature ID-oriented and not rule-type-oriented.\r\n-
The framework has to be aware of the values of the
`consumer`\r\nattribute.\r\n- Feature IDs are tightly coupled with the
alerting indices leading
to\r\n[bugs](https://github.com/elastic/kibana/issues/179082).\r\n-
Legacy consumers that are not a valid feature anymore can
cause\r\n[bugs](https://github.com/elastic/kibana/issues/184595).\r\n-
The framework has to be aware of legacy consumers to handle
edge\r\ncases.\r\n- The framework has to be aware of specific consumers
to handle edge\r\ncases.\r\n\r\n### Proposed solution\r\n\r\nThis PR
aims to decouple the feature IDs from consumers. It achieves\r\nthat a)
by changing the way solutions configure the alerting privileges\r\nwhen
registering a feature and b) by changing the alerting actions.
The\r\nschema changes as:\r\n\r\n```\r\n// Old formatting\r\nid: 'siem',
<--- feature ID\r\nalerting:['siem.queryRule']\r\n\r\n// New
formatting\r\nid: 'siem', <--- feature ID\r\nalerting: [{ ruleTypeId:
'siem.queryRule', consumers: ['siem'] }] <-- consumer same as the
feature ID in the old formatting\r\n```\r\n\r\nThe new actions are
constructed
as\r\n`alerting:<rule-type-id>/<consumer>/<alerting-entity>/<operation>`.
For\r\nexample `alerting:rule-type-id/my-consumer/rule/get`. The new
action\r\nmeans that a user with a role that grants this action can get
a rule of\r\ntype `rule-type` with consumer `my-consumer`. Changing the
action\r\nstrings is not considered a breaking change as long as the
user's\r\npermission works as before. In our case, this is true because
the\r\nconsumer will be the same as before (feature ID), and the
alerting\r\nsecurity actions will be the same. For example:\r\n\r\n**Old
formatting**\r\n\r\nSchema:\r\n```\r\nid: 'logs', <--- feature
ID\r\nalerting:['.es-query'] <-- rule type ID\r\n```\r\n\r\nGenerated
action:\r\n\r\n```\r\nalerting:.es-query/logs/rule/get\r\n```\r\n\r\n**New
formatting**\r\n\r\nSchema:\r\n```\r\nid: 'siem', <--- feature
ID\r\nalerting: [{ ruleTypeId: '.es-query', consumers: ['logs'] }] <--
consumer same as the feature ID in the old
formatting\r\n```\r\n\r\nGenerated
action:\r\n\r\n```\r\nalerting:.es-query/logs/rule/get <--- consumer is
set as logs and the action is the same as before\r\n```\r\n\r\nIn both
formating the actions are the same thus breaking changes
are\r\navoided.\r\n\r\n### Alerting authorization class\r\nThe alerting
plugin uses and exports the alerting authorization
class\r\n(`AlertingAuthorization`). The class is responsible for
handling all\r\nauthorization actions related to rules and alerts. The
class changed to\r\nhandle the new actions as described in the above
sections. A lot of\r\nmethods were renamed, removed, and cleaned up, all
method arguments\r\nconverted to be an object, and the response
signature of some methods\r\nchanged. These changes affected various
pieces of the code. The changes\r\nin this class are the most important
in this PR especially
the\r\n`_getAuthorizedRuleTypesWithAuthorizedConsumers` method which is
the\r\ncornerstone of the alerting RBAC. Please review
carefully.\r\n\r\n### Instantiation of the alerting authorization
class\r\nThe `AlertingAuthorizationClientFactory` is used to create
instances of\r\nthe `AlertingAuthorization` class. The
`AlertingAuthorization` class\r\nneeds to perform async operations upon
instantiation. Because JS, at the\r\nmoment, does not support async
instantiation of classes the\r\n`AlertingAuthorization` class was
assigning `Promise` objects to\r\nvariables that could be resolved later
in other phases of the lifecycle\r\nof the class. To improve readability
and make the lifecycle of the class\r\nclearer, I separated the
construction of the class (initialization) from\r\nthe bootstrap
process. As a result, getting the `AlertingAuthorization`\r\nclass or
any client that depends on it (`getRulesClient` for example) is\r\nan
async operation.\r\n\r\n### Filtering\r\nA lot of routes use the
authorization class to get the authorization\r\nfilter
(`getFindAuthorizationFilter`), a filter that, if applied,\r\nreturns
only the rule types and consumers the user is authorized to.
The\r\nmethod that returns the filter was built in a way to also
support\r\nfiltering on top of the authorization filter thus coupling
the\r\nauthorized filter with router filtering. I believe these two
operations\r\nshould be decoupled and the filter method should return a
filter that\r\ngives you all the authorized rule types. It is the
responsibility of the\r\nconsumer, router in our case, to apply extra
filters on top of the\r\nauthorization filter. For that reason, I made
all the necessary changes\r\nto decouple them.\r\n\r\n### Legacy
consumers & producer\r\nA lot of rules and alerts have been created and
are still being created\r\nfrom observability with the `alerts`
consumer. When the Alerting RBAC\r\nencounters a rule or alert with
`alerts` as a consumer it falls back to\r\nthe `producer` of the rule
type ID to construct the actions. For example\r\nif a rule with
`ruleTypeId: .es-query` and `consumer: alerts` the\r\nalerting action
will be constructed as\r\n`alerting:.es-query/stackAlerts/rule/get`
where `stackRules` is the\r\nproducer of the `.es-query` rule type. The
`producer` is used to be used\r\nin alerting authorization but due to
its complexity, it was deprecated\r\nand only used as a fallback for the
`alerts` consumer. To avoid breaking\r\nchanges all feature privileges
that specify access to rule types add the\r\n`alerts` consumer when
configuring their alerting privileges. By moving\r\nthe `alerts`
consumer to the registration of the feature we can stop\r\nrelying on
the `producer`. The `producer` is not used anymore in
the\r\nauthorization class. In the next PRs the `producer` will
removed\r\nentirely.\r\n\r\n### Routes\r\nThe following changes were
introduced to the alerting routes:\r\n\r\n- All related routes changed
to be rule-type oriented and not feature ID\r\noriented.\r\n- All
related routes support the `ruleTypeIds` and the
`consumers`\r\nparameters for filtering. In all routes, the filters are
constructed as\r\n`ruleTypeIds: ['foo'] AND consumers: ['bar'] AND
authorizationFilter`.\r\nFiltering by consumers is important. In o11y
for example, we do not want\r\nto show ES rule types with the
`stackAlerts` consumer even if the user\r\nhas access to them.\r\n- The
`/internal/rac/alerts/_feature_ids` route got deleted as it was\r\nnot
used anywhere in the codebase and it was internal.\r\n\r\nAll the
changes in the routes are related to internal routes and no\r\nbreaking
changes are introduced.\r\n\r\n### Constants\r\nI moved the o11y and
stack rule type IDs to `kbn-rule-data-utils` and\r\nexported all
security solution rule type IDs from\r\n`kbn-securitysolution-rules`. I
am not a fan of having a centralized\r\nplace for the rule type IDs.
Ideally, consumers of the framework should\r\nspecify keywords like
`observablility` (category or subcategory) or even\r\n`apm.*` and the
framework should know which rule type IDs to pick up. I\r\nthink it is
out of the scope of the PR, and at the moment it seems the\r\nmost
straightforward way to move forward. I will try to clean up as
much\r\nas possible in further iterations. If you are interested in the
upcoming\r\nwork follow this issue
https://github.com/elastic/kibana/issues/187202.\r\n\r\n### Other
notable code changes\r\n- Change all instances of feature IDs to rule
type IDs.\r\n- `isSiemRuleType`: This is a temporary helper function
that is needed\r\nin places where we handle edge cases related to
security solution rule\r\ntypes. Ideally, the framework should be
agnostic to the rule types or\r\nconsumers. The plan is to be removed
entirely in further iterations.\r\n- Rename alerting
`PluginSetupContract` and `PluginStartContract`
to\r\n`AlertingServerSetup` and `AlertingServerStart`. This made me
touch a\r\nlot of files but I could not resist.\r\n- `filter_consumers`
was mistakenly exposed to a public API. It was\r\nundocumented.\r\n-
Files or functions that were not used anywhere in the codebase
got\r\ndeleted.\r\n- Change the returned type of the `list` method of
the\r\n`RuleTypeRegistry` from `Set<RegistryRuleType>` to
`Map<string,\r\nRegistryRuleType>`.\r\n- Assertion of `KueryNode` in
tests changed to an assertion of KQL using\r\n`toKqlExpression`.\r\n-
Removal of `useRuleAADFields` as it is not used anywhere.\r\n\r\n##
Testing\r\n\r\n> [!CAUTION]\r\n> It is very important to test all the
areas of the application where\r\nrules or alerts are being used
directly or indirectly. Scenarios to\r\nconsider:\r\n> - The correct
rules, alerts, and aggregations on top of them are being\r\nshown as
expected as a superuser.\r\n> - The correct rules, alerts, and
aggregations on top of them are being\r\nshown as expected by a user
with limited access to certain features.\r\n> - The changes in this PR
are backward compatible with the previous\r\nusers'
permissions.\r\n\r\n### Solutions\r\nPlease test and verify that:\r\n-
All the rule types you own with all possible combinations
of\r\npermissions both in ESS and in Serverless.\r\n- The consumers and
rule types make sense when registering the features.\r\n- The consumers
and rule types that are passed to the components are the\r\nintended
ones.\r\n\r\n### ResponseOps\r\nThe most important changes are in the
alerting authorization class, the\r\nsearch strategy, and the routes.
Please test:\r\n- The rules we own with all possible combinations of
permissions.\r\n- The stack alerts page and its solution filtering.\r\n-
The categories filtering in the maintenance window UI.\r\n\r\n##
Risks\r\n> [!WARNING]\r\n> The risks involved in this PR are related to
privileges. Specifically:\r\n> - Users with no privileges can access
rules and alerts they do not\r\nhave access to.\r\n> - Users with
privileges cannot access rules and alerts they have\r\naccess
to.\r\n>\r\n> An excessive list of integration tests is in place to
ensure that the\r\nabove scenarios will not occur. In the case of a bug,
we could a)\r\nrelease an energy release for serverless and b) backport
the fix in ESS.\r\nGiven that this PR is intended to be merged in 8.17
we have plenty of\r\ntime to test and to minimize the chances of
risks.\r\n\r\n## FQA\r\n\r\n- I noticed that a lot of routes support the
`filter` parameter where we\r\ncan pass an arbitrary KQL filter. Why we
do not use this to filter by\r\nthe rule type IDs and the consumers and
instead we introduce new\r\ndedicated parameters?\r\n\r\nThe `filter`
parameter should not be exposed in the first place. It\r\nassumes that
the consumer of the API knows the underlying structure
and\r\nimplementation details of the persisted storage API (SavedObject
client\r\nAPI). For example, a valid filter would
be\r\n`alerting.attributes.rule_type_id`. In this filter the consumer
should\r\nknow a) the name of the SO b) the keyword `attributes`
(storage\r\nimplementation detail) and c) the name of the attribute as
it is\r\npersisted in ES (snake case instead of camel case as it is
returned by\r\nthe APIs). As there is no abstraction layer between the
SO and the API,\r\nit makes it very difficult to make changes in the
persistent schema or\r\nthe APIs. For all the above I decided to
introduce new query parameters\r\nwhere the alerting framework has total
control over it.\r\n\r\n- I noticed in the code a lot of instances where
the consumer is used.\r\nShould not remove any logic around
consumers?\r\n\r\nThis PR is a step forward making the framework as
agnostic as possible.\r\nI had to keep the scope of the PR as contained
as possible. We will get\r\nthere. It needs time :).\r\n\r\n- I noticed
a lot of hacks like checking if the rule type is `siem`.\r\nShould not
remove the hacks?\r\n\r\nThis PR is a step forward making the framework
as agnostic as possible.\r\nI had to keep the scope of the PR as
contained as possible. We will get\r\nthere. It needs time :).\r\n\r\n-
I hate the \"Role visibility\" dropdown. Can we remove it?\r\n\r\nI also
do not like it. The goal is to remove it.
Follow\r\nhttps://github.com//issues/189997.\r\n\r\n---------\r\n\r\nCo-authored-by:
kibanamachine
<[email protected]>\r\nCo-authored-by:
Aleh Zasypkin <[email protected]>\r\nCo-authored-by: Paula
Borgonovi
<[email protected]>","sha":"a3496c9ca6d99fa301fd8ca73ad44178cdeb2955"}},{"branch":"8.x","label":"v8.18.0","labelRegex":"^v8.18.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}]
BACKPORT-->
hop-dev pushed a commit to hop-dev/kibana that referenced this pull request Dec 5, 2024
…183756)

## Summary

This PR aims to decouple the feature IDs from the `consumer` attribute
of rules and alerts.

Towards: elastic#187202
Fixes: elastic#181559
Fixes: elastic#182435

> [!NOTE]  
> Unfortunately, I could not break the PR into smaller pieces. The APIs
could not work anymore with feature IDs and had to convert them to use
rule type IDs. Also, I took the chance and refactored crucial parts of
the authorization class that in turn affected a lot of files. Most of
the changes in the files are minimal and easy to review. The crucial
changes are in the authorization class and some alerting APIs.

## Architecture

### Alerting RBAC model

The Kibana security uses Elasticsearch's [application
privileges](https://www.elastic.co/guide/en/elasticsearch/reference/current/security-api-put-privileges.html#security-api-put-privileges).
This way Kibana can represent and store its privilege models within
Elasticsearch roles. To do that, Kibana security creates actions that
are granted by a specific privilege. Alerting uses its own RBAC model
and is built on top of the existing Kibana security model. The Alerting
RBAC uses the `rule_type_id` and `consumer` attributes to define who
owns the rule and the alerts procured by the rule. To connect the
`rule_type_id` and `consumer` with the Kibana security actions the
Alerting RBAC registers its custom actions. They are constructed as
`alerting:<rule-type-id>/<feature-id>/<alerting-entity>/<operation>`.
Because to authorizate a resource an action has to be generated and
because the action needs a valid feature ID the value of the `consumer`
should be a valid feature ID. For example, the
`alerting:siem.esqlRule/siem/rule/get` action, means that a user with a
role that grants this action can get a rule of type `siem.esqlRule` with
consumer `siem`.

### Problem statement

At the moment the `consumer` attribute should be a valid feature ID.
Though this approach worked well so far it has its limitation.
Specifically:

- Rule types cannot support more than one consumer.
- To associate old rules with a new feature ID required a migration on
the rule's SOs and the alerts documents.
- The API calls are feature ID-oriented and not rule-type-oriented.
- The framework has to be aware of the values of the `consumer`
attribute.
- Feature IDs are tightly coupled with the alerting indices leading to
[bugs](elastic#179082).
- Legacy consumers that are not a valid feature anymore can cause
[bugs](elastic#184595).
- The framework has to be aware of legacy consumers to handle edge
cases.
- The framework has to be aware of specific consumers to handle edge
cases.

### Proposed solution

This PR aims to decouple the feature IDs from consumers. It achieves
that a) by changing the way solutions configure the alerting privileges
when registering a feature and b) by changing the alerting actions. The
schema changes as:

```
// Old formatting
id: 'siem', <--- feature ID
alerting:['siem.queryRule']

// New formatting
id: 'siem', <--- feature ID
alerting: [{ ruleTypeId: 'siem.queryRule', consumers: ['siem'] }] <-- consumer same as the feature ID in the old formatting
```

The new actions are constructed as
`alerting:<rule-type-id>/<consumer>/<alerting-entity>/<operation>`. For
example `alerting:rule-type-id/my-consumer/rule/get`. The new action
means that a user with a role that grants this action can get a rule of
type `rule-type` with consumer `my-consumer`. Changing the action
strings is not considered a breaking change as long as the user's
permission works as before. In our case, this is true because the
consumer will be the same as before (feature ID), and the alerting
security actions will be the same. For example:

**Old formatting**

Schema:
```
id: 'logs', <--- feature ID
alerting:['.es-query'] <-- rule type ID
```

Generated action:

```
alerting:.es-query/logs/rule/get
```

**New formatting**

Schema:
```
id: 'siem', <--- feature ID
alerting: [{ ruleTypeId: '.es-query', consumers: ['logs'] }] <-- consumer same as the feature ID in the old formatting
```

Generated action:

```
alerting:.es-query/logs/rule/get <--- consumer is set as logs and the action is the same as before
```

In both formating the actions are the same thus breaking changes are
avoided.

### Alerting authorization class
The alerting plugin uses and exports the alerting authorization class
(`AlertingAuthorization`). The class is responsible for handling all
authorization actions related to rules and alerts. The class changed to
handle the new actions as described in the above sections. A lot of
methods were renamed, removed, and cleaned up, all method arguments
converted to be an object, and the response signature of some methods
changed. These changes affected various pieces of the code. The changes
in this class are the most important in this PR especially the
`_getAuthorizedRuleTypesWithAuthorizedConsumers` method which is the
cornerstone of the alerting RBAC. Please review carefully.

### Instantiation of the alerting authorization class
The `AlertingAuthorizationClientFactory` is used to create instances of
the `AlertingAuthorization` class. The `AlertingAuthorization` class
needs to perform async operations upon instantiation. Because JS, at the
moment, does not support async instantiation of classes the
`AlertingAuthorization` class was assigning `Promise` objects to
variables that could be resolved later in other phases of the lifecycle
of the class. To improve readability and make the lifecycle of the class
clearer, I separated the construction of the class (initialization) from
the bootstrap process. As a result, getting the `AlertingAuthorization`
class or any client that depends on it (`getRulesClient` for example) is
an async operation.

### Filtering
A lot of routes use the authorization class to get the authorization
filter (`getFindAuthorizationFilter`), a filter that, if applied,
returns only the rule types and consumers the user is authorized to. The
method that returns the filter was built in a way to also support
filtering on top of the authorization filter thus coupling the
authorized filter with router filtering. I believe these two operations
should be decoupled and the filter method should return a filter that
gives you all the authorized rule types. It is the responsibility of the
consumer, router in our case, to apply extra filters on top of the
authorization filter. For that reason, I made all the necessary changes
to decouple them.

### Legacy consumers & producer
A lot of rules and alerts have been created and are still being created
from observability with the `alerts` consumer. When the Alerting RBAC
encounters a rule or alert with `alerts` as a consumer it falls back to
the `producer` of the rule type ID to construct the actions. For example
if a rule with `ruleTypeId: .es-query` and `consumer: alerts` the
alerting action will be constructed as
`alerting:.es-query/stackAlerts/rule/get` where `stackRules` is the
producer of the `.es-query` rule type. The `producer` is used to be used
in alerting authorization but due to its complexity, it was deprecated
and only used as a fallback for the `alerts` consumer. To avoid breaking
changes all feature privileges that specify access to rule types add the
`alerts` consumer when configuring their alerting privileges. By moving
the `alerts` consumer to the registration of the feature we can stop
relying on the `producer`. The `producer` is not used anymore in the
authorization class. In the next PRs the `producer` will removed
entirely.

### Routes
The following changes were introduced to the alerting routes:

- All related routes changed to be rule-type oriented and not feature ID
oriented.
- All related routes support the `ruleTypeIds` and the `consumers`
parameters for filtering. In all routes, the filters are constructed as
`ruleTypeIds: ['foo'] AND consumers: ['bar'] AND authorizationFilter`.
Filtering by consumers is important. In o11y for example, we do not want
to show ES rule types with the `stackAlerts` consumer even if the user
has access to them.
- The `/internal/rac/alerts/_feature_ids` route got deleted as it was
not used anywhere in the codebase and it was internal.

All the changes in the routes are related to internal routes and no
breaking changes are introduced.

### Constants
I moved the o11y and stack rule type IDs to `kbn-rule-data-utils` and
exported all security solution rule type IDs from
`kbn-securitysolution-rules`. I am not a fan of having a centralized
place for the rule type IDs. Ideally, consumers of the framework should
specify keywords like `observablility` (category or subcategory) or even
`apm.*` and the framework should know which rule type IDs to pick up. I
think it is out of the scope of the PR, and at the moment it seems the
most straightforward way to move forward. I will try to clean up as much
as possible in further iterations. If you are interested in the upcoming
work follow this issue elastic#187202.

### Other notable code changes
- Change all instances of feature IDs to rule type IDs.
- `isSiemRuleType`: This is a temporary helper function that is needed
in places where we handle edge cases related to security solution rule
types. Ideally, the framework should be agnostic to the rule types or
consumers. The plan is to be removed entirely in further iterations.
- Rename alerting `PluginSetupContract` and `PluginStartContract` to
`AlertingServerSetup` and `AlertingServerStart`. This made me touch a
lot of files but I could not resist.
- `filter_consumers` was mistakenly exposed to a public API. It was
undocumented.
- Files or functions that were not used anywhere in the codebase got
deleted.
- Change the returned type of the `list` method of the
`RuleTypeRegistry` from `Set<RegistryRuleType>` to `Map<string,
RegistryRuleType>`.
- Assertion of `KueryNode` in tests changed to an assertion of KQL using
`toKqlExpression`.
- Removal of `useRuleAADFields` as it is not used anywhere.

## Testing

> [!CAUTION]
> It is very important to test all the areas of the application where
rules or alerts are being used directly or indirectly. Scenarios to
consider:
> - The correct rules, alerts, and aggregations on top of them are being
shown as expected as a superuser.
> - The correct rules, alerts, and aggregations on top of them are being
shown as expected by a user with limited access to certain features.
> - The changes in this PR are backward compatible with the previous
users' permissions.

### Solutions
Please test and verify that:
- All the rule types you own with all possible combinations of
permissions both in ESS and in Serverless.
- The consumers and rule types make sense when registering the features.
- The consumers and rule types that are passed to the components are the
intended ones.

### ResponseOps
The most important changes are in the alerting authorization class, the
search strategy, and the routes. Please test:
- The rules we own with all possible combinations of permissions.
- The stack alerts page and its solution filtering.
- The categories filtering in the maintenance window UI.

## Risks
> [!WARNING]
> The risks involved in this PR are related to privileges. Specifically:
> - Users with no privileges can access rules and alerts they do not
have access to.
> - Users with privileges cannot access rules and alerts they have
access to.
>
> An excessive list of integration tests is in place to ensure that the
above scenarios will not occur. In the case of a bug, we could a)
release an energy release for serverless and b) backport the fix in ESS.
Given that this PR is intended to be merged in 8.17 we have plenty of
time to test and to minimize the chances of risks.

## FQA

- I noticed that a lot of routes support the `filter` parameter where we
can pass an arbitrary KQL filter. Why we do not use this to filter by
the rule type IDs and the consumers and instead we introduce new
dedicated parameters?

The `filter` parameter should not be exposed in the first place. It
assumes that the consumer of the API knows the underlying structure and
implementation details of the persisted storage API (SavedObject client
API). For example, a valid filter would be
`alerting.attributes.rule_type_id`. In this filter the consumer should
know a) the name of the SO b) the keyword `attributes` (storage
implementation detail) and c) the name of the attribute as it is
persisted in ES (snake case instead of camel case as it is returned by
the APIs). As there is no abstraction layer between the SO and the API,
it makes it very difficult to make changes in the persistent schema or
the APIs. For all the above I decided to introduce new query parameters
where the alerting framework has total control over it.

- I noticed in the code a lot of instances where the consumer is used.
Should not remove any logic around consumers?

This PR is a step forward making the framework as agnostic as possible.
I had to keep the scope of the PR as contained as possible. We will get
there. It needs time :).

- I noticed a lot of hacks like checking if the rule type is `siem`.
Should not remove the hacks?

This PR is a step forward making the framework as agnostic as possible.
I had to keep the scope of the PR as contained as possible. We will get
there. It needs time :).

- I hate the "Role visibility" dropdown. Can we remove it?

I also do not like it. The goal is to remove it. Follow
elastic#189997.

---------

Co-authored-by: kibanamachine <[email protected]>
Co-authored-by: Aleh Zasypkin <[email protected]>
Co-authored-by: Paula Borgonovi <[email protected]>
CAWilson94 pushed a commit to CAWilson94/kibana that referenced this pull request Dec 9, 2024
…183756)

## Summary

This PR aims to decouple the feature IDs from the `consumer` attribute
of rules and alerts.

Towards: elastic#187202
Fixes: elastic#181559
Fixes: elastic#182435

> [!NOTE]  
> Unfortunately, I could not break the PR into smaller pieces. The APIs
could not work anymore with feature IDs and had to convert them to use
rule type IDs. Also, I took the chance and refactored crucial parts of
the authorization class that in turn affected a lot of files. Most of
the changes in the files are minimal and easy to review. The crucial
changes are in the authorization class and some alerting APIs.

## Architecture

### Alerting RBAC model

The Kibana security uses Elasticsearch's [application
privileges](https://www.elastic.co/guide/en/elasticsearch/reference/current/security-api-put-privileges.html#security-api-put-privileges).
This way Kibana can represent and store its privilege models within
Elasticsearch roles. To do that, Kibana security creates actions that
are granted by a specific privilege. Alerting uses its own RBAC model
and is built on top of the existing Kibana security model. The Alerting
RBAC uses the `rule_type_id` and `consumer` attributes to define who
owns the rule and the alerts procured by the rule. To connect the
`rule_type_id` and `consumer` with the Kibana security actions the
Alerting RBAC registers its custom actions. They are constructed as
`alerting:<rule-type-id>/<feature-id>/<alerting-entity>/<operation>`.
Because to authorizate a resource an action has to be generated and
because the action needs a valid feature ID the value of the `consumer`
should be a valid feature ID. For example, the
`alerting:siem.esqlRule/siem/rule/get` action, means that a user with a
role that grants this action can get a rule of type `siem.esqlRule` with
consumer `siem`.

### Problem statement

At the moment the `consumer` attribute should be a valid feature ID.
Though this approach worked well so far it has its limitation.
Specifically:

- Rule types cannot support more than one consumer.
- To associate old rules with a new feature ID required a migration on
the rule's SOs and the alerts documents.
- The API calls are feature ID-oriented and not rule-type-oriented.
- The framework has to be aware of the values of the `consumer`
attribute.
- Feature IDs are tightly coupled with the alerting indices leading to
[bugs](elastic#179082).
- Legacy consumers that are not a valid feature anymore can cause
[bugs](elastic#184595).
- The framework has to be aware of legacy consumers to handle edge
cases.
- The framework has to be aware of specific consumers to handle edge
cases.

### Proposed solution

This PR aims to decouple the feature IDs from consumers. It achieves
that a) by changing the way solutions configure the alerting privileges
when registering a feature and b) by changing the alerting actions. The
schema changes as:

```
// Old formatting
id: 'siem', <--- feature ID
alerting:['siem.queryRule']

// New formatting
id: 'siem', <--- feature ID
alerting: [{ ruleTypeId: 'siem.queryRule', consumers: ['siem'] }] <-- consumer same as the feature ID in the old formatting
```

The new actions are constructed as
`alerting:<rule-type-id>/<consumer>/<alerting-entity>/<operation>`. For
example `alerting:rule-type-id/my-consumer/rule/get`. The new action
means that a user with a role that grants this action can get a rule of
type `rule-type` with consumer `my-consumer`. Changing the action
strings is not considered a breaking change as long as the user's
permission works as before. In our case, this is true because the
consumer will be the same as before (feature ID), and the alerting
security actions will be the same. For example:

**Old formatting**

Schema:
```
id: 'logs', <--- feature ID
alerting:['.es-query'] <-- rule type ID
```

Generated action:

```
alerting:.es-query/logs/rule/get
```

**New formatting**

Schema:
```
id: 'siem', <--- feature ID
alerting: [{ ruleTypeId: '.es-query', consumers: ['logs'] }] <-- consumer same as the feature ID in the old formatting
```

Generated action:

```
alerting:.es-query/logs/rule/get <--- consumer is set as logs and the action is the same as before
```

In both formating the actions are the same thus breaking changes are
avoided.

### Alerting authorization class
The alerting plugin uses and exports the alerting authorization class
(`AlertingAuthorization`). The class is responsible for handling all
authorization actions related to rules and alerts. The class changed to
handle the new actions as described in the above sections. A lot of
methods were renamed, removed, and cleaned up, all method arguments
converted to be an object, and the response signature of some methods
changed. These changes affected various pieces of the code. The changes
in this class are the most important in this PR especially the
`_getAuthorizedRuleTypesWithAuthorizedConsumers` method which is the
cornerstone of the alerting RBAC. Please review carefully.

### Instantiation of the alerting authorization class
The `AlertingAuthorizationClientFactory` is used to create instances of
the `AlertingAuthorization` class. The `AlertingAuthorization` class
needs to perform async operations upon instantiation. Because JS, at the
moment, does not support async instantiation of classes the
`AlertingAuthorization` class was assigning `Promise` objects to
variables that could be resolved later in other phases of the lifecycle
of the class. To improve readability and make the lifecycle of the class
clearer, I separated the construction of the class (initialization) from
the bootstrap process. As a result, getting the `AlertingAuthorization`
class or any client that depends on it (`getRulesClient` for example) is
an async operation.

### Filtering
A lot of routes use the authorization class to get the authorization
filter (`getFindAuthorizationFilter`), a filter that, if applied,
returns only the rule types and consumers the user is authorized to. The
method that returns the filter was built in a way to also support
filtering on top of the authorization filter thus coupling the
authorized filter with router filtering. I believe these two operations
should be decoupled and the filter method should return a filter that
gives you all the authorized rule types. It is the responsibility of the
consumer, router in our case, to apply extra filters on top of the
authorization filter. For that reason, I made all the necessary changes
to decouple them.

### Legacy consumers & producer
A lot of rules and alerts have been created and are still being created
from observability with the `alerts` consumer. When the Alerting RBAC
encounters a rule or alert with `alerts` as a consumer it falls back to
the `producer` of the rule type ID to construct the actions. For example
if a rule with `ruleTypeId: .es-query` and `consumer: alerts` the
alerting action will be constructed as
`alerting:.es-query/stackAlerts/rule/get` where `stackRules` is the
producer of the `.es-query` rule type. The `producer` is used to be used
in alerting authorization but due to its complexity, it was deprecated
and only used as a fallback for the `alerts` consumer. To avoid breaking
changes all feature privileges that specify access to rule types add the
`alerts` consumer when configuring their alerting privileges. By moving
the `alerts` consumer to the registration of the feature we can stop
relying on the `producer`. The `producer` is not used anymore in the
authorization class. In the next PRs the `producer` will removed
entirely.

### Routes
The following changes were introduced to the alerting routes:

- All related routes changed to be rule-type oriented and not feature ID
oriented.
- All related routes support the `ruleTypeIds` and the `consumers`
parameters for filtering. In all routes, the filters are constructed as
`ruleTypeIds: ['foo'] AND consumers: ['bar'] AND authorizationFilter`.
Filtering by consumers is important. In o11y for example, we do not want
to show ES rule types with the `stackAlerts` consumer even if the user
has access to them.
- The `/internal/rac/alerts/_feature_ids` route got deleted as it was
not used anywhere in the codebase and it was internal.

All the changes in the routes are related to internal routes and no
breaking changes are introduced.

### Constants
I moved the o11y and stack rule type IDs to `kbn-rule-data-utils` and
exported all security solution rule type IDs from
`kbn-securitysolution-rules`. I am not a fan of having a centralized
place for the rule type IDs. Ideally, consumers of the framework should
specify keywords like `observablility` (category or subcategory) or even
`apm.*` and the framework should know which rule type IDs to pick up. I
think it is out of the scope of the PR, and at the moment it seems the
most straightforward way to move forward. I will try to clean up as much
as possible in further iterations. If you are interested in the upcoming
work follow this issue elastic#187202.

### Other notable code changes
- Change all instances of feature IDs to rule type IDs.
- `isSiemRuleType`: This is a temporary helper function that is needed
in places where we handle edge cases related to security solution rule
types. Ideally, the framework should be agnostic to the rule types or
consumers. The plan is to be removed entirely in further iterations.
- Rename alerting `PluginSetupContract` and `PluginStartContract` to
`AlertingServerSetup` and `AlertingServerStart`. This made me touch a
lot of files but I could not resist.
- `filter_consumers` was mistakenly exposed to a public API. It was
undocumented.
- Files or functions that were not used anywhere in the codebase got
deleted.
- Change the returned type of the `list` method of the
`RuleTypeRegistry` from `Set<RegistryRuleType>` to `Map<string,
RegistryRuleType>`.
- Assertion of `KueryNode` in tests changed to an assertion of KQL using
`toKqlExpression`.
- Removal of `useRuleAADFields` as it is not used anywhere.

## Testing

> [!CAUTION]
> It is very important to test all the areas of the application where
rules or alerts are being used directly or indirectly. Scenarios to
consider:
> - The correct rules, alerts, and aggregations on top of them are being
shown as expected as a superuser.
> - The correct rules, alerts, and aggregations on top of them are being
shown as expected by a user with limited access to certain features.
> - The changes in this PR are backward compatible with the previous
users' permissions.

### Solutions
Please test and verify that:
- All the rule types you own with all possible combinations of
permissions both in ESS and in Serverless.
- The consumers and rule types make sense when registering the features.
- The consumers and rule types that are passed to the components are the
intended ones.

### ResponseOps
The most important changes are in the alerting authorization class, the
search strategy, and the routes. Please test:
- The rules we own with all possible combinations of permissions.
- The stack alerts page and its solution filtering.
- The categories filtering in the maintenance window UI.

## Risks
> [!WARNING]
> The risks involved in this PR are related to privileges. Specifically:
> - Users with no privileges can access rules and alerts they do not
have access to.
> - Users with privileges cannot access rules and alerts they have
access to.
>
> An excessive list of integration tests is in place to ensure that the
above scenarios will not occur. In the case of a bug, we could a)
release an energy release for serverless and b) backport the fix in ESS.
Given that this PR is intended to be merged in 8.17 we have plenty of
time to test and to minimize the chances of risks.

## FQA

- I noticed that a lot of routes support the `filter` parameter where we
can pass an arbitrary KQL filter. Why we do not use this to filter by
the rule type IDs and the consumers and instead we introduce new
dedicated parameters?

The `filter` parameter should not be exposed in the first place. It
assumes that the consumer of the API knows the underlying structure and
implementation details of the persisted storage API (SavedObject client
API). For example, a valid filter would be
`alerting.attributes.rule_type_id`. In this filter the consumer should
know a) the name of the SO b) the keyword `attributes` (storage
implementation detail) and c) the name of the attribute as it is
persisted in ES (snake case instead of camel case as it is returned by
the APIs). As there is no abstraction layer between the SO and the API,
it makes it very difficult to make changes in the persistent schema or
the APIs. For all the above I decided to introduce new query parameters
where the alerting framework has total control over it.

- I noticed in the code a lot of instances where the consumer is used.
Should not remove any logic around consumers?

This PR is a step forward making the framework as agnostic as possible.
I had to keep the scope of the PR as contained as possible. We will get
there. It needs time :).

- I noticed a lot of hacks like checking if the rule type is `siem`.
Should not remove the hacks?

This PR is a step forward making the framework as agnostic as possible.
I had to keep the scope of the PR as contained as possible. We will get
there. It needs time :).

- I hate the "Role visibility" dropdown. Can we remove it?

I also do not like it. The goal is to remove it. Follow
elastic#189997.

---------

Co-authored-by: kibanamachine <[email protected]>
Co-authored-by: Aleh Zasypkin <[email protected]>
Co-authored-by: Paula Borgonovi <[email protected]>
viduni94 added a commit that referenced this pull request Dec 11, 2024
## Summary

### Problem
With the merge of the PR #183756,
the alerts function has stopped working in the Obs AI Assistant, because
there has been a change to the query (when finding alerts)

### Solution
Revert the change made to the query.

### Checklist

- [x] The PR description includes the appropriate Release Notes section,
and the correct `release_note:*` label is applied per the
[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Dec 11, 2024
## Summary

### Problem
With the merge of the PR elastic#183756,
the alerts function has stopped working in the Obs AI Assistant, because
there has been a change to the query (when finding alerts)

### Solution
Revert the change made to the query.

### Checklist

- [x] The PR description includes the appropriate Release Notes section,
and the correct `release_note:*` label is applied per the
[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)

(cherry picked from commit d9c1cd3)
CAWilson94 pushed a commit to CAWilson94/kibana that referenced this pull request Dec 12, 2024
…183756)

## Summary

This PR aims to decouple the feature IDs from the `consumer` attribute
of rules and alerts.

Towards: elastic#187202
Fixes: elastic#181559
Fixes: elastic#182435

> [!NOTE]  
> Unfortunately, I could not break the PR into smaller pieces. The APIs
could not work anymore with feature IDs and had to convert them to use
rule type IDs. Also, I took the chance and refactored crucial parts of
the authorization class that in turn affected a lot of files. Most of
the changes in the files are minimal and easy to review. The crucial
changes are in the authorization class and some alerting APIs.

## Architecture

### Alerting RBAC model

The Kibana security uses Elasticsearch's [application
privileges](https://www.elastic.co/guide/en/elasticsearch/reference/current/security-api-put-privileges.html#security-api-put-privileges).
This way Kibana can represent and store its privilege models within
Elasticsearch roles. To do that, Kibana security creates actions that
are granted by a specific privilege. Alerting uses its own RBAC model
and is built on top of the existing Kibana security model. The Alerting
RBAC uses the `rule_type_id` and `consumer` attributes to define who
owns the rule and the alerts procured by the rule. To connect the
`rule_type_id` and `consumer` with the Kibana security actions the
Alerting RBAC registers its custom actions. They are constructed as
`alerting:<rule-type-id>/<feature-id>/<alerting-entity>/<operation>`.
Because to authorizate a resource an action has to be generated and
because the action needs a valid feature ID the value of the `consumer`
should be a valid feature ID. For example, the
`alerting:siem.esqlRule/siem/rule/get` action, means that a user with a
role that grants this action can get a rule of type `siem.esqlRule` with
consumer `siem`.

### Problem statement

At the moment the `consumer` attribute should be a valid feature ID.
Though this approach worked well so far it has its limitation.
Specifically:

- Rule types cannot support more than one consumer.
- To associate old rules with a new feature ID required a migration on
the rule's SOs and the alerts documents.
- The API calls are feature ID-oriented and not rule-type-oriented.
- The framework has to be aware of the values of the `consumer`
attribute.
- Feature IDs are tightly coupled with the alerting indices leading to
[bugs](elastic#179082).
- Legacy consumers that are not a valid feature anymore can cause
[bugs](elastic#184595).
- The framework has to be aware of legacy consumers to handle edge
cases.
- The framework has to be aware of specific consumers to handle edge
cases.

### Proposed solution

This PR aims to decouple the feature IDs from consumers. It achieves
that a) by changing the way solutions configure the alerting privileges
when registering a feature and b) by changing the alerting actions. The
schema changes as:

```
// Old formatting
id: 'siem', <--- feature ID
alerting:['siem.queryRule']

// New formatting
id: 'siem', <--- feature ID
alerting: [{ ruleTypeId: 'siem.queryRule', consumers: ['siem'] }] <-- consumer same as the feature ID in the old formatting
```

The new actions are constructed as
`alerting:<rule-type-id>/<consumer>/<alerting-entity>/<operation>`. For
example `alerting:rule-type-id/my-consumer/rule/get`. The new action
means that a user with a role that grants this action can get a rule of
type `rule-type` with consumer `my-consumer`. Changing the action
strings is not considered a breaking change as long as the user's
permission works as before. In our case, this is true because the
consumer will be the same as before (feature ID), and the alerting
security actions will be the same. For example:

**Old formatting**

Schema:
```
id: 'logs', <--- feature ID
alerting:['.es-query'] <-- rule type ID
```

Generated action:

```
alerting:.es-query/logs/rule/get
```

**New formatting**

Schema:
```
id: 'siem', <--- feature ID
alerting: [{ ruleTypeId: '.es-query', consumers: ['logs'] }] <-- consumer same as the feature ID in the old formatting
```

Generated action:

```
alerting:.es-query/logs/rule/get <--- consumer is set as logs and the action is the same as before
```

In both formating the actions are the same thus breaking changes are
avoided.

### Alerting authorization class
The alerting plugin uses and exports the alerting authorization class
(`AlertingAuthorization`). The class is responsible for handling all
authorization actions related to rules and alerts. The class changed to
handle the new actions as described in the above sections. A lot of
methods were renamed, removed, and cleaned up, all method arguments
converted to be an object, and the response signature of some methods
changed. These changes affected various pieces of the code. The changes
in this class are the most important in this PR especially the
`_getAuthorizedRuleTypesWithAuthorizedConsumers` method which is the
cornerstone of the alerting RBAC. Please review carefully.

### Instantiation of the alerting authorization class
The `AlertingAuthorizationClientFactory` is used to create instances of
the `AlertingAuthorization` class. The `AlertingAuthorization` class
needs to perform async operations upon instantiation. Because JS, at the
moment, does not support async instantiation of classes the
`AlertingAuthorization` class was assigning `Promise` objects to
variables that could be resolved later in other phases of the lifecycle
of the class. To improve readability and make the lifecycle of the class
clearer, I separated the construction of the class (initialization) from
the bootstrap process. As a result, getting the `AlertingAuthorization`
class or any client that depends on it (`getRulesClient` for example) is
an async operation.

### Filtering
A lot of routes use the authorization class to get the authorization
filter (`getFindAuthorizationFilter`), a filter that, if applied,
returns only the rule types and consumers the user is authorized to. The
method that returns the filter was built in a way to also support
filtering on top of the authorization filter thus coupling the
authorized filter with router filtering. I believe these two operations
should be decoupled and the filter method should return a filter that
gives you all the authorized rule types. It is the responsibility of the
consumer, router in our case, to apply extra filters on top of the
authorization filter. For that reason, I made all the necessary changes
to decouple them.

### Legacy consumers & producer
A lot of rules and alerts have been created and are still being created
from observability with the `alerts` consumer. When the Alerting RBAC
encounters a rule or alert with `alerts` as a consumer it falls back to
the `producer` of the rule type ID to construct the actions. For example
if a rule with `ruleTypeId: .es-query` and `consumer: alerts` the
alerting action will be constructed as
`alerting:.es-query/stackAlerts/rule/get` where `stackRules` is the
producer of the `.es-query` rule type. The `producer` is used to be used
in alerting authorization but due to its complexity, it was deprecated
and only used as a fallback for the `alerts` consumer. To avoid breaking
changes all feature privileges that specify access to rule types add the
`alerts` consumer when configuring their alerting privileges. By moving
the `alerts` consumer to the registration of the feature we can stop
relying on the `producer`. The `producer` is not used anymore in the
authorization class. In the next PRs the `producer` will removed
entirely.

### Routes
The following changes were introduced to the alerting routes:

- All related routes changed to be rule-type oriented and not feature ID
oriented.
- All related routes support the `ruleTypeIds` and the `consumers`
parameters for filtering. In all routes, the filters are constructed as
`ruleTypeIds: ['foo'] AND consumers: ['bar'] AND authorizationFilter`.
Filtering by consumers is important. In o11y for example, we do not want
to show ES rule types with the `stackAlerts` consumer even if the user
has access to them.
- The `/internal/rac/alerts/_feature_ids` route got deleted as it was
not used anywhere in the codebase and it was internal.

All the changes in the routes are related to internal routes and no
breaking changes are introduced.

### Constants
I moved the o11y and stack rule type IDs to `kbn-rule-data-utils` and
exported all security solution rule type IDs from
`kbn-securitysolution-rules`. I am not a fan of having a centralized
place for the rule type IDs. Ideally, consumers of the framework should
specify keywords like `observablility` (category or subcategory) or even
`apm.*` and the framework should know which rule type IDs to pick up. I
think it is out of the scope of the PR, and at the moment it seems the
most straightforward way to move forward. I will try to clean up as much
as possible in further iterations. If you are interested in the upcoming
work follow this issue elastic#187202.

### Other notable code changes
- Change all instances of feature IDs to rule type IDs.
- `isSiemRuleType`: This is a temporary helper function that is needed
in places where we handle edge cases related to security solution rule
types. Ideally, the framework should be agnostic to the rule types or
consumers. The plan is to be removed entirely in further iterations.
- Rename alerting `PluginSetupContract` and `PluginStartContract` to
`AlertingServerSetup` and `AlertingServerStart`. This made me touch a
lot of files but I could not resist.
- `filter_consumers` was mistakenly exposed to a public API. It was
undocumented.
- Files or functions that were not used anywhere in the codebase got
deleted.
- Change the returned type of the `list` method of the
`RuleTypeRegistry` from `Set<RegistryRuleType>` to `Map<string,
RegistryRuleType>`.
- Assertion of `KueryNode` in tests changed to an assertion of KQL using
`toKqlExpression`.
- Removal of `useRuleAADFields` as it is not used anywhere.

## Testing

> [!CAUTION]
> It is very important to test all the areas of the application where
rules or alerts are being used directly or indirectly. Scenarios to
consider:
> - The correct rules, alerts, and aggregations on top of them are being
shown as expected as a superuser.
> - The correct rules, alerts, and aggregations on top of them are being
shown as expected by a user with limited access to certain features.
> - The changes in this PR are backward compatible with the previous
users' permissions.

### Solutions
Please test and verify that:
- All the rule types you own with all possible combinations of
permissions both in ESS and in Serverless.
- The consumers and rule types make sense when registering the features.
- The consumers and rule types that are passed to the components are the
intended ones.

### ResponseOps
The most important changes are in the alerting authorization class, the
search strategy, and the routes. Please test:
- The rules we own with all possible combinations of permissions.
- The stack alerts page and its solution filtering.
- The categories filtering in the maintenance window UI.

## Risks
> [!WARNING]
> The risks involved in this PR are related to privileges. Specifically:
> - Users with no privileges can access rules and alerts they do not
have access to.
> - Users with privileges cannot access rules and alerts they have
access to.
>
> An excessive list of integration tests is in place to ensure that the
above scenarios will not occur. In the case of a bug, we could a)
release an energy release for serverless and b) backport the fix in ESS.
Given that this PR is intended to be merged in 8.17 we have plenty of
time to test and to minimize the chances of risks.

## FQA

- I noticed that a lot of routes support the `filter` parameter where we
can pass an arbitrary KQL filter. Why we do not use this to filter by
the rule type IDs and the consumers and instead we introduce new
dedicated parameters?

The `filter` parameter should not be exposed in the first place. It
assumes that the consumer of the API knows the underlying structure and
implementation details of the persisted storage API (SavedObject client
API). For example, a valid filter would be
`alerting.attributes.rule_type_id`. In this filter the consumer should
know a) the name of the SO b) the keyword `attributes` (storage
implementation detail) and c) the name of the attribute as it is
persisted in ES (snake case instead of camel case as it is returned by
the APIs). As there is no abstraction layer between the SO and the API,
it makes it very difficult to make changes in the persistent schema or
the APIs. For all the above I decided to introduce new query parameters
where the alerting framework has total control over it.

- I noticed in the code a lot of instances where the consumer is used.
Should not remove any logic around consumers?

This PR is a step forward making the framework as agnostic as possible.
I had to keep the scope of the PR as contained as possible. We will get
there. It needs time :).

- I noticed a lot of hacks like checking if the rule type is `siem`.
Should not remove the hacks?

This PR is a step forward making the framework as agnostic as possible.
I had to keep the scope of the PR as contained as possible. We will get
there. It needs time :).

- I hate the "Role visibility" dropdown. Can we remove it?

I also do not like it. The goal is to remove it. Follow
elastic#189997.

---------

Co-authored-by: kibanamachine <[email protected]>
Co-authored-by: Aleh Zasypkin <[email protected]>
Co-authored-by: Paula Borgonovi <[email protected]>
CAWilson94 pushed a commit to CAWilson94/kibana that referenced this pull request Dec 12, 2024
## Summary

### Problem
With the merge of the PR elastic#183756,
the alerts function has stopped working in the Obs AI Assistant, because
there has been a change to the query (when finding alerts)

### Solution
Revert the change made to the query.

### Checklist

- [x] The PR description includes the appropriate Release Notes section,
and the correct `release_note:*` label is applied per the
[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
apm:review backport:prev-minor Backport to (8.x) the previous minor version (i.e. one version back from main) ci:build-serverless-image ci:cloud-deploy Create or update a Cloud deployment ci:cloud-persist-deployment Persist cloud deployment indefinitely ci:project-deploy-observability Create an Observability project Feature:Alerting release_note:skip Skip the PR/issue when compiling release notes Team:Obs AI Assistant Observability AI Assistant Team:obs-ux-infra_services Observability Infrastructure & Services User Experience Team Team:obs-ux-management Observability Management User Experience Team Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v8.18.0 v9.0.0
Projects
None yet