Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Streams] Introducing the new Streams plugin #198713

Merged
merged 23 commits into from
Nov 13, 2024

Conversation

simianhacker
Copy link
Member

@simianhacker simianhacker commented Nov 1, 2024

Summary

This PR introduces the new experimental "Streams" plugin into the Kibana project. The Streams project aims to simplify workflows around dealing with messy logs in Elasticsearch. Our current offering is either extremely opinionated with integrations or leaves the user alone with the high flexibility of Elasticsearch concepts like index templates, component templates and so on, which make it challenging to configure everything correctly for good performance and controlling search speed and cost.

Scope of PR

  • Provides an API for the user to "enable" the streams framework which creates the "root" entity logs with all the backing Elasticsearch assets
  • Provides an API for the user to "fork" a stream
  • Provides an API for the user to "read" a stream and all of it's Elasticsearch assets.
  • Provides an API for the user to upsert a stream (and implicitly child streams that are mentioned)
    • Part of this API is placing grok and disscect processing steps as well as fields to the mapping
  • Implements the Stream Naming Schema (SNS) which uses dots to express the index patterns and stream IDs. Example: logs.nginx.errors
  • The APIs will fully manage the index_template, component_template, and ingest_pipelines.

Out of scope

  • Integration tests (coming in a follow-up)

Reviewer Notes

  • I haven't implemented tests beyond a unit test for converting the filter conditions to Painless. I wanted to get a PR up so we can start iterating on the interface and functionality before we invest in testing.
  • You might need to add server.versioned.versionResolution: oldest to your config/kibana.dev.yaml to play with the requests below in the Kibana "Dev console".

Example API Calls

Enable the root stream (and set the mapping for the internal .streams index)

POST kbn:/api/streams/_enable

Read the root entity "logs"

GET kbn:/api/streams/logs

Fork the "root" entity "logs" and create "logs.nginx" based on a condition

POST kbn:/api/streams/logs/_fork
{
  "stream": {
    "id": "logs.nginx",
    "children": [],
    "processing": [],
    "fields": [],
  },
  "condition": {
    "field": "log.logger",
    "operator": "eq",
    "value": "nginx_proxy"
  }
}

Fork the entity "logs.nginx" and create "logs.nginx.errors" based on a condition

POST kbn:/api/streams/logs.nginx/_fork
{
  "stream": {
    "id": "logs.nginx.error",
    "children": [],
   "processing": [],
   "fields": [],
  },
  "condition": {
    "or": [
      { "field": "log.level", "operator": "eq", "value": "error" },
      { "field": "log.level", "operator": "eq", "value": "ERROR" }
    ]
  }
}

Set some processing on a stream and map the generated field

PUT kbn:/api/streams/logs.nginx
{
    "children": [],
    "processing": [
       { "config": { "type": "grok", "patterns": ["^%{IP:ip} – –"], "field": "message" } }
    ],
    "fields": [
       { "name": "ip", "type": "ip" }
    ],
  }
}

Field definitions are checked for both descendants and ancestors for incompatibilities to ensure they stay additive.

If children are defined in the PUT /api/streams/<name> API, sub-streams are created implicitly. If a stream is PUT, it's added to the parent as well with a condition that is never true (can be edited subsequently).

POST /api/streams/_resync can be used to re-sync all streams from their meta data in case the Elasticsearch objects got messed up by some external change - not sure whether we want to keep that.

Follow-ups

  • API integration tests
  • Check read permissions on data streams to determine whether a user is allowed to read certain streams

@simianhacker simianhacker added Team:Observability Team label for Observability Team (for things that are handled across all of observability) v9.0.0 labels Nov 1, 2024
@flash1293
Copy link
Contributor

flash1293 commented Nov 4, 2024

Great start!

@dgieselaar as discussed I think we should add the following things during the week and aim to get it merged by Friday:

  • Store all the information about a stream in the .streams document as a source of truth
  • Keep the routing conditions in the parent stream, not in the child
  • Do not return the underlying ES objects when reading a stream, but the source of truth from the .streams document

Everything else can probably happen on a separate PR (that we ideally have in by Friday as well):

  • Basic UI skeleton
  • Read all streams for listing
  • Idempotent "PUT" api for editing

@flash1293
Copy link
Contributor

Next stepsd:

  • Create app plugin for the UI in a separate stacked PR
  • Do the API changes on this PR

@flash1293 flash1293 marked this pull request as ready for review November 12, 2024 08:21
@flash1293 flash1293 requested a review from a team as a code owner November 12, 2024 08:21
@elasticmachine
Copy link
Contributor

Pinging @elastic/unified-observability (Team:Observability)

@flash1293 flash1293 added release_note:skip Skip the PR/issue when compiling release notes backport:skip This commit does not require backporting labels Nov 12, 2024
@flash1293
Copy link
Contributor

@dgieselaar For your changes on the UI side, should we backport this to 8.x to avoid conflicts? Or is it OK to only keep on main?

@dgieselaar
Copy link
Member

@flash1293 I will create a separate PR for the changes that impact other plugins, and I'll backport that to 8.x, but the streams app plugin only needs to go into 9.x

@flash1293 flash1293 requested a review from a team as a code owner November 12, 2024 09:38
Copy link
Contributor

@jloleysens jloleysens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New plugin LGTM, reviewed new server APIs.

Comment on lines 24 to 26
options: {
access: 'public',
security: {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Availability is not documented yet, but in this case we can set this (and other APIs) to experimental

Suggested change
options: {
access: 'public',
security: {
options: {
access: 'public',
availability: {
stability: 'experimental'
},
security: {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, set to experimental

@flash1293
Copy link
Contributor

Buildkite is having an outage, holding back with trying to get green CI on this for now

Copy link
Member

@jbudz jbudz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

packages/kbn-optimizer/limits.yml

query: {
bool: {
filter: {
prefix: {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
streams - 4 +4

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
streams - 12 +12

Any counts in public APIs

Total count of every any typed public API. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats any for more detailed information.

id before after diff
streams - 7 +7

Public APIs missing exports

Total count of every type that is part of your API that should be exported but is not. This will cause broken links in the API documentation system. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats exports for more detailed information.

id before after diff
streams - 2 +2

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
streams - 1.6KB +1.6KB
Unknown metric groups

API count

id before after diff
streams - 12 +12

ESLint disabled line counts

id before after diff
streams - 4 +4

Total ESLint disabled count

id before after diff
streams - 4 +4

History

@flash1293 flash1293 merged commit b86dc81 into elastic:main Nov 13, 2024
38 checks passed
CAWilson94 pushed a commit to CAWilson94/kibana that referenced this pull request Nov 18, 2024
## Summary

This PR introduces the new experimental "Streams" plugin into the Kibana
project. The Streams project aims to simplify workflows around dealing
with messy logs in Elasticsearch. Our current offering is either
extremely opinionated with integrations or leaves the user alone with
the high flexibility of Elasticsearch concepts like index templates,
component templates and so on, which make it challenging to configure
everything correctly for good performance and controlling search speed
and cost.

### Scope of PR
- Provides an API for the user to "enable" the streams framework which
creates the "root" entity `logs` with all the backing Elasticsearch
assets
- Provides an API for the user to "fork" a stream
- Provides an API for the user to "read" a stream and all of it's
Elasticsearch assets.
- Provides an API for the user to upsert a stream (and implicitly child
streams that are mentioned)
- Part of this API is placing grok and disscect processing steps as well
as fields to the mapping
- Implements the Stream Naming Schema (SNS) which uses dots to express
the index patterns and stream IDs. Example: `logs.nginx.errors`
- The APIs will fully manage the `index_template`, `component_template`,
and `ingest_pipelines`.

### Out of scope
- Integration tests (coming in a follow-up)

### Reviewer Notes
- I haven't implemented tests beyond a unit test for converting the
filter conditions to Painless. I wanted to get a PR up so we can start
iterating on the interface and functionality before we invest in
testing.
- You might need to add `server.versioned.versionResolution: oldest` to
your `config/kibana.dev.yaml` to play with the requests below in the
Kibana "Dev console".

### Example API Calls

Enable the root stream (and set the mapping for the internal `.streams`
index)
```
POST kbn:/api/streams/_enable
```

Read the root entity "logs"
```
GET kbn:/api/streams/logs
```

Fork the "root" entity "logs" and create "logs.nginx" based on a
condition
```
POST kbn:/api/streams/logs/_fork
{
  "stream": {
    "id": "logs.nginx",
    "children": [],
    "processing": [],
    "fields": [],
  },
  "condition": {
    "field": "log.logger",
    "operator": "eq",
    "value": "nginx_proxy"
  }
}
```

Fork the entity "logs.nginx" and create "logs.nginx.errors" based on a
condition
```
POST kbn:/api/streams/logs.nginx/_fork
{
  "stream": {
    "id": "logs.nginx.error",
    "children": [],
   "processing": [],
   "fields": [],
  },
  "condition": {
    "or": [
      { "field": "log.level", "operator": "eq", "value": "error" },
      { "field": "log.level", "operator": "eq", "value": "ERROR" }
    ]
  }
}
```

Set some processing on a stream and map the generated field
```
PUT kbn:/api/streams/logs.nginx
{
    "children": [],
    "processing": [
       { "config": { "type": "grok", "patterns": ["^%{IP:ip} – –"], "field": "message" } }
    ],
    "fields": [
       { "name": "ip", "type": "ip" }
    ],
  }
}
```

Field definitions are checked for both descendants and ancestors for
incompatibilities to ensure they stay additive.

If children are defined in the `PUT /api/streams/<name>` API,
sub-streams are created implicitly. If a stream is `PUT`, it's added to
the parent as well with a condition that is never true (can be edited
subsequently).

`POST /api/streams/_resync` can be used to re-sync all streams from
their meta data in case the Elasticsearch objects got messed up by some
external change - not sure whether we want to keep that.


Follow-ups

* API integration tests 
* Check read permissions on data streams to determine whether a user is
allowed to read certain streams

---------

Co-authored-by: Joe Reuter <[email protected]>
Co-authored-by: kibanamachine <[email protected]>
CAWilson94 pushed a commit to CAWilson94/kibana that referenced this pull request Nov 18, 2024
## Summary

This PR introduces the new experimental "Streams" plugin into the Kibana
project. The Streams project aims to simplify workflows around dealing
with messy logs in Elasticsearch. Our current offering is either
extremely opinionated with integrations or leaves the user alone with
the high flexibility of Elasticsearch concepts like index templates,
component templates and so on, which make it challenging to configure
everything correctly for good performance and controlling search speed
and cost.

### Scope of PR
- Provides an API for the user to "enable" the streams framework which
creates the "root" entity `logs` with all the backing Elasticsearch
assets
- Provides an API for the user to "fork" a stream
- Provides an API for the user to "read" a stream and all of it's
Elasticsearch assets.
- Provides an API for the user to upsert a stream (and implicitly child
streams that are mentioned)
- Part of this API is placing grok and disscect processing steps as well
as fields to the mapping
- Implements the Stream Naming Schema (SNS) which uses dots to express
the index patterns and stream IDs. Example: `logs.nginx.errors`
- The APIs will fully manage the `index_template`, `component_template`,
and `ingest_pipelines`.

### Out of scope
- Integration tests (coming in a follow-up)

### Reviewer Notes
- I haven't implemented tests beyond a unit test for converting the
filter conditions to Painless. I wanted to get a PR up so we can start
iterating on the interface and functionality before we invest in
testing.
- You might need to add `server.versioned.versionResolution: oldest` to
your `config/kibana.dev.yaml` to play with the requests below in the
Kibana "Dev console".

### Example API Calls

Enable the root stream (and set the mapping for the internal `.streams`
index)
```
POST kbn:/api/streams/_enable
```

Read the root entity "logs"
```
GET kbn:/api/streams/logs
```

Fork the "root" entity "logs" and create "logs.nginx" based on a
condition
```
POST kbn:/api/streams/logs/_fork
{
  "stream": {
    "id": "logs.nginx",
    "children": [],
    "processing": [],
    "fields": [],
  },
  "condition": {
    "field": "log.logger",
    "operator": "eq",
    "value": "nginx_proxy"
  }
}
```

Fork the entity "logs.nginx" and create "logs.nginx.errors" based on a
condition
```
POST kbn:/api/streams/logs.nginx/_fork
{
  "stream": {
    "id": "logs.nginx.error",
    "children": [],
   "processing": [],
   "fields": [],
  },
  "condition": {
    "or": [
      { "field": "log.level", "operator": "eq", "value": "error" },
      { "field": "log.level", "operator": "eq", "value": "ERROR" }
    ]
  }
}
```

Set some processing on a stream and map the generated field
```
PUT kbn:/api/streams/logs.nginx
{
    "children": [],
    "processing": [
       { "config": { "type": "grok", "patterns": ["^%{IP:ip} – –"], "field": "message" } }
    ],
    "fields": [
       { "name": "ip", "type": "ip" }
    ],
  }
}
```

Field definitions are checked for both descendants and ancestors for
incompatibilities to ensure they stay additive.

If children are defined in the `PUT /api/streams/<name>` API,
sub-streams are created implicitly. If a stream is `PUT`, it's added to
the parent as well with a condition that is never true (can be edited
subsequently).

`POST /api/streams/_resync` can be used to re-sync all streams from
their meta data in case the Elasticsearch objects got messed up by some
external change - not sure whether we want to keep that.


Follow-ups

* API integration tests 
* Check read permissions on data streams to determine whether a user is
allowed to read certain streams

---------

Co-authored-by: Joe Reuter <[email protected]>
Co-authored-by: kibanamachine <[email protected]>
dgieselaar added a commit that referenced this pull request Nov 25, 2024
Creates the Streams app plugin, which renders UI for managing streams
(see #198713).

Additional changes in this PR:

- The menus were updated to conditionally add a link to the Streams app.
The Streams plugin itself returns a status$ observable which signals if
Streams have been enabled. This value is used to conditionally render
the link in the various flavors of menus.
- There's a small change in the ES types to allow for ordered params in
ES|QL (vs named params)
- `@kbn/server-route-repository` was updated to be able to override
`access` (instead of only inferring it from the endpoint name).
Additionally, we now allow all route options by default.
- `@kbn/typed-react-router-config` now also exports a `useBreadcrumbs`.
This was copied over from the APM implementation.
- the signature of the `esql` method in
`ObservabilityElasticsearchClient` was updated to separate processing
options from options that are sent over to the _query endpoint.

---------

Co-authored-by: Chris Cowan <[email protected]>
Co-authored-by: Joe Reuter <[email protected]>
Co-authored-by: kibanamachine <[email protected]>
@dgieselaar
Copy link
Member

💚 All backports created successfully

Status Branch Result
8.x

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

dgieselaar pushed a commit to dgieselaar/kibana that referenced this pull request Nov 26, 2024
## Summary

This PR introduces the new experimental "Streams" plugin into the Kibana
project. The Streams project aims to simplify workflows around dealing
with messy logs in Elasticsearch. Our current offering is either
extremely opinionated with integrations or leaves the user alone with
the high flexibility of Elasticsearch concepts like index templates,
component templates and so on, which make it challenging to configure
everything correctly for good performance and controlling search speed
and cost.

### Scope of PR
- Provides an API for the user to "enable" the streams framework which
creates the "root" entity `logs` with all the backing Elasticsearch
assets
- Provides an API for the user to "fork" a stream
- Provides an API for the user to "read" a stream and all of it's
Elasticsearch assets.
- Provides an API for the user to upsert a stream (and implicitly child
streams that are mentioned)
- Part of this API is placing grok and disscect processing steps as well
as fields to the mapping
- Implements the Stream Naming Schema (SNS) which uses dots to express
the index patterns and stream IDs. Example: `logs.nginx.errors`
- The APIs will fully manage the `index_template`, `component_template`,
and `ingest_pipelines`.

### Out of scope
- Integration tests (coming in a follow-up)

### Reviewer Notes
- I haven't implemented tests beyond a unit test for converting the
filter conditions to Painless. I wanted to get a PR up so we can start
iterating on the interface and functionality before we invest in
testing.
- You might need to add `server.versioned.versionResolution: oldest` to
your `config/kibana.dev.yaml` to play with the requests below in the
Kibana "Dev console".

### Example API Calls

Enable the root stream (and set the mapping for the internal `.streams`
index)
```
POST kbn:/api/streams/_enable
```

Read the root entity "logs"
```
GET kbn:/api/streams/logs
```

Fork the "root" entity "logs" and create "logs.nginx" based on a
condition
```
POST kbn:/api/streams/logs/_fork
{
  "stream": {
    "id": "logs.nginx",
    "children": [],
    "processing": [],
    "fields": [],
  },
  "condition": {
    "field": "log.logger",
    "operator": "eq",
    "value": "nginx_proxy"
  }
}
```

Fork the entity "logs.nginx" and create "logs.nginx.errors" based on a
condition
```
POST kbn:/api/streams/logs.nginx/_fork
{
  "stream": {
    "id": "logs.nginx.error",
    "children": [],
   "processing": [],
   "fields": [],
  },
  "condition": {
    "or": [
      { "field": "log.level", "operator": "eq", "value": "error" },
      { "field": "log.level", "operator": "eq", "value": "ERROR" }
    ]
  }
}
```

Set some processing on a stream and map the generated field
```
PUT kbn:/api/streams/logs.nginx
{
    "children": [],
    "processing": [
       { "config": { "type": "grok", "patterns": ["^%{IP:ip} – –"], "field": "message" } }
    ],
    "fields": [
       { "name": "ip", "type": "ip" }
    ],
  }
}
```

Field definitions are checked for both descendants and ancestors for
incompatibilities to ensure they stay additive.

If children are defined in the `PUT /api/streams/<name>` API,
sub-streams are created implicitly. If a stream is `PUT`, it's added to
the parent as well with a condition that is never true (can be edited
subsequently).

`POST /api/streams/_resync` can be used to re-sync all streams from
their meta data in case the Elasticsearch objects got messed up by some
external change - not sure whether we want to keep that.

Follow-ups

* API integration tests
* Check read permissions on data streams to determine whether a user is
allowed to read certain streams

---------

Co-authored-by: Joe Reuter <[email protected]>
Co-authored-by: kibanamachine <[email protected]>
(cherry picked from commit b86dc81)

# Conflicts:
#	.github/CODEOWNERS
@kibanamachine kibanamachine mentioned this pull request Nov 26, 2024
1 task
dgieselaar added a commit that referenced this pull request Nov 26, 2024
# Backport

This will backport the following commits from `main` to `8.x`:
- [[Streams] Introducing the new Streams plugin
(#198713)](#198713)

<!--- Backport version: 7.3.2 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT {commits} BACKPORT-->

Co-authored-by: Chris Cowan <[email protected]>
paulinashakirova pushed a commit to paulinashakirova/kibana that referenced this pull request Nov 26, 2024
Creates the Streams app plugin, which renders UI for managing streams
(see elastic#198713).

Additional changes in this PR:

- The menus were updated to conditionally add a link to the Streams app.
The Streams plugin itself returns a status$ observable which signals if
Streams have been enabled. This value is used to conditionally render
the link in the various flavors of menus.
- There's a small change in the ES types to allow for ordered params in
ES|QL (vs named params)
- `@kbn/server-route-repository` was updated to be able to override
`access` (instead of only inferring it from the endpoint name).
Additionally, we now allow all route options by default.
- `@kbn/typed-react-router-config` now also exports a `useBreadcrumbs`.
This was copied over from the APM implementation.
- the signature of the `esql` method in
`ObservabilityElasticsearchClient` was updated to separate processing
options from options that are sent over to the _query endpoint.

---------

Co-authored-by: Chris Cowan <[email protected]>
Co-authored-by: Joe Reuter <[email protected]>
Co-authored-by: kibanamachine <[email protected]>
dgieselaar added a commit to dgieselaar/kibana that referenced this pull request Nov 27, 2024
Creates the Streams app plugin, which renders UI for managing streams
(see elastic#198713).

Additional changes in this PR:

- The menus were updated to conditionally add a link to the Streams app.
The Streams plugin itself returns a status$ observable which signals if
Streams have been enabled. This value is used to conditionally render
the link in the various flavors of menus.
- There's a small change in the ES types to allow for ordered params in
ES|QL (vs named params)
- `@kbn/server-route-repository` was updated to be able to override
`access` (instead of only inferring it from the endpoint name).
Additionally, we now allow all route options by default.
- `@kbn/typed-react-router-config` now also exports a `useBreadcrumbs`.
This was copied over from the APM implementation.
- the signature of the `esql` method in
`ObservabilityElasticsearchClient` was updated to separate processing
options from options that are sent over to the _query endpoint.

---------

Co-authored-by: Chris Cowan <[email protected]>
Co-authored-by: Joe Reuter <[email protected]>
Co-authored-by: kibanamachine <[email protected]>
(cherry picked from commit 63da770)

# Conflicts:
#	.github/CODEOWNERS
CAWilson94 pushed a commit to CAWilson94/kibana that referenced this pull request Dec 12, 2024
Creates the Streams app plugin, which renders UI for managing streams
(see elastic#198713).

Additional changes in this PR:

- The menus were updated to conditionally add a link to the Streams app.
The Streams plugin itself returns a status$ observable which signals if
Streams have been enabled. This value is used to conditionally render
the link in the various flavors of menus.
- There's a small change in the ES types to allow for ordered params in
ES|QL (vs named params)
- `@kbn/server-route-repository` was updated to be able to override
`access` (instead of only inferring it from the endpoint name).
Additionally, we now allow all route options by default.
- `@kbn/typed-react-router-config` now also exports a `useBreadcrumbs`.
This was copied over from the APM implementation.
- the signature of the `esql` method in
`ObservabilityElasticsearchClient` was updated to separate processing
options from options that are sent over to the _query endpoint.

---------

Co-authored-by: Chris Cowan <[email protected]>
Co-authored-by: Joe Reuter <[email protected]>
Co-authored-by: kibanamachine <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:skip This commit does not require backporting release_note:skip Skip the PR/issue when compiling release notes Team:Observability Team label for Observability Team (for things that are handled across all of observability) v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants