Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Change Proposal] Ability to add privileges for all data streams of a specific type #315

Closed
Tracked by #151898
felixbarny opened this issue Apr 7, 2022 · 19 comments · Fixed by #327
Closed
Tracked by #151898
Assignees
Labels
discuss Issue needs discussion Team:Ecosystem Label for the Packages Ecosystem team

Comments

@felixbarny
Copy link
Member

felixbarny commented Apr 7, 2022

Currently, the data stream permissions are specific to the dataset that an integration defines.
For example, when adding a custom log integration, you'll have to specify the data_stream.dataset, for example foo, and an API key will be generated with the permissions to send data to logs-foo-default.

This is limiting for integrations that define a data stream that routes events to other data streams. See:

Example: a log line like this gets sent to the logs-ecs_router-default data stream:

{
  "message": "{\"@timestamp\":\"2022-04-01T12:09:12.375Z\", \"log.level\": \"INFO\", \"message\":\"With event.dataset\", \"data_stream.dataset\": \"foo\"}"
}

The default ingest pipeline for logs-ecs_router-default parses the JSON within the message and uses the data_stream.dataset from the log message to route the message to the logs-foo-default data stream (by overriding the _index field).

The issue is that this will lead to a security exception as the API key used to ingest the data only has permissions to ingest data to logs-ecs_router-default.

This is also an issue for the Azure springcloudlogs integration: All logs are always sent to the springcloudlogs data stream, even if the logs are from different application and, thus, should ideally be routed to their own data streams. Other examples are CloudWatch, k8s logs, PCF logs, and httpjson.

This relates to the discussions about input-only packages but is an independent and decoupled task.

What I'm proposing is to add a flag to the package spec that behaves similar to dataset_is_prefix

dataset_is_prefix:
description: if true, the index pattern in the ES template will contain the dataset as a prefix only
type: boolean
default: false

But instead of just adding .* to the index permissions, the flag will allow access to all data streams of a given type, such as logs-*-*.

@ruflin @mtojek @joshdover

@ruflin
Copy link
Member

ruflin commented Apr 11, 2022

An alternative approach is using default_permissions: true. There is a default set of permissions that gives access to logs-*-*, metrics-*-*, traces-*-*. It gives broader permissions then needed but is a catch all solution for all input packages that might also contain metrics or traces. One of the basic assumptions I make here is that if someone deploys this, the Elastic Agent is deployed in a trusted environment.

@felixbarny
Copy link
Member Author

An alternative approach is using default_permissions: true. There is a default set of permissions that gives access to logs-*-*, metrics-*-*, traces-*-*.

Is this an existing flag? Where can I learn more about it? Ideally, a logging-related input integration would only have access to logs-*-*.

@ruflin
Copy link
Member

ruflin commented Apr 11, 2022

Is this an existing flag?

No, I just made it up :-) Alternative it could default_permissions: ["logs", "metrics"] to make it an array which by default is empty.

@joshdover
Copy link
Contributor

Thanks for highlighting this use case. We have a few discussions this week around ingest node routing (eg. elastic/elasticsearch#63798) and I think I can provide a more helpful response here after we've had those.

@felixbarny
Copy link
Member Author

@joshdover Did you have a chance to think about this again? This issue is a blocker for full event routing and also the lightweight event routing required for an ECS logging integration (elastic/integrations#2972).

@ruflin
Copy link
Member

ruflin commented Apr 29, 2022

There is currently work ongoing here around input packages: #325 I expect the packages that need this feature are normally input packages as for integrations packages, the data streams are mostly predefined. Now I'm torn if there should be a config param in the package spec itself for this or if Fleet should so magic around it. I'm leaning towards package a package spec flag as it would allow Fleet to know in advance that a certain package requires much more permissions and also warn the user about it.

@joshdover
Copy link
Contributor

joshdover commented Apr 29, 2022

The default ingest pipeline for logs-ecs_router-default parses the JSON within the message and uses the data_stream.dataset from the log message to route the message to the logs-foo-default data stream (by overriding the _index field).

The issue is that this will lead to a security exception as the API key used to ingest the data only has permissions to ingest data to logs-ecs_router-default.

This is also an issue for the Azure springcloudlogs integration: All logs are always sent to the springcloudlogs data stream, even if the logs are from different application and, thus, should ideally be routed to their own data streams. Other examples are CloudWatch, k8s logs, PCF logs, and httpjson.

I'm wondering if instead, this should actually be solved with an index / data stream setting rather than with granting Agents explicit access to every data stream that an ingest pipeline may route to. This would allow us to further decouple the data collection from the ingest processing.

Otherwise, every time a routing ingest pipeline adds or removes a new destination data stream, we'll need to update the API keys for every agent sending data to the routing data stream. Generating API keys has non-trivial performance overhead and when we're talking about 100k+ agents, rolling out an API key change could take quite a bit of time and put a large load on the cluster. Due to this delay, we'd need to also change the order of operations to update all the Agent API keys first, then update the routing ingest pipeline. Otherwise, data could be dropped.

I'm imagining a setting like index.pipeline_routing_allowlist: ["logs-*"] that could be specified on the data stream's index template which would modify the security behavior during the ingest pipeline processing to avoid the security exception.

@felixbarny
Copy link
Member Author

Otherwise, every time a routing ingest pipeline adds or removes a new destination data stream, we'll need to update the API keys

I don't think updates are required when the API keys have permissions for logs-* as that's all it would ever need. Note that it wouldn't be possible to make the list of index names more restrictive so that it only contains patterns that are used by installed integrations, such as logs-ngingx* if the Nginx integration is installed. That's because integrations, such as the ECS logging integration and possibly the Azure springcloudlogs integration, can route to any data dataset, depending on the values in the log message.

I'm imagining a setting like index.pipeline_routing_allowlist: ["logs-*"] that could be specified on the data stream's index template which would modify the security behavior during the ingest pipeline processing to avoid the security exception.

I think that could work, too. However, it would conflate index-level settings with API Key permissions which makes it harder to reason about what you can do with a given API key. Not sure if the ES team would be happy with that.

Also, I don't see why that would be preferable over an integration being able to request permissions for logs-*. Having access to logs-router-default which has a setting like index.pipeline_routing_allowlist: ["logs-*"] would be synonymous to having a permission for logs-*.

@joshdover
Copy link
Contributor

I don't think updates are required when the API keys have permissions for logs-* as that's all it would ever need.

Got it, if we don't expect to ever have finer-grained privileges for these types of packages, I guess going with a purely package-based approach is fine then. Probably an option default_permissions would work fine, though I'd maybe want to make the name of the option a bit more explicit. I also don't think it should live next to dataset_as_prefix since that changes more than just the permissions and this is a permissions-only feature.

What if we put this next to data_stream.elasticsearch.privileges.indices? I'm thinking a name like data_stream.elasticsearch.privileges.allow_routing which would grant create privileges on logs-*-*, metrics-*-*, traces-*-*, synthetics-*-*

Felix are you comfortable opening a PR for the package-spec change? Once we have agreement on the spec change we can create the implementation tasks.

@felixbarny
Copy link
Member Author

Here's the PR: #327

@ruflin ruflin assigned felixbarny and unassigned ruflin Jun 24, 2022
@felixbarny
Copy link
Member Author

Hey @joshdover what's the progress on this? Is there any blocker?

@joshdover
Copy link
Contributor

@felixbarny We still haven't made progress on elastic/kibana#115032 yet which is blocking this. It was a stretch goal for 8.4. @jen-huang @kpollich any progress made here?

@felixbarny
Copy link
Member Author

Thanks for the update. Seems like it doesn't make it to 8.4 then. Could we add this as a non-stretch goal to 8.5?

@kpollich
Copy link
Member

@felixbarny We still haven't made progress on elastic/kibana#115032 yet which is blocking this. It was a stretch goal for 8.4. @jen-huang @kpollich any progress made here?

We weren't able to get to this in 8.4. I'll add it to our 8.5 planning doc.

@axw
Copy link
Member

axw commented Aug 16, 2022

APM Server would like to make use of this, or a variation on it, to satisfy elastic/kibana#134971

We wouldn't need to write to arbitrary datasets, only to arbitrary namespaces. Should I open a new issue for this?

@ruflin
Copy link
Member

ruflin commented Sep 26, 2022

I would like to see us moving forward on this. My suggestion is to start with a simple solution that should work for apm-server but also input packages. The basic idea is that a package can opt-in to having the default permissions meaning permissions for logs-*-*, metrics-*-*, traces-*-*. The assumption is that anyone running Elastic Agent with apm-server runs it in a trusted environment and we are comfortable with giving these permissions.

If this becomes a problem, we can still follow up with more fine grained permissions. We should keep in mind, AFAIK apm-server standalone had more permissions in the past (including things like setup templates).

There is a concern around the upgrade. Can we warn users if they add an integration with these permissions? Should we?

@joshdover
Copy link
Contributor

+1 on a more generically useful name than allow_routing to allow for the other use cases, but I'd still prefer integrations declare which types they need access to.

There is a concern around the upgrade. Can we warn users if they add an integration with these permissions? Should we?

After our last discussions, I'm a little less concerned about expanding privileges for Agents as it hasn't historically been an issue for Beats users. We have advertised that agents will receive an API key with the minimum privileges required, but I haven't seen many user questions about this. We could consider some sort of explanation text in the integration policy upgrade flow which shows the privilege change that will happen. One problem with the APM package specifically is that this is upgraded during stack upgrade without user interaction, so there's not a chance to show this to the user. I'd argue we just defer on this problem.

On an execution note, we are starting work on the Kibana blocker to enable any variation of this feature: elastic/kibana#115032

@TheRiffRafi
Copy link

I don't know if this will add and help to the discussion but here it goes.

I jut dealt with a user that is pulling kubernetes metrics and log data through elastic-agent. The agent monitors a cluster but that cluster has different pods doing different work for different applications and apparently different apps within a same pod. The user has not access to make changes on how the agent is deployed so they can only segment the data based on the fields they are receiving from the already installed agents.
One of those applications requires a longer data retention cycle than the other ones but since the data comes from a single policy they are unable to redirect some of the data to another data stream with a longer retention policy using an ingest pipeline (doing the set _index field).

Hacky workaround
In order to get around this I had them add a custom log integration to the policy with a dataset and namespace used for the longer retention and pointing to a dummy log path.
Now the apikey assigned to the policy has that data stream as part of the permissions and the ingest pipeline redirect works.

Dropping here for considerations.

@felixbarny
Copy link
Member Author

I've updated the package spec PR

Now, there are two dedicated flags: data_stream.elasticsearch.dynamic_dataset/data_stream.elasticsearch.dynamic_namespace. I think this both more self-explanatory and more flexible rather than having a single flag. I agree that allow_routing is not a great name. Separating the functionality to the two different flags dynamic_dataset/dynamic_dataset gets around that naming issue, too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Issue needs discussion Team:Ecosystem Label for the Packages Ecosystem team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants