Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow routing for input packages #5989

Closed
felixbarny opened this issue Apr 25, 2023 · 9 comments · Fixed by elastic/kibana#157897
Closed

Allow routing for input packages #5989

felixbarny opened this issue Apr 25, 2023 · 9 comments · Fixed by elastic/kibana#157897
Assignees

Comments

@felixbarny
Copy link
Member

felixbarny commented Apr 25, 2023

allow routing to logs-*-* by default by changing the api permissions

These are the flags we can use to request the additional permissions from fleet: elastic/package-spec#327

@gsantoro
Copy link
Contributor

I did some testing with some nginx access logs. These are the steps that I followed

  1. Setup a Elastic cluster with fleet and elastic-agent running in kubernetes. Since I have automation for this use case
  2. I created a sample app running in Kubernetes that writes a file /var/log/nginx-logs in each of the kubernetes nodes
  3. I made some changes to the nginx integration to add the following settings to the manifest packages/nginx/data_stream/access/manifest.yml
elasticsearch.dynamic_dataset: true
elasticsearch.dynamic_namespace: true

I have also added an entry to the changelog to be able to install the integration locally (via elastic-package and a local package registry) with a custom version and made the required changes to the manifest to point to this new version.

  1. I installed the customized integration in fleet and added to my agent policy
  2. I created this custom pipeline logs-nginx.access@custom that is called at the end of the default pipeline logs-nginx.access-1.12.1 (where 1.12.1 is the new version of the integration)
{
    "description": "My optional pipeline description",
    "processors": [
        {
            "set": {
                "description": "My optional processor description",
                "field": "my-long-field",
                "value": 10
            }
        },
        {
            "reroute": {
                "dataset": "test",
                "namespace": "test"
            }
        }
    ]
}

here the field is only to verify that the custom pipeline has been correctly called.

  1. I created a data view over the datastream logs-test-test and checked the data in Discover.

Data is correctly rerouted from one datastream to the other one. There are no failures due to permissions (like in the linked issue without the extra settings).

@gsantoro
Copy link
Contributor

A couple of questions for @felixbarny and @ruflin :

  1. I have tested this behaviour with nginx integration and datastream access_logs. Are we going to change a subset of all integrations to support rerouting? do we have a list
  2. are we supporting rerouting in input packages? For example like the log integration which provide the integration Custom logs. This will required to change the package spec similarly to what @felixbarny already did for datastream at here. Currently you can't set elasticsearch.dynamic_dataset or elasticsearch.dynamic_namespace in the manifest of an input package.
  3. should I be aware of anything else? the description of this issue is quite short.

@gsantoro
Copy link
Contributor

  1. would we alternatively make the change in Fleet to add this setting to all datastream??

@felixbarny
Copy link
Member Author

  1. We don't have a list, yet. Part of the scope of this issue is to come up with a list. Generally, we'll want to widen API key permissions for all data streams that typically contain data from multiple services, such as syslog, k8s container logs, Kafka, udp, tcp, etc. Data streams that are already very specific, such as nginx.access are not candidates for now.
  2. Oh, I didn't realize that this needs separate changes in the package spec. Yes, I think we'll want all input packages to support routing by default. Do we have an exhaustive list of all input packages?
  3. I hope this will be just as simple as setting the new flags on some data streams. I'm sure we'll encounter some dragons along the way.
  4. I don't think we should allow routing on all data streams. Rather, each data stream will opt-in to allow routing. Otherwise, we'd not need these flags at all and just always generate API keys for <type>-*-*.

@ruflin
Copy link
Contributor

ruflin commented May 15, 2023

The assumption I follow that all input packages should have these broader permissions by default. Custom logs might be a partial exception here but even here, likely users will specify a pattern and mix data together. Now routing happens centrally instead of the edge which should simplify configs.

@gsantoro
Copy link
Contributor

a summary of my discussion with @ruflin mostly for @felixbarny

  1. We are going to add those two flags elasticsearch.dynamic_dataset: true, elasticsearch.dynamic_namespace: true to the defaults of each input type when the package is installed by Fleet. This way we don't have to manually make changes to each input package and we even cover future use cases for input packages that don't exists yet.
  2. we are NOT going to change the package-spec to allow to set those two flags in the input package. If we need an input package to opt out in the future we might add those to the package-spec.

@gsantoro
Copy link
Contributor

The current list of input packages (gathered by searching for type: input under the directory ./packages of the integrations repo is the following:

  • cel
  • gcp_metrics
  • jolokia_input
  • journald
  • log
  • sql_input
  • statsd_input

Going forward, for testing purposes I'm going to use log input package.

@felixbarny
Copy link
Member Author

Maybe we need a separate issue for this but there are also non-input integrations that we should adapt to allow routing. See also point 1 in #5989 (comment).

@gsantoro could you create a follow-up issue for that?

@gsantoro
Copy link
Contributor

A follow up to make similar changes to datastreams at #6255

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants