REROUTE processor doesn't work for 2nd and later instances of the same integration and drops documents for theese instances. #111828

SergeyDrachuk · 2024-08-13T09:32:07Z

Elasticsearch Version

8.15.0, Build: deb/1a77947f34deddb41af25e6f0ddb8e830159c179/2024-08-05T10:05:34.233336849Z, JVM: 22.0.1

Installed Plugins

No response

Java Version

bundled

OS Version

6.1.0-22-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.94-1 (2024-06-21) x86_64 GNU/Linux

Problem Description

Trying to split existing integration datastream ("System" integration v1.60.2) in the ingest pipeline logs-system.auth@custom
with "REROUTE" processor.

[
  {
    "reroute": {
      "dataset": [
        "{{data_stream.dataset}}"
      ],
      "namespace": [
        "1y.nbu"
      ]
    }
  }
]

The idea is to keep critical docs (logs) longer then usual docs (logs).

This is a working configuration but only for the first agent policy to which the integration was added.
For example docs routed to the index .ds-logs-system.auth-1y.nbu-2024.08.05-000004

If the integration will be added to other agent policies with different names, the documents from such integrations will be completely lost. If i remove the reroute processor from the ingest pipeline the docs stop getting lost and all ingest pipeline works as expected (System integration documents are saved to the datastream from the all agent policies).
This behavior was observed in previous elasticsearch versions as well.

Steps to Reproduce

Install System integration
Add "Reroute" processor in the ingest pipeline "logs-system.auth@custom":
[
{
"reroute": {
"dataset": [
"{{data_stream.dataset}}"
],
"namespace": [
"1y.nbu"
]
}
}
]
Add "System" integration to more then one agent policy name.
Check logs from hosts in the different agent policies.

Logs (if relevant)

No response

The text was updated successfully, but these errors were encountered:

gbanasiak · 2024-08-14T08:49:17Z

@SergeyDrachuk This is an Elasticsearch bug report but you're referring to Elastic Agent integrations, so it's not clear where the problem is, actually. Is ES misbehaving, or Elastic Agent integrations introduce an unexpected ES configuration? Can you provide reproduction steps without Elastic Agent involved, or at least check the state of ingest pipelines on each Elastic Agent policy interaction?

elasticsearchmachine · 2024-08-14T08:49:56Z

Pinging @elastic/es-data-management (Team:Data Management)

SergeyDrachuk · 2024-08-14T10:35:25Z

@gbanasiak
We expected the doc index .ds-logs-system.auth-default (assigned in first integration ingest pipeline logs-system.auth-1.60.2)
would change its name to something like .ds-logs-system.auth-1y.nbu-xxxx
after treatment in the next ingest pipeline "logs-system.auth@custom" with "Reroute" processor (described above).

But pipeline simulation shows the new name will be precisely
.ds-system.auth-1y.nbu

Data IN:

{
  "_id": "YxtLQIoBvMlmAQX6MVtg",
  "_index": ".ds-logs-system.auth-default-2023.01.04-000058",
  "_source": {
    "agent": {
      "name": "FLEET-MASTER-A",
      "id": "42d17a41-bb3b-402f-b687-5fca2426fbae",
      "ephemeral_id": "15b46ae1-d892-4d19-8e11-5b9106920953",
      "type": "filebeat",
      "version": "8.9.0"
    },
    "process": {
      "name": "sshd",
      "pid": 5011
    },
    "log": {
      "file": {
        "path": "/var/log/auth.log.1"
      },
      "offset": 0
    },
    "elastic_agent": {
      "id": "42d17a41-bb3b-402f-b687-5fca2426fbae",
      "version": "8.9.0",
      "snapshot": false
    },
    "message": "Received disconnect from 10.143.10.33 port 34680:11: disconnected by user",
    "tags": [
      "system-auth"
    ],
    "input": {
      "type": "log"
    },
    "@timestamp": "2023-08-22T12:53:11.546+03:00",
    "system": {
      "auth": {}
    },
    "ecs": {
      "version": "8.0.0"
    },
    "related": {
      "hosts": [
        "SIEM-FLEET-1"
      ]
    },
    "data_stream": {
      "namespace": "default",
      "type": "logs",
      "dataset": "system.auth"
    },
    "host": {
      "hostname": "SIEM-FLEET-1",
      "os": {
        "kernel": "6.1.0-11-amd64",
        "codename": "bookworm",
        "name": "Debian GNU/Linux",
        "type": "linux",
        "family": "debian",
        "version": "12 (bookworm)",
        "platform": "debian"
      },
      "containerized": false,
      "ip": [
        "10.143.10.38"
      ],
      "name": "fleet-master-a",
      "id": "0cb615e4592b487e87233f92553f13d0",
      "mac": [
        "00-50-56-9B-05-32"
      ],
      "architecture": "x86_64"
    },
    "event": {
      "agent_id_status": "verified",
      "ingested": "2023-08-29T07:56:24Z",
      "timezone": "+03:00",
      "kind": "event",
      "dataset": "system.auth"
    }
  }
}

Data out

{
  "_index": ".ds-system.auth-1y.nbu",
  "_id": "YxtLQIoBvMlmAQX6MVtg",
  "_version": "-3",
  "_source": {
    "agent": {
      "name": "FLEET-MASTER-A",
      "id": "42d17a41-bb3b-402f-b687-5fca2426fbae",
      "ephemeral_id": "15b46ae1-d892-4d19-8e11-5b9106920953",
      "type": "filebeat",
      "version": "8.9.0"
    },
    "process": {
      "name": "sshd",
      "pid": 5011
    },
    "log": {
      "offset": 0,
      "file": {
        "path": "/var/log/auth.log.1"
      }
    },
    "elastic_agent": {
      "id": "42d17a41-bb3b-402f-b687-5fca2426fbae",
      "version": "8.9.0",
      "snapshot": false
    },
    "message": "Received disconnect from 10.143.10.33 port 34680:11: disconnected by user",
    "tags": [
      "system-auth"
    ],
    "input": {
      "type": "log"
    },
    "@timestamp": "2023-08-22T12:53:11.546+03:00",
    "system": {
      "auth": {}
    },
    "ecs": {
      "version": "8.0.0"
    },
    "related": {
      "hosts": [
        "SIEM-FLEET-1"
      ]
    },
    "data_stream": {
      "namespace": "1y.nbu",
      "type": ".ds",
      "dataset": "system.auth"
    },
    "host": {
      "hostname": "SIEM-FLEET-1",
      "os": {
        "kernel": "6.1.0-11-amd64",
        "codename": "bookworm",
        "name": "Debian GNU/Linux",
        "type": "linux",
        "family": "debian",
        "version": "12 (bookworm)",
        "platform": "debian"
      },
      "containerized": false,
      "ip": [
        "10.143.10.38"
      ],
      "name": "fleet-master-a",
      "id": "0cb615e4592b487e87233f92553f13d0",
      "mac": [
        "00-50-56-9B-05-32"
      ],
      "architecture": "x86_64"
    },
    "event": {
      "agent_id_status": "verified",
      "ingested": "2023-08-29T07:56:24Z",
      "dataset": "system.auth",
      "timezone": "+03:00",
      "kind": "event"
    }
  },
  "_ingest": {
    "pipeline": "_simulate_pipeline",
    "timestamp": "2024-08-14T10:11:45.761Z"
  }
}

Does it work as expected?
Is a new datastream created?
How to determine the name of it?

gbanasiak · 2024-08-14T13:36:12Z

Thank you for additional data.

I have now realized the mechanism described in https://www.elastic.co/guide/en/elasticsearch/reference/8.15/ingest.html#pipelines-for-fleet-elastic-agent is used here. Specifically, Fleet-managed logs-system.auth-1.60.2 ingest pipeline calls logs-system.auth@custom pipeline, if it exists.

logs-system.auth-1.60.2 and other Fleet-managed pipelines contain the following pipeline processors at the end:

[..]
      {
        "pipeline": {
          "name": "global@custom",
          "ignore_missing_pipeline": true,
          "description": "[Fleet] Global pipeline for all data streams"
        }
      },
      {
        "pipeline": {
          "name": "logs@custom",
          "ignore_missing_pipeline": true,
          "description": "[Fleet] Pipeline for all data streams of type `logs`"
        }
      },
      {
        "pipeline": {
          "name": "logs-system.integration@custom",
          "ignore_missing_pipeline": true,
          "description": "[Fleet] Pipeline for all data streams of type `logs` defined by the `system` integration"
        }
      },
      {
        "pipeline": {
          "name": "logs-system.auth@custom", <--- HERE
          "ignore_missing_pipeline": true,
          "description": "[Fleet] Pipeline for the `system.auth` dataset"
        }
      }
[..]

Going back to last update it looks like ingest pipeline simulation does not produce the expected output as type is not preserved, and index name is incorrect:

[..]
  "_index": ".ds-system.auth-1y.nbu",
[..]
    "data_stream": {
      "namespace": "1y.nbu",
      "type": ".ds", <--- HERE
      "dataset": "system.auth"
    },

Reproduction steps in 8.15.0:

PUT _ingest/pipeline/logs-system.auth@custom
{
  "processors": [
    {
      "reroute": {
        "dataset": [
          "{{data_stream.dataset}}"
        ],
        "namespace": [
          "test"
        ]
      }
    }
  ]
}

POST /_ingest/pipeline/logs-system.auth@custom/_simulate
{
  "docs": [
    {
      "_id": "YxtLQIoBvMlmAQX6MVtg",
      "_index": ".ds-logs-system.auth-default-2023.01.04-000058",
      "_source": {
        "agent": {
          "name": "FLEET-MASTER-A",
          "id": "42d17a41-bb3b-402f-b687-5fca2426fbae",
          "ephemeral_id": "15b46ae1-d892-4d19-8e11-5b9106920953",
          "type": "filebeat",
          "version": "8.9.0"
        },
        "process": {
          "name": "sshd",
          "pid": 5011
        },
        "log": {
          "file": {
            "path": "/var/log/auth.log.1"
          },
          "offset": 0
        },
        "elastic_agent": {
          "id": "42d17a41-bb3b-402f-b687-5fca2426fbae",
          "version": "8.9.0",
          "snapshot": false
        },
        "message": "Received disconnect from 10.143.10.33 port 34680:11: disconnected by user",
        "tags": [
          "system-auth"
        ],
        "input": {
          "type": "log"
        },
        "@timestamp": "2023-08-22T12:53:11.546+03:00",
        "system": {
          "auth": {}
        },
        "ecs": {
          "version": "8.0.0"
        },
        "related": {
          "hosts": [
            "SIEM-FLEET-1"
          ]
        },
        "data_stream": {
          "namespace": "default",
          "type": "logs",
          "dataset": "system.auth"
        },
        "host": {
          "hostname": "SIEM-FLEET-1",
          "os": {
            "kernel": "6.1.0-11-amd64",
            "codename": "bookworm",
            "name": "Debian GNU/Linux",
            "type": "linux",
            "family": "debian",
            "version": "12 (bookworm)",
            "platform": "debian"
          },
          "containerized": false,
          "ip": [
            "10.143.10.38"
          ],
          "name": "fleet-master-a",
          "id": "0cb615e4592b487e87233f92553f13d0",
          "mac": [
            "00-50-56-9B-05-32"
          ],
          "architecture": "x86_64"
        },
        "event": {
          "agent_id_status": "verified",
          "ingested": "2023-08-29T07:56:24Z",
          "timezone": "+03:00",
          "kind": "event",
          "dataset": "system.auth"
        }
      }
    }
  ]
}

# output
[..]
        "_index": ".ds-system.auth-test",
[..]
          "data_stream": {
            "namespace": "test",
            "type": ".ds",
            "dataset": "system.auth"
          },
[..]

But this might only be simulation thing as the following test does work correctly:

PUT logs-system.auth-default/_bulk?pipeline=logs-system.auth@custom
{"create":{}}
{"agent":{"name":"FLEET-MASTER-A","id":"42d17a41-bb3b-402f-b687-5fca2426fbae","ephemeral_id":"15b46ae1-d892-4d19-8e11-5b9106920953","type":"filebeat","version":"8.9.0"},"process":{"name":"sshd","pid":5011},"log":{"file":{"path":"/var/log/auth.log.1"},"offset":0},"elastic_agent":{"id":"42d17a41-bb3b-402f-b687-5fca2426fbae","version":"8.9.0","snapshot":false},"message":"Received disconnect from 10.143.10.33 port 34680:11: disconnected by user","tags":["system-auth"],"input":{"type":"log"},"@timestamp":"2023-08-22T12:53:11.546+03:00","system":{"auth":{}},"ecs":{"version":"8.0.0"},"related":{"hosts":["SIEM-FLEET-1"]},"data_stream":{"namespace":"default","type":"logs","dataset":"system.auth"},"host":{"hostname":"SIEM-FLEET-1","os":{"kernel":"6.1.0-11-amd64","codename":"bookworm","name":"Debian GNU/Linux","type":"linux","family":"debian","version":"12 (bookworm)","platform":"debian"},"containerized":false,"ip":["10.143.10.38"],"name":"fleet-master-a","id":"0cb615e4592b487e87233f92553f13d0","mac":["00-50-56-9B-05-32"],"architecture":"x86_64"},"event":{"agent_id_status":"verified","ingested":"2023-08-29T07:56:24Z","timezone":"+03:00","kind":"event","dataset":"system.auth"}}

# result
{
  "errors": false,
  "took": 24087811880,
  "ingest_took": 7,
  "items": [
    {
      "create": {
        "_index": ".ds-logs-system.auth-test-2024.08.14-000001", <---- HERE
        "_id": "_PsPUZEBsRUWctdqcdH3",
        "_version": 1,
        "result": "created",
        "_shards": {
          "total": 1,
          "successful": 1,
          "failed": 0
        },
        "_seq_no": 1,
        "_primary_term": 1,
        "status": 201
      }
    }
  ]
}

However, this test with the payload specified above does not:

PUT logs-system.auth-default/_bulk
[..]

But that's due to a failure earlier in the logs-system.auth-1.60.2 pipeline: "Processor conditional with tag grok-message-header in pipeline logs-system.auth-1.60.2 failed with message: Provided Grok expressions do not match field value: [Received disconnect from 10.143.10.33 port 34680:11: disconnected by user]". When pipeline encountered an error it moved to on_failure section bypassing custom pipeline processors. This Elasticsearch behavior is expected for this definition of the pipeline I think. Perhaps pipeline should be made more robust, but that's an integration issue.

I'll let @elastic/es-data-management comment further. At this point based on the data provided I think there's a concern around ingest pipeline simulation behavior. I think we should adjust the issue title.

masseyke · 2024-08-14T17:11:25Z

What is the expected output from the POST /_ingest/pipeline/logs-system.auth@custom/_simulate request? I assume it would be this, right?

[..]
        "_index": ".ds-logs-system.auth-test-2023.01.04-000058",
[..]
          "data_stream": {
            "namespace": "test",
            "type": "logs",
            "dataset": "system.auth"
          },
[..]

I think the problem though is not with the simulate code, but with the fact that the reroute processor assumes that if you don't give it a destination then you have given a data stream (using the data stream naming convention) as the _index field. So it's parsing .ds-logs-system.auth-test-2023.01.04-000058 as if that were a data stream name (it's a backing index behind a data stream though). Maybe we just need to add some validation to the reroute processor to blow up if the index is not a data stream?

gbanasiak · 2024-08-14T19:16:34Z

I see, so the correct way of using the simulate endpoint in this context would be:

POST /_ingest/pipeline/logs-system.auth@custom/_simulate
{
  "docs": [
    {
      [..]
      "_index": "logs-system.auth-default",
      [..]
    }
  ]
}

Maybe we just need to add some validation to the reroute processor to blow up if the index is not a data stream?

According to documentation this should already be happening?

When the destination option is not set, this processor is in a data stream mode. Note that in this mode, the reroute processor can only be used on data streams that follow the data stream naming scheme. Trying to use this processor on a data stream with a non-compliant name will raise an exception.

Also, exception or rather a pipeline failure? I would expect a failure, so it can be ignored with ignore_failure and handled through on_failure?

SergeyDrachuk · 2024-08-15T16:21:00Z

Behind the simulation tests what we expect that Reroute processor provide the possibility to assign a different index template (for example with pattern logs-system.auth-test-) and indexes like .ds-logs-system.auth-test- would have different ILM (keep time) than .ds-logs-system.auth-default-* (some index mask overlapping exists, i'm still not sure this will work).
The solution with Reroute processor (to keep docs different time from any an integration datastream) come from elasticsearch team.

gbanasiak · 2024-09-02T12:58:10Z

@SergeyDrachuk

Behind the simulation tests what we expect that Reroute processor provide the possibility to assign a different index template (for example with pattern logs-system.auth-test-) and indexes like .ds-logs-system.auth-test- would have different ILM (keep time) than .ds-logs-system.auth-default-* (some index mask overlapping exists, i'm still not sure this will work).

Reroute processor can alter the destination data stream, and destination data stream settings (such as ILM policy and thus data retention) depend on a matching index template. If you follow https://www.elastic.co/guide/en/fleet/8.15/data-streams-ilm-tutorial.html you will be able to differentiate data retention setting by duplicating and modifying the default logs-system.auth index template behavior.

When it comes to this issue I still don't see where the problem is, actually. The _simulate endpoint behavior is meaningless from ingest perspective. As discussed earlier, the proper way to exercise simulate would be as follows:

POST /_ingest/pipeline/logs-system.auth-1.60.2/_simulate
{
  "docs": [
    {
      "_id": "YxtLQIoBvMlmAQX6MVtg",
      "_index": "logs-system.auth-default", <--- data stream name, not index name
      "_source": {
        "agent": {
          "name": "FLEET-MASTER-A",
          "id": "42d17a41-bb3b-402f-b687-5fca2426fbae",
          "ephemeral_id": "15b46ae1-d892-4d19-8e11-5b9106920953",
          "type": "filebeat",
          "version": "8.9.0"
        },
        "process": {
          "name": "sshd",
          "pid": 5011
        },
        "log": {
          "file": {
            "path": "/var/log/auth.log.1"
          },
          "offset": 0
        },
        "elastic_agent": {
          "id": "42d17a41-bb3b-402f-b687-5fca2426fbae",
          "version": "8.9.0",
          "snapshot": false
        },
        "message": "Received disconnect from 10.143.10.33 port 34680:11: disconnected by user",
        "tags": [
          "system-auth"
        ],
        "input": {
          "type": "log"
        },
        "@timestamp": "2023-08-22T12:53:11.546+03:00",
        "system": {
          "auth": {}
        },
        "ecs": {
          "version": "8.0.0"
        },
        "related": {
          "hosts": [
            "SIEM-FLEET-1"
          ]
        },
        "data_stream": {
          "namespace": "default",
          "type": "logs",
          "dataset": "system.auth"
        },
        "host": {
          "hostname": "SIEM-FLEET-1",
          "os": {
            "kernel": "6.1.0-11-amd64",
            "codename": "bookworm",
            "name": "Debian GNU/Linux",
            "type": "linux",
            "family": "debian",
            "version": "12 (bookworm)",
            "platform": "debian"
          },
          "containerized": false,
          "ip": [
            "10.143.10.38"
          ],
          "name": "fleet-master-a",
          "id": "0cb615e4592b487e87233f92553f13d0",
          "mac": [
            "00-50-56-9B-05-32"
          ],
          "architecture": "x86_64"
        },
        "event": {
          "agent_id_status": "verified",
          "ingested": "2023-08-29T07:56:24Z",
          "timezone": "+03:00",
          "kind": "event",
          "dataset": "system.auth"
        }
      }
    }
  ]
}

What I get in return shows that the reroute did not happen which is expected because pipeline ran into Processor conditional with tag grok-message-header in pipeline logs-system.auth-1.60.2 failed with message: Provided Grok expressions do not match field value: [Received disconnect from 10.143.10.33 port 34680:11: disconnected by user] error.

{
  "docs": [
    {
      "doc": {
        "_index": "logs-system.auth-default", <--- unchanged data stream
        "_version": "-3",
        "_id": "YxtLQIoBvMlmAQX6MVtg",
        "_source": {
          "agent": {
            "name": "FLEET-MASTER-A",
            "id": "42d17a41-bb3b-402f-b687-5fca2426fbae",
            "ephemeral_id": "15b46ae1-d892-4d19-8e11-5b9106920953",
            "type": "filebeat",
            "version": "8.9.0"
          },
          "process": {
            "name": "sshd",
            "pid": 5011
          },
          "log": {
            "offset": 0,
            "file": {
              "path": "/var/log/auth.log.1"
            }
          },
          "elastic_agent": {
            "id": "42d17a41-bb3b-402f-b687-5fca2426fbae",
            "version": "8.9.0",
            "snapshot": false
          },
          "error": {
            "message": "Processor conditional with tag grok-message-header in pipeline logs-system.auth-1.60.2 failed with message: Provided Grok expressions do not match field value: [Received disconnect from 10.143.10.33 port 34680:11: disconnected by user]"
          },
          "tags": [
            "system-auth"
          ],
          "input": {
            "type": "log"
          },
          "@timestamp": "2023-08-22T12:53:11.546+03:00",
          "system": {
            "auth": {}
          },
          "ecs": {
            "version": "8.0.0"
          },
          "related": {
            "hosts": [
              "SIEM-FLEET-1"
            ]
          },
          "data_stream": {
            "namespace": "default",
            "type": "logs",
            "dataset": "system.auth"
          },
          "host": {
            "hostname": "SIEM-FLEET-1",
            "os": {
              "kernel": "6.1.0-11-amd64",
              "codename": "bookworm",
              "name": "Debian GNU/Linux",
              "type": "linux",
              "family": "debian",
              "version": "12 (bookworm)",
              "platform": "debian"
            },
            "containerized": false,
            "ip": [
              "10.143.10.38"
            ],
            "name": "fleet-master-a",
            "id": "0cb615e4592b487e87233f92553f13d0",
            "mac": [
              "00-50-56-9B-05-32"
            ],
            "architecture": "x86_64"
          },
          "event": {
            "agent_id_status": "verified",
            "ingested": "2023-08-29T07:56:24Z",
            "original": "Received disconnect from 10.143.10.33 port 34680:11: disconnected by user",
            "timezone": "+03:00",
            "kind": "pipeline_error",
            "dataset": "system.auth"
          }
        },
        "_ingest": {
          "timestamp": "2024-09-02T12:33:38.130868589Z"
        }
      }
    }
  ]
}

ES behavior is correct here. To have a different result, the logs-system.auth-1.60.2 pipeline definition would have to be different, but that's not an ES issue but rather https://github.com/elastic/integrations issue.

I don't see how the behavior you described initially could happen, i.e.:

This is a working configuration but only for the first agent policy to which the integration was added.
If the integration will be added to other agent policies with different names, the documents from such integrations will be completely lost. If i remove the reroute processor from the ingest pipeline the docs stop getting lost and all ingest pipeline works as expected (System integration documents are saved to the datastream from the all agent policies).

Is ES configuration changing as you go through integration policy reconfiguration? We need some reproduction steps at ES level. Can you provide these?

SergeyDrachuk · 2024-09-03T06:50:28Z

@gbanasiak

Is ES configuration changing as you go through integration policy reconfiguration?

I don't think an ES config changes were occasionally had impact on the Reroute processor behaviour. We tried to add/remove Reroute processor in a different days several times. The behaviour is the same.

We need some reproduction steps at ES level. Can you provide these?

Sure. What should i do?

I think the way we tested simulation is not 100% right. The doc we tried to put to ingest pipeline simulation was taken after it processed the same pipiline and stored to an output datastream. We did it in such a way bc i don't know how to intercept (catch) the raw doc at point between integration and ingest pipeline. Should we need to add to the integration output config an "empty" ingest pipiline and take raw doc for the simulation or there is an another more easy way?

gbanasiak · 2024-09-03T07:21:48Z

To get the original document on input to ES as formed by the agent, you can enable debug logs in a single agent. Leave it running for a while to make sure documents were ingested, and then collect diagnostics using elastic-agent diagnostics command. The produced archive will include logs of a respective Beat (among other things). Debug logs are very verbose, so don't keep this log level for too long. The ones I'm interested in include a JSON payload sent to ES (of every single event/document).

Empty ingest pipeline you suggested should also work, but debug logs might be easier.

SergeyDrachuk · 2024-09-04T14:09:40Z

elastic-agent-event-log-20240904-144.zip

@gbanasiak
I can't find in the diagnostics archive folder elastic-agent-8.15.0-25075f\events\ most of the following events from the /var/log/auth.log

gbanasiak · 2024-09-05T08:55:39Z

Thank you for sharing the logs. If you grep for system.auth you can find interesting bits.

First the bit we were looking for - ingested event. The agent log looks like this (after formatting):

{
    "log.level": "debug",
    "@timestamp": "2024-09-04T16:55:10.755+0300",
    "message": "Publish event: {\n  \"@timestamp\": \"2024-09-04T13:55:01.000Z\",\n  \"@metadata\": {\n    \"beat\": \"filebeat\",\n    \"type\": \"_doc\",\n    \"version\": \"8.15.0\",\n    \"raw_index\": \"logs-system.auth-default\",\n    \"input_id\": \"logfile-system-04b6e2f3-72bd-4e2e-a0a3-d8cb413195b9\",\n    \"stream_id\": \"logfile-system.auth-04b6e2f3-72bd-4e2e-a0a3-d8cb413195b9\"\n  },\n  \"event\": {\n    \"dataset\": \"system.auth\",\n    \"timezone\": \"+03:00\",\n    \"original\": \"Sep  4 16:55:01 SIEM-ARCHIVE CRON[3925397]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)\"\n  },\n  \"agent\": {\n    \"ephemeral_id\": \"91adbb8f-f4ab-4135-b972-38e30446c9a3\",\n    \"id\": \"814646e8-37b9-4d52-992f-40bf2e9eb209\",\n    \"name\": \"SIEM-ARCHIVE\",\n    \"type\": \"filebeat\",\n    \"version\": \"8.15.0\"\n  },\n  \"ecs\": {\n    \"version\": \"8.0.0\"\n  },\n  \"log\": {\n    \"offset\": 189281,\n    \"file\": {\n      \"path\": \"/var/log/auth.log\"\n    },\n    \"syslog\": {\n      \"appname\": \"CRON\",\n      \"procid\": \"3925397\",\n      \"hostname\": \"SIEM-ARCHIVE\"\n    }\n  },\n  \"message\": \"pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)\",\n  \"tags\": [\n    \"system-auth\"\n  ],\n  \"input\": {\n    \"type\": \"log\"\n  },\n  \"data_stream\": {\n    \"namespace\": \"default\",\n    \"type\": \"logs\",\n    \"dataset\": \"system.auth\"\n  },\n  \"elastic_agent\": {\n    \"id\": \"814646e8-37b9-4d52-992f-40bf2e9eb209\",\n    \"snapshot\": false,\n    \"version\": \"8.15.0\"\n  },\n  \"host\": {\n    \"containerized\": false,\n    \"ip\": [\n      \"10.143.10.33\"\n    ],\n    \"mac\": [\n      \"00-0C-29-88-2D-37\"\n    ],\n    \"name\": \"siem-archive\",\n    \"hostname\": \"siem-archive\",\n    \"architecture\": \"x86_64\",\n    \"os\": {\n      \"version\": \"11 (bullseye)\",\n      \"family\": \"debian\",\n      \"name\": \"Debian GNU/Linux\",\n      \"kernel\": \"5.10.0-28-amd64\",\n      \"codename\": \"bullseye\",\n      \"type\": \"linux\",\n      \"platform\": \"debian\"\n    },\n    \"id\": \"79782a8b868f45708e81fb1949743325\"\n  }\n}",
    "component": {
        "binary": "filebeat",
        "dataset": "elastic_agent.filebeat",
        "id": "log-default",
        "type": "log"
    },
    "log": {
        "source": "log-default"
    },
    "log.logger": "processors",
    "log.origin": {
        "file.line": 215,
        "file.name": "processing/processors.go",
        "function": "github.com/elastic/beats/v7/libbeat/publisher/processing.debugPrintProcessor.func1"
    },
    "service.name": "filebeat",
    "log.type": "event",
    "ecs.version": "1.6.0",
    "ecs.version": "1.6.0"
}

The event extracted from the message field is as follows:

{
    "@timestamp": "2024-09-04T13:55:01.000Z",
    "@metadata": {
        "beat": "filebeat",
        "type": "_doc",
        "version": "8.15.0",
        "raw_index": "logs-system.auth-default",
        "input_id": "logfile-system-04b6e2f3-72bd-4e2e-a0a3-d8cb413195b9",
        "stream_id": "logfile-system.auth-04b6e2f3-72bd-4e2e-a0a3-d8cb413195b9"
    },
    "event": {
        "dataset": "system.auth",
        "timezone": "+03:00",
        "original": "Sep  4 16:55:01 SIEM-ARCHIVE CRON[3925397]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)"
    },
    "agent": {
        "ephemeral_id": "91adbb8f-f4ab-4135-b972-38e30446c9a3",
        "id": "814646e8-37b9-4d52-992f-40bf2e9eb209",
        "name": "SIEM-ARCHIVE",
        "type": "filebeat",
        "version": "8.15.0"
    },
    "ecs": {
        "version": "8.0.0"
    },
    "log": {
        "offset": 189281,
        "file": {
            "path": "/var/log/auth.log"
        },
        "syslog": {
            "appname": "CRON",
            "procid": "3925397",
            "hostname": "SIEM-ARCHIVE"
        }
    },
    "message": "pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)",
    "tags": [
        "system-auth"
    ],
    "input": {
        "type": "log"
    },
    "data_stream": {
        "namespace": "default",
        "type": "logs",
        "dataset": "system.auth"
    },
    "elastic_agent": {
        "id": "814646e8-37b9-4d52-992f-40bf2e9eb209",
        "snapshot": false,
        "version": "8.15.0"
    },
    "host": {
        "containerized": false,
        "ip": [
            "10.143.10.33"
        ],
        "mac": [
            "00-0C-29-88-2D-37"
        ],
        "name": "siem-archive",
        "hostname": "siem-archive",
        "architecture": "x86_64",
        "os": {
            "version": "11 (bullseye)",
            "family": "debian",
            "name": "Debian GNU/Linux",
            "kernel": "5.10.0-28-amd64",
            "codename": "bullseye",
            "type": "linux",
            "platform": "debian"
        },
        "id": "79782a8b868f45708e81fb1949743325"
    }
}

The ingest simulation for this event with your setup works fine, i.e. document lands in non-default datastream.

POST /_ingest/pipeline/logs-system.auth-1.60.2/_simulate
{
  "docs": [
    {
      "_index": "logs-system.auth-default",
      "_source": {
        [..]
      }
    }
  ]
}

response:
{
  "docs": [
    {
      "doc": {
        "_index": "logs-system.auth-test", <--- HERE
[..]
          "data_stream": {
            "namespace": "test", <--- HERE
            "type": "logs",
            "dataset": "system.auth"
          },
[..]
      }
    }
  ]
}

Agent logs includes this interesting warning though: Cannot index event publisher.Event{Content:beat.Event{Timestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Meta:null, Fields:null, Private:interface {}(nil), TimeSeries:false}, Flags:0x1, Cache:publisher.EventCache{m:mapstr.M(nil)}, EncodedEvent:(*elasticsearch.encodedEvent)(0xc002f69680)} (status=403): {\"type\":\"security_exception\",\"reason\":\"action [indices:data/write/bulk[s]] is unauthorized for API key id [FBzaMo4B7V8SpG9PYflL] of user [elastic/fleet-server] on indices [logs-system.auth-1y.nbu], this action is granted by the index privileges [create_doc,create,delete,index,write,all]\"}, dropping event!

{
    "type": "security_exception",
    "reason": "action [indices:data/write/bulk[s]] is unauthorized for API key id [FBzaMo4B7V8SpG9PYflL] of user [elastic/fleet-server] on indices [logs-system.auth-1y.nbu], this action is granted by the index privileges [create_doc,create,delete,index,write,all]"
}

This means that API key the agent has is not authorized to write into logs-system.auth-1y.nbu datastream which would explain why documents were completely missing. I don't know why this works initially and then stops working (as per your original description), but that's not an Elasticsearch issue, but Fleet/Elastic Agent setup issue. I would recommend opening a new topic in https://discuss.elastic.co/c/elastic-stack/elastic-agent or contacting Elastic Support.

I've raised #112543 to document the problem discussed earlier, and I'm closing this one. Feel free to open a new issue if evidence for ES bug is found.

immon · 2024-09-11T15:00:08Z

I'm leaving a note here for others who'd like to use reroute.
The root cause of the issue lies in the insufficient permissions associated with the API key generated by the Fleet Server. Specifically, the API key doesn't permit writing to the logs-system.auth-1y.nbu index.

As a temporary solution, you can add the Custom Logs integration to the same Agent policy. This will grant the auto_configure and create_doc privileges to all logs-*-* indices.

SergeyDrachuk added >bug needs:triage Requires assignment of a team area label labels Aug 13, 2024

gbanasiak added the :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP label Aug 14, 2024

elasticsearchmachine added the Team:Data Management Meta label for data/management team label Aug 14, 2024

elasticsearchmachine removed the needs:triage Requires assignment of a team area label label Aug 14, 2024

gbanasiak mentioned this issue Sep 5, 2024

Validate index name in data stream mode of reroute processor #112543

Open

gbanasiak closed this as completed Sep 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

REROUTE processor doesn't work for 2nd and later instances of the same integration and drops documents for theese instances. #111828

REROUTE processor doesn't work for 2nd and later instances of the same integration and drops documents for theese instances. #111828

SergeyDrachuk commented Aug 13, 2024 •

edited

Loading

gbanasiak commented Aug 14, 2024

elasticsearchmachine commented Aug 14, 2024

SergeyDrachuk commented Aug 14, 2024 •

edited

Loading

gbanasiak commented Aug 14, 2024

masseyke commented Aug 14, 2024

gbanasiak commented Aug 14, 2024

SergeyDrachuk commented Aug 15, 2024

gbanasiak commented Sep 2, 2024

SergeyDrachuk commented Sep 3, 2024 •

edited

Loading

gbanasiak commented Sep 3, 2024

SergeyDrachuk commented Sep 4, 2024

gbanasiak commented Sep 5, 2024

immon commented Sep 11, 2024

REROUTE processor doesn't work for 2nd and later instances of the same integration and drops documents for theese instances. #111828

REROUTE processor doesn't work for 2nd and later instances of the same integration and drops documents for theese instances. #111828

Comments

SergeyDrachuk commented Aug 13, 2024 • edited Loading

Elasticsearch Version

Installed Plugins

Java Version

OS Version

Problem Description

Steps to Reproduce

Logs (if relevant)

gbanasiak commented Aug 14, 2024

elasticsearchmachine commented Aug 14, 2024

SergeyDrachuk commented Aug 14, 2024 • edited Loading

gbanasiak commented Aug 14, 2024

masseyke commented Aug 14, 2024

gbanasiak commented Aug 14, 2024

SergeyDrachuk commented Aug 15, 2024

gbanasiak commented Sep 2, 2024

SergeyDrachuk commented Sep 3, 2024 • edited Loading

gbanasiak commented Sep 3, 2024

SergeyDrachuk commented Sep 4, 2024

gbanasiak commented Sep 5, 2024

immon commented Sep 11, 2024

SergeyDrachuk commented Aug 13, 2024 •

edited

Loading

SergeyDrachuk commented Aug 14, 2024 •

edited

Loading

SergeyDrachuk commented Sep 3, 2024 •

edited

Loading