Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Ingest Manager] Moved from stream to dataset #18967

Merged
merged 6 commits into from
Jun 9, 2020

Conversation

michalpristas
Copy link
Contributor

@michalpristas michalpristas commented Jun 4, 2020

What does this PR do?

Changes injected processor to events so it enriches them with dataset.* instead of stream.*

Why is it important?

elastic/package-registry#491

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

As draft until we are ready and in sync to merge

Fixes: elastic/package-registry#491
Fixes: #18826

cc @ruflin

@michalpristas michalpristas self-assigned this Jun 4, 2020
@botelastic botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels Jun 4, 2020
@michalpristas michalpristas changed the title moved from stream to dataset [Ingest Manager] Moved from stream to dataset Jun 4, 2020
@elasticmachine
Copy link
Collaborator

elasticmachine commented Jun 4, 2020

💚 Build Succeeded

Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: [Pull request #18967 updated]

  • Start Time: 2020-06-09T10:47:01.659+0000

  • Duration: 32 min 32 sec

Test stats 🧪

Test Results
Failed 0
Passed 508
Skipped 127
Total 635

Steps errors

Expand to view the steps failures

  • Name: Report to Codecov
    • Description: curl -sSLo codecov https://codecov.io/bash for i in auditbeat filebeat heartbeat libbeat metricbeat packetbeat winlogbeat journalbeat do FILE="${i}/build/coverage/full.cov" if [ -f "${FILE}" ]; then bash codecov -f "${FILE}" fi done

    • Duration: 1 min 27 sec

    • Start Time: 2020-06-09T11:13:01.316+0000

    • log

@@ -8,7 +8,7 @@ filebeat:
index: logs-generic-default
processors:
- add_fields:
target: "stream"
target: "dataset"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There a small thing missing here: dataset.dataset need to be changed for dataset.name, this will create these fields:

  1. dataset.type
  2. dataset.name
  3. dataset.namespace.

@michalpristas @ruflin The field in the agent configuration will also need to changes? Meaning we will need to adjust the index name generation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in configuration i would expect this

datasources:
  - namespace: testing
    use_output: default
    inputs:
    - type: docker/metrics
      streams:
        - metricset: status
          dataset: docker.status

to become this

datasources:
  - namespace: testing
    use_output: default
    inputs:
    - type: docker/metrics
      datasets:
        - metricset: status
          name: docker.status

is that correct?
but this feels weird as you can have

datasources:
  - namespace: testing
    use_output: default
    inputs:
    - type: docker/metrics
      datasets:
        - metricset: status
          name: docker.status
        - metricset: cpu
          name: docker.status

i dont recall if we agreed to handle configuration change as a separate pack of PRs or the same with this one @ruflin

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michalpristas Please handle config changes as a separate PR as it also has effects in Kibana. I plan to open an issue for this. These config changes we should directly in combination with the potential removal of datasources.

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ingest-management (Team:Ingest Management)

@@ -8,7 +8,7 @@ filebeat:
index: logs-generic-default
processors:
- add_fields:
target: "stream"
target: "dataset"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this and it does not work as expected. The result is:

          "dataset" : {
            "type" : "metrics",
            "dataset" : "system.cpu",
            "namespace" : "default"
          },

But it should be

          "dataset" : {
            "type" : "metrics",
            "name" : "system.cpu",
            "namespace" : "default"
          },

I think the reason is on line 14 where it still states dataset instead of name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh right

@ruflin
Copy link
Contributor

ruflin commented Jun 9, 2020

@michalpristas I pushed a fix with the requested change above. Lets see what CI thinks.

@ruflin
Copy link
Contributor

ruflin commented Jun 9, 2020

@michalpristas Seems like this does not apply yet to agent monitoring data. Where needs this change to happen?

@ruflin
Copy link
Contributor

ruflin commented Jun 9, 2020

I filed #19070 as a follow up as it is really hard to validate these changes without manual testing.

Copy link
Contributor

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested locally with elastic/kibana#68322 and works as expected.

@michalpristas michalpristas merged commit d7d5634 into elastic:master Jun 9, 2020
@ph ph added the needs_backport PR is waiting to be backported to other branches. label Jun 9, 2020
michalpristas added a commit to michalpristas/beats that referenced this pull request Jun 10, 2020
[Ingest Manager] Moved from stream to dataset (elastic#18967)
michalpristas added a commit that referenced this pull request Jun 12, 2020
[Ingest Manager] Moved from stream to dataset (#18967)
melchiormoulin pushed a commit to melchiormoulin/beats that referenced this pull request Oct 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement needs_backport PR is waiting to be backported to other branches. review
Projects
None yet
4 participants