Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(new transform): New merge transform #1504

Merged
merged 42 commits into from
Feb 5, 2020
Merged

Conversation

MOZGIII
Copy link
Contributor

@MOZGIII MOZGIII commented Jan 9, 2020

Closes #1488.

@MOZGIII MOZGIII force-pushed the merge-transform branch 3 times, most recently from a9dc7ff to 9147835 Compare January 10, 2020 12:48
@MOZGIII MOZGIII marked this pull request as ready for review January 10, 2020 20:05
@MOZGIII MOZGIII requested a review from LucioFranco as a code owner January 10, 2020 20:05
@MOZGIII
Copy link
Contributor Author

MOZGIII commented Jan 11, 2020

The feature is incomplete, however, the approach is settled, and overall it's ready for review. What remains to be added is:

  • built-in merge transform for journald;
  • built-in merge transform for kubernetes (to match docker);
  • a transform for regexp-based partial event marker implementations (a simple one, for advanced cases user can already use lua transform);
  • support for multiple partial event queues in merge transform to enable use in service deployment scenarios.

@binarylogic binarylogic changed the title feat(new transform): Merge transform feat(new transform): New merge transform Jan 11, 2020
@binarylogic binarylogic removed the request for review from LucioFranco January 11, 2020 16:00
@binarylogic
Copy link
Contributor

Nice work! Do you plan to do those remaining items in this PR or in separate PRs?

@MOZGIII
Copy link
Contributor Author

MOZGIII commented Jan 11, 2020

I'd move some of the pending things from the checklist to the separate PRs:

  • journald - I'm currently looking into it, and it might be a bit more complicated than I anticipated originally;
  • a custom regexp-based partial events marker- transform - it's unclear wherher this is even needed, since it might be just a couple line of lua;
  • kubernetes - it might be non-trivial too, I'd submit a separate PR for it; I wanted to take care of it as part of this PR, but now I think it can be clearly split since the implementation is very different there.

I'd like to take care of the last point - multiple partial event queues - as part of this one.

@MOZGIII
Copy link
Contributor Author

MOZGIII commented Jan 11, 2020

Btw, I think it's time to team up for test harness setup.

@binarylogic
Copy link
Contributor

Sounds good. Also, regarding the journald work, we are making some changes there via #1473. I want to make sure that does not conflict with your work.

Let's schedule time next week to dive into the test harness. I'll ping you.

@binarylogic
Copy link
Contributor

@MOZGIII mind resolving the conflicts? Then hopefully we can get this merged 🚀

@MOZGIII
Copy link
Contributor Author

MOZGIII commented Feb 4, 2020

@binarylogic sure, will do shortly. I'd like to have feedback from @lukesteensen on #1504 (comment) before the merge.
Also, another pass on the documentation is required, I can do that too.

@lukesteensen
Copy link
Member

Yep, I think if we can get this green on CI then we should be in a good place to merge.

@binarylogic
Copy link
Contributor

Nice! Let's make the final changes in #1504 (comment) and then we can merge @MOZGIII .

It's not as optimized at the previous variant, however it's way easier
to understand and maintain.

Signed-off-by: MOZGIII <[email protected]>
@MOZGIII
Copy link
Contributor Author

MOZGIII commented Feb 4, 2020

There are some merge conflicts that are in the .meta/transforms/concat.toml. The changes to that have initially arrived to this branch from 1421277#diff-08a736386e72041abbb1de9fb9000216
I'm not sure how relevant are they, and what would be a reasonable resolution in the context iof this PR at this point.
Same applies to website/metadata.js.

@MOZGIII
Copy link
Contributor Author

MOZGIII commented Feb 4, 2020

Ideally, we should alter that commit (1421277) to only include the changes relevant to this PR.

@MOZGIII
Copy link
Contributor Author

MOZGIII commented Feb 4, 2020

Ended up with a merge resolution that negates the irrelevant changes, not sure how correct that is.

Signed-off-by: MOZGIII <[email protected]>
@binarylogic
Copy link
Contributor

That's fine. I'll review to make sure nothing is off.

@MOZGIII
Copy link
Contributor Author

MOZGIII commented Feb 4, 2020

Thanks @binarylogic! I've tried to solve the check-blog, and pretty much succeeded, but it seems there's an ongoing work there - so I'll just leave it to you. Sorry if I messed something up already.

@binarylogic
Copy link
Contributor

I'll do a final pass on this, get the checks passing, and then merge.

@MOZGIII
Copy link
Contributor Author

MOZGIII commented Feb 4, 2020

Just one sec, I'll update the docs. I'm only supposed to update the text at .meta dir manually, correct?

@binarylogic
Copy link
Contributor

Yep, modify everything in the /.meta folder then run make generate.

- Pull the latest field descriptions from the code for merge transform
- Update the field names at merge transform and docker source
- Update schema to allow `merge` at the `function_category`
- Remove `guides` and `resources` from merge transform - not sure what
  those are about
- Fix some minor nitpicks

Signed-off-by: MOZGIII <[email protected]>
Signed-off-by: MOZGIII <[email protected]>
@MOZGIII
Copy link
Contributor Author

MOZGIII commented Feb 4, 2020

Done! So, one concern I had is with 687e281#diff-cc6d35117dc2abae06d7b7f10c958363L6-L9 - the guides and resources field, I don't know where those come from, and the schema doesn't allow them. That's why I though there's an ongoing process of some kind there.

Signed-off-by: binarylogic <[email protected]>
@binarylogic binarylogic merged commit c285303 into master Feb 5, 2020
@binarylogic binarylogic deleted the merge-transform branch February 5, 2020 05:14

Where possible, Vector will handle event merging at the source level. For
example, the [`file`][docs.sources.file] contains a `message_start_indicator`
option and the [`docker`][docs.sources.file] contains an `auto_partial_merge`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@binarylogic here is a wrong link, should be [`docker`][docs.sources.docker]

@alexgavrisco
Copy link
Contributor

Please forgive my ignorance in case it's been already answered, I couldn't find the answer in this PR or other issues.
@MOZGIII @binarylogic I've encountered the same issue with kubernetes source. And it doesn't seem that I can achieve it using the merge filter since I'm not in control of splitting messages (kubernetes is). Do you have any pointers for this issue?

@MOZGIII
Copy link
Contributor Author

MOZGIII commented Mar 24, 2020

@Alexx-G thanks for reaching out! You're looking for this: #1718
I'll be working on it this week!

@alexgavrisco
Copy link
Contributor

Yep, this is exactly what I'm looking for. Can't believe I missed.
Thanks a lot! Subscribed for updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New merge transform
5 participants