Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pkg/stanza] remove all usage of Multiline from the fileconsumer package #14947

Closed

Conversation

atingchen
Copy link
Contributor

Description:

remove all usage of Multiline from the fileconsumer package
Link to tracking Issue:
#14766
Testing:
unit test
Documentation:

@atingchen atingchen requested a review from a team October 14, 2022 10:23
@atingchen atingchen requested a review from djaglowski as a code owner October 14, 2022 10:23
pkg/stanza/fileconsumer/config.go Outdated Show resolved Hide resolved
pkg/stanza/fileconsumer/splitter_factory.go Outdated Show resolved Hide resolved
@atingchen atingchen force-pushed the feature/remove-multiline-config branch 2 times, most recently from 52c0230 to fc363eb Compare October 22, 2022 06:00
@@ -57,11 +58,12 @@ type Config struct {
FingerprintSize helper.ByteSize `mapstructure:"fingerprint_size,omitempty"`
MaxLogSize helper.ByteSize `mapstructure:"max_log_size,omitempty"`
MaxConcurrentFiles int `mapstructure:"max_concurrent_files,omitempty"`
Splitter helper.SplitterConfig `mapstructure:",squash,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we also take MultilineConfig out of SplitterConfig?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You suggested removing it from the fileconsumer package earlier in #14605, so I removed it here~

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'm just trying to make sure we end up in the correct end state.

pkg/stanza/fileconsumer/splitter_factory.go Outdated Show resolved Hide resolved
}

// Build will build a file input operator from the supplied configuration
func (c Config) Build(logger *zap.SugaredLogger, emit EmitFunc) (*Manager, error) {
func (c Config) Build(logger *zap.SugaredLogger, emit EmitFunc, opts ...FactoryOption) (*Manager, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I apologize for the delayed review - I'm finally having time to look at this again.

This new parameter type is a bit confusing because the FactoryOption is really an option for the splitter factory, but this isn't necessarily clear. I had to look back at your original proposal here to remember why we are making these changes. The original goal was to be able to pass in a splitFunc, which I think makes a lot of sense. However, we have realized there are some parts that are needed outside of the splitFunc as well, such as encoding.

Rather than pass in options, I think we should take a step back and clarify the problems with our original plan. Then we can try to solve those problems.

What specific problems do we have if we pass in splitFunc directly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking the time to review this code.
In fact, the main purpose of this code is to remove the parameter MultilineConfig from the package fileconsumer as you mentioned.
Of course, we can also directly pass in the splitFunc parameter. We also need to consider what to do when the user uses the MultilineConfig parameter and passes in a custom splitFunc at the same time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, the main purpose of this code is to remove the parameter MultilineConfig from the package fileconsumer as you mentioned.

Sure, I just want to make sure we are solving the problem you initially proposed in #14593.

We also need to consider what to do when the user uses the MultilineConfig parameter and passes in a custom splitFunc at the same time.

I don't know that we need to worry about this case. I think if it is desired to user MultilineConfig, then that is used to generate the SplitFunc in some way. Then passing in the SplitFunc works regardless of whether or not Multiline is used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the user passes in the SplitFunc, whenever the MultilineConfig is set, the SplitFunc passed in by the user is used. Will this confuse users? Is it possible to consider throwing an error?

Regarding whether to remove from the fileconsumer package, we can open an issue for discussion later. And all we need to do now is to go back to the original purpose and let fileconsumer support custom SplitFunc.

What do you think about this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the user passes in the SplitFunc, whenever the MultilineConfig is set, the SplitFunc passed in by the user is used. Will this confuse users? Is it possible to consider throwing an error?

Sorry, but this I don't think it should work this way, even temporarily. This would be confusing. We should use MultilineConfig to generate the splitFunc, then pass it in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you, it will confuse users.
I think we should throw an error when user not only customize splitFunc but also set MultilineConfig.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the case here is that we did not remove the multilineConfig from the fileconsumer, while allowing the user to pass in a custom SplitFunc when calling the Build method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reconsider your opinion, the Build method passes in an Options on the spliterFactory that the responsibility is not clear.
If we add a new method BuildWithSplitter, this method can directly pass in a user-defined splitFunc. What do you think of this way? @djaglowski

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes more sense to me. I think we should try it. Thanks for finding another solution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will also not affect the existing interface. I will open a new PR to implement this.

@atingchen atingchen force-pushed the feature/remove-multiline-config branch from 69bdabd to 9ff504b Compare October 31, 2022 14:47
@atingchen atingchen closed this Nov 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants