-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(parsers.influx): New influx line protocol via feature flag #10749
feat(parsers.influx): New influx line protocol via feature flag #10749
Conversation
654ef28
to
ff0d893
Compare
This introduces a new parser option to allow users to choose between the upstream (newer, more memory efficient and faster) influx line protocol parser or the built-in, included influx line protocol.
5fb6ed2
to
08d78ec
Compare
@reimda would you be willing to give the last commit a quick review? The big changes are:
I am looking at changes to influxdb_v1_listener, but so far do not feel my changes are appropriate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
review of 08d78ec
|
||
if err != influx.EOF && err != nil { | ||
if err != influx.EOF && err != influx_upstream.ErrEOF && err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels a little strange and maybe unsafe to handle errors from either parser in the same place without any type assertion on the error. I think in the logs we're also going to want to know which parser generated the error. Maybe handle the specific errors caused by each parser right after each Parse finishes and only handle common errors like the badRequest after the if/else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
influx.EOF and influx.ErrEOF are the same thing, an errors.New("EOF")
. We are only handling types of error in this if statement so I'm not sure I follow the concern about type assertions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's store errors.New("EOF) as a variable in this package and compare err to that variable here.
* Keep influx_upstream under influx * Add and update READMEs for influx parsers
08d78ec
to
ce01646
Compare
require.NoError(t, err) | ||
require.NoError(t, resp.Body.Close()) | ||
require.EqualValues(t, 204, resp.StatusCode) | ||
for _, parser := range []string{"internal", "upstream"} { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been thinking about this pattern. It reuses the listener for both parsers and always runs in a specific order of parsers. It's best to test as close to real use as possible so this isn't ideal. I think it would be better to pull the initialization into the parser for loop to use a new listener for each parser.
Doing it this way has some testing usability drawbacks too. If there is a failure we won't know which parser was involved. Also it doesn't allow us to run all tests of just one of the parser types. Using golang subtests would fix both those problems. See https://go.dev/blog/subtests#table-driven-tests-using-subtests
Maybe these improvements aren't needed, but they were on my mind so I thought I'd share them with you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done - the reason I did not do this initially was since the expected input and outputs are the same and the parser is a run-time check it did not seem to make sense. We both agree that it is a best practice, however slightly not sure it is a perfect fit here, but made the change anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good with the Run func.
Could we define the testCases slice in one place, maybe global, so it's not repeated in each function?
plugins/parsers/influx/README.md
Outdated
## Influx parser type to use. Users can choose between 'internal' and | ||
## 'upstream'. The internal parser is what Telegraf has historically used. | ||
## While the upstream parser involved a large re-write to make it more | ||
## memory efficient and performant. | ||
## influx_parser_version = "internal" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use the same shorter text here too.
* update parser readme to be inline with listeners * global EOF error check * consolidate the test cases for both listener tests
Download PR build artifacts for linux_amd64.tar.gz, darwin_amd64.tar.gz, and windows_amd64.zip. 📦 Click here to get additional PR build artifactsArtifact URLs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
@oplehto Do you by chance have before/after performance numbers of using the new parser compared to the old one? |
Add the ability to use the upstream Influx Line Protocol parser with the new, zero-allocation with the existing internal parser. Users can choose to use the new 'upstream' parser with the
influx_parser_type
config option or with theparser_type
config option with theinfluxdb_v2_listener
.Moves time
influx.TimeFunc
to a common file for use by both parsers.Blocked by influxdata/line-protocol#50
Previous PR #9685
Resolves #9474
Authored-by: Alex Krantz [email protected]