Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(parsers.influx): New influx line protocol via feature flag #10749

Merged
merged 5 commits into from
Mar 10, 2022

Conversation

powersj
Copy link
Contributor

@powersj powersj commented Feb 28, 2022

Add the ability to use the upstream Influx Line Protocol parser with the new, zero-allocation with the existing internal parser. Users can choose to use the new 'upstream' parser with the influx_parser_type config option or with the parser_type config option with the influxdb_v2_listener.

Moves time influx.TimeFunc to a common file for use by both parsers.

Blocked by influxdata/line-protocol#50
Previous PR #9685
Resolves #9474
Authored-by: Alex Krantz [email protected]

@telegraf-tiger telegraf-tiger bot added the feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin label Feb 28, 2022
@powersj powersj force-pushed the feat/influx-line-protocol-flag branch from 654ef28 to ff0d893 Compare March 1, 2022 15:06
This introduces a new parser option to allow users to choose between the
upstream (newer, more memory efficient and faster) influx line protocol
parser or the built-in, included influx line protocol.
@powersj powersj force-pushed the feat/influx-line-protocol-flag branch from 5fb6ed2 to 08d78ec Compare March 3, 2022 18:28
@powersj
Copy link
Contributor Author

powersj commented Mar 3, 2022

@reimda would you be willing to give the last commit a quick review? The big changes are:

  • Moving TimeFunc to a common location to be used by both parsers
  • add parser_type config option to influxdb_v2_listener + tests
  • Update README for line protocol parsers

I am looking at changes to influxdb_v1_listener, but so far do not feel my changes are appropriate

Copy link
Contributor

@reimda reimda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

review of 08d78ec

plugins/parsers/influx/parser.go Outdated Show resolved Hide resolved

if err != influx.EOF && err != nil {
if err != influx.EOF && err != influx_upstream.ErrEOF && err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels a little strange and maybe unsafe to handle errors from either parser in the same place without any type assertion on the error. I think in the logs we're also going to want to know which parser generated the error. Maybe handle the specific errors caused by each parser right after each Parse finishes and only handle common errors like the badRequest after the if/else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

influx.EOF and influx.ErrEOF are the same thing, an errors.New("EOF"). We are only handling types of error in this if statement so I'm not sure I follow the concern about type assertions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's store errors.New("EOF) as a variable in this package and compare err to that variable here.

plugins/inputs/influxdb_v2_listener/README.md Outdated Show resolved Hide resolved
plugins/parsers/influx_upstream/README.md Outdated Show resolved Hide resolved
plugins/parsers/registry.go Outdated Show resolved Hide resolved
* Keep influx_upstream under influx
* Add and update READMEs for influx parsers
@powersj powersj force-pushed the feat/influx-line-protocol-flag branch from 08d78ec to ce01646 Compare March 4, 2022 15:51
require.NoError(t, err)
require.NoError(t, resp.Body.Close())
require.EqualValues(t, 204, resp.StatusCode)
for _, parser := range []string{"internal", "upstream"} {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been thinking about this pattern. It reuses the listener for both parsers and always runs in a specific order of parsers. It's best to test as close to real use as possible so this isn't ideal. I think it would be better to pull the initialization into the parser for loop to use a new listener for each parser.

Doing it this way has some testing usability drawbacks too. If there is a failure we won't know which parser was involved. Also it doesn't allow us to run all tests of just one of the parser types. Using golang subtests would fix both those problems. See https://go.dev/blog/subtests#table-driven-tests-using-subtests

Maybe these improvements aren't needed, but they were on my mind so I thought I'd share them with you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done - the reason I did not do this initially was since the expected input and outputs are the same and the parser is a run-time check it did not seem to make sense. We both agree that it is a best practice, however slightly not sure it is a perfect fit here, but made the change anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good with the Run func.

Could we define the testCases slice in one place, maybe global, so it's not repeated in each function?

@powersj powersj marked this pull request as ready for review March 7, 2022 17:20
@powersj powersj added the ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review. label Mar 8, 2022
Comment on lines 19 to 23
## Influx parser type to use. Users can choose between 'internal' and
## 'upstream'. The internal parser is what Telegraf has historically used.
## While the upstream parser involved a large re-write to make it more
## memory efficient and performant.
## influx_parser_version = "internal"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use the same shorter text here too.

* update parser readme to be inline with listeners
* global EOF error check
* consolidate the test cases for both listener tests
@telegraf-tiger
Copy link
Contributor

telegraf-tiger bot commented Mar 8, 2022

Copy link
Contributor

@reimda reimda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@reimda reimda changed the title feat: new influx line protocol via feature flag feat(parsers.influx): New influx line protocol via feature flag Mar 10, 2022
@reimda reimda merged commit 40ed7fb into influxdata:master Mar 10, 2022
@sjwang90
Copy link
Contributor

@oplehto Do you by chance have before/after performance numbers of using the new parser compared to the old one?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support the new Influx Line Protocol Parser
4 participants