Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weird behavior of json_v2 parser with multiple input lines and lots of decimal places #13805

Closed
agiilit-admin opened this issue Aug 21, 2023 · 4 comments · Fixed by #14344
Closed
Labels
area/json json and json_v2 parser/serialiser related bug unexpected problem or unintended behavior plugin/parser 1. Request for new parser plugins 2. Issues/PRs that are related to parser plugins

Comments

@agiilit-admin
Copy link
Contributor

Relevant telegraf.conf

[[inputs.tail]]
  files = ["./test-input.txt"]
  from_beginning = true
  data_format = "json_v2"

  [[inputs.tail.json_v2]]
    [[inputs.tail.json_v2.object]]
      path = "{fields}"

    [[inputs.tail.json_v2.object.field]]
      path = "fields.kwh_import"
      rename = "o1"

    [[inputs.tail.json_v2.object.field]]
      path = "fields.kwh_export"
      rename = "o2"

    [[inputs.tail.json_v2.object.field]]
      path = "fields.freq"
      rename = "o3"


[[outputs.file]]
  files = ["./test-output.txt"]

Logs from Telegraf

$ telegraf --debug --config test-config.conf
2023-08-21T16:47:49Z I! Loading config: test-config.conf
2023-08-21T16:47:49Z I! Starting Telegraf 1.27.3
2023-08-21T16:47:49Z I! Available plugins: 237 inputs, 9 aggregators, 28 processors, 23 parsers, 59 outputs, 4 secret-stores
2023-08-21T16:47:49Z I! Loaded inputs: tail
2023-08-21T16:47:49Z I! Loaded aggregators:
2023-08-21T16:47:49Z I! Loaded processors:
2023-08-21T16:47:49Z I! Loaded secretstores:
2023-08-21T16:47:49Z I! Loaded outputs: file
2023-08-21T16:47:49Z I! Tags enabled: host=ai-2
2023-08-21T16:47:49Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"ai-2", Flush Interval:10s
2023-08-21T16:47:49Z D! [agent] Initializing plugins
2023-08-21T16:47:49Z D! [agent] Connecting outputs
2023-08-21T16:47:49Z D! [agent] Attempting connection to [outputs.file]
2023-08-21T16:47:49Z D! [agent] Successfully connected to outputs.file
2023-08-21T16:47:49Z D! [agent] Starting service inputs
2023-08-21T16:47:49Z D! [inputs.tail] Tail added for "test-input.txt"
2023-08-21T16:47:59Z D! [outputs.file] Wrote batch of 3 metrics in 89.699µs
2023-08-21T16:47:59Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics
2023-08-21T16:48:05Z D! [agent] Stopping service inputs
2023-08-21T16:48:05Z D! [inputs.tail] Tail removed for "test-input.txt"
2023-08-21T16:48:05Z D! [agent] Input channel closed
2023-08-21T16:48:05Z I! [agent] Hang on, flushing any cached metrics before shutdown
2023-08-21T16:48:05Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics
2023-08-21T16:48:05Z I! [agent] Stopping running outputs
2023-08-21T16:48:05Z D! [agent] Stopped Successfully

System info

Telegraf 1.27.3, Debian 12

Docker

No response

Steps to reproduce

  1. Prepare input file ./test-input.txt as follows:
{"fields":{"freq":50.02000045776367,"kwh_export":0,"kwh_import":568.5}}
{"fields":{"freq":50.0099983215332,"kwh_export":0,"kwh_import":568.5}}
{"fields":{"freq":50,"kwh_export":0,"kwh_import":568.5}}
  1. Run telegraf --debug --config test-config.conf
  2. Observe ./test-output.txt

Expected behavior

Each line of output should contain fields o1, o2, o3:

tail,host=ai-2,path=test-input.txt o3=50.02000045776367,o2=0,o1=568.5 1692636469843257713
tail,host=ai-2,path=test-input.txt o3=50.0099983215332,o2=0,o1=568.5 1692636469843312200
tail,host=ai-2,path=test-input.txt o3=50,o2=0,o1=568.5 1692636469843327461

Actual behavior

Last line of output is missing field o1:

tail,host=ai-2,path=test-input.txt o3=50.02000045776367,o2=0,o1=568.5 1692636469843257713
tail,host=ai-2,path=test-input.txt o3=50.0099983215332,o2=0,o1=568.5 1692636469843312200
tail,host=ai-2,path=test-input.txt o3=50,o2=0 1692636469843327461

Additional info

Number of decimals in freq field seems to have significance. If input is changed to:

{"fields":{"freq":50.02,"kwh_export":0,"kwh_import":568.5}}
{"fields":{"freq":50.01,"kwh_export":0,"kwh_import":568.5}}
{"fields":{"freq":50,"kwh_export":0,"kwh_import":568.5}}

The output is as expected.

Also, there need to be multiple lines before "freq":50 for the anomaly to occur.

Another workaround is to increase decimals for the last freq:

{"fields":{"freq":50.02000045776367,"kwh_export":0,"kwh_import":568.5}}
{"fields":{"freq":50.0099983215332,"kwh_export":0,"kwh_import":568.5}}
{"fields":{"freq":50.00,"kwh_export":0,"kwh_import":568.5}}

..producing:

tail,host=ai-2,path=test-input.txt o3=50.02000045776367,o2=0,o1=568.5 1692638034737406013
tail,host=ai-2,path=test-input.txt o3=50.0099983215332,o2=0,o1=568.5 1692638034737459133
tail,host=ai-2,path=test-input.txt o3=50,o2=0,o1=568.5 1692638034737474051

The anomaly was first observed with AMQP, but it seems to be reproducible with the tail input plugin.

@agiilit-admin agiilit-admin added the bug unexpected problem or unintended behavior label Aug 21, 2023
@powersj
Copy link
Contributor

powersj commented Aug 21, 2023

The anomaly was first observed with AMQP, but it seems to be reproducible with the tail input plugin.

This is not an issue with any input plugin, rather with the JSON v2 parser itself. The first version of telegraf I was able to reproduce with was v1.21.0, which introduce the object field/tags in #9449.

Adding some debugging it appears that the parser is currently adding the kwh_export/o2 value twice. First, correctly as 0 and again as the kwh_import/o1 568.5 value. The parent node index that is used is somehow wrong. That results in the wrong rename getting used and the o2 value getting updated again and no o1 value.

I believe this is all related to the indexing and lookups that happen, somewhere long the way they get off because of the previous values. Specifically, the call to p.existsInpathResults(parentIndex) looks up the wrong index and returns o2 value. Those index values come from combineObject() here

@powersj powersj added plugin/parser 1. Request for new parser plugins 2. Issues/PRs that are related to parser plugins area/json json and json_v2 parser/serialiser related labels Aug 21, 2023
@agiilit-admin
Copy link
Contributor Author

Related: there have been incidents where simultaneously with errors in fields also tag values exctracted from the JSON have ended up with wrong tag names, but I am unable to repeat the behavior at the moment. The config involved both object field and object tag. Btw, there were two name objects in the JSON document, one at the top level and one as tags.name.

That problem (with tags) went away when replacing

[[inputs.tail.json_v2.object]]
  path = "@this"

with

[[inputs.tail.json_v2.object]]
  path = "{fields,tags}"

@agiilit-admin
Copy link
Contributor Author

If an extra "name":"foo" is added to each JSON document:

{"fields":{"freq":50.02000045776367,"kwh_export":0,"kwh_import":568.5},"name":"foo"}
{"fields":{"freq":50.0099983215332,"kwh_export":0,"kwh_import":568.5},"name":"foo"}
{"fields":{"freq":50,"kwh_export":0,"kwh_import":568.5},"name":"foo"}

and included with json_v2.object path:

[[inputs.tail.json_v2.object]]
  path = "{fields,name}"

it ends up as kwh_import/o1 value in last line of output:

tail,host=ai-2,path=test-input.txt o3=50.02000045776367,o2=0,o1=568.5 1692690843010670721
tail,host=ai-2,path=test-input.txt o3=50.0099983215332,o2=0,o1=568.5 1692690843010756312
tail,host=ai-2,path=test-input.txt o3=50,o2=0,o1="foo" 1692690843010788352

@agiilit-admin
Copy link
Contributor Author

The problem here seems to be that parse results accumulate in p.subPathResults over multiple calls of parseCriticalPath(). When length of JSON elements vary a lot (as in the provided examples), probability of erroneous index match increases. Sooner or later existsInpathResults() starts misbehaving, and produces the results above.

Setting p.subPathResults = nil in parseCriticalPath() makes this particular issue go away. Having no experience in telegraf internals (or go, for that matter), I have no suggestions on the correct way to fix it.

Could not replicate above behaviour with json_v2 tests with multiple inputs. Have seen it on amqp_consumer, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/json json and json_v2 parser/serialiser related bug unexpected problem or unintended behavior plugin/parser 1. Request for new parser plugins 2. Issues/PRs that are related to parser plugins
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants