-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add parse_ints
config in json parser to support parsing int or float properly
#33699
Conversation
} | ||
|
||
type JsoniterConfig struct { | ||
IndentionStep int `mapstructure:"indention_step"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to expose all these options to the users?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, my second thought on this is at least at this moment we should only expose options we needed, to not bind to the json iterator so tight.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently json_parser
uses jsoniter.ConfigFastest
which has 2 more options enabled
var ConfigFastest = [Config](https://pkg.go.dev/github.com/json-iterator/go#Config){
EscapeHTML: [false](https://pkg.go.dev/builtin#false),
MarshalFloatWith6Digits: [true](https://pkg.go.dev/builtin#true),
ObjectFieldMustBeSimpleString: [true](https://pkg.go.dev/builtin#true),
}.Froze()
to add the use_number
config, it looks like we have to create a new jsoniter.Config
object probably to have the same config so users are not surprised, I think it makes sense to allow user to be able to turn off MarshalFloatWith6Digits
as well. any thoughts or comments on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree we should only expose what is needed, and we should consider that perhaps some day we need to switch json parsing libraries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wrong about MarshalFloatWith6Digits
, it doesn't affect unmarshal, only marshal, which is not related to the parser. so for now I only added config for UseNumber
.
@@ -36,5 +38,47 @@ func (p *Parser) parse(value any) (any, error) { | |||
default: | |||
return nil, fmt.Errorf("type %T cannot be parsed as JSON", value) | |||
} | |||
|
|||
if p.useNumber { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing the conversion explicitly here makes me wonder what is the actual reason for defining the UseNumber
setting in the json
object's config at first place 🤔 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure either. Looking at the encoding/json it looks like not as these many options as json iterator, which makes more sense to me to only expose use_number
option for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So what's the point of enabling the UseNumber
of the jsoniter.Config
? Is this required for some reason? If so I suggest we document this at https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/33699/files#diff-158189e84f05b177451492225bc83c2c23fa140cffdd4c5f9bf7db6ada8edc3aR54.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added comment, please also check this test that when UseNumber is false, regardless the data is int or float, they will be parsed as float64
. https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/33699/files#diff-f811622a59595f3ded728a9213b5e08d0c5fbe91c4cfff23b25d037f84e1f953R142-R155
49d68f6
to
e3b3b43
Compare
use_number
config in json parser to support decode into int or float properly
@djaglowski @ChrsMark @crobert-1 could you please review again? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that we are switching the json library already: #33785
I suggest we wait until the switch is resolved and see what the possible implementation would look like.
# Conflicts: # pkg/stanza/operator/parser/json/config.go # pkg/stanza/operator/parser/json/parser.go
updated code for the new json library, since the overhead of enabling
|
Thank's @newly12.
Would that make sense to mention this in the docs so as users know that enabling this option comes with a price? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a small documentation suggestion. Thanks.
| `if` | | An [expression](../types/expression.md) that, when set, will be evaluated to determine whether this operator should be used for the given entry. This allows you to do easy conditional parsing without branching logic with routers. | | ||
| `timestamp` | `nil` | An optional [timestamp](../types/timestamp.md) block which will parse a timestamp field before passing the entry to the output operator. | | ||
| `severity` | `nil` | An optional [severity](../types/severity.md) block which will parse a severity field before passing the entry to the output operator. | | ||
| `use_number` | `false` | Numbers like `int` and `float` are parsed as `float64` by default, when `use_number` is enabled, numbers are parsed as `json.Number` and then converted to `int64` or `float64` based on the value. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| `use_number` | `false` | Numbers like `int` and `float` are parsed as `float64` by default, when `use_number` is enabled, numbers are parsed as `json.Number` and then converted to `int64` or `float64` based on the value. | | |
| `use_number` | `false` | Numbers like `int` and `float` are parsed as `float64` by default. When `use_number` is enabled, numbers are parsed as `json.Number` and then converted to `int64` or `float64` based on the value. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, updated.
Done, PTAL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thank's! Not sure if we could avoid this performance overhead, but since that's behind the specific setting I think we can land it and evaluate its adoption.
@djaglowski please take a look as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, with one open question:
Should we use a different name for the parameter than use_number
? To me the name seems to communicate an implementation detail, rather than the effect that it has. I would think something like parse_ints
might be more informative. That said, I think the documentation is very clear so I don't think this is required.
I checked fastjson code a little bit but looks like not much can be done apart from current approach, given limited options/interface exposed by the library. |
make sense to me, it is more informative, I've changed the config name to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Just make sure to update the PR's title as well ;)
use_number
config in json parser to support decode into int or float properlyparse_ints
config in json parser to support parsing int or float properly
Thanks @newly12 |
Description:
expose json iterator config in json parser
Link to tracking Issue:
Fixes #33696
Testing:
added.
Documentation:
updated