Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace Dots in Key names with underscores #79

Merged
merged 6 commits into from
Oct 19, 2020

Conversation

zhonghui12
Copy link
Contributor

*Issue aws/amazon-kinesis-firehose-for-fluent-bit#46

Description of changes:
Replace Dots in Key names with underscores

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Comment on lines 212 to 222
record := map[interface{}]interface{}{
"testkey": []byte("test value"),
"kubernetes": map[interface{}]interface{}{
"app": []byte("test app label"),
"app.kubernetes.io/name": []byte("test key with dots"),
},
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this test case because the kubernetes bit is a real world example.

Let's make it a bit more complicated though, just to be super safe. Add another top level key that has dots and make the value of that key be a map with some keys with dots too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Actually I have tested these cases with the dummy input by fluent bit. Will cover them in our unit test.

README.md Outdated
@@ -24,6 +24,7 @@ If you think you’ve found a potential security issue, please do not post it in
* `experimental_concurrency_retries`: Specify a limit to the number of retries concurrent goroutines will attempt. By default `4` retries will be attempted before records are dropped.
* `aggregation`: Setting `aggregation` to `true` will enable KPL aggregation of records sent to Kinesis. This feature isn't compatible with the `partition_key` feature. See the KPL aggregation section below for more details.
* `compression`: Setting `compression` to `zlib` will enable zlib compression of each record. By default this feature is disabled and records are not compressed.
* `replace_dots`: If you set replace_dots as true, all dots in key names will be replaced with underscores.
Copy link
Contributor

@PettitWesley PettitWesley Oct 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I just thought of something... this feature is too limited. You can just take dots and replace them with underscores. But what if you don't want underscores? What if you have a key like example.key and a key example_key- with this feature when the first one has its dot replaced it will then conflict with the second key.

So instead, let's have the value of the replace_dots be a string which will be what you replace the dots with.

So for example you can do:

[OUTPUT]
    Name            kinesis
    Match           *
    region          us-west-2
    stream          my-kinesis-stream-name
    aggregation     true
    append_newline  true
    replace_dots    -

This would replace dots with a -. Basically let the user decide what is the replacement. If they want underscores, they can specify it.

In the code, if replace_dots is empty then that means the feature is disabled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense. Will implement the option like this.

README.md Outdated
@@ -24,6 +24,7 @@ If you think you’ve found a potential security issue, please do not post it in
* `experimental_concurrency_retries`: Specify a limit to the number of retries concurrent goroutines will attempt. By default `4` retries will be attempted before records are dropped.
* `aggregation`: Setting `aggregation` to `true` will enable KPL aggregation of records sent to Kinesis. This feature isn't compatible with the `partition_key` feature. See the KPL aggregation section below for more details.
* `compression`: Setting `compression` to `zlib` will enable zlib compression of each record. By default this feature is disabled and records are not compressed.
* `replace_dots`: A string which will be what you replace the dots in the key name with. By default, it will be empty which means the feature is disabled.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead, change it to:

Replace dot characters in key names with the value of this option. For example, if you add replace_dots _ in your config then all occurrences of . will be replaced with an underscore. By default, dots will not be replaced.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for rephrasing it

Comment on lines 110 to 112
if replaceDots == "" {
logrus.Infof("[kinesis %d] No replacement provided. Do not need to replace dots with other symbols.", pluginID)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You already print the value of this config option in the plugin initialization, remove this info statement, its not needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Will remove this statement

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants