-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for include_fields and drop_fields #1120
Conversation
b783763
to
628b636
Compare
3d604a5
to
c7f76ec
Compare
c7f76ec
to
4864c98
Compare
4864c98
to
ce66ac7
Compare
|
||
} | ||
|
||
// deepEqual compars two MapStr assuming they only have primitive types as values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/compars/compares/
8d8c425
to
f8a5bfa
Compare
Instead of ignoring it, can it just fail? That way if the list ever grows in the future, then a user that upgrades won't be surprised to find their rules no longer apply as they did before. |
d80287c
to
f00812f
Compare
@pickypg The list of mandatory fields is defined in the code, so the user can configure only the extra fields that he wants to include or drop. I agree with you that the sentence is confusing, so I updated the description. Thanks for noticing. |
@@ -84,6 +85,7 @@ type BeatConfig struct { | |||
Output map[string]*ucfg.Config | |||
Logging logp.Logging | |||
Shipper publisher.ShipperConfig | |||
Filter []filter.FilterConfig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest we don't introduce a new top level config here as we are in the process to get rid of it. This leads to problems when we want for example run multiple beats as one binary. We should only have one common name space which is beat.
Alternatively this could also be added to every single beat config to have it "local". Means it would be filebeat.filter
or packetbeat.filer
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can also do this change in a separate PR once it's merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if we merge it like that we probably also going to keep it that way as also tests below and config changes for topbeat were already made for it. I suggest to currently add it under shipper
which should be beat
, same as we have the fields etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think shipper
or beat
(to be honest I think beat is too generic, maybe host?) should be the section that includes only the configuration options for that host like geoip configuration, max_queue, tags and shouldn't include fields
and filter
as these are not a configuration of the host and more of an enhancement of the output data.
I suggest to leave it like this for now and change it in a separate PR if we decide t o structure differently the configuration file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#1165 will allow to place it under beat.filter as we do we fields and tags.
cf8d908
to
c022ec2
Compare
assert "http.response_headers" in objs[0] | ||
|
||
# generic filtering fails to delete the htp.response_headers.transfer-encoding | ||
assert "transfer-encoding" not in objs[0]["http.response_headers"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand the comment above this statement. Does it really fail or it's more like "check if filtering fails to delete.."?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, exactly
I think mandatory should really only be |
@tsg Agree with that these should be the required libbeat fields. We could add a possibility in the future that beats can also reserve some fields as required. |
LGTM apart of the comments above. Nice work! Longer term we should work towards an Event type that's not |
769408a
to
bb0fd33
Compare
11de95e
to
d9358f0
Compare
All the comments were addressed. It's ready for review. |
I changed the action from |
@elastic/beats Any other thoughts on the naming of |
I renamed drop_all_fields_except with include_fields. Now it's ready for the final review. |
6b0e39c
to
39c6d0c
Compare
Rebased to master, it should be green soon. |
39c6d0c
to
0f78a7d
Compare
d747b3e
to
2d78255
Compare
Finally all green now |
2d78255
to
1bea4fb
Compare
include_fields - removes all the fields from the event except the configured fields drop_fields - removes only the configured fields Both actions receive fields as argument that can be of type map or a value. For example: "proc", "proc.cpu.total_p". In case a map is passed as an argument, all the fields inside the map are considered. Each event MUST contain the fields ["@timestamp","type"] and they cannot be removed from the event. When multiple filtering rules are defined, we start with a copy of the event and apply the filtering rules one by to the event. event -> filter rule 1 -> event 1 -> filter rule 2 -> event 2 ... where event is the initial event, event1 is the event resulted after applying "filter rule1" and it's considered input for the "filter rule 2" and so on. Problem encountered: The generic filtering expects to receive an event with known types, meaning that it contains MapStr and other primitive types and not structures. In case the event contains an unknown structure for libbeat (it's defined inside the Beat), then we do a marshal & unmarshal to conver it to a MapStr. Configuration: If include_fields=None or not defined, then all the fields are exported If include_fields=[], then only the mandatory fields are exported If drop_fields=[], then all the fields are exported The default values are: include_fields=None drop_fields=[]
1bea4fb
to
09ca0f4
Compare
Add support for include_fields and drop_fields
This PR adds two configuration options:
include_fields
anddrop_fields
.There is a list of mandatory fields like:
@timestamp
,type
that cannot be removed and they are always available in the exported fields. With the two configuration options you can configure the additional fields that will be exported or not exported. By default, all fields are exported.Here are the options:
include_fields=[]
-> exports only the mandatory fieldsinclude_fields is not defined
(commented) -> export all fields (the default option)drop_fields=[]
-> export all fields (the default option)As fields, they accept a map like
proc
or a value likeproc.cpu.total_p
.In case there is a field to include that is not available in the event, then just ignore the field.
When multiple filtering rules are defined, we start with a copy of the event and apply the filtering rules one by to the
event.
where event is the initial event, event1 is the event resulted after applying "filter rule1" and it's considered input for the "filter rule 2" and so on.