-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docker module cleanup #2665
Docker module cleanup #2665
Conversation
ruflin
commented
Oct 3, 2016
•
edited
Loading
edited
- Make default period 10s, same as for all other beats
- Remove unnecessary code comments
- Make convertContainerPort local to container MetricSet
- Only add labels to container MetricSet if not empty
- Apply some code conventions
- Convert local @timestamp to timestamp to prevent confusion
- Replace rx / tx with in / out in network metricset to be consistent with system network metricset
- Add interface name as field instead of having it as a dynamic element. This makes it easier to filter.
- If not containers are running, just no events are returned
- Add data.json for all metricsets
- Introduce ToMapStr for Container data
- Move socket to Container MapStr. Is Socket info really needed?
- Remove unnecessary nesting levels as namespace already given by metricset
- Have one docker client per metricset instead of having one global client
@douaejeouit It would be nice if you could have a look at this PR and especially have a look at the // TODO I added. Would be good to get your thoughts on these. |
9a4abde
to
4cdcb27
Compare
|
||
event := common.MapStr{ | ||
"@timestamp": time.Now(), | ||
"timestamp": time.Now(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand this, does it mean we'll have both @timestamp
and timestamp
? Seems unnecessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tsg Agree. Before one was overwriting the other (or at least it tried and didn't as it was not common.Time). With the above implementation each event can have its separate timestamp instead of one for all events of this batch. But as the heavy work happens when the info from the docker api is fetched and the timestamp is generated afterwards, I suggest to just remove it.
Perhaps @douaejeouit can share some insights here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tsg The @ timestamp here shouldn't be time.now() ! The idea is to get the time generate by Docker daemon while sending the events requested ( here the containers event), this is what I did for all the other metricsets.
I will be on it tomorrow morning! There is some smalll changes to do, otherwise, we can simply keep it as it is for the containers event! What do you think guys ? @ruflin @tsg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@douaejeouit I saw this part, but we are talking here about a milli second difference in time. I don't think it is worth to do this and have the logic because of this different from all other modules / metricsets where the timestamp represents when the event fetching started. The nice part of this is that timestamp + rrt
is when the even creation ended. With the above implementation it is not 100% clear to me what the timestamp will exactly represent.
@douaejeouit Perhaps you can share some insights on what the timestamp exactly is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ruflin Ok!
But .. what do you mean by having one dockerclient per metricset instead of having one global client? For me the docker client instance should be created only once. Could you please clarify this point?
Concerning timestamp, as I said before, it's overwritten and it represents the time when the stats were read from the deamon. But since it's only a matter of a milli seconds difference in time, I agree with you! We can change the logic 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous implementation didn't allow to have two different docker host machines as only one global client was used. In general MetricSets are designed to no share information with each other. So in case one MetricSet has some issues, it doesn't affect the others. This should also be true for the client connection, means every MetricSet should have its own instance of the client.
The downside of this is when using all docker module MetricSets, 5 connections are open to the Docker host instead of one. Currently I would say this is acceptable, but in case this becomes a problem we will tackle it. There are options like making the Module more powerful so multiple MetricSets in one Module, meaning if in the config multiple metricsets are defined in a module, this module could also be used to share some information. This will potentially require some refactoring as it exceeds the current capability of the interface methods of the module interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok,
Indeed, I made the choice to have one client for failing to 5 simultaneous connections if all the metricsets are used.. anyway, it's Ok as it is not problematic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before it was not only 1 client for all 5 metricsets, but one client for all docker modules (one docker module can be used multiple times in the config) which broke the implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All right .. I can see now! Thanks
fbb350a
to
896d480
Compare
LGTM. |
* Make default period 10s, same as for all other beats * Remove unnecessary code comments * Make convertContainerPort local to container MetricSet * Only add labels to container MetricSet if not empty * Apply some code conventions * Convert local @timestamp to timestamp to prevent confusion * Replace rx / tx with in / out in network metricset to be consistent with system network metricset * Add interface name as field instead of having it as a dynamic element. This makes it easier to filter. * If not containers are running, just no events are returned * Add data.json for all metricsets * Introduce ToMapStr for Container data * Move socket to Container MapStr. Is Socket info really needed? * Remove unnecessary nesting levels as namespace already given by metricset * Have one docker client per metricset instead of having one global client
896d480
to
83e7c61
Compare
@douaejeouit We merged this PR, but lets continue this discussion about the timestamp. Happy to readd it if needed. Please also have a look at the PR to check if I changed some code that should not be changed. |
@@ -73,6 +74,8 @@ func TestCPUService_UsageInUsermode(t *testing.T) { | |||
// THEN | |||
assert.Equal(t, float64(0.50), result) | |||
} | |||
|
|||
/* TODO: uncomment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ruflin Should I comment it? This function needs to be updated in order to match with last changes you made. I'll do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please. Thanks
"id": c.Id, | ||
"name": c.Name, | ||
// TODO: Is this really needed | ||
"socket": GetSocket(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's an extra information. We can ignore it if you judge it isn't needed!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to understand the use case behind it. It see that people will search for id and name, but will the socket information be of any use? I think this is useful in the info
metricset, but it is required in the other metricsets. I'm hesitant to remove it as I'm sure there was a good reason to add it which I'm just not aware of yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main objectif of having it as a field is to have an clear idea about the conf and to check it as well. Espacially in case we have two docker modules running and each one listen on different socket. Since we can bind docker to another host or port using a tcp connection, it's always good to know if it listen on TCP or a Unix socket or both, that's why it's present n all the metricsets!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking in case a socket is used, if we should copy over the socket information to metricset.host
info instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, it sounds good for me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a task to the list here: #2629
i := 0 | ||
var keys []string | ||
for key := range labels { | ||
keys = append(keys, key) | ||
} | ||
sort.Strings(keys) | ||
for _, k := range keys { | ||
// Replace all . in the labels by _ | ||
// TODO: WHY? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was the naming used in dockbeat Project, where it has been decided to format the labels output . I didn't want it to be different, that's why I left it at it is. If it's not necessery. We can leave it as it's generated by docker's API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was first thinking . were replaced by _ because in ES <2.4 . can't be used in field names. But it seems to be the names where it is replaced? Perhaps there are some disadvantages in Kibana with the . ? Would be interesting to get some more background info on this decision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, it was due to some ES & Kibana issues in some previous version. Apparently, Kibana couldn't distinguish between fields whose names contains a . and those who are nested under a different objects! So as a temporary solution, we replaced the feild names containing ( . ) by ( _ ) to avoid any potential problem. By the way, I'm wondering if the bug was fixed. If so, we can ignore this function and keep our field names without any modification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't aware, that the above was and issue for values. I would expect it to be an issue for keys, but also values? Were the values defined as text or keywords? That could have been the issue, that if it was not defined as keywors that the tokenizer was applied? But this can be solved by applying the right mapping.
We should test the behaviour here, as in general I think it would be nice to keep the name of the labels the same as the user set it, as this is what he will probably look for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm not mistaken, it concerns only keys !
+1 for not changing the name of labels
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a task to the list here: #2629
Guys, with my previous contribution #3024 about labels and dots, is this still necessary? |
@rikatz There were quite a lot of changes and discussions in this PR. Which part are you referring to? |
@rikatz I checked the checkboxes in the issue as done. |