Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New field "time" #77

Closed
vbohata opened this issue Aug 10, 2018 · 20 comments
Closed

New field "time" #77

vbohata opened this issue Aug 10, 2018 · 20 comments
Labels

Comments

@vbohata
Copy link

vbohata commented Aug 10, 2018

For at least IIS, Apache2, IIS DHCP logs I need a time field. Just event.duration is not good for use because it do not describe what has taken what time.

I propose to create "time" (or some other named) fields like this:

  • time.taken_seconds
  • time.taken_milliseconds
  • time.duration_seconds
  • time.duration_milliseconds

So for IIS, Apache2 logs I could you http.response.time.taken_milliseconds, for dhcp I could use dhcp.lease.time.duration_seconds, ...

@ruflin
Copy link
Contributor

ruflin commented Aug 13, 2018

Thanks for opening this request. As usual I'm a bit hesitant to add metrics which are not very generic to ECS as there are just too many different metrics out there.

Can you describe a bit more in detail what taken and duration are in the above?

I'm hoping we would not need seconds and milliseconds but only unit: elastic/elasticsearch#31244

@vbohata
Copy link
Author

vbohata commented Aug 13, 2018

Taken would describe number of time unit the action took to process, duration is more generic as for example lease duration - for how much time for example lease has been assigned.

In case one unit will be used, this general "time" field will not be needed ... probably. From the linked issue I did not get if there will be new duration field in Elasticsearch (as mentioned in this issue) or there will be just ingest filter to convert everything to ms for example.

@webmat
Copy link
Contributor

webmat commented Aug 14, 2018

We should not add multiple fields for different scales of a numeric value (millis vs sec). Rather, we should store the most precise information we have in a numeric field, then have our application reading from it convert the value to whatever makes most sense (human readable, seconds, hours) as needed.

For example, in Kibana index patterns, you can specify a field formatter that would specify "these values are milliseconds, and I want to display them human readable". If you have another custom application reading from ES, it should also incorporate some knowledge of how to interpret the data.

Here's an example of how Kibana deals with this

Your issue brings up a good point, however. For a general schema like ECS, we can't know for a given duration field, what kind of resolution the source will have. If we arbitrarily decide that "all duration fields" should be stored as milliseconds, we will force systems with finer resolutions to discard the trailing information, whether it's micro or nano seconds, which is not great.

@ruflin
Copy link
Contributor

ruflin commented Aug 15, 2018

For now we should define the unit that "duration" should be. Long term I'm hoping for the duration type so it can be ingested in different units.

@vbohata
Copy link
Author

vbohata commented Aug 19, 2018

My duration ranges for different data sources are very large. For example for http response time it is OK to be in milliseconds. For ttl values in DNS etc. the long data type is not enough for some kind of unified value in micro/nano seconds.
Maybe it will worth try something like this one: host.timezone.offset.sec (already in ECS).
So there will be http.request.response_time.ms, dns.ttl.sec, extreme_precision_field.ns, ... Kibana could handle this automatically ...

@ruflin
Copy link
Contributor

ruflin commented Aug 20, 2018

What are the largest values you expect and in which data source (as an example)?

@vbohata
Copy link
Author

vbohata commented Aug 20, 2018

My mistake, seems to fit even for max possible TTL in DNS records converted from secs to nsecs. So it is supposed to store everything in ns? Most of logged time values are limited to 32bit number, so 64bit long is wasting of space.

@vbohata
Copy link
Author

vbohata commented Aug 20, 2018

Btw. what is current usage of event.duration? Could I use it for extracting "execution time" strings from messages or is it dedicated for time taken to generate (how measured?) the event?

@ruflin
Copy link
Contributor

ruflin commented Aug 21, 2018

In ES it does not make much a difference if you use integer vs long as long as the values are small.

I think event.duration can be used for various use cases. And you describe one above.

@vbohata
Copy link
Author

vbohata commented Aug 24, 2018

Currently found some cases for which I need to store duration times also as keyword with multi-field). I need it for splitting in ML for DNS TTL, DHCP lease time ... values which are usually 1800, 3600, ... seconds. Here it does not make sense to store it in nanoseconds. What about for each time/duration value use suffix .sec, .msec, .usec, .nsec (user defined, for precision the user needs) ... and once the ES will be able to handle the time duration, just copy this field value to some general .period or .interval suffix?

So for now for example:

  • event.duration.nsec
  • dhcp.lease_duration.sec

After interval support in ES copy it (for backward compatibility):

  • event.duration.nsec -> event.duration.period
  • dhcp.lease_duration.sec -> dhcp.lease_duration.period

Pros: does not need to set format for each time value in Kibana, easy to understand for users, easy to write in search bar in Kibana, better visibility during searching (seeing or typing 3 seconds in secods is much better and typos then value in nanoseconds) ...

@ruflin
Copy link
Contributor

ruflin commented Aug 24, 2018

I worry a lot about field explosion in ECS here. ECS should have 1 unit until it's supported differently by Elasticsearch.

@vbohata
Copy link
Author

vbohata commented Aug 24, 2018

I understand the reason, but this makes much worse to filter these values in Kibana. I just tested it can be nicely translated to human units automatically in discover or in visualisations. Timelions does not support it so I had to manual add divide here. Also for each filter in discover I have to put the original value, so nanoseconds. For the "clicked" filters the Kibana will show it as wanted, but still editing in nanoseconds - so editing in nanoseconds, filter displays the value as defined in Kibana (very confusing for big numbers). Also to "lucene search bar" users have to put original values and Kibana shows in this bar original values ... which is logical but even more confusing - users will see the same value in different scales at the same time.

@vbohata
Copy link
Author

vbohata commented Aug 24, 2018

For some fields it does not make sense to store it in other then seconds like host.timezone.offset.sec which is part of ECS. So maybe other fields which are always in seconds like dhcp.lease_duration.sec (for example, dont know if will be part of ECS) should have ".sec" suffix. For other values like event.duration it make sense to store it in nanoseconds but for users it should be clear in which unit the fields is stored in (before applying Kibana display modifications). So what about adding .sec to all fields for which other unit is useless and for other fields use nanoseconds and add .nsec suffix for them ... so event.duration.nsec?

@ruflin
Copy link
Contributor

ruflin commented Aug 27, 2018

I actually have second thoughts about host.timezone.offset.sec and we should potentially rename it to host.timezone.offset. It would be part of the ECS docs to state which unit they are in. And I have some ideas around how we could make Elasticsearch and Kibana aware of it but it's not in there yet and ECS should also work with older version.

Units is definitively something we need to still put more thoughts into.

@webmat webmat mentioned this issue Sep 18, 2018
26 tasks
@ruflin ruflin mentioned this issue Oct 31, 2018
22 tasks
@dagguh
Copy link

dagguh commented May 30, 2019

ISO 8601 offers a standard syntax for "time" or rather "duration": https://en.wikipedia.org/wiki/ISO_8601#Durations
There's no need to pick a time unit, it's unit-agnostic. WRT precision, it allows you to easily truncate to the desired precision.

@ghost
Copy link

ghost commented Oct 22, 2019

+1 for an obvious place to put duration value for http requests.

@mabre69uk
Copy link

any update on this, is it planned to be moved in GA soon?

@djptek
Copy link
Contributor

djptek commented May 19, 2021

Hi @mabre69uk http duration is always going to depend at what point between the request being made and the response being received the duration is measured, browser, cluster, application, server etc...? . Nanosecond measurement (also mentioned above) as of today's date will likely be counter productive since even servers on the same network have internal offsets at that granularity which will cause aggregations to fail since the buckets are no longer aligned.

The Issue requestor cited IIS, Apache2, IIS DHCP: Do you have a specific use case in mind?

@lukeelmers
Copy link
Member

FWIW, in Kibana we are currently using a custom field for http response duration, and would be interested in using http.response.duration if it became an official part of the spec.

In our case the duration is a server-side calculation in milliseconds: the difference between the time a request is received by the http server and the time we've prepared & sent the response. We use this to understand how long it is taking our server to process specific requests.

@kgeller kgeller added the discuss label Jun 1, 2021
@kgeller
Copy link
Contributor

kgeller commented Jun 2, 2021

I am seeing two main requests from this issue, both of which get described in more recent issues.

I am going to close this issue out in favor of continuing the conversation in those places. But of course, feel free to re-open if needed, or create new spin-off issues as well.

@kgeller kgeller closed this as completed Jun 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants