-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP]beats @timestamp support nanoseconds #9818
Conversation
Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually? |
@ruflin I squash commits. But I don't know which method should i do, if you know, please tell me. Thanks. |
libbeat/outputs/elasticsearch/enc.go
Outdated
@@ -89,8 +89,8 @@ func (b *jsonEncoder) resetState() { | |||
|
|||
b.folder, err = gotype.NewIterator(visitor, | |||
gotype.Folders( | |||
codec.MakeTimestampEncoder(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change of |
@urso Thanks for you reply. Now Should i update these testing expected output? Or make In addition, i will add more tests in |
ParseTime returns a timestamp of type
As the encoding formatter is more similar to traditional date-time formatting packages as can be found in other languages (like Java) I think this use case can be better handled by extending the 'S' formatter. E.g. checking javas DateTimeFormat docs, it says 'n' is nano (similar to your implementation), but 'S' is a fraction. Playing with the How about changing the behavior of
If a timestamp parsed with ParseTime includes only milliseconds, then we will print the timestamp with millisecond accuracy when changing the default pattern for the formatter to use Following the trail of fractions, we could also introduce Dealing with time can be tricky at times :( As the switch to nanoseconds is kind of a breaking change we will have to check if the tests fail for a good reason or not and adapt them. |
@urso Let me confirm that i understanding is correct. If wrong, hope that you can point out. Two ways that implement timestamp support millis and nanoseconds, or microseconds:
If use If we want to auto omitting the trailing zeros, it means that beats store data to elasticsearch timestamp maybe different. Is it will cause other unknown question? |
From playing with
Neither I think we can make the semantics of I was thinking to also add support to
e.g. a timestamp with nanos:
Rethinking it (again) we can just make |
We update Like code below: case 'S': // fraction of second
b.nanoOfSecond(tokLen)
case 'n': // nano second
if len(tokLen) < 9 {
tokLen = 9
}
b.nanoOfSecond(tokLen)
case 'f': // fraction of second
b.fractionOfSecond(tokLen) I have a question, that we use default timestamp formatter in |
Oh I see you didn't push new code. Can you also push changes to the PR, so I have a chance to comment on the change? If unsure you can create a PR in your own repo against your development branch so we can discuss over there.
LGTM. maybe change
I think I'd prefer
I think there is no other code depending on the date format changes. The formatter is only used for printing, but I'd love to have parsing support as well, so we could allow users to configure custom timestamp formats. |
@urso I will push new code according to the discussion above soon.
I think this change maybe send a new PR to do it. |
Cool :) |
Hi, @urso I have a question when i implement
Now i use the second method to implement. If you have other method or idea, i hope you can tell me. Thanks. |
libbeat/common/dtfmt/elems.go
Outdated
@@ -50,6 +50,14 @@ type paddedNumber struct { | |||
signed bool | |||
} | |||
|
|||
type paddedBigNumber struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@urso, i add a new elements. It works, what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need this structure if we introduce (or extend) the opcodes to also stores a div
up to 32bit. paddedNumber
would need to choose the right opcode based on the size of div. The we wouldn't need to cast to float
or use math.Pow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we wants to stores a div up to 32bit, i think we should change prog
structure, is it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to change the structure. For example see opCopyLong (it uses a 16bit length counter). Putting a 32bit value into prog can easily be done via binary.LittleEndian.PutUint32
.
Checking all usages of paddedNumber we could actually change all uses of paddedNumber to report the exponent for 10^exp (like you did for divLen).
type paddedNumber struct {
ft fieldType
divExp int // <- replace div with exponent for computing 10^exp
minDigits, maxDigits int
signed bool
}
func (n paddedNumber) compile() (prog, error) {
if n.divExp == 0 {
...
}
return makeProg(opExtNumPadded, byte(n.ft), byte(n.divExp), byte(n.maxDigits))
}
And use a table instead of math.Pow.
This changes smeantics of opExtNumPadded, but I didn't find any place where we need the actual divisor, so that's fine:
var pow10Table [10]int
func init() {
x := 1
for i := range pow10Table {
pow10Table[i] = x
x *= 10
}
}
...
case opExtNumPadded:
ft, divExp, digits := fieldType(p.p[i]), int(p.p[i+1]), int(p.p[i+2])
div := pow10Table[divExp]
i += 3
v, err := getIntField(ft, ctx, t)
if err != nil {
return bytes, err
}
bytes = appendPadded(bytes, v/div, digits)
What's nice about this is that we don't need to add any new cases to prog. We can also remove opNumPadded now, because a divExp
value of 0 will divide by 1 anyways.
I think this later approach is more close to the implementation you provided, but also unifies all fraction handling in the package to use the pow10 based approach of yours (minus the float operations).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we remove div
and add divExp
, the paddednumber
only support 10^exp, and i'm not sure whether it affect the appendDecimalValue
function, so i add a paddedBigNumber
structure before.
I'm thinking change the compile()
function like below when div
too large, it use binary.PutVarint
save the int to byte and binary.Varint
to convert byte to int now. But i'm not sure this way whether correct.
My way maybe not good, what do you think?
func (n paddedNumber) compile() (prog, error) {
if n.div == 0 {
return makeProg(opNumPadded, byte(n.ft), byte(n.maxDigits))
}
buf := make([]byte, 4)
bvar := binary.PutVarint(buf, int64(div))
bufLen := len(buf[:n])
newBuf := make([]byte, 0, bufLen)
newBuf = append(newBuf, opExtNumPadded, byte(n.ft), byte(bufLen)))
newBuf = append(newBuf, bvar ...)
newBuf = append(newBuf, byte(n.maxDigits))
return makeProg(newBuf...)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we remove div and add divExp , the paddednumber only support 10^exp, and i'm not sure whether it affect the appendDecimalValue function, so i add a paddedBigNumber structure before.
The appendDecimalValue
method always sets div
to 0. The only call that sets the div
is appendExtDecimal
. All uses of appendExtDecimal
have a divisor that is dividable by 10. This is why I did propose this approach.
Your approach with PutVarint will also work. Using Varint will add some decoding overhead whenever we want to print the data. this is why I opted for using a 32bit value by using binary.LittleEndian.PutUint32
.
What's the status of this PR? Would be great to have this feature. |
@urso : are you aware of any other attempts to get nanoseconds in filebeat? because it sounds like this second attempt to get nanoseconds to (file)beats is lost in limbo again :) I was so excited to upgrade our ELK stack to 7.5.1 to then discover that filebeat does not support nanoseconds... (e.g: type:container aka docker logs) |
@ruflin: are you aware of any other attempts to get nanoseconds in filebeat? |
@TomGudman Thanks for the ping. AFAIK there wasn't more progress on this but @urso is the one that will know here. |
c.enableClock() | ||
c.enableMillis() | ||
case ftNanoOfSecond: | ||
c.enableNano() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ftMillisX should be removed from the enum if they are not used anymore.
- Rename beat.timezone to event.timezone. {pull}9458[9458] | ||
- Use _doc as document type. {pull}9056[9056]{pull}9573[9573] | ||
- Update to Golang 1.11.3. {pull}9560[9560] | ||
- Add @timestamp support nanoseconds. {pull}9818[9818] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the git 3-way-merge did mess with the Changelog :)
I did skim code for other potential merge failures, but it looks all good.
I squash commits and push it. I remove the code related about @urso You can review this pr, and if something missing or wrong, hope you point out. |
Jenkins, test this. |
The CI failures are definitely related to this change. There are some failing unit tests in libbeat + filebeat json parsing (includes a timestamp) also breaks:
|
Code for encoding looks good. Some outstanding tasks:
|
Timestamp parse use const time layout, https://github.com/elastic/beats/blob/master/libbeat/common/datetime.go#L59 . The tests use time.Parse(), should we support milliseconds or nanoseconds input when use Parse function? |
I'd say both formats. |
I want to change |
@urso I fix some fail tests, send a new commit about parse milliseconds and nanoseconds format. |
|
||
for i := 0; i < 3; i++ { | ||
trailZero += "000" | ||
tsLayout = fmt.Sprintf("%s.%sZ", TsLayout, trailZero) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fmt.Sprintf
is a rather expensive operation on the critical path. Given that we have only 3 patterns better create a table with supported layouts. This allows devs to even add more layout types in the future for probing.
I did restart some tests on travis, but there are still a many that are failing due to the timestamp changes. Even some unit tests in libbeat it seems. |
I use |
Right now your tests are run on travis (not sure you can see the list of checks at the bottom of this page). Following the travis link the last failed CI run is here: https://travis-ci.org/elastic/beats/builds/636201972?utm_medium=notification&utm_source=github_status In order to run a more complete set of tests use Filebeat system tests error:
libbeat logstash output integration test errors:
and
Metricbeat unit test failure:
And packetbeat system test failures:
|
OK, i will retry in my pc use |
@pytimer almost there! Good progress! |
@urso Sorry, i'm on the holiday until next month, i commit code to github will be interrupted due to bad network, so this work maybe i can't continue during this time, can you continue to do it? Thanks. |
Will do. Thanks for the contribution. |
@pytimer fiy; I merged this PR into a feature branch ( I think you have finished the hard part, now we have to adapt the tests. Thanks for your efforts. |
来自这个pr:elastic#9818 但未被合并进master
Is this issue related to that: https://discuss.elastic.co/t/filebeat-timestamp-processsor-doesnt-parse-microseconds-part/248591 ? |
* beats: add @timestamp support nanoseconds * datetime support milli and nano seconds format
This PR adds nanoseconds support to beats. Because this pr update libbeat dtfmt module, so it maybe influence all beats.
This PR relate to #7576.