-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce memory usage of http parser #6680
Conversation
A simple benchmark that processes a 10MB request.
It would not catch the case where the functionality to extract form-urlencoded parameters is broken.
Tests were missing for the following features: - include_body_for - send_request - send_response
return true, true | ||
} | ||
|
||
func (p *parser) shouldIncludeInBody(contenttype []byte) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
receiver name p should be consistent with previous receiver name parser for parser
f2d8fd5
to
4d7ffef
Compare
packetbeat/protos/http/http.go
Outdated
@@ -274,7 +267,12 @@ func (http *httpPlugin) doParse( | |||
conn.streams[dir] = st | |||
} else { | |||
// concatenate bytes | |||
if len(st.data)+len(pkt.Payload) > http.maxMessageSize { | |||
totalLength := uint64(len(st.data)) + uint64(len(pkt.Payload)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why change code to cast to uint64?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Originally an int64 variable was involved in that calculation, which forced the cast (msg.size). After a cleanup I didn't realize the cast was redundant.
packetbeat/protos/http/http.go
Outdated
@@ -514,6 +512,15 @@ func (http *httpPlugin) newTransaction(requ, resp *message) beat.Event { | |||
} | |||
} | |||
|
|||
func (http *httpPlugin) getRaw(m *message) []byte { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe rename this to buildRawMessage
or makeRawMessage
. Quite some allocation and copying is happening here.
Also consider this returning a string. When being used, the result is casted to a string. This requires another allocation + copy.
go1.10 is adding a StringBuilder. This would be a good replacement. If you don't want to wait, you could use reflect + pointer casts (this is what StringBuilder does internally):
func (m *message) makeRaw() string {
sz := len(m.rawHeaders)
if m.sendBody {
sz += len(m.body)
}
result := make([]byte, 0, sz) // allocate enough space once, don't grow slice
result = append(result, m.rawHeaders...)
if m.sendBody {
result = append(result, m.body...)
}
// create string from allocated array
hdr := (*reflect.SliceHeader)(unsafe.Pointer(&result))
return *(*string)(unsafe.Pointer(&reflect.StringHeader{
Data: hdr.Data,
Len: hdr.Len,
}))
}
packetbeat/protos/http/http.go
Outdated
return | ||
} | ||
authHeaderStartX += len(constCRLF) | ||
authHeaderEndX := limit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why remove the header/bodyOffset
in favour of bytes.Index? They still seem to be usefull when post processing messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removal of headerOffset was a mistake. bodyOffset is unnecessary as the headers and body are now stored in different slices.
m.contentLength += (len(s.data) - s.parseOffset) | ||
s.parseOffset = len(s.data) | ||
if m.saveBody { | ||
m.body = append(m.body, s.data...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a chance len(body) > http header content length
after this append? E.g. another HTTP message starting right after the current one due to HTTP keep-alive/pipelining.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not in this particular append, as it is only called in the case where the length is not known and connection is close
or http/1.0 is used, which rules out keep-alive/pipelining.
It's a many small, but subtle changes. Most notable is the removal of parseOffset in favour of accessing s.data directly. Can you add a short comment to the PR description why it's removed and how this changes parsing (I guess the s.data slice is always advanced by number of consumed bytes). |
if m.sendBody { | ||
result = append(result, m.body...) | ||
} | ||
// TODO: (go1.10) Use strings.Builder to avoid allocation/copying |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a followup github issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Maybe add a changelog.
a5f287d
to
2e4342d
Compare
This patch updates the http protocol parser in packetbeat to not buffer the full request/response. Body is only buffered when neccesary, i.e. the `include_body_for` option specifies that the body should be included in the event, or a request body of type form-urlencoded is sent. Fixes elastic#6679
2e4342d
to
fc771f2
Compare
This patch updates the http protocol parser in packetbeat to not buffer the full request/response.
Body is only buffered when neccessary, i.e. the
include_body_for
option specifies that the body should be included in the event, or a request body of type form-urlencoded is sent.Fixes #6679
Until now, the http parser buffered the whole request/response in
stream.data
, including the body if present, and usedmessage.parseOffset
to mark what has already been parsed. New packet's payload got appended tostream.data
.Now, the http parser discards parsed data from
stream.data
, saving the headers inmessage.rawHeaders
and the body (when required) inmessage.body
. Thus makingmessage.parseOffset
unnecessary.