-
Notifications
You must be signed in to change notification settings - Fork 61
Simple HTTP cleanups and followup to #72 #76
Conversation
There is only one case left, which probably should not be an error at all.
Also eliminate an allocation when parsing the header.
cc @raphjaph does this do what you want? |
This PR does not recover the pre-#72 performance. I'm investigating. |
Honestly I'm not sure what to make of this. It looks like for some reason Bitcoin Core takes ~10ms to reply to
where that It has nothing to do with the mutex on our end, or any sort of buffering, as near I can tell. I have tried it with the tightened mutex scope, with socket pools of various sizes, and it seems like it's quick to reply to the first RPC call on a connection, and slow to reply to every one after that. Although there is some noise so it's hard to be certain. |
Ok, I can get most of the performance back by both:
I don't know why the latter is necessary when it wasn't under the old "create a new connection every time" model, and I don't know why I need to make both changes when only the latter should provide any visible change under normal circumstances. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple minor questions but apart from that looks good to me.
src/simple_http.rs
Outdated
let http_response = get_line(&mut reader, request_deadline)?; | ||
if http_response.len() < 12 || !http_response.starts_with("HTTP/1.1 ") { | ||
return Err(Error::HttpParseError); | ||
let http_response = get_line(&mut reader, Instant::now() + self.timeout)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason not to use request_deadline
here? If there is one can be have a code comment please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I meant to change request_deadline
everywhere, and it's a rebasing mistake that it hasn't been changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:)
src/simple_http.rs
Outdated
}); | ||
}, | ||
Some(n) => { | ||
let mut buffer = Vec::with_capacity(INITIAL_RESP_ALLOC as usize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the buffer have capacity n
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think so. I went back and forth a couple times on this, but now that we bound n
I think we should use it.
struct TcpStream; | ||
|
||
#[cfg(fuzzing)] | ||
mod impls { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason to put the implementation of the fuzzed TcpStream into its own module?
I'm new to Rust so forgive me if this is a silly question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So that the #[cfg(fuzzing)]
is applied to the whole module scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, that makes sense.
The TcpStream
is declared outside of the impls
mod so it doesn't have have to be imported explicitly in the simple_http
mod.
Looks good to me. I don't 100% understand how the fuzzing works but have never seen that so it was cool to see that in the wild. |
Thanks @raphjaph ! Unfortunately I am going to rebase this and change a few things (hopefully not a huge code diff). So I may need you to review again. |
When we create/destroy BufReaders we potentially lose data on the socket. Avoid doing this by keeping the BufReader permanently.
@tcharding I have pushed 4 new commits which address your nits and get us some further perf improvements. Weirdly this is still not as fast as the old "new connection for every request" logic, even though it's way more intelligent about buffering, but we're within a factor 2 now, and we're able to completely pin my CPU (the old code was 50/50 Core vs icboc, this code is 30/70, so the implication is that we're doing more processing and that's why we're slower ... but why?) so I'm okay with it. |
I think the performance hit is caused by serde-rs/json#160 (comment) ... but I don't have a good fix because reading from the socket into memory prior to starting parsing is actually slower in this case. Edit the prior code did this, but line-by-line and doing utf8-decoding on each line before reading more. I suspect that it simply accidentally was pumping various buffers at the right resonance (along with the frequent socket connections triggering the kernel to wake up bitcoind at just the right time), and there is no principled or reproducable way to get that performance back. I think we should just leave this be until somebody complains, then maybe we need to write our own json deserializer or something. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 2380d90
ACK 2380d90 |
Tightens the mutex lifetime to improve performance. Basically we read all the data into a buffer and then unlock the mutex ASAP. We could be a bit more memory-efficient by using
serde_json::from_reader
to parse directly from the socket, but (a) that would make it harder to enforce the Content-length header; and (b) it'd hold the mutex for longer than necessary.This commit also splits out the
HttpParseError
into several variants, which unfortunately makes this a breaking change. I think I can move that commit into its own PR if we want to get a minor rev out, but I don't think we care too much about that since this crate is pretty far to the edge of most dependency trees.Adds a fuzz harness but doesn't integrate it into CI. Fixes a couple bugs I found while fuzzing locally.