-
-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non blocking matchers & matching timeout #72
Conversation
Can't wait to review this! Please hold tight while I get Caddy 2.6 released. Should be this week if all goes well. |
Ok, now I'm working through my backlog from the 2.6 release, so, hang tight 😁 |
Just played a bit more with this. The tests are now fixed and simple configs should already work. |
@ydylla Thanks for working on this! I'm still backlogged and have been taking a little time for my mental health lately but I will be back to this as soon as I can :) I sympathize with the complexity of this... I hope we can eventually solve these in an efficient way. Really appreciate your contributions 💯 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I finally had a chance to look at this! This is really impressive, and overall I think I like where this is going. Bear with me as I have some questions though, as I haven't looked at this code in quite a while! So don't mind my questions -- they are not criticisms -- I am just trying to understand this as well as you do.
It now has a prefetch function which tries to load all data a client sends during connection setup. It does this in chunks of 1024 bytes up to 8 KiB and stops on the first short read.
What's the advantage of this over simply reading bytes as the matchers need them (as we currently do)? ... [several minutes later] ... thinking on it, is it because we might as well read all the bytes (with a cap) the client sends to establish the connection, since reading all the bytes is what the server has to do anyways? Might as well do it all at once, I guess?
During matching a ErrConsumedAllPrefetchedBytes is returned if a matcher requests more data than currently available. If the routing code see this error it is ignored and the next matcher is tried.
Ok, I think I get this part. The client only sends so many bytes at the beginning of a connection (within a deadline) and those are what the matcher has to work with, period.
The only matcher that does not play well with this is the http matcher because http.ReadRequest forces us to use a bufio.Reader.
It's a bit hacky and can probably be improved.
That does sound a bit complicated. Let's collaborate on this and see if we can come up with something simpler.
The matching timeout is implemented by setting SetReadDeadline before each matcher and can be configured per route.
I wonder if this should be set between each Read()? I think usually it is conventional to enforce read timeouts by extending the deadline after each read.
I am not sure if the nested timeout config is necessary, Maybe move it up and rename it to matcher_timeout?
I like where it is for now, let's see how people use it.
Should the max prefetch/matching byte size be configurable?
Almost certainly, yes. Not a showstopper but I think that will be a good idea.
Thanks @ydylla -- hopefully you haven't given up on me yet with how long I'm taking 🙃
Thanks for the feedback, I will try to answer your questions.
The main reason for the chunking is the detection of short reads. If we ask Regarding http.ReadRequest and bufio.Reader. Yes a way to parse http request without the need of bufio would be nice. We already have all bytes in memory so it's unnecessary.
The current timeout was only intended as a total matching timeout, not a read timeout. If no route could be matched in x seconds it's very likely not a legitimate client, so the server should close the connection. After a route is selected Regarding the max prefetch size. I was mostly unsure how to access config that is on server level from the connection. |
This is looking good I think. Did you mean to leave it in draft state? |
Thanks. Yes this is still a draft because http2 matching is not working in all cases. It depends on how the client sends the data. See my last comment:
I also did not find the time to test it with real web browser traffic, which I planned to do before undrafting it. |
Gotcha. That is tricky. I will let you know if I have any ideas! |
@ydylla Any interest in finishing this up? Let me know if I can help. |
Hi @ydylla -- I might actually merge this sooner rather than later, and see if we can figure out the HTTP/2 part in a separate PR, or if you want we can finish up the HTTP/2 tweaks in this one. Up to you; I know you're very busy. (I am too!) If I don't hear from you next time I circle back to this I'll probably just go forward with the merge. 😃 |
@mholt Sorry apparently I forgot to answer to your previous message 😅
I will experiment with this within the next couple of days and report back to you. |
Hi,
this is my best attempt to solve the blocking matchers problem when clients send not enough data (also discussed in #68).
In the end it is a full rewrite of the layer4 connection. It now has a
prefetch
function which tries to load all data a client sends during connection setup. It does this in chunks of 1024 bytes up to 8 KiB and stops on the first short read.During matching a
ErrConsumedAllPrefetchedBytes
is returned if a matcher requests more data than currently available. If the routing code see this error it is ignored and the next matcher is tried.The only matcher that does not play well with this is the http matcher because
http.ReadRequest
forces us to use abufio.Reader
. That's why I also added aMatchingBytes
function which allows matcher to get a view of all available bytes. The http matcher uses this to pre check if the data looks like http before callinghttp.ReadRequest
and also to configure the buffer size for thebufio.Reader
so it does not produce theErrConsumedAllPrefetchedBytes
error.It's a bit hacky and can probably be improved.
The matching timeout is implemented by setting
SetReadDeadline
before each matcher and can be configured per route.There are still many things to do, this is just so you can take a peek.
matcher_timeout
?net.Pipe
behaves very differently from a real connection, it does not return short reads and so the prefetching breaks.