Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes and Logging #71

Merged
merged 3 commits into from
Nov 29, 2022
Merged

Fixes and Logging #71

merged 3 commits into from
Nov 29, 2022

Conversation

ydylla
Copy link
Collaborator

@ydylla ydylla commented Sep 9, 2022

Hi,
my experiments mentioned in #68 revealed some bugs. I also added some more logging which makes debugging configs a lot easier.

The first bug (at least I think it is one 😄) is that when a matcher returns an error the processing is not stopped. Instead the next matcher is tried and if that matches its handlers are executed even if the connection is broken.

For example with this config

matcher-bug.json
{
  "admin": {
    "disabled": true
  },
  "logging": {
    "logs": {
      "default": {"level":"DEBUG", "encoder": {"format":"console"}}
    }
  },
  "apps": {
    "layer4": {
      "servers": {
        "matcher": {
          "listen": ["0.0.0.0:8888"],
          "routes": [
            {
              "match": [
                {
                  "tls": {}
                }
              ],
              "handle": [
                {
                  "handler": "echo"
                }
              ]
            },
            {
              "match": [
                {
                  "ip": {"ranges": ["0.0.0.0/0"]}
                }
              ],
              "handle": [
                {
                  "handler": "proxy",
                  "upstreams": [{"dial": ["127.0.0.1:9999"]}]
                }
              ]
            }
          ]
        }
      }
    }
  }
}

and this curl curl -x socks5://127.0.0.1:8888 https://example.org (Ctrl-C or wait 5 minutes for the timeout) you will get this confusing log:

1.6627550161737518e+09  info    serving initial configuration
1.662755029922435e+09   error   layer4  matching connection     {"error": "unexpected EOF"}
1.6627550319487917e+09  debug   layer4.handlers.proxy   dial upstream   {"remote": "10.0.0.2:43900", "upstream": "127.0.0.1:9999", "error": "dial tcp 127.0.0.1:9999: connectex: Es konnte keine Verbindung hergestellt werden, da der Zielcomputer die Verbindung verweigerte."}
1.662755031949321e+09   error   layer4  handling connection     {"error": "dial tcp 127.0.0.1:9999: connectex: Es konnte keine Verbindung hergestellt werden, da der Zielcomputer die Verbindung verweigerte."}
1.662755031949321e+09   debug   layer4  connection stats        {"remote": "10.0.0.2:43900", "read": 4, "written": 0, "duration": 6.8051304}

with the fixes and additional logging it looks like this:

1.6627555421950946e+09  info    serving initial configuration
1.6627555549489841e+09  debug   layer4  matching        {"remote": "10.0.0.2:35720", "error": "unexpected EOF", "matcher": "layer4.matchers.tls", "matched": false}
1.6627555549492755e+09  error   layer4  matching connection     {"remote": "10.0.0.2:35720", "error": "unexpected EOF"}
1.6627555549492755e+09  debug   layer4  connection stats        {"remote": "10.0.0.2:35720", "read": 4, "written": 0, "duration": 3.2630775}

Only one matcher and no handlers are executed.

The second bug is causing Read() to return 0 bytes and no error on its first call after record() (which I also mentioned here). It's probably not forbidden for an io.Reader to do that but its confusing. I wrote a connection test that covers this situation TestConnection_RecordAndRewind.

The additional remote ip address logging is mainly intended as kind of connection id, with that its possible to correlate multiple log lines on busy servers.

Besides that I also have a commit which adds a per route configurable matching timeout. But it needs more testing and is probably better discussed in its own PR.

I am also still experimenting on how to prevent matchers from blocking when there is not enough data for them. But I would like to see these fixes merged first, so I can work on top of them. A short read detection in the layer4 connection looks promising. One problem is the http matcher. Since it uses a buffered reader, it will always cause a short read from the view point of the connection.

@ydylla
Copy link
Collaborator Author

ydylla commented Sep 10, 2022

During testing with a real config I noticed that my 0 bytes fix was flawed, I force pushed a better one.

Copy link
Owner

@mholt mholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, just did a brief review for now, but it's looking pretty good so far.

I wonder if we can use the MatcherSet without having to pass in a logger, though. Hmm.

layer4/matchers.go Outdated Show resolved Hide resolved
layer4/matchers.go Outdated Show resolved Hide resolved
layer4/connection.go Outdated Show resolved Hide resolved
@ydylla
Copy link
Collaborator Author

ydylla commented Sep 12, 2022

I wonder if we can use the MatcherSet without having to pass in a logger

Would you prefer it if layer4.Connection had a public logger? I already did this in one of my experiment branches.
Or did you mean an additional nil check before logger usage? Logging which matchers are executed is one of the main objectives of this PR 😄

@mholt
Copy link
Owner

mholt commented Sep 13, 2022

Would you prefer it if layer4.Connection had a public logger?

@ydylla Maybe that's a good idea! Can we try it?

@ydylla
Copy link
Collaborator Author

ydylla commented Sep 13, 2022

Maybe that's a good idea! Can we try it?

Sure, I will change it tomorrow.

@ydylla
Copy link
Collaborator Author

ydylla commented Sep 14, 2022

@mholt I have added a logger to the layer4 connection and also removed the caddy.Module requirement for ConnMatcher.

@@ -127,7 +127,8 @@ func wrapRoute(route *Route, logger *zap.Logger) Middleware {
// route must match at least one of the matcher sets
matched, err := route.matcherSets.AnyMatch(cx)
if err != nil {
logger.Error("matching connection", zap.Error(err))
logger.Error("matching connection", zap.String("remote", cx.RemoteAddr().String()), zap.Error(err))
return nil // return nil so the error does not get logged again
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where would it be logged again? I don't see anywhere else we use the value err.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be logged again in server.go#L109

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, are you seeing that in your logs? How does err get returned from here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return statement is completely new to fix this:

The first bug (at least I think it is one 😄) is that when a matcher returns an error the processing is not stopped. Instead the next matcher is tried and if that matches its handlers are executed even if the connection is broken.

You can see example logs of the processing not stopping in the initial post. When using return err instead one would see the double logging.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I follow so far, except I don't see where we are using return err anywhere. We always return nextCopy.Handle(cx) instead.

(Sorry if I'm being a bit slow to understand.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are talking past each other :)

I included the comment to signify we are returning nil on purpose, because otherwise it may look like a bug.

You are right we mostly return nextCopy.Handle(cx) except for exactly this case, since we want to stop the execution of the chain. But we already logged (handled) the error, so it is wrong to return the error (like it's often the case in if err != nil blocks). Because the last return value from the chain will eventually appear in server.go#L106 and thus would get logged in server.go#L109 which we do not want because we already logged it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mholt, did you found the time to think about this?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ydylla I'm so sorry for the delay on this PR. (It's not fair to you, that's for sure.) I was traveling this last week and also taking a little break from sprinting on Caddy stuff after the 2.6 release (to avoid burnout), and now I'm way behind on notifications. 😅 I'll be catching up soon!

If we can find another maintainer who is willing to review and approve this and other PRs, that might be a good idea so it's not always blocked by me. I feel bad about that! I might look around and see who would be willing to accept the invitation. This plugin just isn't nearly as active as the main Caddy project so it has fewer candidates.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem at all, I understand that you are very busy. Don't feel bad, you are doing good work with caddy.
But you have to endure me pining you from time to time 😄

I just used the delay to clean up the commits and fix the failing tests in the l4socks module.

@mholt
Copy link
Owner

mholt commented Sep 15, 2022

(Sorry, still working through my backlog. I'm trying to get Caddy 2.6 ready to release. Your PRs are the next on my list. Thanks for your patience!)

@ydylla
Copy link
Collaborator Author

ydylla commented Oct 16, 2022

Hi @mholt,
I am still interested to see this merged. What is missing here to move this forward?

Copy link
Owner

@mholt mholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ydylla I suppose you have tested this more than I will be able to at the moment. I'm willing to merge it in!

Fire away when ready.

(Last night I asked in your other PR why a separate Logger was added for each connection. I guess that is being introduced here, which I now I remember. Is having a separate logger for each connection particularly useful?)

Thanks for all you do :)

@ydylla
Copy link
Collaborator Author

ydylla commented Nov 21, 2022

@mholt Thanks, I will merge it when the logger stuff is settled.
Yes #72 is based on this branch.

Is having a separate logger for each connection particularly useful?

Not sure what you mean by that. Each connection uses the same logger. It's the one from the server that gets passed into WrapConnection.
If you meant what the use case is, I used it mostly for the debug logging in the Match function. With this it is possible to see at which matcher layer4 breaks, which was impossible before. My configs often have routes with multiple matcher+handler combos that should match. For example a proxy protocol matcher & handler followed by tls followed by http. So each "layer" gets unwrapped one after another.

Copy link
Owner

@mholt mholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this. Sorry it took me so long. Let's give it a try and see how it goes! I may ask for your help if anything comes up.

Again, really appreciate it :D

@mholt mholt merged commit 1c1f62d into master Nov 29, 2022
@mholt mholt deleted the fixes branch November 29, 2022 18:14
@ydylla
Copy link
Collaborator Author

ydylla commented Nov 29, 2022

@mholt Thanks for the merge.
But since you also did merge #78 layer4 will not build right now. Here is the diff to fix it:

diff --git a/layer4/listener.go b/layer4/listener.go
index 5001d6b..49a6d9f 100644
--- a/layer4/listener.go
+++ b/layer4/listener.go
@@ -119,7 +119,7 @@ func (l *listener) handle(conn net.Conn) {
 	buf.Reset()
 	defer bufPool.Put(buf)
 
-	cx := WrapConnection(conn, buf)
+	cx := WrapConnection(conn, buf, l.logger)
 	cx.Context = context.WithValue(cx.Context, listenerCtxKey, l)
 
 	start := time.Now()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants