Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High-level networking interface #73

Closed
njsmith opened this issue Mar 7, 2017 · 11 comments
Closed

High-level networking interface #73

njsmith opened this issue Mar 7, 2017 · 11 comments

Comments

@njsmith
Copy link
Member

njsmith commented Mar 7, 2017

We should have something that's just a smidge higher level than trio.socket. This is something that could go into an external library, but:

  • I'd like to do TLS at this layer (so it's not bound to socket objects specifically), and in 2017 it really feels like TLS should be an included battery
  • Getting started with a simple echo server shouldn't require boilerplate or third-party extensions
  • I'm feeling the need for a basic stream abstraction in downstream libraries, and there are interoperability benefits to defining that in trio itself

The goal wouldn't be to compete with like, all of netty or twisted or something, but just to provide some minimal extensible abstractions for endpoints, connecting, listening. Maybe UDP too, though that can wait.

Things that should be easy:

  • Connecting to a given hostname+port, with ipv4/ipv6 taken care of (ideally with happy-eyeballs)
  • ...same as above, but with TLS enabled
  • Making a simple server that listens on both ipv4 and ipv6
  • Making a simple server that listens on a unix domain port

Maybe:

  • Plugging in some code to handle systemd socket activation
  • Plugging in some code to do something sensible with the haproxy PROXY protocol

A source of inspiration is twisted's "endpoints": tutorial, list of client endpoints, list of server endpoints, a much fancier server endpoint. I'm not a big fan of the plugins + string parsing thing, just because of the whole injecting-arbitrary-code-into-people's-processes thing that plugins do ... though I see the advantage when it comes to exposing this stuff as part of a public interface and wanting it to be extensible there.

We probably want:

  • A basic Stream API. There's a sketch in trio._streams. This is useful in its own right; the idea would be to implement TLS as a stream wrapper e.g., and we can use a SendStream and RecvStream for subprocess pipes, etc.

  • Some sort of connect and listen factories for streams.

Lots more TBD; for now this is a placeholder to think more about.

Related: #9 (tls support), #13 (more ergonomic server quick-start), #14 (batched accept), #72 (socket option defaults – some might make more sense to set as part of the high-level interface)

@njsmith
Copy link
Member Author

njsmith commented Mar 10, 2017

When it comes to unix domain socket handling, some references:

Summary: regular AF_UNIX sockets have an annoying issue where the filesystem entry is automatically created when you bind() the socket, but it's not automatically deleted when you close the listener, AND if you try to bind() on an existing file, then it fails (regardless of whether any socket is still listening). (Linux's "abstract namespace" sockets don't have this property; they're automatically cleaned up.) So one common strategy is to unconditionally unlink any socket file before trying to bind to an address. This is what tornado and asyncio do.

Twisted is a little fancier: they also have the option to use a pid-based lockfile to track whether the socket is currently in use. By default, they do not do this (everything defaults to wantPID=0), except that if you use the string-based endpoint constructor (!) then they do (issue, and see twisted.internet.endpoints._parseUNIX. The only code that I can find via codesearch tools that sets wantPID=True explicitly is "deluge" and the twisted test suite, but I guess the string-parser thing means that they do get used.

If we do want locking then it might makes sense to use fcntl or flock or something instead of PIDs... dunno. These are a bit messy, but I think PID lockfiles are worse. Need to do some more research on permissions. (On Linux you can actually use abstract namespace sockets as a reliable local lock, because bind fails if someone has already bound that name and bindings are cleaned up when the socket is closed. But probably permissions issues mean this won't work.)

Twisted also has the option to check for the the lockfile when connecting to a AF_UNIX socket. I'm not sure what the point of this is – if the socket is stale and gone, then won't connect fail anyway?

NB: for zero-downtime upgrades you might want to blow away an existing socket with an active listener and replace it. To facilitate this use case, blowing away listening sockets should be done as bind+rename, not unlink+bind. This can be the one-and-only way, I think.

In any case, you want to unlink the socket when closing it when possible – after verifying that the file hasn't been replaced under you! (i.e. check fs/inode information and compare to the still-open socket). But unfortunately this has an inherent race, since there's no way to do the stat and the unlink atomically! Maybe this is worth doing some kind of locking, like taking an fcntl lock on some sidecar file just while we're creating or cleaning up a socket? I guess the algorithm here would be:

To acquire the lock:

  1. open FOO.lock, creating if necessary
  2. acquire an flock on the given file (flock is the more-sensible less-portable BSD lock, but I think it's portable enough for us and fcntl locks are just awful)
  3. check that the file we got the lock on is the same file called FOO.lock on disk; if not goto 1

To release a lock:

  1. unlink FOO.lock
  2. close the file/release the lock (order here is important)

To close a socket:

  1. acquire the lock as per above
  2. check that the socket matches the socket file on disk
  3. if so, unlink the socket file
  4. release the lock as per above

To bind the socket:

  1. bind/listen to a temporary name
  2. acquire the lock as per above
  3. rename the socket over the real name
  4. release the lock as per above

To acquire an flock: the trick is to make it cancellable. There's no way to set a timeout, and empirically, if a thread is blocked in flock on Linux then closing the fd does not wake it up. So the only mechanism (and eh, it's a fine one, so whatever) is to do:

SLEEP_TIME = 0.01
async def acquire_flock(fd):
    while True:
        try:
            await fcntl.flock(fd, fcntl.LOCK_EX | fcntl.LOCK_NB)
        except BlockingIOError:
            await trio.sleep(SLEEP_TIME)
        else:
            return

This is essentially a busy-wait, but that's OK, we never hold the lock for more than a few tens of milliseconds at worst, so sleeping for 10 ms in between checks is probably fine.

Permissions: apparently some-but-not-all systems enforce file permissions on unix domain sockets. Linux does. (Do any others we care about? probably not many.) Tornado defaults to 0o600, Twisted defaults to 0o666, and asyncio just accepts the system default. In any case, sockets are always created according to the umask, and then if you want something else you have to fchmod afterwards, so there's a small race condition but this doesn't create a race condition so long as you do things in the order bindfchmodlistenrename. If you care then you're better off controlling directory permissions, I guess. In any case, I guess our lock file can be 0o600 or whatever.

Biggest question: How on earth do we test all this.

@brettcannon
Copy link
Contributor

A survey of various HTTP libraries to tease out the commonality among their request and response APIs: https://drive.google.com/open?id=1HKHvyeEnB_y4rvLEOraE0JNCcpkVCTJ_jVdKs8KFXYw

@njsmith
Copy link
Member Author

njsmith commented Apr 23, 2017

Twisted's code for calling accept also contains a significant amount of hard-earned wisdom.

libuv has a very cute trick for handling EMFILE/ENFILE: https://github.com/libuv/libuv/blob/26daa99e2c93db2692933e8d141e44661a444e68/src/unix/stream.c#L470-L480
I believe the idea is: EMFILE/ENFILE means you can't call accept, so normally new connections will just stall sitting in the accept queue until they time out, plus eventually the queue overflows which most TCP stacks respond to by discarding ACKs so that the peer keeps trying to finish the connection – so basically it's a bad experience, not only are we not able to handle these connections, but we're not even telling them that, we're just letting them hang. So it's better to quickly close them all so at least they get notification quickly that things are messed up.

@njsmith
Copy link
Member Author

njsmith commented May 2, 2017

[moving this comment from #8 because it's more relevant here]

Poorly organized thoughts on server API:

Listener is an AsyncResource with start (returns an arbitrary value, does setup, like __init__ but async) and accept (returns an arbitrary value that is usually a Stream, all we actually care about is that it has a forceful_close method). start_server starts the listener, spawns a task to loop on accept, and returns (start_return_value, task_object). The former is to let you get back information from the setup code (example: what port are you actually listening on?), and the latter is b/c fancy supervisors will need to know it, to know which thing needs restarting when a task dies. (Or I guess the return value could be some information on the listener object, and we could pass back the listener object itself directly? Or return a Server object with a .listener attribute and stuff like close methods?)

Maybe start_server should take a mandatory timeout argument (can be inf if you want), use graceful_close on the passed in strings, and pass in a cancel scope to every callback along with the stream? Because if you need to have a timeout on network code, and you need that timeout to encompass the graceful_close call, then every connection callback has to start with a double-indent boilerplate with open_cancel_scope: async with stream: ..., and who wants that?

It should also optionally let you set the listener nursery and the connection nursery separately.

Implementations: TCPListener, UnixListener (I guess this is ambiguous between STREAM and DATAGRAM sockets... maybe Accepter or StreamListener would be better?), SSLListener (wraps another listener), probably eventually HAProxyListener (need some way to pass back peername etc. on the stream – maybe SocketStream should have methods that return (type, family, proto, python-style sockaddr)?), AcmeListener, ...

Convenience functions: start_tcp_server (with optional SSL), start_unix_server (with optional SSL?)

@njsmith
Copy link
Member Author

njsmith commented Jul 26, 2017

On further thought, I'm wondering if perhaps we should not try to handle every weird error that can happen in accept. Specifically:

There are some errors that are "normal" – mostly around cases where something goes wrong on the new connection before we accept it. So e.g. the Twisted and Tornado comments say BSD can give ECONNABORTED when the client does a handshake and then sends FIN or RST before we call accept, which is... ok? whatever? we can just ignore this, or possibly return a fake stream that errors out, but probably simpler to just ignore it. It's not a problem, anyway. The Linux accept manpage also claims that any error you can get on a connection, you can also get here, so like ENETDOWN and similar nonsense. That's fine, whatever, same issue as above. (And this is apparently not common, since neither Twisted nor Tornado even attempts to handle it...)

Then there are errors that indicate a system problem, like EMFILE or ENOBUFS. Twisted handles this by logging something and then carrying on. This has two problems: (a) I'd rather not bind trio itself to any particular logging solution, and (b) if there are no fd's, then the pending connections are probably doomed, but by just leaving them sitting in the accept queue the clients get poor QoS. Libuv handles this using a sneaky trick: it keeps an allocated-but-unused fd around, and if it gets EMFILE, it closes that fd to hopefully put the process below the fd limit, so it can call accept, and then it calls accept a bunch of times and immediately closes all the sockets it gets, to clear out the accept queue and avoid leaving clients hanging until they timeout. Of course this is complicated and fragile (what if another thread opens an fd while we're doing it? what if we can't re-allocate the 'spare' fd at the end, do we need to remember to do that later?).

The alternative is: if we hit EMFILE or similar, crash the listener and let the supervisor restart it! So this means closing the listen socket, which is... pretty reasonable, actually; it immediately and reliably signals to all waiting clients to go away, like the libuv trick without the trickiness. And if any new connection attempts come in while we're restarting, then well, we're breaking under the load, rejecting their connections is pretty reasonable. Similarly, if restarting the listener fails because we can't allocate an fd for the listening socket, then oh well, if we're that short on fds then we can't call accept anyway so what good is a listening socket. And by shoving the problem off onto the supervisor, then logging becomes their problem.

@njsmith
Copy link
Member Author

njsmith commented Aug 11, 2017

Okay, more refined thoughts on the issue about how to handle resource exhaustion errors (EMFILE etc.) in servers:

For the Listener interface, I'm going to let these errors escape. This allows the caller to respond appropriately, and that's definitely more sensible than hard-coding the handling inside accept itself.

For our spawn_tcp_server or whatever we end up calling it... I'm still unsure :-(. "Just crash" is problematic when we don't actually have any existing supervisors or experience with them. And if you're binding to a random ephemeral port, then restarting is very much a state-changing operation, unless you have some way to pass the port number across restarts.

@njsmith
Copy link
Member Author

njsmith commented Aug 15, 2017

Reviewed some erlang libraries to see how they handle capacity errors:

Also, I'm not 100% sure, but I think mochiweb and yaws actually open the accept socket outside of the acceptor process, so crashing the process just causes a new process to be spawned using the same socket.

@brettcannon
Copy link
Contributor

Is there a library named "actually"? Or did you mean to say "actually opens" instead of "and actually open"? Or is there a word missing?

@njsmith
Copy link
Member Author

njsmith commented Aug 15, 2017

Whoops, I meant mochiweb and yaws, the two libraries that do crash their acceptor process. Thanks for the catch.

@brettcannon
Copy link
Contributor

@njsmith I went ahead and edited your comment to add yaws.

@njsmith
Copy link
Member Author

njsmith commented Aug 22, 2017

I ended up copying the "log and wait 100 ms" thing that the erlang servers do, we'll see how it goes. It's just using the stdlib logging, on the theory that well, it kinda sucks, but as methods of dumping to stderr go, at least it's reconfigurable and good logging packages will have some easy way to capture it.

Follow-up bugs:

@njsmith njsmith closed this as completed Aug 22, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants