-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support proxying to a server process via a Unix socket #337
Conversation
Thanks for submitting your first pull request! You are awesome! 🤗 |
hello there, great idea ! I tried using this with code-server: https://github.com/coder/code-server which already have a I can verify with socat that code-server is running on the socket but the request is not going through the proxy or there is some timing issue ? this is my j-s-p config:
I guess it wont be useful without websockets working, but I still expected to get a initial page to show up, maybe I am doing something wrong ? |
This is a fantastic new feature! |
The branch |
@jhgoebbert could that error be related to connecting a websocket? Sadly, this branch doesn't support websockets yet. I think it would need either a change in Tornado (tornadoweb/tornado#3172 ), or switching the HTTP client machinery for j-s-p to some other library, such as aiohttp. @rvalieris I'm not sure what went wrong there, though @jhgoebbert seems to have got past that. In general, that 'did not start ... in time' message means that it launched the process and then tried to get an HTTP response from it, for up to a second each try and up to 5 seconds total, but that didn't work. |
Just to mention, I am still interested in pushing this forwards, if anyone more involved with JSP thinks it's worthwhile. The biggest open question I see is what to do about websocket support - as it stands, websockets don't work when proxying to a Unix socket. There are 2½ options:
|
I think this would be an awesome feature @takluyver ! Supporting websockets is very important for the apps that I'm most interested in, but maybe others would be okay with living without ws for now. @yuvipanda mentioned wanting to switch to aiohttp. Reading the tea leaves it seems like jupyter server may eventually move off of tornado anyways. |
Keep in mind that I applied the changes mentioned above ( https://github.com/jupyterhub/jupyter-server-proxy/pull/337/files/885243ac9f1f21ca4869876ecfba0286f486328f ) to run the example. |
Ah, sorry, I missed your replies at the time. My tornado PR has now been merged (though it has yet to be released), so for now I've updated this PR to proxy websockets with tornado. This part will need the next version of Tornado to come out, presumably version 6.3. @jhgoebbert would you have time to test again, with this branch and tornado installed from master? I'd prefer any conversion to aiohttp to be a separate PR if possible - I think it's easier to review those changes independently than combined. But I'm happy to tackle it in this PR if you ask me to. I imagine it might depend on how soon we can expect a tornado release, and I've asked the maintainer about that. |
@takluyver I think a conversion to aiohttp should be in a separate PR as well. Tornado's two most recent minor releases were 7 and 8 months after the previous, so if that trend holds then it will be 5-6 months until the next one. |
Thanks! Then 🤞 this should be ready to try out. The combination of a websocket connection and a Unix socket backend will need Tornado 6.3, but websockets+TCP and regular HTTP requests + Unix sockets should work with current versions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting PR! As a novice with unix sockets I struggle to review this PR from a technical standpoint still and need to learn more to review it fully - but here is a partial review's suggested changes.
- Naming of config/variables
unix
is too vague for configuration I think. I suggest the configuration is calledunix_socket
. Being explicit is more relevant for those like me that knows less and can't as easily guess how and what things does. I think for consistency, variable names should be named like that as well. - Documentation entry
There is documentation in server-process that should be updated with all available configuration options available. - Basic configuration validity check
This option should not be used alongsideport
right, because it would be nonsensical? It would be good to take some action related to this. I'm thinking one could enforce it by erroring, or providing a warning, or documenting it. I think warning + documenting it can make most sense as I think erroring can break things a bit too much.
That makes sense to me, thanks.
Good point, will do.
I've been thinking that it's easy for packages to specify I assume in most cases, such projects would already let JSP select a TCP port at random, rather than specifying a fixed port. But I might not complain if both are specified, even though they can't both be used at the same time. |
Thanks @takluyver for working on this!! Review feedback
|
Thanks @takluyver and @consideRatio ! I vote in favor of using a separate parameter for |
@jhgoebbert Instead of saving the password to a temp file as you are doing here, you could as well write it to I agree that with unix sockets, it will be even more secure, but I think it is still doable with |
@takluyver if you find time to work this, I'll make sure to find time to review this quickly going onwards! |
Do you prefer me to resolve conflicts by rebasing (clean history) or by merging master in (accurate history)? |
Hi @takluyver!! Either would be acceptable but my preference at this point is a rebase! |
I quite like the idea of having a single property to handle both tcp ports and unix sockets. This is similar to how you set the docker host ( Would you support this if we renamed |
OK, with the new changes:
I've rearranged some of the code in
ZMQ also uses a similar addressing scheme, although it calls Unix sockets |
Maybe, but I don't understand things well enough to form a clear opinion. I trust your judgement though @manics! I've invested quite a bit of effort into understanding this feature already, but still feel quite lost. @manics could you work with @takluyver towards resolving this PR without me? I have a growing backlog of things I'd like to contribute with in the jupyterhub org, and reviewing something I don't understand well takes quite a bit of time and effort for me. General review point
|
Sorry to hear that. If it would help, I'm happy to try writing a brief summary of the goal - as I see it - and how I've tried to implement it. But I'll respect your decision to focus on other things if you prefer. |
@takluyver that would be great to have, I think it would be suitable to put in an updated PR description! I'll probably end up personally benefiting from it as well when reviewing other work related to this in the feature. I think a key piece of the complexity for me has been a lack of understanding of the assumptions of what is to be accomodated. I for example assumed that you would need to specify a unix socket path explicitly rather than allow for a temporary path be generated. I think now, thinking about what I don't undertand its one key piece - that we need a configuration API to support the wish to use unix sockets without specifying it in the server process configuration. Another key piece is the coupling with
A big 👍 to providing a summary of the goals to accomplish with the provided implementation, the more I think about it, the more I understand that a common understanding of such goals is crucial to review and think about the implementation. Oh I see you have now updated the description, its great!
Is this part of the traitlet configuration's help string already? If not, its probably worth putting there as well.
A key piece of understanding that I didn't caught onto quickly was that the server process definition's Thank you soo much for your thorough work on this @takluyver and helping me understand these details better! |
Thanks @consideRatio !
Gotcha. The temporary paths are an easy way to have a unique socket per process, and they help to ensure that only the relevant user can connect to them (this will normally be the case anyway, but it's easier to be sure with a temp folder).
Sorry, I don't follow? This does add an option to the server process config, and it will only use Unix sockets if you specify that option. I've focused on supplying the config via entry points, but it should work the same if you use the config system.
This is still an open question for me. The config comes from Python code, it could check its platform before setting
I don't know what most applications do. 🙂 I've been thinking primarily about writing new (small) applications, where obviously I can make the argument parsing work however I want.
Good news, you can already supply a callable! |
Ahem I had missed that help string entirely. I've updated it now. |
In our scenario the user is already authenticated and authorized on a multi-user system as he is logged in through |
Do let me know if there's anything else you want me to do here. I believe I've addressed all the feedback given so far. I'm still open to ideas whether Tornado 6.3 is now on the horizon, but it looks like it will still be a little way away. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@takluyver I'm still struggling with review here, and we are very low on capacity so I'm trying to help out anyhow.
This is the things I'm considering:
- Why is a new class
NamedLocalProxyHandler
introduced?
Is it relevant for this feature, or is it refactoring work you think makes sense but is mostly unrelated? Concretely, if you think it should be part of this PR still, it would be helpful to let the class docstrings clarify what the class purpose is compared to its super- and sub- classes.
I've historically found it hard to understand the separate responsibilities of_Proxy
,SuperviseAndProxyHandler
, new in PR:NamedLocalProxyHandler
,LocalProxyHandler
, and what motivated having many separate classes. - The term "named proxy" is unclear to me. If it refers to "a configured server process" or similar, I'd like to stick with the in-repo common terminology of configured server process to proxy to.
Right, good point. I had trouble making sense of the terminology and code structure following #339. The docs refer to 'server processes' ('Processes that are supervised and proxied are called servers.'). The corresponding code is in My attempt to represent this in the code was for I'm open to other names, e.g. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @takluyver for your thorough work on this, I learned a lot about this project reviewing this!
I'm looking into if I can quickly setup a test for this feature as well before merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM, I'll go for a merge here at this point!
Thank you @takluyver for your effort into this!!! ❤️ 🎉 🌻
Thanks @consideRatio ! I've also just seen that Tornado 6.3 has just been released, which was the missing piece to allow this to work with websockets (it already worked with simple HTTP requests). The pieces are coming together! 🎉 |
Wow, 6.3 looks like it has some massive quality of life improvements:
My kingdom for HTTP/2, ASGI, and native |
Updated description
Unix sockets are an alternative to TCP sockets for local servers: the server listens (and the client connects) on a filesystem path, rather than a numeric port. The big advantage is that, with the right filesystem permissions, the OS can prevent other users (except root) from connecting to a Unix socket, so it's more secure on a multi-user system. See #321 for more details.
This PR adds support for named proxies (those defined in config) to forward to a Unix socket rather than a TCP socket.
unix_socket=True
in server process config, JSP will create a temporary directory, and fill the new{unix_socket}
command template argument with a path inside there, where the server can create a socket. This is equivalent to choosing a random TCP port, and requires that JSP launches the server process itself.unix_socket='/path/to/some.socket'
instead, you are telling JSP that the application will listen at that path. This is equivalent to specifyingport=4321
, and works whether or not JSP launches the process (with or withoutcommand
set).This works already for regular HTTP requests. Forwarding websockets over a Unix socket will work once Tornado 6.3 is released. If this is a sticking point, we could switch the client code to aiohttp
I have a corresponding branch of my
hello_jupyter_proxy
repo to test this with: https://github.com/takluyver/hello_jupyter_proxy/tree/unix-sockOriginal, outdated description
This is fairly rough, but if the config for a server process includes
'unix': True
and no port number, j-s-p will create a new temp folder and give the server process a path inside that instead of a TCP port number. It then expects the process to create and bind a Unix socket at the given path, and will connect to that to forward requests.On a shared host, this means that only the user whose server this is can connect to the socket, whereas anyone with access to the system can connect to a localhost TCP socket. It also avoids the race condition where the parent selects an unused TCP port, but something else binds that port before the child process can.
For now, I've left websockets out of this, because Tornado's websocket client doesn't seem to easily support passing in a resolver object like its HTTPClient does. I think this can be useful even without websocket support, and I'd hope that we could either add to Tornado or find an alternative like aiohttp to fill that in later.
Fixes #321