Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ssh community cannot work normally with the rsocket #8

Open
ling0329 opened this issue Feb 13, 2019 · 7 comments
Open

ssh community cannot work normally with the rsocket #8

ling0329 opened this issue Feb 13, 2019 · 7 comments

Comments

@ling0329
Copy link

I have tried to use FreeFlow open source project to test the performance of app. I noticed that you have test the rsocket with freeflow. But I got a problem when I tried to test big data app , the ssh community cannot work normally with the rsocket. The prompt message as following:
ssh
Have you ever had a similar problem? I am wondering if you can give me some advices to solve this problem ? Or even just a few names you think we should talk to. Thank you very much !

@bobzhuyb
Copy link
Contributor

Why do you want to run ssh with rsocket? rsocket has compatibility issues with many applications. For example, rsocket does not support epoll(), so any applications using epoll() won't work. It's possible that ssh uses something that rsocket does not support.

Anyways, if you really want to run ssh with rsocket, and you are really sure it can run, you need to check whether ssh server is started with rsocket. Usually ssh server is started by Linux service, and may not carry the correct environmental variables (like LD_PRELOAD)

@ling0329
Copy link
Author

Thank you for your reply. The reason why we want to run ssh with rsocket is we want to run big data app based on rsocket, like hadoop, spark, on freeflow. We have configured the env LD_PRELOAD as
image
So I think the critical point lies in whether or not ssh uses epoll. If ssh uses epoll, we can assert that big data app can not run based on rsocket.

@bobzhuyb
Copy link
Contributor

Running big data (or whatever) app over rsocket does not mean you need to run ssh over rsocket. ssh is usually only used for control channel, while the actual data channel (where the heavy data goes) is usually not through ssh. For example, MPI control channel can go through TCP-based ssh, while the actual MPI APIs go through RDMA network.

You should read my reply again and carefully -- the process started by Linux service may not respect what you set in the environment. You can do some quick google search https://unix.stackexchange.com/questions/44370/how-to-make-unix-service-see-environment-variables

@ling0329
Copy link
Author

We attempted to capture packages when getting ssh connected. The following was captured without configuring env LD_PRELOAD
image
But when we configured env LD_PRELOAD, we got this
image
It occurs exception of rst ack. We preliminary infer the configuration of env LD_PRELOAD has influence on ssh connection, though ssh still go through normal tcp socket.
Do you have any more suggestions?

@bobzhuyb
Copy link
Contributor

If LD_PRELOAD really lets rsocket hijack TCP socket's connect(), send(), etc., you should not be able to capture any TCP handshake or SSH handshake packets.. How did you infer that LD_PRELOAD has taken effects?

Honestly, I don't think this is related to Freeflow. In addition, in general, I suggest you not use rsocket at all in production.

@ling0329
Copy link
Author

Thank you for your suggestion, and we decided to accept your suggestion of not using rsocket after discussions. Then turning back to paper 'freeflow', we can see that there are two approaches to transform tcp socket to rdma, rsocket and sdp, but rsocket has problems of compatibility. So can we replace rsocket with sdp? or other ways?

@bobzhuyb
Copy link
Contributor

bobzhuyb commented Feb 14, 2019

There is basically no mature ways to convert TCP socket to RDMA, at least no public ways. Also, the performance of converted socket would be far from optimal, since it requires at least one-copy given the socket interface.

If you are serious about using RDMA to accelerate things, you can choose 1) re-implement everything using RDMA verbs or 2) switch to an RPC framework that has RDMA option, like https://github.com/accelio/accelio used by ceph. There might be other RPC options, too. 3) find an RDMA-version of the app, if there exists. Check whether this page has something you want http://hibd.cse.ohio-state.edu/

To be clear, the paper talks about rsocket just for demonstrating the capability of Freeflow; not for overcoming rsocket's own limitations. Converting socket to RDMA is out of the scope of the paper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants