Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test m2 in test deployment #123

Closed
jdknives opened this issue Jan 21, 2020 · 12 comments
Closed

Test m2 in test deployment #123

jdknives opened this issue Jan 21, 2020 · 12 comments
Assignees

Comments

@jdknives
Copy link
Member

jdknives commented Jan 21, 2020

Feature description

We need to test skywire-mainnet@milestone2 with dmsg@milestone2 in test deployment and check whether the previous issues raised by Senyoret (#22 , #28 , #31 ) have been fixed. We should deploy the services needed to the test deployment and check a few cases specifically:

  1. Connect multiple visors to one hypervisor over dmsg. Try different endpoints including the changes made by @nkryuchkov to app configurations and restarting.
  2. Use skysocks over stcp and try it over a dmsg transport as well.
  3. Run dmsgpty on m2.
@jdknives jdknives changed the title Test m2 in production Test m2 in test deployment Jan 21, 2020
@Darkren
Copy link
Contributor

Darkren commented Jan 22, 2020

  1. Just redeployed the system, so a fresh new run. Dmsg server has the following log:
[2020-01-22T13:20:20Z] INFO [messaging-server]: Serving server. local_addr="172.105.115.99:8080" local_pk=035915c609f71d0c7df27df85ec698ceca0cb262590a54f732e3bbd0cc68d89282
[2020-01-22T13:20:20Z] INFO [messaging-server]: Updating discovery entry... local_addr="172.105.115.99:8080" local_pk=035915c609f71d0c7df27df85ec698ceca0cb262590a54f732e3bbd0cc68d89282
[2020-01-22T13:20:20Z] INFO [messaging-server]: Accepting sessions... local_addr="172.105.115.99:8080" local_pk=035915c609f71d0c7df27df85ec698ceca0cb262590a54f732e3bbd0cc68d89282
[2020-01-22T13:20:20Z] INFO [messaging-server]: Started session. remote_pk=02b3631dd8a52117c7bc327aa019c9681e9d40b3839958ffc34951a0115514a8d8 remote_tcp=172.104.52.156:43958
[2020-01-22T13:20:21Z] INFO [messaging-server]: Started session. remote_pk=02af1ed222c20e866e9cc3fb6291a45d656036cba168597a6a1f9c1cb5afa17f60 remote_tcp=188.2.81.132:65128
[2020-01-22T13:21:13Z] INFO [messaging-server]: Stopping session... error="EOF" session=02b3631dd8a52117c7bc327aa019c9681e9d40b3839958ffc34951a0115514a8d8
[2020-01-22T13:21:13Z] INFO [messaging-server]: Stopped session. error=<nil> remote_pk=02b3631dd8a52117c7bc327aa019c9681e9d40b3839958ffc34951a0115514a8d8 remote_tcp=172.104.52.156:43958
[2020-01-22T13:21:54Z] INFO [messaging-server]: Started session. remote_pk=02b3631dd8a52117c7bc327aa019c9681e9d40b3839958ffc34951a0115514a8d8 remote_tcp=172.104.52.156:43964
[2020-01-22T13:22:37Z] INFO [messaging-server]: Started session. remote_pk=031ed711a49c78e56d618fe0038823ddb93710c3574d697d35ec22724d7977d3ba remote_tcp=178.155.4.237:21361
[2020-01-22T13:22:41Z] INFO [messaging-server]: Started session. remote_pk=0324d515d9c120d3511b444c15ec23af5548da72eecc9900e3559d66721eed87aa remote_tcp=178.155.4.237:27202

This one happens because of the deployed visor connection. Not always happens. Also nothing similar in the logs of visor itself

@Darkren
Copy link
Contributor

Darkren commented Jan 23, 2020

  1. Again, visors A and C are local, B is deployed on the server. all visors running, I do the following:
./skywire-cli --rpc localhost:3435 node add-tp 02b3631dd8a52117c7bc327aa019c9681e9d40b3839958ffc34951a0115514a8d8 --type stcp
./skywire-cli --rpc localhost:3436 node add-tp 02b3631dd8a52117c7bc327aa019c9681e9d40b3839958ffc34951a0115514a8d8 --type stcp

Then I call ls-tp on the deployed visor, see 2 transports established, all is ok. I shutdown visors A and C, then I call rm-tp on visor B. ls-tp shows no transports. but there are still some keys left in redis. particularly smembers transport:* shows transports. and such case prevents transport-discovery from being run. it doesn't start and shows no error in logs

@Darkren
Copy link
Contributor

Darkren commented Jan 23, 2020

  1. Calling rm-tp on visor B to remove transports while A and C are running doesn't do anything. The response from B is still OK but ls-tp shows that transports are still there. not sure what's going on with the physical connection at this point, but it should be correctly closed and transports totally removed

UPD: seems like transports are being added there after removal just as @jdknives suggested. if you run rm-tp and then ls-tp immediately, the transport is gone, but appears after a couple of seconds

@Darkren
Copy link
Contributor

Darkren commented Jan 23, 2020

I had really hard time trying to figure out why stcp doesn't work at all. Actually the problem was in the setup node's config, there was mistype in sessions_count field, so actual sessions count was 0 in the dmsg client config. But it wasn't obvious at all looking at the logs. This partciular thing should be at least logged with the warning if sessions count is 0, cause it's really hard to diagnose the cause of the problem

@Darkren
Copy link
Contributor

Darkren commented Jan 24, 2020

  1. Sometimes visor just keeps spamming Discovering dmsg servers. Dmsg discovery logs show that it responds to all requests correctly (at least code is 200). Nothing strange in the dmsg server logs (except several EOFs)

UPD:
Encountered this once again, applying logs.

Visor:
image

Dmsg server:
image

Dmsg discovery:
image

@Darkren
Copy link
Contributor

Darkren commented Jan 25, 2020

  1. dmsgpty works just great. the only thing I noticed in logs of the visor where dmsgpty server is run:

изображение

This error="close unix /tmp/dmsgpty.sock->: use of closed network connection" happens on each executed command. really not a big a deal, but probably we're just closing an already closed connection somewhere. getting rid of this log message would make execution flow cleaner

@Darkren
Copy link
Contributor

Darkren commented Jan 25, 2020

  1. Tried the following:
  1. start local visors A and C
  2. add direct dmsg transport from C to A
  3. shutdown visors C and A
  4. start hypervisor
  5. start visors A and C again
  6. request nodes from hypervisor via GET http://localhost:8080/api/nodes - request hangs for a while

Log of visor A:
изображение

Log of visor C:
изображение

UPD: tried this once again just to be sure, happened again. issue seems to be stable

UPD: after hypervisor responds, logs look like this:
image

@Darkren
Copy link
Contributor

Darkren commented Jan 27, 2020

  1. Logs of visor hosting skysocks while changing password:

image

Probably restart happened too soon after socket closed, cause when I tried to start if from hypervisor, skysocks sucessfully started. Not sure

@Darkren
Copy link
Contributor

Darkren commented Jan 27, 2020

  1. Log of visor when trying to stop app:

image

Not a big deal, but Pop app: no such app should not really be there, I'd deal with it

@Darkren
Copy link
Contributor

Darkren commented Jan 27, 2020

  1. Changing PK of skysocks-client which is STOPPED at this point. Logs:
    image
    image
    logs seem ok except for the fact that this request immediately starts the app even if it was closed. not sure if this is intended. but I get the following response from the API:
{
    "error": "stop app skysocks-client: unknown app"
}

UPD: it outputs Restarting app skysocks-client but that doesn't really happen, at least not in my case

@Darkren
Copy link
Contributor

Darkren commented Jan 27, 2020

  1. I did the following:
  1. Started visors A and C on local machine
  2. Added transport C->B, A->B
  3. Tried changing password and PK for skysocks
  4. Tried curling through skysocks - it works
  5. Requested visor restart from hypervisor for C where skysocks-client is hosted. It went down and rose back up just great. but then, logs:
    image

Then tried curling through proxy and got this:
image

Restarting skysocks-client from hypervisor doesn't save the day, CURL request hangs after that

@jdknives
Copy link
Member Author

Superseded by the individual tickets #143 to #147 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants