Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rust SDK does not wait for connection to be ready #938

Closed
roberthbailey opened this issue Jul 23, 2019 · 5 comments · Fixed by #953
Closed

Rust SDK does not wait for connection to be ready #938

roberthbailey opened this issue Jul 23, 2019 · 5 comments · Fixed by #953
Labels
kind/bug These are bugs.
Milestone

Comments

@roberthbailey
Copy link
Member

What happened: rust-simple doesn't work on a GKE cluster

What you expected to happen: gameserver to transition to Ready

How to reproduce it (as minimally and precisely as possible): Run the example code from my pending PR: #937

Anything else we need to know?:

Environment:

  • Agones version: master branch (pre-0.12.0)
  • Kubernetes version (use kubectl version): 1.12
  • Cloud provider or hardware configuration: GKE
  • Install method (yaml/helm): helm (via make install)
  • Troubleshooting guide log(s):
  • Others:

When launching the simple rust server, the gameserver starts up really fast, tries to connect to the sidecar and fails, and then restarts. This immediate restart causes the gameserver to become unhealthy, even though the second time around it successfully connects to the sidecar and is sending health() messages every 2 seconds.

It seems like the underlying issue is that the rust SDK isn't blocking when initially trying to connect to the sidecar.

In the C++ SDK, we have

  if (!pimpl_->channel_->WaitForConnected(
          gpr_time_add(gpr_now(GPR_CLOCK_REALTIME),
                       gpr_time_from_seconds(30, GPR_TIMESPAN)))) {
    return false;
  }

The Rust SDK has a comment that makes me think it's doing the same thing:

    /// Starts a new SDK instance, and connects to localhost on port 59357.
    /// Blocks until connection and handshake are made.
    /// Times out after 30 seconds.

but I don't see anything in the code that blocks, nor do I see anything in the Rust docs for Channel that matches the WaitForConnected function in the C++ gRPC docs.

@roberthbailey roberthbailey added the kind/bug These are bugs. label Jul 23, 2019
@roberthbailey
Copy link
Member Author

A couple of thoughts:

  1. The constructor in the Rust SDK shouldn't be calling Ready since gameserver should have control over it's own state transitions (and may not be immediately ready)
  2. It might make sense for the constructor to instead make a Health call and use that to block until the sidecar is ready using a CallOption with wait_for_ready and a timeout of 30 seconds.

@aLekSer
Copy link
Collaborator

aLekSer commented Jul 23, 2019

@thara can you please help on this ticket?

@thara
Copy link
Contributor

thara commented Jul 23, 2019

Thanks for your mention. I try to reproduce this in my free time.

The constructor in the Rust SDK shouldn't be calling Ready since gameserver should have control over it's own state transitions (and may not be immediately ready)

I agree with the suggestion.
I seem to have misread Go implementation when I wrote the SDK.

It might be good to add a method into Rust SDK like SDK::Connect in C++ one.

@roberthbailey
Copy link
Member Author

I don't know the rust idioms to know if that makes more sense. The go sdk does the (blocking) connection in the NewSDK function. The nodejs sdk also doesn't have a blocking connect afaict.

@markmandel
Copy link
Member

From memory, I had to have the C++ Connect function just because of how C++ did grpc. I think it's better to avoid that, if possible in the other SDKs, as it's a bit of a smoother developer experience.

markmandel added a commit to markmandel/agones that referenced this issue Jul 27, 2019
- Uses a simple polling mechanism to confirm the connection is live
- Tweaks the conformance tests to add a random delay to ensure future
  testing of SDK blocking.
- Add `ready` to the SDK documentation.

Fixes googleforgames#938
markmandel added a commit to markmandel/agones that referenced this issue Jul 27, 2019
- Uses a simple polling mechanism to confirm the connection is live
- Tweaks the conformance tests to add a random delay to ensure future
  testing of SDK blocking.
- Add `ready` to the SDK documentation.

Fixes googleforgames#938
markmandel added a commit to markmandel/agones that referenced this issue Jul 28, 2019
- Uses a simple polling mechanism to confirm the connection is live
- Tweaks the conformance tests to add a random delay to ensure future
  testing of SDK blocking.
- Add `ready` to the SDK documentation.

Fixes googleforgames#938
markmandel added a commit to markmandel/agones that referenced this issue Jul 29, 2019
- Uses a simple polling mechanism to confirm the connection is live
- Add `ready` to the SDK documentation.

- Implements and documents a sdk.connect() function

- Tweaks the conformance tests to add a random delay to ensure future
  testing of SDK blocking.

Fixes googleforgames#938
markmandel added a commit to markmandel/agones that referenced this issue Jul 30, 2019
- Uses a simple polling mechanism to confirm the connection is live
- Add `ready` to the SDK documentation.

- Implements and documents a sdk.connect() function

- Tweaks the conformance tests to add a random delay to ensure future
  testing of SDK blocking.

Fixes googleforgames#938
markmandel added a commit to markmandel/agones that referenced this issue Jul 30, 2019
- Uses a simple polling mechanism to confirm the connection is live
- Add `ready` to the SDK documentation.

- Implements and documents a sdk.connect() function

- Tweaks the conformance tests to add a random delay to ensure future
  testing of SDK blocking.

Fixes googleforgames#938
markmandel added a commit to markmandel/agones that referenced this issue Jul 30, 2019
- Uses a simple polling mechanism to confirm the connection is live
- Add `ready` to the SDK documentation.

- Implements and documents a sdk.connect() function

- Tweaks the conformance tests to add a random delay to ensure future
  testing of SDK blocking.

Fixes googleforgames#938
markmandel added a commit to markmandel/agones that referenced this issue Jul 30, 2019
- Uses a simple polling mechanism to confirm the connection is live
- Add `ready` to the SDK documentation.

- Implements and documents a sdk.connect() function

- Tweaks the conformance tests to add a random delay to ensure future
  testing of SDK blocking.

Fixes googleforgames#938
roberthbailey pushed a commit that referenced this issue Jul 30, 2019
- Uses a simple polling mechanism to confirm the connection is live
- Add `ready` to the SDK documentation.

- Implements and documents a sdk.connect() function

- Tweaks the conformance tests to add a random delay to ensure future
  testing of SDK blocking.

Fixes #938
@markmandel markmandel added this to the 0.12.0 milestone Aug 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug These are bugs.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants