Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GS Shutdown sdk calls sometimes failed/timeout and leave Gameservers behind #624

Closed
ilkercelikyilmaz opened this issue Feb 25, 2019 · 4 comments
Labels
kind/bug These are bugs.
Milestone

Comments

@ilkercelikyilmaz
Copy link
Contributor

I was doing load testing for GS allocation so I modified the simple-udp gameserver to call sdk.Shutdown() 10 minutes after the Gameserver state is changed to Allocated. Because of lot of allocation requests and gameserver creation, status change, deletion, some of the sdk.Shutdown calls didn't complete (200+ in 10000 calls), so 200+ GS left allocated although they should be deleted.

When the Agones/Kubernetes overwhelmed (e.g. too many Allocation requests), it can fail to serve/complete the GS management activities. We should implement a monitor to watch if a GS is still in use and then delete it. There issue #607 mentioned a similar solution.

@markmandel
Copy link
Member

So I have a few questions on this:

some of the sdk.Shutdown calls didn't complete

Why didn't they complete? Do we have logs? I have to review the code, but they should retry after a period if there is a failure (in case master goes down, etc). Did that not happen?

When the Agones/Kubernetes overwhelmed (e.g. too many Allocation requests), it can fail to serve/complete the GS management activities.

What does this mean exactly? Is the master going down? Is there contention issues? Is the retry mechanism not working?

Sounds like there is definitely an issue here, but would like to dig into it more about it's cause.

@markmandel markmandel added the kind/bug These are bugs. label Feb 25, 2019
@ilkercelikyilmaz
Copy link
Contributor Author

I could not reproduce it since I get the latest from master.
I will provide details when/if I can reproduce it.

@markmandel
Copy link
Member

@ilkercelikyilmaz can this be closed now as well?

@markmandel
Copy link
Member

Closing. Reopen if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug These are bugs.
Projects
None yet
Development

No branches or pull requests

2 participants