Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to Kubernetes 1.16 #1649

Closed
34 tasks done
roberthbailey opened this issue Jun 25, 2020 · 20 comments
Closed
34 tasks done

Upgrade to Kubernetes 1.16 #1649

roberthbailey opened this issue Jun 25, 2020 · 20 comments
Labels
area/operations Installation, updating, metrics etc kind/breaking Breaking change kind/feature New features for Agones
Milestone

Comments

@roberthbailey
Copy link
Member

roberthbailey commented Jun 25, 2020

As discussed on the monthly community meeting today, it's time to think about when we should upgrade our official support to Kubernetes 1.16.

It looks like we might be on track to do it for the Agones 1.8 release.

List of items to do for upgrading to 1.16 (this is copied from the 1.15 issue and may need to be updated):

  • Update e2e cluster to run against 1.16
    • Update deployment manage script
    • Recreate cluster with new scripts
    • Update kubectl in e2e-image/Dockerfile
  • Update prow cluster to use 1.16 (even though we aren't using it yet, we should keep it in sync)
    • Update deployment script
    • Recreate cluster with new scripts
  • Update the dev tooling to create 1.16 clusters
    • GKE
    • Minikube
    • Kind
    • Update kubectl
  • Update terraform submodules
    • GKE
    • Azure
    • EKS
  • Update documentation for creating clusters to 1.16
    • Usage requirements
    • GKE
    • Minikube
    • EKS
    • AKS
    • Helm documentation (no longer neccessary)
  • Update links to k8s documentation
    • examples/fleet.yaml
    • examples/fleetautoscaler.yaml
    • examples/gameserver.yaml
    • site/content/en/docs/Reference/fleet.md
    • site/content/en/docs/Reference/fleetautoscaler.md
    • site/content/en/docs/Reference/gameserver.md
    • site/content/en/docs/Advanced/limiting-resources.md
  • Update to client-go 13.0 (based on compatibility matrix)
  • Move API references from beta to GA. For instance, scheduling.k8s.io/v1beta1 to scheduling.k8s.io/v1. See https://github.com/googleforgames/agones/search?l=Go&q=v1beta1
  • Update site/assets/templates/crd-doc-config.json
@roberthbailey roberthbailey added the kind/feature New features for Agones label Jun 25, 2020
@roberthbailey
Copy link
Member Author

The Kubernetes v1.16 Release Notes have some points that we should keep in mind for this upgrade:

  • The 1.16 release marks the graduation of CRDs to general availability (GA).
    • The apiextensions.k8s.io/v1beta1 version of CustomResourceDefinition is deprecated and will no longer be served in v1.19. Use apiextensions.k8s.io/v1 instead.
    • There are some changes to the CustomResourceDefinition API type as it was promoted to apiextensions.k8s.io/v1
  • The 1.16 release marks the graduation of admission webhooks to general availability (GA).
    • The admissionregistration.k8s.io/v1beta1 versions of MutatingWebhookConfiguration and ValidatingWebhookConfiguration are deprecated and will no longer be served in v1.19. Use admissionregistration.k8s.io/v1 instead.
  • Many beta APIs are no longer served: use apps/v1, networking.k8s.io/v1, policy/v1beta1, networking.k8s.io/v1beta1, scheduling.k8s.io/v1
  • Aggregated discovery requests can now timeout. Aggregated API servers must complete discovery calls within 5 seconds (other requests can take longer).
  • Server-side apply feature is now beta
    • Server-side apply will now use the openapi provided in the CRD validation field to help figure out how to correctly merge objects and update ownership.
  • The CustomResourceDefaulting feature is promoted to beta and enabled by default. Defaults may be specified in structural schemas via the apiextensions.k8s.io/v1 API.

@roberthbailey
Copy link
Member Author

We should switch to 1.16 after this next release is cut, as Azure is about to put 1.18 into GA (estimated date is Aug 20th) at which point they will drop official support for 1.15.

@markmandel
Copy link
Member

markmandel commented Aug 25, 2020

Working up updating the dev gke and e2e clusters to 1.16 -- which is now powered by the GKE terraform module, so taking care of several tasks in one go.

@thisisnotapril
Copy link
Collaborator

Just heads-up that the final 1.16 patch is scheduled for 9/2

@roberthbailey
Copy link
Member Author

Ugh. Always on the tail end.....

That being said, GKE's release notes for today just remove support for new clusters at 1.14 and are upgrading 1.14 -> 1.15 in the stable channel while the regular channel is being moved from 1.15 -> 1.16. So we are right in line with where GKE is at in terms of k8s version support.

@aLekSer
Copy link
Collaborator

aLekSer commented Sep 3, 2020

I would add a PR for AKS and EKS soon.

@aLekSer
Copy link
Collaborator

aLekSer commented Sep 3, 2020

Adding some notes from the Agones Community Meeting, mentioned by @roberthbailey :

1.16 has a few api changes to be aware of - things graduating from beta -> ga, older beta apis being dropped
We reference beta APIs in our go code: https://github.com/googleforgames/agones/search?l=Go&q=v1beta1
A good first step would be to move the resources in the Helm chart (like CRDs) from betav1 to v1
E.g. Many beta APIs are no longer served: use apps/v1, networking.k8s.io/v1, policy/v1beta1, networking.k8s.io/v1beta1, scheduling.k8s.io/v1 - we should explore and see if we have any.

https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.16.md#deprecations-and-removals

The apiextensions.k8s.io/v1beta1 version of CustomResourceDefinition is deprecated and will no longer be served in v1.19. Use apiextensions.k8s.io/v1 instead.

@roberthbailey
Copy link
Member Author

Most of this is in this issue above.

The important things to catch when upgrading to 1.16 are to make sure that we aren't relying on APIs that are no longer being served (like scheduling.k8s.io/v1beta1).

For APIs that graduated to v1 in 1.16, if we immediately adopt the v1 API then we will prevent Agones from being installed on older versions of k8s. I'd suggest that we wait to take advantage of the newly promoted APIs until 1.17 (or even 1.18) to give folks a longer window to upgrade (even though as a project we won't be testing those versions any longer).

@markmandel
Copy link
Member

Working on upgrading client-go 👍

markmandel added a commit to markmandel/agones that referenced this issue Sep 5, 2020
Worth noting that cache.WaitForCacheSync has changed its internal
implementation to now use `err := wait.PollImmediateUntil(...)`, so
there is no more implicit 100ms sleep before the sync. Therefore it
didn't _actually_ wait to populate from Watch or allow events to fire -
it just made room for it to occur.

So had to now using assert/require.Eventually() to make sure that the
system is in a state that matches what we expect before testing.

Work on googleforgames#1649
markmandel added a commit to markmandel/agones that referenced this issue Sep 5, 2020
Worth noting that cache.WaitForCacheSync has changed its internal
implementation to now use `err := wait.PollImmediateUntil(...)`, so
there is no more implicit 100ms sleep before the sync. Therefore it
didn't _actually_ wait to populate from Watch or allow events to fire -
it just made room for it to occur.

So had to now using assert/require.Eventually() to make sure that the
system is in a state that matches what we expect before testing.

Work on googleforgames#1649
markmandel added a commit to markmandel/agones that referenced this issue Sep 8, 2020
Worth noting that cache.WaitForCacheSync has changed its internal
implementation to now use `err := wait.PollImmediateUntil(...)`, so
there is no more implicit 100ms sleep before the sync. Therefore it
didn't _actually_ wait to populate from Watch or allow events to fire -
it just made room for it to occur.

So had to now using assert/require.Eventually() to make sure that the
system is in a state that matches what we expect before testing.

Work on googleforgames#1649
aLekSer added a commit that referenced this issue Sep 9, 2020
* Update k8s.io/client-go to v0.16.15

* Upgrade k8s.io/apiextensions-apiserver to v0.16.15

Note: overwrote grpc to keep at current version.

* Upgrade client-go and apimachinery to 0.16.5

Worth noting that cache.WaitForCacheSync has changed its internal
implementation to now use `err := wait.PollImmediateUntil(...)`, so
there is no more implicit 100ms sleep before the sync. Therefore it
didn't _actually_ wait to populate from Watch or allow events to fire -
it just made room for it to occur.

So had to now using assert/require.Eventually() to make sure that the
system is in a state that matches what we expect before testing.

Work on #1649

Co-authored-by: Alexander Apalikov <[email protected]>
markmandel added a commit to markmandel/agones that referenced this issue Sep 11, 2020
Both the build and the e2e image.

Work on googleforgames#1649
@markmandel
Copy link
Member

markmandel commented Sep 11, 2020

Adding to this list:

  • Regenerate CRD Kubernetes client libraries

aLekSer pushed a commit that referenced this issue Sep 11, 2020
Both the build and the e2e image.

Work on #1649
markmandel added a commit to markmandel/agones that referenced this issue Sep 11, 2020
markmandel added a commit that referenced this issue Sep 14, 2020
Only a small change

Work on #1649

Co-authored-by: Robert Bailey <[email protected]>
@markmandel
Copy link
Member

A reminder - RC is next week, so probably makes sense to prioritise work for this ticket over others, so we can get support for 1.16 out the door.

@markmandel
Copy link
Member

Did a review - looks like it's just the EKS/AKS stuff that needs updating. @aLekSer are you on that?

And to confirm:

Move API references from beta to GA. For instance, scheduling.k8s.io/v1beta1 to scheduling.k8s.io/v1. See https://github.com/googleforgames/agones/search?l=Go&q=v1beta1

We're waiting on 1.17 to do this (@roberthbailey ) ?

@roberthbailey
Copy link
Member Author

For things that just graduated to GA, we are waiting until 1.17 to migrate our configs (see #1799).

For things where beta was dropped in 1.16 (like scheduling.k8s.io/v1beta1) we need to move now. But I did a search and only found one instance which was updated already in #1714 so that task is complete.

@aLekSer
Copy link
Collaborator

aLekSer commented Sep 15, 2020

AKS is ready, regarding EKS I have a draft PR working on the fix.

@aLekSer aLekSer added the area/operations Installation, updating, metrics etc label Sep 15, 2020
@markmandel
Copy link
Member

For things where beta was dropped in 1.16 (like scheduling.k8s.io/v1beta1) we need to move now. But I did a search and only found one instance which was updated already in #1714 so that task is complete.

Sweet. Ticked the box.

@aLekSer
Copy link
Collaborator

aLekSer commented Sep 16, 2020

All ticks are in place.
Just want to add that we need to add check Metrics checkbox as some names might be changed, there is a section about metrics:
https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.16.md#metrics-changes
I look through our metrics, definitely that's not the case right now.

@roberthbailey
Copy link
Member Author

Thanks @aLekSer. I've created #1824 for the 1.17 upgrade so that we can start capturing any additional steps there. Please feel free to edit that issue to add any steps that you know are missing.

@markmandel
Copy link
Member

Awesome work. Any objections to closing this issue then?

@roberthbailey
Copy link
Member Author

I think we are all set.

@markmandel
Copy link
Member

CLOSIING!

@markmandel markmandel added this to the 1.9.0 milestone Sep 17, 2020
@aLekSer aLekSer unpinned this issue Sep 18, 2020
@markmandel markmandel added the kind/breaking Breaking change label Sep 22, 2020
ilkercelikyilmaz pushed a commit to ilkercelikyilmaz/agones that referenced this issue Oct 23, 2020
* Update k8s.io/client-go to v0.16.15

* Upgrade k8s.io/apiextensions-apiserver to v0.16.15

Note: overwrote grpc to keep at current version.

* Upgrade client-go and apimachinery to 0.16.5

Worth noting that cache.WaitForCacheSync has changed its internal
implementation to now use `err := wait.PollImmediateUntil(...)`, so
there is no more implicit 100ms sleep before the sync. Therefore it
didn't _actually_ wait to populate from Watch or allow events to fire -
it just made room for it to occur.

So had to now using assert/require.Eventually() to make sure that the
system is in a state that matches what we expect before testing.

Work on googleforgames#1649

Co-authored-by: Alexander Apalikov <[email protected]>
ilkercelikyilmaz pushed a commit to ilkercelikyilmaz/agones that referenced this issue Oct 23, 2020
ilkercelikyilmaz pushed a commit to ilkercelikyilmaz/agones that referenced this issue Oct 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/operations Installation, updating, metrics etc kind/breaking Breaking change kind/feature New features for Agones
Projects
None yet
Development

No branches or pull requests

4 participants