rebase on upstream 1.4.0 or cherry-pick important fixes? #95

jhoblitt · 2019-01-09T15:51:59Z

The kubernetes_deployment resource is fairly painful to use without hashicorp#210, and requires running tf multiple times for a deployment to converge. The fix is part of the upstream 1.4.0 release. Are there plans to rebase this module upon upstream "soonish" or is it preferred to try and cherry-pick/back-port critical fixes from upstream?

The text was updated successfully, but these errors were encountered:

sl1pm4t · 2019-01-09T17:36:50Z

Hi @jhoblitt, at this stage I don't intend to rebase this provider on upstream. This fork has diverged quite considerably from upstream and I don't see the reconciliation effort as worthwhile right now. I'm optimistic upstream will catch up with the features in this provider and this fork can be abandoned.
So, for the time being cherry-picking fixes from usptream will be the way to go.

Also, I'm curious how does issue hashicorp#210 manifest? In the 1.5 years of using kubernetes_deployment resource with this provider I've not seen the kind of issue you describe?! Our deploys all work in a single apply.

jhoblitt · 2019-01-09T18:24:38Z

@sl1pm4t I also have been hoping that upstream will pickup most of the additional resource types but I'm in a bind as I need the ingress type and am trying to avoid having to maintain an internal fork.

In fairness, I haven't yet tried to cherry-pick erraform-providers#210 to see if it resolves the problem I'm seeing but it definitely isn't present with upstream 1.4.0 (note that changing between this fork and upstream also requires a minor change to the deployment syntax, which is a frustration).

An example of a failure is using a module to install tiller for the helm provider and then trying to use helm resources. This will fail on at least the first tf run as the the helm provider will try to talk to the tiller pod before the rs/pods have finished provisioning. If the docker image pull is slow or the k8s cluster is busy, sometimes a second tf run is too fast and will fail again. This is on top of a strange error from the kubernetes_deployment resource itself even though the deploy is properly created. Eg.

https://github.com/lsst-sqre/terraform-gitlfs/blob/800eae562de6f698936f5d5498ee01dfd55bb822/tf/main.tf#L59-L78

module "tiller" {
  source = "git::https://github.com/lsst-sqre/terraform-tinfoil-tiller.git//?ref=sl1pm4t-1.3.0"

  namespace       = "kube-system"
  service_account = "tiller"
  tiller_image    = "gcr.io/kubernetes-helm/tiller:v2.11.0"
}

provider "helm" {
  version = "~> 0.7.0"

  service_account = "${module.tiller.service_account}"
  namespace       = "${module.tiller.namespace}"
  install_tiller  = false

  kubernetes {
    host                   = "${module.gke.host}"
    cluster_ca_certificate = "${base64decode(module.gke.cluster_ca_certificate)}"
  }
}

First run error on a 1.11.5-gke.5 cluster:

Error: Error applying plan:

2 error(s) occurred:

* module.tiller.kubernetes_deployment.tiller_deploy: 1 error(s) occurred:

* kubernetes_deployment.tiller_deploy: an error on the server ("service unavailable") has prevented the request from succeeding
* module.nginx_ingress.helm_release.nginx_ingress: 1 error(s) occurred:

* helm_release.nginx_ingress: error creating tunnel: "could not find tiller"

Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.


[terragrunt] 2019/01/09 11:16:48 Detected 1 Hooks
[terragrunt] 2019/01/09 11:16:48 Hit multiple errors:
exit status 1

jimmiebtlr · 2019-04-01T14:26:55Z

Fairly sure that's a terraform limitation. Providers can't take data from resources and work the first run.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rebase on upstream 1.4.0 or cherry-pick important fixes? #95

rebase on upstream 1.4.0 or cherry-pick important fixes? #95

jhoblitt commented Jan 9, 2019

sl1pm4t commented Jan 9, 2019

jhoblitt commented Jan 9, 2019 •

edited

Loading

jimmiebtlr commented Apr 1, 2019

rebase on upstream 1.4.0 or cherry-pick important fixes? #95

rebase on upstream 1.4.0 or cherry-pick important fixes? #95

Comments

jhoblitt commented Jan 9, 2019

sl1pm4t commented Jan 9, 2019

jhoblitt commented Jan 9, 2019 • edited Loading

jimmiebtlr commented Apr 1, 2019

jhoblitt commented Jan 9, 2019 •

edited

Loading