Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for K8s pod #13571

Merged
merged 1 commit into from
Jun 8, 2017

Conversation

ashishth09
Copy link
Contributor

I was just testing it and looks default volumes are created as part of Pod creation which is leading to the below error.

==> Checking that code complies with gofmt requirements...
go generate $(go list ./... | grep -v /terraform/vendor/)
2017/04/12 17:58:48 Generated command/internal_plugin_list.go
TF_ACC=1 go test ./builtin/providers/kubernetes -v -run TestAccKubernetesPod_basic -timeout 120m
=== RUN   TestAccKubernetesPod_basic
--- FAIL: TestAccKubernetesPod_basic (3.24s)
	testing.go:280: Step 0 error: After applying this step, the plan was not empty:
		
		DIFF:
		
		UPDATE: kubernetes_pod.test
		  spec.0.containers.0.resources.#:                "1" => "0"
		  spec.0.containers.0.resources.0.limits:         "" => ""
		  spec.0.containers.0.resources.0.requests:       "" => ""
		  spec.0.containers.0.volume_mounts.#:            "1" => "0"
		  spec.0.containers.0.volume_mounts.0.mount_path: "/var/run/secrets/kubernetes.io/serviceaccount" => ""
		  spec.0.containers.0.volume_mounts.0.name:       "default-token-ndgzz" => ""
		  spec.0.containers.0.volume_mounts.0.read_only:  "true" => "false"
		  spec.0.containers.0.volume_mounts.0.sub_path:   "" => ""
		  spec.0.volumes.#:                               "1" => "0"
		  spec.0.volumes.0.aws_elastic_block_store:       "" => ""
		  spec.0.volumes.0.azure_disk:                    "" => ""
		  spec.0.volumes.0.azure_file:                    "" => ""
		  spec.0.volumes.0.ceph_fs:                       "" => ""
		  spec.0.volumes.0.cinder:                        "" => ""
		  spec.0.volumes.0.fc:                            "" => ""
		  spec.0.volumes.0.flex_volume:                   "" => ""
		  spec.0.volumes.0.flocker:                       "" => ""
		  spec.0.volumes.0.gce_persistent_disk:           "" => ""
		  spec.0.volumes.0.glusterfs:                     "" => ""
		  spec.0.volumes.0.host_path:                     "" => ""
		  spec.0.volumes.0.iscsi:                         "" => ""
		  spec.0.volumes.0.name:                          "default-token-ndgzz" => ""
		  spec.0.volumes.0.nfs:                           "" => ""
		  spec.0.volumes.0.persistent_volume_claim:       "" => ""
		  spec.0.volumes.0.photon_persistent_disk:        "" => ""
		  spec.0.volumes.0.quobyte:                       "" => ""
		  spec.0.volumes.0.rbd:                           "" => ""
		  spec.0.volumes.0.vsphere_volume:                "" => ""
		
		STATE:
		
		kubernetes_pod.test:
		  ID = default/tf-acc-pod-test-wm4gasjx7y
		  metadata.# = 1
		  metadata.0.annotations.% = 0
		  metadata.0.generate_name = 
		  metadata.0.generation = 0
		  metadata.0.labels.% = 1
		  metadata.0.labels.app = pod_label
		  metadata.0.name = tf-acc-pod-test-wm4gasjx7y
		  metadata.0.namespace = default
		  metadata.0.resource_version = 336409
		  metadata.0.self_link = /api/v1/namespaces/default/pods/tf-acc-pod-test-wm4gasjx7y
		  metadata.0.uid = ae63549f-1f7b-11e7-a206-22422a00791b
		  spec.# = 1
		  spec.0.active_deadline_seconds = 0
		  spec.0.containers.# = 1
		  spec.0.containers.0.args.# = 0
		  spec.0.containers.0.command.# = 0
		  spec.0.containers.0.image = nginx:1.7.9
		  spec.0.containers.0.image_pull_policy = IfNotPresent
		  spec.0.containers.0.name = cname
		  spec.0.containers.0.ports.# = 1
		  spec.0.containers.0.ports.0.container_port = 9080
		  spec.0.containers.0.ports.0.host_ip = 
		  spec.0.containers.0.ports.0.host_port = 0
		  spec.0.containers.0.ports.0.name = 
		  spec.0.containers.0.ports.0.protocol = TCP
		  spec.0.containers.0.resources.# = 1
		  spec.0.containers.0.volume_mounts.# = 1
		  spec.0.containers.0.volume_mounts.0.mount_path = /var/run/secrets/kubernetes.io/serviceaccount
		  spec.0.containers.0.volume_mounts.0.name = default-token-ndgzz
		  spec.0.containers.0.volume_mounts.0.read_only = true
		  spec.0.containers.0.volume_mounts.0.sub_path = 
		  spec.0.containers.0.working_dir = 
		  spec.0.dns_policy = ClusterFirst
		  spec.0.host_ipc = false
		  spec.0.host_network = false
		  spec.0.hostname = 
		  spec.0.image_pull_secrets.# = 0
		  spec.0.node_name = minikube
		  spec.0.node_selector.% = 0
		  spec.0.restart_policy = Always
		  spec.0.service_account_name = default
		  spec.0.subdomain = 
		  spec.0.termination_grace_period_seconds = 30
		  spec.0.volumes.# = 1
		  spec.0.volumes.0.aws_elastic_block_store.# = 0
		  spec.0.volumes.0.azure_disk.# = 0
		  spec.0.volumes.0.azure_file.# = 0
		  spec.0.volumes.0.ceph_fs.# = 0
		  spec.0.volumes.0.cinder.# = 0
		  spec.0.volumes.0.fc.# = 0
		  spec.0.volumes.0.flex_volume.# = 0
		  spec.0.volumes.0.flocker.# = 0
		  spec.0.volumes.0.gce_persistent_disk.# = 0
		  spec.0.volumes.0.glusterfs.# = 0
		  spec.0.volumes.0.host_path.# = 0
		  spec.0.volumes.0.iscsi.# = 0
		  spec.0.volumes.0.name = default-token-ndgzz
		  spec.0.volumes.0.nfs.# = 0
		  spec.0.volumes.0.persistent_volume_claim.# = 0
		  spec.0.volumes.0.photon_persistent_disk.# = 0
		  spec.0.volumes.0.quobyte.# = 0
		  spec.0.volumes.0.rbd.# = 0
		  spec.0.volumes.0.vsphere_volume.# = 0
FAIL
exit status 1
FAIL	github.com/hashicorp/terraform/builtin/providers/kubernetes	3.310s

@ashishth09
Copy link
Contributor Author

Hi @radeksimko ,
I made some more changes basically putting the Compute/ForceNew attributes according to my understanding from the APIs and some testing. But my earlier problem as mentioned above remains.
So basically when a Pod is created the default secret is being made available to it via volumes
For example , the below snippet is obtained on "describing" a pod.

  default-token-ndgzz:
    Type:	Secret (a volume populated by a Secret)
    SecretName:	default-token-ndgzz

So when user inputs some specific volumes to be mounted in the "tf" file what comes back in the API read is a Pod which has 2 volumes , one the user specified and the one k8s added by default.

So, I was thinking to loop over the user input of volumes in the tf configuration and just check it is present in the volumes obtained from the API in the Read operation phase without setting back the schema ( as schema already has the user input).

Is there any other way to deal with such cases where some resources get attached/created over and above the user input?

@radeksimko radeksimko self-assigned this Apr 27, 2017
Copy link
Member

@radeksimko radeksimko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ashishth09
thank you very much for the PR. It was nearly on the top of my todo list. 😄

As you might know this is the biggest resource (in terms of LOC) and possibly the most critical one in K8S. I'll try to keep reviewing changes you make in a timely manner, but expect a few more feedback cycles with this PR as we get closer to merge time 😃

Overall the PR is looking as pretty good start 👍 Out of interest - did you do this all by hand or did you use https://github.com/radeksimko/terraform-gen ?

So, I was thinking to loop over the user input of volumes in the tf configuration and just check it is present in the volumes obtained from the API in the Read operation phase without setting back the schema ( as schema already has the user input).

That is pretty much what we'll have to do. I did something similar for internal K8S annotations already. The filtering shouldn't be too hard, the tricky part is to reliably check if a given volume or container is "internal" and avoid ignoring any user-defined volumes and containers.

What about 1st filtering all volumes with name prefixed as default-token- & Type=Secret and then double check via Secrets API if that the secret has this annotation?

kubernetes.io/service-account.name: default

The user shouldn't be ever defining such annotation - at least we don't allow internal k8s annotations to be defined by the user in kubernetes_secret and users shouldn't be generally touching annotations unless it's for experiments with alpha features.

Also once we're happy with the implementation we'll also need some docs. Feel free to use the mentioned generator for bootstrapping it.

}
in := p[0].(map[string]interface{})

if v, ok := in["active_deadline_seconds"].(*int64); ok {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the expected type here is int (without the pointer).

Elem: &schema.Resource{
Schema: containerSchema(),
},
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the pod spec schema is missing a few fields, specifically init_containers, host_pid and security_context. Is there any specific reason you left these out?

att["active_deadline_seconds"] = *in.ActiveDeadlineSeconds
}
if in.DNSPolicy != "" {
att["dns_policy"] = in.DNSPolicy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'll need to cast to string here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the terraform core is somehow taking care of serializing the right type ? Seems like magically working. I will check this again and put the cast.

att["node_selector"] = in.NodeSelector
}
if in.RestartPolicy != "" {
att["restart_policy"] = in.RestartPolicy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


if len(in.NodeSelector) > 0 {
att["node_selector"] = in.NodeSelector
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is duplicate code - we have NodeSelector already above.

}
}
}
`, name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good start, but we'll certainly need better test coverage - especially because pod is such an important critical resource in the context of K8S.

A few examples on my mind:

  • TestAccKubernetesPod_withVolumes
  • TestAccKubernetesPod_withVolumeMounts
  • TestAccKubernetesPod_withInitContainers

we both might come up with more as we test this and observe some failure scenarios.

if v.PersistentVolumeClaim != nil {
obj["persistent_volume_claim"] = flattenPersistentVolumeClaimVolumeSource(v.PersistentVolumeClaim)
}
//More values needed here
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have a look at flattenPersistentVolumeSource - I think we'll need something similar here. e.g.

if v.GCEPersistentDisk != nil {
	obj["gce_persistent_disk"] = flattenGCEPersistentDiskVolumeSource(v.GCEPersistentDisk)
}

//Is this right way to do this
//p := expandPersistentVolumeSource([]interface{}{v})
//vl[i].AWSElasticBlockStore = p.AWSElasticBlockStore
//vl[i].AzureDisk = p.AzureDisk
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we won't be able to leverageexpandPersistentVolumeSource here because it's a different type (Volume vs PersistentVolumeSource) and the VolumeSource is embedded directly within Volume.

The best way to do this is probably steal some code from there and copy it here. Fortunately all of the source-specific expanders should work just fine, i.e. I'd expect something like this:

if v, ok := in["gce_persistent_disk"].([]interface{}); ok && len(v) > 0 {
	vl[i].GCEPersistentDisk = expandGCEPersistentDiskVolumeSource(v)
}

}
vl := make([]api.Volume, len(volumes))
for i, c := range volumes {
v := c.(map[string]interface{})
Copy link
Member

@radeksimko radeksimko May 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for clarity it would be nice to perhaps call it m (as "map") rather than v. That will free you up that variable name below where you can use it as "value".

While "value" is not great variable name, the map here is more significant because it is used in more lines - as opposed to 3-line conditionals where it matters less.

},
"containers": {
Type: schema.TypeList,
Optional: true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this field cannot be updated, so I think we'll need to mark it as such via ForceNew: true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that deployment or replication controllers etc may use pod's schema and they would allow the containers to be updated, sort of, basically they would destroy and create a new set of containers internally. So I thought it would be better to leave it as it is. As providing ForceNew and using the same schema in deployment would lead to deployment destruction. I am not very sure how you would typically handle this one. Perhaps by not making the deployment depend on the pod schema?. This would lead to lots of code duplication though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 I'm not sure I understand entirely, but either we need to perform a real update/patch API operation for this field or mark it as ForceNew - there is no other expectation from the user - either they want it to be updated or recreated, certainly not ignored.

I will bump kubernetes_replication_controller resource in priority to understand better how is it affected by pod in this context, but I won't get to it until Friday as I'm off to Google Next on Wed+Thu.

Copy link
Member

@radeksimko radeksimko May 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some food for thought:
I wonder if it's possible to reference a previously created PodTemplate via ObjectMeta:
https://github.com/kubernetes/kubernetes/blob/v1.5.3/pkg/api/types.go#L1968

Not that we could avoid the coupling between potential kubernetes_pod_template and kubernetes_pod but at least it's more obvious for these resources and more importantly recreation wouldn't hurt if we could then just update the reference in kubernetes_replication_controller rather than the whole spec.

Recreating whole RC just to update PodSpec is indeed overkill - I agree with you here.

@radeksimko radeksimko added the waiting-response An issue/pull request is waiting for a response from the community label May 1, 2017
@ashishth09
Copy link
Contributor Author

Hi @radeksimko ,

I was waiting for your reply 😄 So during that time I had fixed most of the issues you have mentioned. Right, testing is going to be a key here. The schema has tons of fields. In cases , I skipped some fields as you mentioned. I was just curious to see this up and running with very basic things working first. No other reasons to leave them out 😉

I will update the PR soon with the new changes.

Thanks for inputs and pointers. I didn't know about https://github.com/radeksimko/terraform-gen. This is going to be a great tool to use 👍

@ashishth09 ashishth09 force-pushed the provider/kubernetes/pod branch 3 times, most recently from 543ee12 to 11959a5 Compare May 8, 2017 16:40
@ashishth09
Copy link
Contributor Author

Hi @radeksimko,
Before I add in more changes, I think I would like to get the versioning thing right.
I am following the v1.5 APIs as I see the main provider.go referred that from the very beginning. So , I assume that is the one we are supporting. For example pod of v1.5 doesn't have the initContainers while v1.6 has it in addition to other fields which are not present in the v1.5.
https://kubernetes.io/docs/api-reference/v1.5/#podspec-v1

@radeksimko
Copy link
Member

Hi @ashishth09
that's a very good point. For the initial phase (until we get all stable K8S resources covered) I'd like to stick with 1.5.3. We can deal with 1.6+ later.

I was following the source code of vendored K8S code and that does contain initContainers 🤔
https://github.com/kubernetes/kubernetes/blob/v1.5.3/pkg/api/types.go#L1776-L1777

My Google-fu allowed me to discover https://kubernetes.io/docs/concepts/workloads/pods/init-containers/ which says

NOTE: This feature is beta in Kubernetes 1.5.

so feel free to leave this one out as we only aim to cover stable fields/resources - sorry for the confusion.

@radeksimko
Copy link
Member

@ashishth09 I just spent some time reading more about Pods & PodTemplates & RCs and you're right in the sense that containers cannot be added nor removed for any pods, whereas RCs can update the attached pod with pretty much no limitations.

The TemplateRef attribute was removed in the early days of K8S and wasn't added back since then: kubernetes/kubernetes#8980 so that won't help us here.

In addition to that the schema library doesn't have a way to express this behaviour cleanly (i.e. we have not implemented "ForceNewWhen" yet). To keep things simple I'd just fall back to the API validation and let invalid updates (which are adding or removing whole containers) pass through.
It's not ideal, but better than marking everything as ForceNew as you mentioned. In other words leave it as is.

I will go through the PR again and see if there's anything left. Based on your recent updates I'm assuming it's ready for another round of review.

Copy link
Member

@radeksimko radeksimko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is starting to look better 🙂

Besides what I mentioned I see that both 2 acceptance tests are currently failing.

Let me know if you have any questions or need any help.
Thank you for all the work so far.

Default: false,
Description: `Run container in privileged mode. Processes in privileged containers are essentially equivalent to root on the host.`,
},
"read_only_root_filesystem ": {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a forgotten trailing space here.

}
log.Printf("[INFO] Pod %s created", out.Name)

d.SetId(buildId(out.ObjectMeta))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be safer to keep this line above the retry logic so that if the pod fails to reach "Running" for any reason we can keep it in the state and let the user remove it.

namespace, name := idParts(d.Id())
ops := patchMetadata("metadata.0.", "/metadata/", d)
if d.HasChange("spec") {
specOps, err := patchPodSpec("/spec", "spec", d)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind including the 0th index in the prefix too, so that we don't hide it in the implementation? It may sound like a nitpick, but it's also to make the patchPodSpec reusable just in case we'd change the parent schema in any way in the future.

return fmt.Errorf("Failed to marshal update operations: %s", err)
}

log.Printf("[INFO] Updating pod%s: %s", d.Id(), ops)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing space here. 👓

return err
}

err = d.Set("spec", flattenPodSpec(pod.Spec, userSpec))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand this - why exactly do you need the config (userSpec) here? We should be able to filter out the internal volumes without it, I think - unless I'm missing something 🤔

obj := map[string]interface{}{}

if !userVolumeNames[v.Name] {
//Found a volume which was not provided by user in the tf configuration file and must be internal
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit tricky situation here. It is a convention and expectation from the user that Terraform will detect any drifts of the config from the reality - i.e. if someone adds a volume outside of Terraform (e.g. by calling K8S API directly or just by using kubectl) we need to be able to show that to the user.

At this moment with this current implementation we'd hide it as we're expecting that the config is single point of truth.

Also we would be wiping those internal volumes during each update, which makes me think that DiffSuppressFunc would be good candidate for this - instead of having the filtering logic in flattener and expander.

Is there any reason you want to avoid the proposed solution I mentioned earlier?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had this locally on my laptop last time before I had asked for help 😄. So I am yet to implement the logic you suggested. This would take care of the userSpec in the above review comment

return vl, nil
}

func expandPodTemplateSpec(p []interface{}) (v1.PodTemplateSpec, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can see this function is currently unused? 👀 I'm happy for you to help with the implementation of RC 😃 , but I'd prefer if kept this PR only focused on Pod.

Value: v.Image,
})
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming there's still some work left to do in terms of missing fields?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can only update the active_deadline_seconds and container's image in a Pod.

es = append(es, fmt.Errorf("%s must be greater than 0", key))
}
return
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since they're all the same, how would you feel about merging all the above validation functions into one and call it e.g. validatePositiveInteger?

@ashishth09
Copy link
Contributor Author

Hi @radeksimko ,

Thanks for the review comments. Right still some work to do in terms of missing fields. Basically the containers one now. I think I have Pod level one's covered. I will recheck.

I think I have those tests passing though 😄 I am running them against minikube on my system.

go generate $(go list ./... | grep -v /terraform/vendor/)
2017/05/10 21:58:20 Generated command/internal_plugin_list.go
TF_ACC=1 go test ./builtin/providers/kubernetes -v -run TestAccKubernetesPod_basic -timeout 120m
=== RUN   TestAccKubernetesPod_basic
--- PASS: TestAccKubernetesPod_basic (3.26s)
PASS
ok  	github.com/hashicorp/terraform/builtin/providers/kubernetes	3.328s
==> Checking that code complies with gofmt requirements...
go generate $(go list ./... | grep -v /terraform/vendor/)
2017/05/10 21:59:30 Generated command/internal_plugin_list.go
TF_ACC=1 go test ./builtin/providers/kubernetes -v -run TestAccKubernetesPod_with_pod_security_context -timeout 120m
=== RUN   TestAccKubernetesPod_with_pod_security_context
--- PASS: TestAccKubernetesPod_with_pod_security_context (1.51s)
PASS
ok  	github.com/hashicorp/terraform/builtin/providers/kubernetes	1.586s

@radeksimko
Copy link
Member

@ashishth09
Copy link
Contributor Author

@radeksimko , Thanks for putting it out. I will take a look.

@ashishth09
Copy link
Contributor Author

@radeksimko ,

I see the following lines in the gist

		  spec.0.containers.0.resources.0.limits.% = 0
		  spec.0.containers.0.resources.0.requests.% = 1
		  spec.0.containers.0.resources.0.requests.cpu = 100m

The test case doesn't set those values.
I hope I am running the same test as you are 🤔

I need to see if there is some default values getting set internally by the k8s 😃 based on some rules at the cluster level itself.
I know some limit ranges can be set at the namespace level. So I need to check that also for future cases to see if they would come back during the read operation as default values set.

For requests I am unsure though. I don't think k8s will set any default value for the requests as it should come from the user.
Anyway, I will look into this in some detail. At this point it looks like the prepped cluster rules might be different for both of us like limit ranges etc, which might help find subtle bugs , which is good 😄

@ashishth09
Copy link
Contributor Author

Hi @radeksimko ,

So I created a limit range at the cluster level on some namespace and ran the test on the same namespace. I was able to re-create the issue. Now the problem is how to solve it 😃

The resources can have limits and requests field which can get default values if some one has created a limits resource on the cluster.

Containers resources can't be updated for a Pod. So no one can add/remove resources for a container outside terraform also. Hence no drift can be seen from outside as well.

So should I just check if the input value for resources in tf file is same as what we get on reading back the pod and ignore persisting it if it differs.

@ashishth09
Copy link
Contributor Author

Hi @radeksimko ,
I also found another case like this

	DIFF:
		
		UPDATE: kubernetes_pod.test
		  spec.0.image_pull_secrets.#:      "1" => "0"
		  spec.0.image_pull_secrets.0.name: "bluemix-default-secret" => ""
		
		STATE:
		
		kubernetes_pod.test:
		  ID = default/tf-acc-test-fd11k346y7
		  metadata.# = 1
		  metadata.0.annotations.% = 0
		  metadata.0.generate_name = 
		  metadata.0.generation = 0

So if image_pull_secrets is not defined in the user tf file but someone has defined the secret at the cluster level then we we read back we get those results which we should ignore.

So at this point we need to handle

  • image_pull_secrets from pod
  • resources from containers as described in the previous comment.

My first thought is to just persist them if they are there in the tf file else ignore.
Please let me know your thoughts.

@radeksimko
Copy link
Member

Hi @ashishth09 👋
see my answers below ⬇️

Containers resources can't be updated for a Pod. So no one can add/remove resources for a container outside terraform also. Hence no drift can be seen from outside as well.

The fact that a part of a resource is not updatable doesn't mean that a drift cannot occur. Someone can still recreate the resource under the same name with different attributes. More importantly we usually stick a to this convention (of always detecting any possible drift) because it's how the most basic import mechanism (passthrough) works. It just calls Read() which should simply populate all fields from the API without relying on any config.

See how we verify this behaviour in other resources by greping for _importBasic tests.

My first thought is to just persist them if they are there in the tf file else ignore.

For all of the cases where a drift may occur because there are some internal "subresources" the preferred approach is to detect such internal subresources and filter those out without checking any user config.

In this specific case I'd suggest we look up some details about bluemix-default-secret secret from the K8S secrets API and verify it's a default secret (there is hopefully something special about it which differentiates it from user-defined secrets for different context).

I hope it makes sense. 😃

@ashishth09
Copy link
Contributor Author

Hi @radeksimko ,

Sure, I get your drift (pun intentional 😃.) I think before proceeding any further to fine tune things, I will need your review of the image_pull_secrets and the resource_requirements of the containers.

From what I see image_pull_secrets can get the same value as you define or if you don't then if service account has those defined. So I made it compute: true, optional: true

For the container resource requirements https://kubernetes.io/docs/api-reference/v1.5/#resourcerequirements-v1 too, if there is any limitrange define on the cluster then the default values from that could be assigned as the resource requirements of the containers, else whatever use provides that is picked. So I made the individual fields compute: true, optional: true . I am not sure if this is the right way.

@radeksimko
Copy link
Member

Hah, drift 😄

I will have a look tomorrow - I was working on kubernetes_service_account in the past days which has very similar challenges like pod here - it gets a default secret created and assigned to it which causes diffs.

I really struggled to reliably identify what's default (machine-managed) and what may come from the user, so I totally feel your pain.

@radeksimko
Copy link
Member

Hi @ashishth09
just to give you a quick update - I spent some time looking around for solutions of the mentioned drift problem.

The cleanest approach which I think we should take is to enforce this new flag when creating the pod:
kubernetes/website@c1b88aa#diff-2c0d882359718c4c69c67099bed2156cR38

Unfortunately it was added in 1.6 so I'll need to bump the vendored k8s repo to support this. I thought originally it could be avoided until we finish implementing all v1 resources, but this seems like a good reason to do it sooner.

I will hold off on submitting Service Account because of the same problem and RC (because it will reuse part of the schema from your PR here). I will see if I can finish kubernetes_horizontal_pod_autoscaler as that would be the last v1 resource we can cover prior to upgrading to 1.6.

Thank you for the effort so far & for the patience. I hope it all makes sense? 😅

@radeksimko
Copy link
Member

^ that is to address the secrets drift - I haven't looked into the other problem you mentioned about limit ranges yet

@ashishth09
Copy link
Contributor Author

I think by enforcing that flag we are letting go of
You can access the API from inside a pod using automatically mounted service account credentials
I am not sure if this is something that user would need to use frequently, probably only when they need to debug something or may be never 😃 So we should be fine turning auto-mounting off.

I think enforcing that flag would work for only "Mountable secrets" as seen in the output below

kubectl  describe sa
Name:		default
Namespace:	default
Labels:		<none>

Image pull secrets:	myregistrykey

Mountable secrets: 	default-token-ndgzz

Tokens:            	default-token-ndgzz

So on a pod, I need to know whether the image_pull_secret is internal or not. I don't think we can find that out without taking user input into consideration.

Case 1) User specifies ["myregistrykey"]. We persist it even though it matches the service account "Image pull secrets".

Case 2) User specifies ["abcde"]. That secrets exists in the same namespace but it is not part of the current service account. We need to persist it.

@ashishth09
Copy link
Contributor Author

Hi @radeksimko ,

I think I will take this opportunity to learn the correct usage of optional and compute both set to true on a schema paramter. My understanding is if something could be assigned by server side of the API even if user doesn't give the value for that then it should be optional + computed in the schema.

Is the Image pull secrets above not really one?. I mean If I don't provide one in tf file then one gets assigned to if one is set at the service account level. Else it sets what the user provides.

So I don't have to worry if something like "myregistrykey" shows up in the terraform state even if the tf configuration file doesn't have that because I make it optional + computed.
Let me know I am missing something.

Thanks

@radeksimko
Copy link
Member

@ashishth09 you're right, but this rule only applies if the optional user input replaces the default value completely, not when the default value is a value that you add up to. There's no such thing as "partially optional" - like half of the map/list being computed and half defined by the user. If a field is Optional & Computed it just means it can be either computed or optional, but never both at the same time.

In other words - even if you specify your own image pull secrets AFAIK the kubelet will assign the default secrets anyway and you'd be then fighting with kubelet by wiping that default secret in any subsequent apply as it would keep appearing in the diff.

Correct me if I'm wrong, but I remember looking at that specific behaviour a few weeks back and it behaved that way.


re my current state - I'm dealing with vendoring issues of k8s 1.6.4, but overall the resource coverage (after adding service + HPA) is now as far as we can get without this upgrade, so this upgrade is my main focus.

@ashishth09
Copy link
Contributor Author

@radeksimko ,

Sure even I got very little time in the last week to make any progress.
I just tried creating a secret kubectl create secret docker-registry regsecret --docker-server=<your-registry-server> --docker-username=<your-name> --docker-password=<your-pword> --docker-email=<your-email> and used that in the tf configuration.

It picked up that secret only rather than the default secret defined on the service account. There was only one secret , the secret given by user. So looks like it really is optional + computed holding only one of them at a time. Thanks for confirming the definition 😃 While testing I did find a bug also in my code, I will push that today.

While we are at it 😃 (and I am being selfish to learn things from you) was wondering about how optional + computed fields behave . Since it is computed and someone changes the field outside terraform we will not know the changes made to it if that field is not present in the tf file. Is that fine? I sense some drift occurring here.

@radeksimko
Copy link
Member

Since it is computed and someone changes the field outside terraform we will not know the changes made to it if that field is not present in the tf file.

If the field is not defined in the config then yes - it will behave exactly that way. That's why it's best to avoid Computed if it's not necessary and/or use Default instead where possible.

@radeksimko
Copy link
Member

FYI - just opened #14923 - also verified that it's actually fixing our problem with pod here. Feel free to subscribe to that PR to get notified when it's merged - then you can rebase and avoid all the complex logic for mounted secrets and then we can have discussion & debugging around the limit ranges.

Also you were right that this isn't solving similar problem with service account. I have just raised a question about that at SO: https://stackoverflow.com/questions/44262350/how-to-avoid-default-secret-being-attached-to-serviceaccount but I have a feeling it may not be possible and we'll probably end up filing an issue to upstream (K8S).

@radeksimko
Copy link
Member

@ashishth09 The PR upgrading to 1.6.1 was merged and I'd like to get this PR to the finish line and eventually help you take it there next week, if possible. Let me know how you see it. 😃

If you don't have any time next week, that's 👌 as well, just let me know and I can finish it for you (and keep the authorship of existing commits, obvs).

@ashishth09
Copy link
Contributor Author

Hi @radeksimko ,
Just changed the API references and all the tests passed 👍 . I don't want to slow this down due to my unavailability. Please take this forward. Thanks for all the support so far. Greatly appreciate 👍

Two things (apart from some new fields in the new API) that I feel are pending:
a) limit ranges issues -
b) I guess even if pods only updatable fields are , container images and activeDeadlineSeconds , we would need to take care of other fields in the patchPodSpec function. So if some one tries to update other fields he would at least get error from the API that they are not updatable.

I would be interested to know how we handle the limit range issues though. I had made them optional and computed.
Thanks

@radeksimko radeksimko removed the waiting-response An issue/pull request is waiting for a response from the community label Jun 6, 2017
@radeksimko
Copy link
Member

I will take care of the extra updatable fields, but I'm struggling to reproduce the issue with limit ranges 🤔 It looks like the behaviour has changed in 1.6 or I forgot what's the exact configuration needed to trigger the problem.

I tried this:

resource "kubernetes_namespace" "test" {
  metadata {
  	name = "blablah"
  }
}

resource "kubernetes_pod" "test" {
  metadata {
    name = "radek-pod-test"
    namespace = "${kubernetes_namespace.test.metadata.0.name}"
  }

  spec {
    containers {
      image = "nginx:1.7.9"
      name  = "containername"
    }
  }
  depends_on = ["kubernetes_limit_range.test"]
}

resource "kubernetes_limit_range" "test" {
  metadata {
    name = "radek-test-lr"
    namespace = "${kubernetes_namespace.test.metadata.0.name}"
  }
  spec {
    limit {
      type = "Pod"
      min {
        cpu = "1m"
        memory = "1M"
      }
    }
    limit {
      type = "Container"
      default_request {
        cpu = "50m"
        memory = "24M"
      }
    }
  }
}

and saw no drift.

Would you mind sharing any more details about your setup and/or your configs? Thanks

@ashishth09
Copy link
Contributor Author

ashishth09 commented Jun 7, 2017

@radeksimko ,

If there is a default request/limit set at the cluster level then K8s applies that to the Pod's resource requirements. If the Pod itself defines its own requirements then K8s would apply the pod's requirements. Following this, I set limits/requests to be computed and the problem was gone.

The change I did was -

func resourcesField() map[string]*schema.Schema {
return map[string]*schema.Schema{
"limits": {
Type: schema.TypeList,
Optional: true,
Computed: true,
MaxItems: 1,
Description: "Describes the maximum amount of compute resources allowed. More info: http://kubernetes.io/docs/user-guide/compute-resources/",
Elem: &schema.Resource{
Schema: map[string]*schema.Schema{
"cpu": {
Type: schema.TypeString,
Optional: true,
Computed: true,
ValidateFunc: validateResourceQuantity,
DiffSuppressFunc: suppressEquivalentResourceQuantity,
},
"memory": {
Type: schema.TypeString,
Optional: true,
Computed: true,
ValidateFunc: validateResourceQuantity,
DiffSuppressFunc: suppressEquivalentResourceQuantity,
},
},
},
},
"requests": {
Type: schema.TypeList,
Optional: true,
Computed: true,
MaxItems: 1,
Elem: &schema.Resource{
Schema: map[string]*schema.Schema{
"cpu": {
Type: schema.TypeString,
Optional: true,
Computed: true,
ValidateFunc: validateResourceQuantity,
DiffSuppressFunc: suppressEquivalentResourceQuantity,
},
"memory": {
Type: schema.TypeString,
Optional: true,
Computed: true,
ValidateFunc: validateResourceQuantity,
DiffSuppressFunc: suppressEquivalentResourceQuantity,
},
},
},
},
}

I wanted your view on this change, if you think that's the right way to do or not 😃

My apologies for not being clear on the topic 😞

@radeksimko
Copy link
Member

radeksimko commented Jun 7, 2017

@ashishth09 As far as I can tell limit ranges are namespace-specific:
https://kubernetes.io/docs/tasks/administer-cluster/cpu-memory-limit/ and resource quotas likewise: https://kubernetes.io/docs/concepts/policy/resource-quotas/

What do you mean by cluster-level limits and how does one set such limits?

@ashishth09
Copy link
Contributor Author

@radeksimko ,

Yes you are right. So the points I mentioned above are applicable to this namespace. Any new pods created in the namespace go though this rule. If default limits are set and you have not defined any in Pod then that is what is attached to the pod else Pod's limits are honored. This is what I observed.

I am not sure what happens when more than one default limits are defined in a namespace. If K8s do some kind of max(limits1, limits2, ...), min(limits1, limits2, ...) to determine what will be applied if you don't provide one. Whatever be the case, if we are sure this is optional and computed then I guess we don't need to worry?

@radeksimko
Copy link
Member

I'd just like to understand how to reproduce the problem and possibly add a regression test - that's all 😄 - hence I'm wondering how did you set such default cluster-wide limits? Is there any example with kubectl/YAML or anything like that you can share?

@ashishth09
Copy link
Contributor Author

ashishth09 commented Jun 7, 2017

No there is no cluster-wide limits. As you pasted the link in previous comment it is on a namespace level 😃
The problem came from https://gist.github.com/radeksimko/ee386a97e5cd809fe44b03fe373c1265#file-k8s-test-acc-failures-txt-L38-L40

So when I first wrote the code I didn't know that system is applying default limits to the Pod.
I guess in your environment there was some default limits defined in the same namespace you ran the test.

So to have similar environment at my end, I applied https://k8s.io/docs/tasks/configure-pod-container/limits.yaml to my local minikube environment to test my fix of setting limits as optional + computed in schema.

So you can't reproduce the same problem now 😃 Setting the value as computed alongwith optional solved it. I was just curious to know if this was the right thing to do, or it might break in certain other conditions if at all. My worry was mainly from the fact that calculation of limits seems to be somewhat involved it not very complicated. For example there are some conversions -

For example 100m cpu is 100 milicpu, and is the same as 0.1 cpu.. So I added some DiffSuppressFunc to handle this.

I think we would need to add some tests around this also.

Also how things would behave in case multiple limits are defined on the same namespace as pointed before. In general I think we might not worry about this. But some test cases to take care of internal unit conversion should be there.

@ashishth09
Copy link
Contributor Author

Just to add to that.. I had no test specifically for limits initially and still that limits issue came which you had shown in your gist. I was not setting any values in the my test configuration (there was no limits block)but k8s set the default value and hence terraform reported the diff below.

https://gist.github.com/radeksimko/ee386a97e5cd809fe44b03fe373c1265#file-k8s-test-acc-failures-txt-L38-L40

@radeksimko radeksimko force-pushed the provider/kubernetes/pod branch from 16ef37d to 7906235 Compare June 8, 2017 08:23
@radeksimko
Copy link
Member

radeksimko commented Jun 8, 2017

So I added some DiffSuppressFunc to handle this.

This was certainly the right decision and I saw a test covering this, so that's 👌

I cannot reproduce the issue with drifting limits anymore, but I will believe you as I saw that in the past too 😄 so I'll keep it as Computed. The only downside of making fields Computed is that Terraform will ignore any changes outside of configs if the field is not defined in the config - e.g. if someone modifies the container resources in the pod resources we won't notice, unless we have it explicitly defined in the config.

TL;DR If the field doesn't need to be computed it should not be.

I have pushed the following final touches to your branch:

  • documentation
  • tests to verify pod is actually being deleted (which resulted in adding retry to Delete as deleting pod may take time)
  • test for import
  • removal of all volume filtering logic as it's unnecessary w/ AutomountServiceAccountToken set to false
  • modified security context in 2 tests because GKE (which is our only testing environment atm) doesn't support containers/processes running as root
  • renamed certain TypeList fields (unless it's special case with MaxItems: 1) to singular due to the way it's structured in the HCL:
resource "kubernetes_pod" "test" {
  metadata {
    name = "terraform-example"
  }

  spec {
    container {
      image = "nginx:1.9.7"
      name  = "example"
    }
    container {
      image = "nginx:1.9.6"
      name  = "example-two"
    }
  }
}

I know there's another way to structure it w/ [] but it's not something we actively encourage people to do - at least not at this point.

@ashishth09
Copy link
Contributor Author

Right, with the earlier code, I think I used to do

resource "kubernetes_pod" "test" {
  metadata {
    name = "terraform-example"
  }

  spec {
    containers = [{
      image = "nginx:1.9.7"
      name  = "example"
    },
      {
        image = "nginx:1.9.6"
        name  = "example-two"
      },
    ]
  }
}

@radeksimko radeksimko changed the title [WIP] Add support for K8s pod Add support for K8s pod Jun 8, 2017
@radeksimko radeksimko merged commit a5c208e into hashicorp:master Jun 8, 2017
@ghost
Copy link

ghost commented Apr 11, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Apr 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants