Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Executing a command inside the pod in Reconciliation loop #4302

Closed
psaini79 opened this issue Dec 11, 2020 · 11 comments
Closed

Executing a command inside the pod in Reconciliation loop #4302

psaini79 opened this issue Dec 11, 2020 · 11 comments
Assignees
Labels
language/go Issue is related to a Go operator project lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. triage/needs-information Indicates an issue needs more information in order to work on it.
Milestone

Comments

@psaini79
Copy link

Type of question

Best practices

Question

What is the best way to run the command inside the Pod from the reconciliation loop?

What did you do?

I followed this https://github.com/halkyonio/hal/blob/78f0b5ee8e27117b78fe9d6d5192bc5b04c0e5db/pkg/k8s/client.go and implemented the same in reconciliation loop and it works but I am wondering is it the correct way as in my case user cannot pass the kubeconfig location:

        reqLogger := log.WithValues("Request.Namespace", request.Namespace, "Request.Name", request.Name)
        reqLogger.Info("Reconciling TestOP")
        var kubeConfig  clientcmd.ClientConfig
        var kubeClient  kubernetes.Interface
        // Kube Client Config Setup
        loadingRules := clientcmd.NewDefaultClientConfigLoadingRules()
        configOverrides := &clientcmd.ConfigOverrides{}
        kubeConfig = clientcmd.NewNonInteractiveDeferredLoadingClientConfig(loadingRules, configOverrides)
        config, err := kubeConfig.ClientConfig()
        if err != nil {
         return reconcile.Result{}, err
        }
        kubeClient, err = kubernetes.NewForConfig(config)
        if err != nil {
         return reconcile.Result{}, err
        }

Following is the function which is using remotecommand.NewSPDYExecutor to execute the command:

// ExecCMDInContainer execute command in first container of a pod
func ExecCommand(podName string, cmd []string, kubeClient kubernetes.Interface, kubeConfig  clientcmd.ClientConfig) (string) {

        var (
                execOut bytes.Buffer
                execErr bytes.Buffer
        )

        req := kubeClient.CoreV1().RESTClient().
                Post().
                Namespace("default").
                Resource("pods").
                Name(podName).
                SubResource("exec").
                VersionedParams(&corev1.PodExecOptions{
                        Command: cmd,
                        Stdin:   true,
                        Stdout:  true,
                }, scheme.ParameterCodec)

        config, err := kubeConfig.ClientConfig()
        if err != nil {
                return "Error"
        }

        // Connect to url (constructed from req) using SPDY (HTTP/2) protocol which allows bidirectional streams.
        exec, err := remotecommand.NewSPDYExecutor(config, "POST", req.URL())
        if err != nil {
                return "error"
        }

        err = exec.Stream(remotecommand.StreamOptions{
                Stdout: &execOut,
                Stderr: &execErr,
                Tty:    false,
        })

      return  "noerror"
}

What did you expect to see?

I wanted to what is the best way to read kubeconfig inside operator-sdk so that we can execute command inside the pod.

What did you see instead? Under which circumstances?

Environment

Operator type:

/language go

Kubernetes cluster type:

Testing/Deployment

$ operator-sdk version

operator-sdk version: "v1.2.0", commit: "215fc50b2d4acc7d92b36828f42d7d1ae212015c", kubernetes version: "v1.18.8", go version: "go1.15.3", GOOS: "linux", GOARCH: "amd64"

$ go version (if language is Go)

go version go1.15.5 linux/amd64

$ kubectl version

Client Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.0", GitCommit:"70132b0f130acc0bed193d9ba59dd186f0e634cf", GitTreeState:"clean", BuildDate:"2019-12-07T21:20:10Z", GoVersion:"go1.13.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.8", GitCommit:"336ccfb2560ff170c7ad8aad56944e43e4f6a170", GitTreeState:"clean", BuildDate:"2020-06-09T22:13:31Z", GoVersion:"go1.13.8 BoringCrypto", Compiler:"gc", Platform:"linux/amd64"}

Additional context

@openshift-ci-robot openshift-ci-robot added the language/go Issue is related to a Go operator project label Dec 11, 2020
@estroz estroz added the triage/needs-information Indicates an issue needs more information in order to work on it. label Dec 14, 2020
@estroz estroz added this to the Backlog milestone Dec 14, 2020
@jmrodri
Copy link
Member

jmrodri commented Dec 14, 2020

@psaini79 I'd like to better understand what you're trying to accomplish. Based on the information above I see a few things.

  1. You are trying to execute a command on a pod from your operator. I'm assuming you are trying to run it on the operand pod.
  2. Why are you trying to execute a command on the pod? What is it you are trying to accomplish there?
  3. You stated "I wanted to what is the best way to read kubeconfig inside operator-sdk so that we can execute command inside the pod." Which kubeconfig are you trying to read? The operator-sdk CLI will use the one you have configured on your system. And the operator-sdk command isn't usually run in the cluster.

If you'd like to chat real time we could chat on kubernetes slack #kubernetes-operators.

@psaini79
Copy link
Author

@jmrodri
Thanks for the reply. Below is my use case:

  1. I have a number of statefulsets created using operators under a single CR as they all are part of my application.
    E.g
apiVersion: example.com/v1
kind: ProvApp
metadata:
  name: provapp-sample
spec:
spec:
 myAppSpecs:
    - myAppName: testapp
      myAppSize: 1
      myAppSecretName: db-user-pass
    - myAppName: prodapp
      myAppStageSize: 50Gi
      myAppSize: 1
      myAppSecretName: secret-pass
 myRepoSpecs:
    - myRepoName: testrepo
      myRepoSecretName: secret-pass
  1. Whenever I delete testapp or any other stateful set from myAppSpecs , the operator login to myRepoSpecs and cleans all the records of statefulset testapp from myRepoSpecs in this case testrepo is the statefulset.

To achieve the same, I need to execute a command to delete the records on testapp statefulset pod which is testrepo0. What is the best way to achieve the above?

Sure, I will also connect with you on slack.

@coderanger
Copy link

Mentioned on Slack but overall this is not something OSDK or Kubernetes provides. kubectl exec exists for debugging and development, but it's not really suitable for this kind of production-ready use case. You'll need to build something into your tooling for it. If you really want remote execution, you can run sshd in a sidecar. Or you can build an HTTP (REST, gRPC, whatever you like) API into your service to handle this. A lot of it depends on your security and stability requirements.

@camilamacedo86
Copy link
Contributor

camilamacedo86 commented Dec 14, 2020

Hi @psaini79,

Why not use the finalizers? The finalizer let you implement what actions/operations should be done to allow effectively delete your CR. See: https://sdk.operatorframework.io/docs/building-operators/golang/advanced-topics/#handle-cleanup-on-deletion

Then, why you do not use the client provide in the controller to GET/LIST/DELETE/UPDATE all these resources as you wish before the CR be deleted, in the finalizers? See here an example which indeed does HTTP requests see here and here

@psaini79
Copy link
Author

@camilamacedo86 and @coderanger

Thanks for the reply.

I discussed with @coderanger on the slack and it seems we can use the exec method from the operator but should not be used as the exec method is not the correct way to handle update and delete. Instead, I should use sidecar container or REST server specific to my app. Sidecar will not help in my case. The app running on testrepo stateful set, I need to login in the app and delete the record. However, REST server can do this job but it is extra work for me.

I am trying to understand how can I use the finalizer as I am not deleting CR but deleting a statefulset. For example:

apiVersion: example.com/v1
kind: ProvApp
metadata:
  name: provapp-sample
spec:
spec:
 myAppSpecs:
    - myAppName: testapp
      myAppSize: 1
      myAppSecretName: db-user-pass
    - myAppName: prodapp
      myAppStageSize: 50Gi
      myAppSize: 1
      myAppSecretName: secret-pass
    - myAppName: devapp
      myAppStageSize: 50Gi
      myAppSize: 1
      myAppSecretName: secret-pass
 myRepoSpecs:
    - myRepoName: testrepo
      myRepoSecretName: secret-pass

Let us say as per the above conf, I have 4 statefulsets testapp, prodapp, devapp and testrepo are up running. User passes new CR like below:

apiVersion: example.com/v1
kind: ProvApp
metadata:
  name: provapp-sample
spec:
spec:
 myAppSpecs:
    - myAppName: testapp
      myAppSize: 1
      myAppSecretName: db-user-pass
 myRepoSpecs:
    - myRepoName: testrepo
      myRepoSecretName: secret-pass

Operator logic read the difference between previous statefulset configuration and new CR and perform delete on prodapp and testapp. However, before doing any deletion, operator connect to testrepo and execute a command to delete all the record and balance the data on testapp. Once that part is done then r.Client.Delete function is being called to delete the stateful set.

I am maintaining all the statefulset conf in status struct along with their name. i.e Whenever there is a successful create operation, I record the statefulset name in the status struct and when I found a difference between status struct and instance conf, I execute delete operation on missing statefulsets.

Please let me know if I can use finalizer per statefulset? if yes, how?

@camilamacedo86
Copy link
Contributor

Hi @psaini79,

If User passes new CR like below: then, it means that you are changing/updating the CR. So, if you use the watch feature it will retrigger the reconcile. The reconcile function is responsible for synchronizing the resources and their specifications according to the business logic implemented on them. In this way, it works like a loop, and it does not stop until all conditionals match its implementation. So, could you not implement an idempotent solution to do the required operations based on the CR state on the cluster?

@nicklasfrahm
Copy link

@camilamacedo86 I have been playing around with execution in the container via exec quiet a lot recently and I built myself the following helper. This might be good to add to the examples somewhere or make it a part of the manager client:

// executors.go
package executors

import (
	"bytes"
	"net/http"

	corev1 "k8s.io/api/core/v1"
	"k8s.io/apimachinery/pkg/types"
	"k8s.io/client-go/kubernetes"
	"k8s.io/client-go/rest"
	"k8s.io/client-go/tools/remotecommand"
	"k8s.io/kubectl/pkg/scheme"
)

// Executor implements the remote execution in pods.
type Executor struct {
	KubeClient *kubernetes.Clientset
	KubeConfig *rest.Config
	Pod        types.NamespacedName
	Container  string
}

// ExecutorResult contains the outputs of the execution.
type ExecutorResult struct {
	Stdout bytes.Buffer
	Stderr bytes.Buffer
}

// NewExecutor creates a new executor from a kube config.
func NewExecutor(kubeConfig *rest.Config) Executor {
	return Executor{
		KubeConfig: kubeConfig,
		KubeClient: kubernetes.NewForConfigOrDie(kubeConfig),
	}
}

// Select configures the pod that
func (e *Executor) Select(pod types.NamespacedName, container string) *Executor {
	e.Pod = pod
	e.Container = container
	return e
}

// Exec runs an exec call on the container without a shell.
func (e *Executor) Exec(command []string) (*ExecutorResult, error) {
	request := e.KubeClient.
		CoreV1().
		RESTClient().
		Post().
		Resource("pods").
		Namespace(e.Pod.Namespace).
		Name(e.Pod.Name).
		SubResource("exec").
		VersionedParams(&corev1.PodExecOptions{
			Command:   command,
			Container: e.Container,
			Stdout:    true,
			Stderr:    true,
			TTY:       true,
		}, scheme.ParameterCodec)

	result := new(ExecutorResult)
	exec, err := remotecommand.NewSPDYExecutor(e.KubeConfig, http.MethodPost, request.URL())
	if err != nil {
		return result, err
	}

	if err := exec.Stream(remotecommand.StreamOptions{Stdout: &result.Stdout, Stderr: &result.Stderr}); err != nil {
		return result, err
	}

	return result, nil
}
// main.go
...
	kubeConfig := ctrl.GetConfigOrDie()

	mgr, err := ctrl.NewManager(kubeConfig, ctrl.Options{
		Scheme:             scheme,
		MetricsBindAddress: metricsAddr,
		Port:               9443,
		LeaderElection:     enableLeaderElection,
		LeaderElectionID:   "example.com",
		Namespace:          "",
	})
	if err != nil {
		setupLog.Error(err, "Failed to start manager.")
		os.Exit(1)
	}

	if err = (&testcontroller.TestReconciler{
		Client:   mgr.GetClient(),
		Log:      ctrl.Log.WithName("controller").WithName("test"),
		Scheme:   mgr.GetScheme(),
		Executor: extensions.NewExecutor(kubeConfig),
	}).SetupWithManager(mgr); err != nil {
		setupLog.Error(err, "Failed to create controller.", "controller", "test")
		os.Exit(1)
	}
...

The API is not perfect, because it does not support stdin, but it worked nicely for my use-case.

@openshift-bot
Copy link

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci-robot openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 5, 2021
@openshift-bot
Copy link

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci-robot openshift-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 5, 2021
@openshift-bot
Copy link

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

@openshift-ci
Copy link

openshift-ci bot commented Jun 4, 2021

@openshift-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
language/go Issue is related to a Go operator project lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
None yet
Development

No branches or pull requests

8 participants