Introduce acsfleetctl binary #1074

johannes94 · 2023-06-06T07:56:40Z

Description

Move the Fleet-Manager CLI to it's own binary instead of embedding it in the fleet-manager binary. Some motivating facts for this PR:

Usage info for the CLI are massive, because of all the flags added to the root fleet-manager command:
Simple errors in CLI usage lead to panic logs with massive stack trace, crashing is fine for a service like fleet-manager serve because we don't want it to start with wrong flags/configuration. But for the CLI this is unnecessary
Usage of glog.Fatal() on multipel occasion leads to unmanaged and unrecoverable panic output of stacktraces and goroutines

Slack thread: https://redhat-internal.slack.com/archives/C02HTD51SMD/p1685620429880799

Checklist (Definition of Done)

~~[ ] Unit and integration tests added~~
Added test description under Test manual
~~[ ] Documentation added if necessary (i.e. changes to dev setup, test execution, ...)~~
CI and all relevant tests are passing
~~[ ] Add the ticket number to the PR title if available, i.e. ROX-12345: ...~~
~~[ ] Discussed security and business related topics privately. Will move any security and business related topics that arise to private communication channel.~~
~~[ ] Add secret to app-interface Vault or Secrets Manager if necessary~~
~~[ ] RDS changes were e2e tested manually~~
~~[ ] Check AWS limits are reasonable for changes provisioning new resources~~

Test manual

Run local fleet-manager and test CLI commands against it:

make db/teardown db/setup db/migrate
make binary

# local crc cluster started and kubectl configured
./fleet-manager serve

export OCM_TOKEN=$(ocm token --refresh)
rhoas login --auth-url=https://auth.redhat.com/auth/realms/EmployeeIDP

./acsfleetctl centrals create --name "test-central-1" --region "standalone" --provider "standalone"
# Instance got created

./acsfleetctl centrals list
# Prints CentralRequestList with the instance created

central_id=$(./acsfleetctl centrals list | jq '.items[0].id' -r )
./acsfleetctl centrals get --id $central_id
# Print the central request as JSON

export RHOAS_TOKEN=$(rhoas authtoken)
./acsfleetctl admin centrals list
# Print CentralList (Admin API DTO) with the created central

./acsfleetctl centrals delete --id $central_id
# Prints {status_code: 202}

# To run tests locally run:
make db/teardown db/setup db/migrate
make ocm/setup OCM_OFFLINE_TOKEN=<ocm-offline-token> OCM_ENV=development
make verify lint binary test test/integration

openshift-ci · 2023-06-06T07:56:44Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

SimonBaeumer

LGTM, looks simpler than expected. I am convinced 👍

SimonBaeumer · 2023-06-06T09:00:28Z

internal/dinosaur/pkg/cmd/cliflags/cliflags.go

@@ -0,0 +1,51 @@
+// Package cliflags defines util methods for flags used in acsfleetctl
+package cliflags


There is a package: pkg/flags/flags.go

I didn't use it on purpose, see the changes in comments I did for pkg/flags/flags.go. The problem is the package glog.Fatals a lot if flags are not set. With glog.Fatal there is a problem:

It is unrecoverable because it calls ox.Exit(2) directly after logging and not panic (that's why I changed the comment), this does not allow us to intercept the exit on top level with the recover function and modify output to the output we desired for the CLI. I didn't want to change the fleet-manager serve/migrate behaviour that's why I created another cliflags package just for CLI.

In this case I think we should return an error instead of os.Exit or panic.
What is your concern about serve and migrate behaviour?
I see these changes as uncritical because it is parsing CLI flags which does not crash in prod (except we messed something up pretty bad).

Personally, I want this to be a panic for the CLI.

I think this makes it easier to handle errors that cause termination of the program. If a required flag is not set I want to terminate the whole program. We could do this in two ways:

Return an error and propagate it through every function with if err != nil {return err}, than handle it at some point to terminate the application

panic and immediately go up the stack without having to passthrough the error to the top, handle the error in recover and exit the application

IMO option 2 yields more readable/less noisy code. According to [this post](https://stackoverflow.com/questions/35412449/why-did-go-add-panic-and-recover-in-addition-to-error-handling/35413011#35413011 that's also one of the reasons panic/recover), this is why panic/recover was implemented in Go.

What is your concern about serve and migrate behaviour?

No concern that can not be addressed, just don't wanted to change anything in that regards in this specific PR.

I still see disadvantages for the pruposed solution and not convinced by it. The panic in the library shows its architectural weakness in two disadvantages here:

Source code is duplicated for a panic when it could return an error instead and be supported by both libraries. It shows a leaky abstraction because the same problem is solved twice to exchange its output format (here panic & glog.Fatal). However, there could be wrapper functions to panic or glog.Fatal and the caller decides what to use.

Deferrable recovery is more complex to read compared to an if err != nil because an engineer

most look into the deferred function

keep a map in mind that there is a deferred function executed

We had short call to discuss this, Outcome:

Duplication of flags package removed glog.Fatal replaced by panic calls

CLI implementation itself stays with calling "MustGet*" function from the flags package for required flags.

SimonBaeumer

I see similar messages twice now locally:

❯ ./acsfleetctl centrals get --debug
Error: required flag(s) "id" not set
Usage:
  acsfleetctl centrals get [flags]

Flags:
  -h, --help        help for get
      --id string   Central ID (required)

Global Flags:
      --debug   use debug output

Error executing command: required flag(s) "id" not set%

And could not see any changes when I removed the panic recovery and no difference with the --debug flag enabled.

❯ ./acsfleetctl centrals --debug get
Error: required flag(s) "id" not set
Usage:
  acsfleetctl centrals get [flags]

Flags:
  -h, --help        help for get
      --id string   Central ID (required)

Global Flags:
      --debug   use debug output

Error executing command: required flag(s) "id" not set%
❯ ./acsfleetctl centrals --debug delete

@johannes94 Can you confirm this behaviour?

johannes94 · 2023-06-12T13:23:50Z

I can confirm, the behaviour above. I removed the second log from the code @SimonBaeumer

openshift-ci · 2023-06-13T07:34:42Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: johannes94, SimonBaeumer

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [SimonBaeumer,johannes94]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci bot added the do-not-merge/work-in-progress label Jun 6, 2023

johannes94 requested a review from SimonBaeumer June 6, 2023 07:56

johannes94 temporarily deployed to development June 6, 2023 07:56 — with GitHub Actions Inactive

openshift-ci bot added the approved label Jun 6, 2023

johannes94 temporarily deployed to development June 6, 2023 08:33 — with GitHub Actions Inactive

SimonBaeumer reviewed Jun 6, 2023

View reviewed changes

johannes94 marked this pull request as ready for review June 6, 2023 09:10

openshift-ci bot removed the do-not-merge/work-in-progress label Jun 6, 2023

johannes94 temporarily deployed to development June 6, 2023 09:10 — with GitHub Actions Inactive

johannes94 had a problem deploying to development June 9, 2023 12:39 — with GitHub Actions Failure

johannes94 temporarily deployed to development June 9, 2023 12:39 — with GitHub Actions Inactive

johannes94 had a problem deploying to development June 9, 2023 12:45 — with GitHub Actions Failure

johannes94 temporarily deployed to development June 9, 2023 12:45 — with GitHub Actions Inactive

johannes94 force-pushed the jmalsam/cli-binary branch from b77a5f7 to 2026e5c Compare June 9, 2023 12:51

johannes94 had a problem deploying to development June 9, 2023 12:52 — with GitHub Actions Failure

johannes94 temporarily deployed to development June 9, 2023 12:52 — with GitHub Actions Inactive

johannes94 temporarily deployed to development June 12, 2023 06:17 — with GitHub Actions Inactive

johannes94 temporarily deployed to development June 12, 2023 06:40 — with GitHub Actions Inactive

SimonBaeumer reviewed Jun 12, 2023

View reviewed changes

johannes94 temporarily deployed to development June 12, 2023 10:12 — with GitHub Actions Inactive

johannes94 temporarily deployed to development June 12, 2023 15:19 — with GitHub Actions Inactive

johannes94 temporarily deployed to development June 12, 2023 17:09 — with GitHub Actions Inactive

johannes94 added 11 commits June 12, 2023 19:23

introduce new main faile and make target

11d6cee

use panic instead of glog.Fatal to prevent unmanaged CLI terminations

cf69034

use os.Exit(1) on cli panics

0954efd

use os.Exit(1) on rootCMD errors in CLI

3a7d018

fix command for recover call

6d8ef7b

remove duplicated flags package

095490d

remove RunE usage from create command

1a4d6a7

changed cliflags to flags for central create commmand

315bca2

remove error log since cobra is already logging errors

901b28a

remove recover call and debug flag

85cf269

remove recover call and debug flag

309ee90

johannes94 force-pushed the jmalsam/cli-binary branch from 2aff0a9 to 309ee90 Compare June 12, 2023 17:23

johannes94 temporarily deployed to development June 12, 2023 17:23 — with GitHub Actions Inactive

SimonBaeumer approved these changes Jun 13, 2023

View reviewed changes

openshift-ci bot assigned SimonBaeumer Jun 13, 2023

openshift-ci bot added the lgtm label Jun 13, 2023

johannes94 merged commit d149d29 into main Jun 13, 2023

johannes94 deleted the jmalsam/cli-binary branch June 13, 2023 07:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce acsfleetctl binary #1074

Introduce acsfleetctl binary #1074

johannes94 commented Jun 6, 2023 •

edited

Loading

openshift-ci bot commented Jun 6, 2023

SimonBaeumer left a comment

SimonBaeumer Jun 6, 2023

johannes94 Jun 6, 2023

SimonBaeumer Jun 6, 2023

johannes94 Jun 6, 2023 •

edited

Loading

johannes94 Jun 6, 2023 •

edited

Loading

SimonBaeumer Jun 9, 2023

johannes94 Jun 9, 2023

SimonBaeumer left a comment

johannes94 commented Jun 12, 2023

openshift-ci bot commented Jun 13, 2023

		@@ -0,0 +1,51 @@
		// Package cliflags defines util methods for flags used in acsfleetctl
		package cliflags

Introduce acsfleetctl binary #1074

Introduce acsfleetctl binary #1074

Conversation

johannes94 commented Jun 6, 2023 • edited Loading

Description

Checklist (Definition of Done)

Test manual

openshift-ci bot commented Jun 6, 2023

SimonBaeumer left a comment

Choose a reason for hiding this comment

SimonBaeumer Jun 6, 2023

Choose a reason for hiding this comment

johannes94 Jun 6, 2023

Choose a reason for hiding this comment

SimonBaeumer Jun 6, 2023

Choose a reason for hiding this comment

johannes94 Jun 6, 2023 • edited Loading

Choose a reason for hiding this comment

johannes94 Jun 6, 2023 • edited Loading

Choose a reason for hiding this comment

SimonBaeumer Jun 9, 2023

Choose a reason for hiding this comment

johannes94 Jun 9, 2023

Choose a reason for hiding this comment

SimonBaeumer left a comment

Choose a reason for hiding this comment

johannes94 commented Jun 12, 2023

openshift-ci bot commented Jun 13, 2023

johannes94 commented Jun 6, 2023 •

edited

Loading

johannes94 Jun 6, 2023 •

edited

Loading

johannes94 Jun 6, 2023 •

edited

Loading