Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce acsfleetctl binary #1074

Merged
merged 11 commits into from
Jun 13, 2023
Merged

Introduce acsfleetctl binary #1074

merged 11 commits into from
Jun 13, 2023

Conversation

johannes94
Copy link
Contributor

@johannes94 johannes94 commented Jun 6, 2023

Description

Move the Fleet-Manager CLI to it's own binary instead of embedding it in the fleet-manager binary. Some motivating facts for this PR:

  • Usage info for the CLI are massive, because of all the flags added to the root fleet-manager command:
  • Simple errors in CLI usage lead to panic logs with massive stack trace, crashing is fine for a service like fleet-manager serve because we don't want it to start with wrong flags/configuration. But for the CLI this is unnecessary
  • Usage of glog.Fatal() on multipel occasion leads to unmanaged and unrecoverable panic output of stacktraces and goroutines

Slack thread: https://redhat-internal.slack.com/archives/C02HTD51SMD/p1685620429880799

Checklist (Definition of Done)

  • [ ] Unit and integration tests added
  • Added test description under Test manual
  • [ ] Documentation added if necessary (i.e. changes to dev setup, test execution, ...)
  • CI and all relevant tests are passing
  • [ ] Add the ticket number to the PR title if available, i.e. ROX-12345: ...
  • [ ] Discussed security and business related topics privately. Will move any security and business related topics that arise to private communication channel.
  • [ ] Add secret to app-interface Vault or Secrets Manager if necessary
  • [ ] RDS changes were e2e tested manually
  • [ ] Check AWS limits are reasonable for changes provisioning new resources

Test manual

Run local fleet-manager and test CLI commands against it:

make db/teardown db/setup db/migrate
make binary

# local crc cluster started and kubectl configured
./fleet-manager serve

export OCM_TOKEN=$(ocm token --refresh)
rhoas login --auth-url=https://auth.redhat.com/auth/realms/EmployeeIDP

./acsfleetctl centrals create --name "test-central-1" --region "standalone" --provider "standalone"
# Instance got created

./acsfleetctl centrals list
# Prints CentralRequestList with the instance created

central_id=$(./acsfleetctl centrals list | jq '.items[0].id' -r )
./acsfleetctl centrals get --id $central_id
# Print the central request as JSON

export RHOAS_TOKEN=$(rhoas authtoken)
./acsfleetctl admin centrals list
# Print CentralList (Admin API DTO) with the created central

./acsfleetctl centrals delete --id $central_id
# Prints {status_code: 202}
# To run tests locally run:
make db/teardown db/setup db/migrate
make ocm/setup OCM_OFFLINE_TOKEN=<ocm-offline-token> OCM_ENV=development
make verify lint binary test test/integration

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 6, 2023

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@johannes94 johannes94 requested a review from SimonBaeumer June 6, 2023 07:56
@johannes94 johannes94 temporarily deployed to development June 6, 2023 07:56 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development June 6, 2023 07:56 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development June 6, 2023 07:56 — with GitHub Actions Inactive
@openshift-ci openshift-ci bot added the approved label Jun 6, 2023
@johannes94 johannes94 temporarily deployed to development June 6, 2023 08:33 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development June 6, 2023 08:33 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development June 6, 2023 08:33 — with GitHub Actions Inactive
Copy link
Member

@SimonBaeumer SimonBaeumer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, looks simpler than expected. I am convinced 👍

@@ -0,0 +1,51 @@
// Package cliflags defines util methods for flags used in acsfleetctl
package cliflags
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a package: pkg/flags/flags.go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't use it on purpose, see the changes in comments I did for pkg/flags/flags.go. The problem is the package glog.Fatals a lot if flags are not set. With glog.Fatal there is a problem:

It is unrecoverable because it calls ox.Exit(2) directly after logging and not panic (that's why I changed the comment), this does not allow us to intercept the exit on top level with the recover function and modify output to the output we desired for the CLI. I didn't want to change the fleet-manager serve/migrate behaviour that's why I created another cliflags package just for CLI.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case I think we should return an error instead of os.Exit or panic.
What is your concern about serve and migrate behaviour?
I see these changes as uncritical because it is parsing CLI flags which does not crash in prod (except we messed something up pretty bad).

Copy link
Contributor Author

@johannes94 johannes94 Jun 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I want this to be a panic for the CLI.

I think this makes it easier to handle errors that cause termination of the program. If a required flag is not set I want to terminate the whole program. We could do this in two ways:

  • Return an error and propagate it through every function with if err != nil {return err}, than handle it at some point to terminate the application
  • panic and immediately go up the stack without having to passthrough the error to the top, handle the error in recover and exit the application

IMO option 2 yields more readable/less noisy code. According to [this post](https://stackoverflow.com/questions/35412449/why-did-go-add-panic-and-recover-in-addition-to-error-handling/35413011#35413011 that's also one of the reasons panic/recover), this is why panic/recover was implemented in Go.

Copy link
Contributor Author

@johannes94 johannes94 Jun 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is your concern about serve and migrate behaviour?

No concern that can not be addressed, just don't wanted to change anything in that regards in this specific PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still see disadvantages for the pruposed solution and not convinced by it. The panic in the library shows its architectural weakness in two disadvantages here:

  • Source code is duplicated for a panic when it could return an error instead and be supported by both libraries. It shows a leaky abstraction because the same problem is solved twice to exchange its output format (here panic & glog.Fatal). However, there could be wrapper functions to panic or glog.Fatal and the caller decides what to use.
  • Deferrable recovery is more complex to read compared to an if err != nil because an engineer
    1. most look into the deferred function
    2. keep a map in mind that there is a deferred function executed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had short call to discuss this, Outcome:

  • Duplication of flags package removed glog.Fatal replaced by panic calls
  • CLI implementation itself stays with calling "MustGet*" function from the flags package for required flags.

@johannes94 johannes94 marked this pull request as ready for review June 6, 2023 09:10
@johannes94 johannes94 temporarily deployed to development June 6, 2023 09:10 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development June 6, 2023 09:10 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development June 6, 2023 09:10 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development June 9, 2023 12:39 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development June 9, 2023 12:45 — with GitHub Actions Inactive
@johannes94 johannes94 force-pushed the jmalsam/cli-binary branch from b77a5f7 to 2026e5c Compare June 9, 2023 12:51
@johannes94 johannes94 temporarily deployed to development June 9, 2023 12:52 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development June 12, 2023 06:17 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development June 12, 2023 06:40 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development June 12, 2023 06:40 — with GitHub Actions Inactive
Copy link
Member

@SimonBaeumer SimonBaeumer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see similar messages twice now locally:

❯ ./acsfleetctl centrals get --debug
Error: required flag(s) "id" not set
Usage:
  acsfleetctl centrals get [flags]

Flags:
  -h, --help        help for get
      --id string   Central ID (required)

Global Flags:
      --debug   use debug output

Error executing command: required flag(s) "id" not set%

And could not see any changes when I removed the panic recovery and no difference with the --debug flag enabled.

❯ ./acsfleetctl centrals --debug get
Error: required flag(s) "id" not set
Usage:
  acsfleetctl centrals get [flags]

Flags:
  -h, --help        help for get
      --id string   Central ID (required)

Global Flags:
      --debug   use debug output

Error executing command: required flag(s) "id" not set%
❯ ./acsfleetctl centrals --debug delete

@johannes94 Can you confirm this behaviour?

@johannes94 johannes94 temporarily deployed to development June 12, 2023 10:12 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development June 12, 2023 10:12 — with GitHub Actions Inactive
@johannes94
Copy link
Contributor Author

I can confirm, the behaviour above. I removed the second log from the code @SimonBaeumer

@johannes94 johannes94 temporarily deployed to development June 12, 2023 15:19 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development June 12, 2023 15:19 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development June 12, 2023 15:19 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development June 12, 2023 17:09 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development June 12, 2023 17:09 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development June 12, 2023 17:09 — with GitHub Actions Inactive
@johannes94 johannes94 force-pushed the jmalsam/cli-binary branch from 2aff0a9 to 309ee90 Compare June 12, 2023 17:23
@johannes94 johannes94 temporarily deployed to development June 12, 2023 17:23 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development June 12, 2023 17:23 — with GitHub Actions Inactive
@johannes94 johannes94 temporarily deployed to development June 12, 2023 17:23 — with GitHub Actions Inactive
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 13, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: johannes94, SimonBaeumer

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [SimonBaeumer,johannes94]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@johannes94 johannes94 merged commit d149d29 into main Jun 13, 2023
@johannes94 johannes94 deleted the jmalsam/cli-binary branch June 13, 2023 07:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants