More automation in getting_started.md guide #385

huchen2021 · 2022-02-15T09:33:27Z

What this PR does / why we need it:
At the moment, a developer would have to manually deploy a management cluster, build images, copy files to the docker container etc. Lets improve this experience so that we can delight the developer and have them able to try out our software with just one command run, and in under 10 mins.

Signed-off-by: chen hui [email protected]

Fixes #272

Signed-off-by: chen hui <[email protected]>

scripts/getting_started.sh

anusha94 · 2022-02-17T15:01:17Z

Suggest to move this script under hack/ folder and change the name to local-up.sh
Usually have seen a local-up script that can be used to bring up a minimal cluster for people to get started with

Co-authored-by: Anusha Hegde <[email protected]>

huchen2021 · 2022-02-18T02:42:49Z

This is a script just do all steps in getting_started.md guide, not designed to accepted any arguments to customize their byoh cluster. It just help customer try out our software with just one command run. If customer want to create "real" byoh culster with customize configuration, do it by manually. It's better not to rename this script name, or it will made customer think they can bring up and configure their byoh cluster by this script.

Suggest to move this script under hack/ folder and change the name to local-up.sh Usually have seen a local-up script that can be used to bring up a minimal cluster for people to get started with

Signed-off-by: chen hui <[email protected]>

codecov-commenter · 2022-02-18T04:59:55Z

Codecov Report

Merging #385 (c098ace) into main (1fab38a) will decrease coverage by 2.45%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##             main     #385      +/-   ##
==========================================
- Coverage   66.45%   63.99%   -2.46%     
==========================================
  Files          23       24       +1     
  Lines        1696     1811     +115     
==========================================
+ Hits         1127     1159      +32     
- Misses        495      580      +85     
+ Partials       74       72       -2

Impacted Files	Coverage Δ
agent/main.go	`18.32% <0.00%> (-3.11%)`	⬇️
agent/installer/cli-dev.go	`0.00% <0.00%> (ø)`
agent/reconciler/host_reconciler.go	`79.23% <0.00%> (+16.22%)`	⬆️

jamiemonserrate

Oh wow, this was a lot bigger than I expected. Thats a lot of code 😸

Anyways, this feels like new functionality we are maintaining. I worry its going to be super easy to break this. So I suggest that we add some tests for it. Not unit tests, but at least an e2e test of sorts, that is included in our github checks. Essentially a check that runs this script, and verifies that we have a running cluster(s) at the end if it. Thoughts @huchen2021 ?

And, I assume this script would live in parallel to the existing BYOH getting started guide. Kind of like an alternate experience. If users want white glove service, they run this script. Else, users can get their hands dirty, and manually follow the steps, so that they know all the pieces that make up BYOH. So we would have to modify the README / exisiting getting started guide, to hook this in. Yes?

anusha94 · 2022-02-21T02:15:43Z

Essentially a check that runs this script, and verifies that we have a running cluster(s) at the end if it

@jamiemonserrate, I don't know if that's necessary. This is sort of a simple script that will get a byoh workload cluster up. What breaking changes are you worried of? I am looking for a similar experience like k/k's hack/local-up-cluster.sh.
The current script is hard coding way too many values - and this will become stale pretty soon. If we provide a way for the user to pass / set few args while also having some default values, we should be good (I think)

hack/getting_started.sh

jamiemonserrate · 2022-02-21T04:43:32Z

What breaking changes are you worried of? I am looking for a similar experience like k/k's hack/local-up-cluster.sh

I guess I'm worried that it would be super easy for this script to go out of date. If we introduce a breaking change, it won't be until much later that we find out.

At the end of the day, its code. I am always a pessimist (things break), and prefer writing tests as much as possible. Happy to back down if you feel this test is too distracting or too much work to maintain.

anusha94 · 2022-02-21T07:37:17Z

@huchen2021 Since running this script will change the state on the host, we should display a warning message and take user's confirmation whether to proceed or not. We already do that in our e2e test -

cluster-api-provider-bringyourownhost/Makefile

Lines 140 to 157 in fb51a6d

    
           define WARNING 
        
           ##################################################################################################### 
        
           ** WARNING ** 
        
           These tests modify system settings - and do **NOT** revert them at the end of the test. 
        
           A list of changes can be found below. We **highly** recommend running these tests in a VM.  
        
           Running e2e tests locally will change the following host config 
        
           - enable the kernel modules: overlay & bridge network filter 
        
           - create a systemwide config that will enable those modules at boot time 
        
           - enable ipv4 & ipv6 forwarding 
        
           - create a systemwide config that will enable the forwarding at boot time 
        
           - reload the sysctl with the applied config changes so the changes can take effect without restarting 
        
           - disable unattended OS updates 
        
           ##################################################################################################### 
        
           endef 
        
           export WARNING

anusha94 · 2022-02-22T05:43:31Z

I guess I'm worried that it would be super easy for this script to go out of date. If we introduce a breaking change, it won't be until much later that we find out.

@jamiemonserrate 😄
Agreed. But it is there only to provide one-script experience of BYOH. If we keep few of the variables configurable and have sane default values, it won’t “break” as such. It will be stale, no doubt. We have to update this once in a while (and I think its okay and doesn’t have to be pointing to the latest artifacts all the time). We can do this as a one time ritual post a release, we anyway have to update our README and markdown files after every release. So it should be just about updating a few defaults.

Signed-off-by: chen hui <[email protected]>

huchen2021 · 2022-02-23T01:49:20Z

I guess I'm worried that it would be super easy for this script to go out of date. If we introduce a breaking change, it won't be until much later that we find out.

@jamiemonserrate 😄 Agreed. But it is there only to provide one-script experience of BYOH. If we keep few of the variables configurable and have sane default values, it won’t “break” as such. It will be stale, no doubt. We have to update this once in a while (and I think its okay and doesn’t have to be pointing to the latest artifacts all the time). We can do this as a one time ritual post a release, we anyway have to update our README and markdown files after every release. So it should be just about updating a few defaults.

I totally agreed with @anusha94. We need to take care of this for every new release. It should be enough. What do you think? @jamiemonserrate

Signed-off-by: chen hui <[email protected]>

huchen2021 · 2022-02-23T05:08:40Z

@huchen2021 Since running this script will change the state on the host, we should display a warning message and take user's confirmation whether to proceed or not. We already do that in our e2e test -

cluster-api-provider-bringyourownhost/Makefile

Lines 140 to 157 in fb51a6d

define WARNING

#####################################################################################################

** WARNING **

These tests modify system settings - and do **NOT** revert them at the end of the test.

A list of changes can be found below. We **highly** recommend running these tests in a VM.

Running e2e tests locally will change the following host config

- enable the kernel modules: overlay & bridge network filter

- create a systemwide config that will enable those modules at boot time

- enable ipv4 & ipv6 forwarding

- create a systemwide config that will enable the forwarding at boot time

- reload the sysctl with the applied config changes so the changes can take effect without restarting

- disable unattended OS updates

#####################################################################################################

endef

export WARNING

Done. Thanks for your comment.

huchen2021 · 2022-02-23T06:01:07Z

@jamiemonserrate @anusha94

huchen2021 · 2022-02-23T06:06:55Z

This script will exit with code 1 if any error happened. It do checks on every step, and at the end of script, it also check if every node status is ok. Base on that, I add this test into github workflows, and it pass. You can know more detail by get-started-suite of pull-390.

If we have to add a test for this script, @anusha94 @jamiemonserrate What do you think about this way? If you are all think this is ok, I'll merge the test code part it into this PR.

Signed-off-by: chen hui <[email protected]>

hack/getting_started.sh

anusha94 · 2022-02-23T15:12:49Z

This script will exit with code 1 if any error happened. It do checks on every step, and at the end of script, it also check if every node status is ok. Base on that, I add this test into github workflows, and it pass. You can know more detail by get-started-suite of pull-390.

If we have to add a test for this script, @anusha94 @jamiemonserrate What do you think about this way? If you are all think this is ok, I'll merge the test code part it into this PR.

First of all, great job! Love that the script is running in ~8mins. But as we discussed earlier, I am leaning towards not having a test for this script. It is not expected to be at the latest state all the time and will require default updates with every release.

Co-authored-by: Anusha Hegde <[email protected]>

huchen2021 · 2022-02-24T03:16:30Z

This script will exit with code 1 if any error happened. It do checks on every step, and at the end of script, it also check if every node status is ok. Base on that, I add this test into github workflows, and it pass. You can know more detail by get-started-suite of pull-390.
If we have to add a test for this script, @anusha94 @jamiemonserrate What do you think about this way? If you are all think this is ok, I'll merge the test code part it into this PR.

First of all, great job! Love that the script is running in ~8mins. But as we discussed earlier, I am leaning towards not having a test for this script. It is not expected to be at the latest state all the time and will require default updates with every release.

You are make a good point. But since this script only consume ~8mins, It wouldn't be introduce too much burden for github workflow. Anyway, I can accept both way.

@jamiemonserrate What do you think?

Signed-off-by: chen hui <[email protected]>

hack/getting_started.sh

Signed-off-by: chen hui <[email protected]>

jamiemonserrate · 2022-02-28T03:23:30Z

This script will exit with code 1 if any error happened. It do checks on every step, and at the end of script, it also check if every node status is ok. Base on that, I add this test into github workflows, and it pass. You can know more detail by get-started-suite of pull-390.
If we have to add a test for this script, @anusha94 @jamiemonserrate What do you think about this way? If you are all think this is ok, I'll merge the test code part it into this PR.

First of all, great job! Love that the script is running in ~8mins. But as we discussed earlier, I am leaning towards not having a test for this script. It is not expected to be at the latest state all the time and will require default updates with every release.

You are make a good point. But since this script only consume ~8mins, It wouldn't be introduce too much burden for github workflow. Anyway, I can accept both way.

@jamiemonserrate What do you think?

Happy to yield. Lets not test this with every PR then.

PS - Thanks for taking the effort to get that workflow setup @huchen2021! Sorry, this was my bad you had to do extra work.

huchen2021 · 2022-02-28T05:36:15Z

This script will exit with code 1 if any error happened. It do checks on every step, and at the end of script, it also check if every node status is ok. Base on that, I add this test into github workflows, and it pass. You can know more detail by get-started-suite of pull-390.
If we have to add a test for this script, @anusha94 @jamiemonserrate What do you think about this way? If you are all think this is ok, I'll merge the test code part it into this PR.

First of all, great job! Love that the script is running in ~8mins. But as we discussed earlier, I am leaning towards not having a test for this script. It is not expected to be at the latest state all the time and will require default updates with every release.

You are make a good point. But since this script only consume ~8mins, It wouldn't be introduce too much burden for github workflow. Anyway, I can accept both way.
@jamiemonserrate What do you think?

Happy to yield. Lets not test this with every PR then.

PS - Thanks for taking the effort to get that workflow setup @huchen2021! Sorry, this was my bad you had to do extra work.

@jamiemonserrate It's fine. I am very happy to do this. It's a study opportunity for me. Thanks for the responding.

Signed-off-by: chen hui <[email protected]>

anusha94

LGTM.
Again, huge thank you for this PR 🎉

pshail · 2022-03-01T10:37:46Z

hack/getting_started.sh

+#!/bin/bash
+function isCmdInstalled() {
+    local cmd=$1
+    which ${cmd}


Recommend : command -v over which -> which is external to shell and is divergent when used across platforms, command is built-in e.g
`$ which command
command: shell built-in command

$ command -v command
command

$ which docker
/usr/local/bin/docker

$ command -v docker
/usr/local/bin/docker
`

The above variability is within one platform in "which" output and hence not consistent

pshail · 2022-03-01T10:38:17Z

Minor suggest rest LGTM

More automation in getting_started.md guide

da7d60b

Signed-off-by: chen hui <[email protected]>

vmwclabot added the cla-not-required label Feb 15, 2022

huchen2021 requested review from anusha94 and dharmjit February 15, 2022 09:33

anusha94 requested a review from jamiemonserrate February 17, 2022 13:03

anusha94 reviewed Feb 17, 2022

View reviewed changes

Update scripts/getting_started.sh

b51060f

Co-authored-by: Anusha Hegde <[email protected]>

comment from anusha

39fba82

Signed-off-by: chen hui <[email protected]>

github-actions bot added the area/test-release label Feb 18, 2022

jamiemonserrate reviewed Feb 21, 2022

View reviewed changes

anusha94 reviewed Feb 21, 2022

View reviewed changes

hack/getting_started.sh Outdated Show resolved Hide resolved

anusha94 reviewed Feb 21, 2022

View reviewed changes

hack/getting_started.sh Outdated Show resolved Hide resolved

chen hui added 2 commits February 22, 2022 09:09

comment from anusha

c4dbcf3

Signed-off-by: chen hui <[email protected]>

comment from anusha

a453996

Signed-off-by: chen hui <[email protected]>

comment from anusha

7c8c54d

Signed-off-by: chen hui <[email protected]>

huchen2021 requested review from jamiemonserrate and anusha94 February 23, 2022 05:08

fix exit code

58d0a82

Signed-off-by: chen hui <[email protected]>

anusha94 reviewed Feb 23, 2022

View reviewed changes

huchen2021 and others added 2 commits February 24, 2022 10:48

Update hack/getting_started.sh

7862713

Co-authored-by: Anusha Hegde <[email protected]>

Update hack/getting_started.sh

b5e0cc7

Co-authored-by: Anusha Hegde <[email protected]>

comment from anusha

cac3968

Signed-off-by: chen hui <[email protected]>

huchen2021 requested a review from anusha94 February 24, 2022 04:06

dharmjit reviewed Feb 25, 2022

View reviewed changes

hack/getting_started.sh Outdated Show resolved Hide resolved

hack/getting_started.sh Outdated Show resolved Hide resolved

hack/getting_started.sh Outdated Show resolved Hide resolved

chen hui added 2 commits February 25, 2022 07:10

comment from dharmjit

b21f855

Signed-off-by: chen hui <[email protected]>

comment from anusha

2f72499

Signed-off-by: chen hui <[email protected]>

huchen2021 requested review from dharmjit and pshail February 25, 2022 07:24

string is more stable than integer

c151bf9

Signed-off-by: chen hui <[email protected]>

comment from anusha

c098ace

Signed-off-by: chen hui <[email protected]>

anusha94 approved these changes Mar 1, 2022

View reviewed changes

pshail reviewed Mar 1, 2022

View reviewed changes

pshail approved these changes Mar 1, 2022

View reviewed changes

huchen2021 merged commit 67b626e into vmware-tanzu:main Mar 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More automation in getting_started.md guide #385

More automation in getting_started.md guide #385

huchen2021 commented Feb 15, 2022 •

edited

Loading

anusha94 commented Feb 17, 2022

huchen2021 commented Feb 18, 2022 •

edited

Loading

codecov-commenter commented Feb 18, 2022 •

edited

Loading

jamiemonserrate left a comment

anusha94 commented Feb 21, 2022

jamiemonserrate commented Feb 21, 2022 •

edited by anusha94

Loading

anusha94 commented Feb 21, 2022

anusha94 commented Feb 22, 2022

huchen2021 commented Feb 23, 2022 •

edited

Loading

huchen2021 commented Feb 23, 2022

huchen2021 commented Feb 23, 2022

huchen2021 commented Feb 23, 2022

anusha94 commented Feb 23, 2022

huchen2021 commented Feb 24, 2022

jamiemonserrate commented Feb 28, 2022

huchen2021 commented Feb 28, 2022

anusha94 left a comment

pshail Mar 1, 2022 •

edited

Loading

pshail commented Mar 1, 2022

More automation in getting_started.md guide #385

More automation in getting_started.md guide #385

Conversation

huchen2021 commented Feb 15, 2022 • edited Loading

anusha94 commented Feb 17, 2022

huchen2021 commented Feb 18, 2022 • edited Loading

codecov-commenter commented Feb 18, 2022 • edited Loading

Codecov Report

jamiemonserrate left a comment

Choose a reason for hiding this comment

anusha94 commented Feb 21, 2022

jamiemonserrate commented Feb 21, 2022 • edited by anusha94 Loading

anusha94 commented Feb 21, 2022

anusha94 commented Feb 22, 2022

huchen2021 commented Feb 23, 2022 • edited Loading

huchen2021 commented Feb 23, 2022

huchen2021 commented Feb 23, 2022

huchen2021 commented Feb 23, 2022

anusha94 commented Feb 23, 2022

huchen2021 commented Feb 24, 2022

jamiemonserrate commented Feb 28, 2022

huchen2021 commented Feb 28, 2022

anusha94 left a comment

Choose a reason for hiding this comment

pshail Mar 1, 2022 • edited Loading

Choose a reason for hiding this comment

pshail commented Mar 1, 2022

huchen2021 commented Feb 15, 2022 •

edited

Loading

huchen2021 commented Feb 18, 2022 •

edited

Loading

codecov-commenter commented Feb 18, 2022 •

edited

Loading

jamiemonserrate commented Feb 21, 2022 •

edited by anusha94

Loading

huchen2021 commented Feb 23, 2022 •

edited

Loading

pshail Mar 1, 2022 •

edited

Loading