Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More automation in getting_started.md guide #385

Merged
merged 14 commits into from
Mar 2, 2022

Conversation

huchen2021
Copy link
Contributor

@huchen2021 huchen2021 commented Feb 15, 2022

What this PR does / why we need it:
At the moment, a developer would have to manually deploy a management cluster, build images, copy files to the docker container etc. Lets improve this experience so that we can delight the developer and have them able to try out our software with just one command run, and in under 10 mins.

Signed-off-by: chen hui [email protected]

Fixes #272

scripts/getting_started.sh Outdated Show resolved Hide resolved
scripts/getting_started.sh Outdated Show resolved Hide resolved
scripts/getting_started.sh Outdated Show resolved Hide resolved
scripts/getting_started.sh Outdated Show resolved Hide resolved
scripts/getting_started.sh Outdated Show resolved Hide resolved
scripts/getting_started.sh Outdated Show resolved Hide resolved
scripts/getting_started.sh Outdated Show resolved Hide resolved
scripts/getting_started.sh Outdated Show resolved Hide resolved
scripts/getting_started.sh Outdated Show resolved Hide resolved
scripts/getting_started.sh Outdated Show resolved Hide resolved
@anusha94
Copy link
Contributor

Suggest to move this script under hack/ folder and change the name to local-up.sh
Usually have seen a local-up script that can be used to bring up a minimal cluster for people to get started with

@huchen2021
Copy link
Contributor Author

huchen2021 commented Feb 18, 2022

This is a script just do all steps in getting_started.md guide, not designed to accepted any arguments to customize their byoh cluster. It just help customer try out our software with just one command run. If customer want to create "real" byoh culster with customize configuration, do it by manually. It's better not to rename this script name, or it will made customer think they can bring up and configure their byoh cluster by this script.

Suggest to move this script under hack/ folder and change the name to local-up.sh Usually have seen a local-up script that can be used to bring up a minimal cluster for people to get started with

Signed-off-by: chen hui <[email protected]>
@codecov-commenter
Copy link

codecov-commenter commented Feb 18, 2022

Codecov Report

Merging #385 (c098ace) into main (1fab38a) will decrease coverage by 2.45%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #385      +/-   ##
==========================================
- Coverage   66.45%   63.99%   -2.46%     
==========================================
  Files          23       24       +1     
  Lines        1696     1811     +115     
==========================================
+ Hits         1127     1159      +32     
- Misses        495      580      +85     
+ Partials       74       72       -2     
Impacted Files Coverage Δ
agent/main.go 18.32% <0.00%> (-3.11%) ⬇️
agent/installer/cli-dev.go 0.00% <0.00%> (ø)
agent/reconciler/host_reconciler.go 79.23% <0.00%> (+16.22%) ⬆️

Copy link
Contributor

@jamiemonserrate jamiemonserrate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wow, this was a lot bigger than I expected. Thats a lot of code 😸

Anyways, this feels like new functionality we are maintaining. I worry its going to be super easy to break this. So I suggest that we add some tests for it. Not unit tests, but at least an e2e test of sorts, that is included in our github checks. Essentially a check that runs this script, and verifies that we have a running cluster(s) at the end if it. Thoughts @huchen2021 ?

And, I assume this script would live in parallel to the existing BYOH getting started guide. Kind of like an alternate experience. If users want white glove service, they run this script. Else, users can get their hands dirty, and manually follow the steps, so that they know all the pieces that make up BYOH. So we would have to modify the README / exisiting getting started guide, to hook this in. Yes?

@anusha94
Copy link
Contributor

Essentially a check that runs this script, and verifies that we have a running cluster(s) at the end if it

@jamiemonserrate, I don't know if that's necessary. This is sort of a simple script that will get a byoh workload cluster up. What breaking changes are you worried of? I am looking for a similar experience like k/k's hack/local-up-cluster.sh.
The current script is hard coding way too many values - and this will become stale pretty soon. If we provide a way for the user to pass / set few args while also having some default values, we should be good (I think)

hack/getting_started.sh Outdated Show resolved Hide resolved
hack/getting_started.sh Outdated Show resolved Hide resolved
hack/getting_started.sh Outdated Show resolved Hide resolved
hack/getting_started.sh Outdated Show resolved Hide resolved
hack/getting_started.sh Outdated Show resolved Hide resolved
hack/getting_started.sh Outdated Show resolved Hide resolved
hack/getting_started.sh Outdated Show resolved Hide resolved
hack/getting_started.sh Outdated Show resolved Hide resolved
hack/getting_started.sh Outdated Show resolved Hide resolved
@jamiemonserrate
Copy link
Contributor

jamiemonserrate commented Feb 21, 2022

What breaking changes are you worried of? I am looking for a similar experience like k/k's hack/local-up-cluster.sh

I guess I'm worried that it would be super easy for this script to go out of date. If we introduce a breaking change, it won't be until much later that we find out.

At the end of the day, its code. I am always a pessimist (things break), and prefer writing tests as much as possible. Happy to back down if you feel this test is too distracting or too much work to maintain.

@anusha94
Copy link
Contributor

@huchen2021 Since running this script will change the state on the host, we should display a warning message and take user's confirmation whether to proceed or not. We already do that in our e2e test -

define WARNING
#####################################################################################################
** WARNING **
These tests modify system settings - and do **NOT** revert them at the end of the test.
A list of changes can be found below. We **highly** recommend running these tests in a VM.
Running e2e tests locally will change the following host config
- enable the kernel modules: overlay & bridge network filter
- create a systemwide config that will enable those modules at boot time
- enable ipv4 & ipv6 forwarding
- create a systemwide config that will enable the forwarding at boot time
- reload the sysctl with the applied config changes so the changes can take effect without restarting
- disable unattended OS updates
#####################################################################################################
endef
export WARNING

@anusha94
Copy link
Contributor

I guess I'm worried that it would be super easy for this script to go out of date. If we introduce a breaking change, it won't be until much later that we find out.

@jamiemonserrate 😄
Agreed. But it is there only to provide one-script experience of BYOH. If we keep few of the variables configurable and have sane default values, it won’t “break” as such. It will be stale, no doubt. We have to update this once in a while (and I think its okay and doesn’t have to be pointing to the latest artifacts all the time). We can do this as a one time ritual post a release, we anyway have to update our README and markdown files after every release. So it should be just about updating a few defaults.

chen hui added 2 commits February 22, 2022 09:09
Signed-off-by: chen hui <[email protected]>
Signed-off-by: chen hui <[email protected]>
@huchen2021
Copy link
Contributor Author

huchen2021 commented Feb 23, 2022

I guess I'm worried that it would be super easy for this script to go out of date. If we introduce a breaking change, it won't be until much later that we find out.

@jamiemonserrate 😄 Agreed. But it is there only to provide one-script experience of BYOH. If we keep few of the variables configurable and have sane default values, it won’t “break” as such. It will be stale, no doubt. We have to update this once in a while (and I think its okay and doesn’t have to be pointing to the latest artifacts all the time). We can do this as a one time ritual post a release, we anyway have to update our README and markdown files after every release. So it should be just about updating a few defaults.

I totally agreed with @anusha94. We need to take care of this for every new release. It should be enough. What do you think? @jamiemonserrate

Signed-off-by: chen hui <[email protected]>
@huchen2021
Copy link
Contributor Author

@huchen2021 Since running this script will change the state on the host, we should display a warning message and take user's confirmation whether to proceed or not. We already do that in our e2e test -

define WARNING
#####################################################################################################
** WARNING **
These tests modify system settings - and do **NOT** revert them at the end of the test.
A list of changes can be found below. We **highly** recommend running these tests in a VM.
Running e2e tests locally will change the following host config
- enable the kernel modules: overlay & bridge network filter
- create a systemwide config that will enable those modules at boot time
- enable ipv4 & ipv6 forwarding
- create a systemwide config that will enable the forwarding at boot time
- reload the sysctl with the applied config changes so the changes can take effect without restarting
- disable unattended OS updates
#####################################################################################################
endef
export WARNING

Done. Thanks for your comment.

@huchen2021
Copy link
Contributor Author

@jamiemonserrate @anusha94

@huchen2021
Copy link
Contributor Author

This script will exit with code 1 if any error happened. It do checks on every step, and at the end of script, it also check if every node status is ok. Base on that, I add this test into github workflows, and it pass. You can know more detail by get-started-suite of pull-390.

If we have to add a test for this script, @anusha94 @jamiemonserrate What do you think about this way? If you are all think this is ok, I'll merge the test code part it into this PR.

Signed-off-by: chen hui <[email protected]>
hack/getting_started.sh Outdated Show resolved Hide resolved
hack/getting_started.sh Outdated Show resolved Hide resolved
hack/getting_started.sh Outdated Show resolved Hide resolved
hack/getting_started.sh Show resolved Hide resolved
hack/getting_started.sh Show resolved Hide resolved
hack/getting_started.sh Outdated Show resolved Hide resolved
hack/getting_started.sh Outdated Show resolved Hide resolved
hack/getting_started.sh Outdated Show resolved Hide resolved
hack/getting_started.sh Show resolved Hide resolved
hack/getting_started.sh Outdated Show resolved Hide resolved
@anusha94
Copy link
Contributor

This script will exit with code 1 if any error happened. It do checks on every step, and at the end of script, it also check if every node status is ok. Base on that, I add this test into github workflows, and it pass. You can know more detail by get-started-suite of pull-390.

If we have to add a test for this script, @anusha94 @jamiemonserrate What do you think about this way? If you are all think this is ok, I'll merge the test code part it into this PR.

First of all, great job! Love that the script is running in ~8mins. But as we discussed earlier, I am leaning towards not having a test for this script. It is not expected to be at the latest state all the time and will require default updates with every release.

huchen2021 and others added 2 commits February 24, 2022 10:48
@huchen2021
Copy link
Contributor Author

This script will exit with code 1 if any error happened. It do checks on every step, and at the end of script, it also check if every node status is ok. Base on that, I add this test into github workflows, and it pass. You can know more detail by get-started-suite of pull-390.
If we have to add a test for this script, @anusha94 @jamiemonserrate What do you think about this way? If you are all think this is ok, I'll merge the test code part it into this PR.

First of all, great job! Love that the script is running in ~8mins. But as we discussed earlier, I am leaning towards not having a test for this script. It is not expected to be at the latest state all the time and will require default updates with every release.

You are make a good point. But since this script only consume ~8mins, It wouldn't be introduce too much burden for github workflow. Anyway, I can accept both way.

@jamiemonserrate What do you think?

Signed-off-by: chen hui <[email protected]>
@huchen2021 huchen2021 requested a review from anusha94 February 24, 2022 04:06
hack/getting_started.sh Outdated Show resolved Hide resolved
hack/getting_started.sh Outdated Show resolved Hide resolved
hack/getting_started.sh Outdated Show resolved Hide resolved
chen hui added 2 commits February 25, 2022 07:10
Signed-off-by: chen hui <[email protected]>
Signed-off-by: chen hui <[email protected]>
@jamiemonserrate
Copy link
Contributor

This script will exit with code 1 if any error happened. It do checks on every step, and at the end of script, it also check if every node status is ok. Base on that, I add this test into github workflows, and it pass. You can know more detail by get-started-suite of pull-390.
If we have to add a test for this script, @anusha94 @jamiemonserrate What do you think about this way? If you are all think this is ok, I'll merge the test code part it into this PR.

First of all, great job! Love that the script is running in ~8mins. But as we discussed earlier, I am leaning towards not having a test for this script. It is not expected to be at the latest state all the time and will require default updates with every release.

You are make a good point. But since this script only consume ~8mins, It wouldn't be introduce too much burden for github workflow. Anyway, I can accept both way.

@jamiemonserrate What do you think?

Happy to yield. Lets not test this with every PR then.

PS - Thanks for taking the effort to get that workflow setup @huchen2021! Sorry, this was my bad you had to do extra work.

@huchen2021
Copy link
Contributor Author

This script will exit with code 1 if any error happened. It do checks on every step, and at the end of script, it also check if every node status is ok. Base on that, I add this test into github workflows, and it pass. You can know more detail by get-started-suite of pull-390.
If we have to add a test for this script, @anusha94 @jamiemonserrate What do you think about this way? If you are all think this is ok, I'll merge the test code part it into this PR.

First of all, great job! Love that the script is running in ~8mins. But as we discussed earlier, I am leaning towards not having a test for this script. It is not expected to be at the latest state all the time and will require default updates with every release.

You are make a good point. But since this script only consume ~8mins, It wouldn't be introduce too much burden for github workflow. Anyway, I can accept both way.
@jamiemonserrate What do you think?

Happy to yield. Lets not test this with every PR then.

PS - Thanks for taking the effort to get that workflow setup @huchen2021! Sorry, this was my bad you had to do extra work.

@jamiemonserrate It's fine. I am very happy to do this. It's a study opportunity for me. Thanks for the responding.

Signed-off-by: chen hui <[email protected]>
Copy link
Contributor

@anusha94 anusha94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Again, huge thank you for this PR 🎉

#!/bin/bash
function isCmdInstalled() {
local cmd=$1
which ${cmd}
Copy link
Contributor

@pshail pshail Mar 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend : command -v over which -> which is external to shell and is divergent when used across platforms, command is built-in e.g
`$ which command
command: shell built-in command

$ command -v command
command

$ which docker
/usr/local/bin/docker

$ command -v docker
/usr/local/bin/docker
`

The above variability is within one platform in "which" output and hence not consistent

@pshail
Copy link
Contributor

pshail commented Mar 1, 2022

Minor suggest rest LGTM

@huchen2021 huchen2021 merged commit 67b626e into vmware-tanzu:main Mar 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

More automation in getting_started.md guide
7 participants