Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement CI using CircleCI #421

Merged
merged 10 commits into from
Apr 5, 2023
Merged

Implement CI using CircleCI #421

merged 10 commits into from
Apr 5, 2023

Conversation

vprashar2929
Copy link
Contributor

@vprashar2929 vprashar2929 commented Mar 9, 2023

This PR is in continuation to this. It contains implementing RHOBS configuration CI using CircleCI and addresses some of the drawbacks of using GitHub Actions.
Currently the CI pipeline performs following actions:

  1. Run microshift on Linux VM that is provided by CircleCI. Since RHOBS configs are specific to OCP so it's best to use OCP env to deploy and test these instead of kind/minikube.
  2. Deploy's the generated manifests on microshift cluster with minimum resources.
  3. Run a small test using observatorium/up (Like in observatorium e2e.sh) to verify send/receive metrics.

Using circleci we are getting good resources out of the box than existing GitHub Actions runners.

@douglascamata
Copy link
Member

douglascamata commented Mar 9, 2023

hey @vprashar2929, can you mention the drawbacks of Github Actions that makes us fallback to Circle CI, please, as we spoke some time ago? It's valuable to have the reasoning of the decision for documentation purposes.

@simonpasquier
Copy link
Contributor

I'm really an outsider here but I agree with @douglascamata that more context would be helpful
I'd keep also in mind that CircleCI might reduce the free plan's resources at any time without prior notice. Do we have a backup plan for this possibility?

@vprashar2929
Copy link
Contributor Author

hey @vprashar2929, can you mention the drawbacks of Github Actions that makes us fallback to Circle CI, please, as we spoke some time ago? It's valuable to have the reasoning of the decision for documentation purposes.

So the only difference between the two according to our CI use case is in terms of out of the box (free of cost) resources which both has to offer. GitHub Actions provide 2 vCPU and 7GB or ram which are not sufficient enough to deploy complete RHOBS components in one go. Whereas CircleCI provides 4 vCPU and 15GB of ram.
Since we want to deploy RHOBS with minimum resources and run a simple test on generated manifests than currently it is easy to achieve using CircleCI instead of GitHub Actions.

@vprashar2929
Copy link
Contributor Author

I'm really an outsider here but I agree with @douglascamata that more context would be helpful I'd keep also in mind that CircleCI might reduce the free plan's resources at any time without prior notice. Do we have a backup plan for this possibility?

Oh reduce free plan resources hurts 😅. Well in that case what we can do is deploy each components individually and check its rollout status. But in this case we won't be able to deploy all components at once and run tests on it. It can only validate the deployments generated manifests.

@douglascamata
Copy link
Member

@simonpasquier @vprashar2929 my proposals for the case where CircleCI reduces the size of the free runner:

  1. We join the paid plan (lol) of whichever of the two vendors has the best cost/benefit ratio for our case.
  2. We run a private Github Action Runner for the builds of the RHOBS org.

I personally don't worry too much in case we need a backup plan. 👍

@vprashar2929
Copy link
Contributor Author

@douglascamata Absolutely agree with you. We can definitely do a cost analysis between the two and choose what best suits us according to cost. Provided we are willing to pay the bucks😄
Also I am open to self-hosted GitHub runners specifically for our RHOBS repo as in that case we can have a dedicated OCP cluster running and escape from resources crunch completely.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vprashar2929 was wondering if we could better separate here the deploy procedures and the test ones, so that we can reuse the test part in a post-deploy job. What do you think? This is not a blocker though. I am happy with merging this as is now and then follow up with the separation and more tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI test procedures consist of just checking the rollout status of various deployments/statefulsets along with running observatorium-up to send/receive metrics.

The current post deploy job runs inside a docker with observatorium-up binary copied to Alpine-based container and a runtest.sh script is triggered inside the container. Since this script was already created specifically for post deploy job I think we can reuse this script for adding further checks to job. What are your thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me. 👍

Copy link
Member

@douglascamata douglascamata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Would be great to have a review from other team members (cc @rhobs/team-monitoring)

@@ -0,0 +1,140 @@
#!/bin/bash

set -e
Copy link

@sthaha sthaha Mar 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it s best to have -u set for all bash script. I good template to follow in imho is

#!/usr/bin/env bash

set -eu -opipefail

main() {
  return 0
}

main "$@"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely will add -u flag. Will avoid any undefined behaviour in script

set -e
set -o pipefail

ARTIFACT_DIR="${ARTIFACT_DIR:-/tmp/artifacts}"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

declare -r for all readonly vars

set -o pipefail

ARTIFACT_DIR="${ARTIFACT_DIR:-/tmp/artifacts}"
INFO="INFO"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this required? where do we change this variable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Earlier I used two variables for Logging purposes INFO and ERROR. But since we are gathering Artifacts in case of any failure I think its best to remove this

res=$1
namespace=$2
log_info "Checking status of $res inside $namespace"
oc rollout status $res -n $namespace --timeout=5m || (must_gather "$ARTIFACT_DIR" && exit 1)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a subshell here?

Suggested change
oc rollout status $res -n $namespace --timeout=5m || (must_gather "$ARTIFACT_DIR" && exit 1)
oc rollout status $res -n $namespace --timeout=5m || {
must_gather "$ARTIFACT_DIR"
return 1
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not mandatory. We can use {} along with || as well as it will also allow executing multiple commands if the oc rollout status fails

Comment on lines 23 to 24
}
minio(){
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: a blank line between functions will help with readability.

}
prereq(){
log_info "Deploying prerequisites on cluster"
oc apply -f pre-requisites.yaml
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we use apply ? or create ?

minio(){
log_info "Deploying resources inside minio namespace"
oc create ns minio || true
sleep 5
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sleep is usually a code smell. It is better to "poll" or wait for something to happen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure will add oc wait to wait for namespace to be created.

log_info "Deploying resources inside minio namespace"
oc create ns minio || true
sleep 5
oc process -f ../minio-template.yaml -p MINIO_CPU_REQUEST=30m -p MINIO_CPU_LIMITS=50m -p MINIO_MEMORY_REQUEST=50Mi -p MINIO_MEMORY_LIMITS=100Mi --local -o yaml | sed -e 's/storage: [0-9].Gi/storage: 0.25Gi/g' | oc apply -n minio -f -
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
oc process -f ../minio-template.yaml -p MINIO_CPU_REQUEST=30m -p MINIO_CPU_LIMITS=50m -p MINIO_MEMORY_REQUEST=50Mi -p MINIO_MEMORY_LIMITS=100Mi --local -o yaml | sed -e 's/storage: [0-9].Gi/storage: 0.25Gi/g' | oc apply -n minio -f -
oc process -f ../minio-template.yaml \
-p MINIO_CPU_REQUEST=30m \
-p MINIO_CPU_LIMITS=50m \
-p MINIO_MEMORY_REQUEST=50Mi \
-p MINIO_MEMORY_LIMITS=100Mi \
--local -o yaml |
sed -e 's/storage: [0-9].Gi/storage: 0.25Gi/g' |
oc apply -n minio -f -

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VIM effects😅

Comment on lines 121 to 140
case $1 in
metrics)
prereq
minio
dex
observatorium_metrics
observatorium
run_test
telemeter
;;
logs)
#TODO
;;
traces)
#TODO
;;
*)
echo "usage: $(basename "$0") { metrics | logs | traces }"
;;
esac
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
case $1 in
metrics)
prereq
minio
dex
observatorium_metrics
observatorium
run_test
telemeter
;;
logs)
#TODO
;;
traces)
#TODO
;;
*)
echo "usage: $(basename "$0") { metrics | logs | traces }"
;;
esac
main() {
case $1 in
metrics)
prereq
minio
dex
observatorium_metrics
observatorium
run_test
telemeter
;;
logs)
#TODO
;;
traces)
#TODO
;;
*)
echo "usage: $(basename "$0") { metrics | logs | traces }"
return 1
;;
esac
}
main "$@"

Copy link

@sthaha sthaha Mar 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also just FYI instead of case, you can directly jump to a function like so

ci.metrics() {
	:
}

ci.logs() {
	:
}

ci.help() {
	local exit_code=${1:-0}

	local fns=($(declare -F -p | cut -f3 -d ' ' |
		grep '^et\.' | cut -f2- -d.))

	read -d '^' -r docstring <<EOF_HELP
Usage:
  $(basename "$0") <task>

Tasks:
$(printf "  - %s\n" "${fns[@]}")
^ after this won't be read
EOF_HELP
	echo -e "$docstring"
	exit "$exit_code"
}

is_function() {
	local fn=$1
	shift
	[[ $(type -t "$fn") == "function" ]]
	return $?
}

main() {
	local fn=${1:-''}
	local ci_fn="ci.$fn"
	if ! is_function "$ci_fn"; then
		ci.help 1
	fi
	# call the function passing in the reset of the args
	shift
	$ci_fn "$@"
	return $?
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome @sthaha 🙇🏼‍♂️. I haven't tried like this but this looks good. Will definitely use this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants