-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KIND infra, SRIOV proivder, Run in-memory etcd #478
KIND infra, SRIOV proivder, Run in-memory etcd #478
Conversation
428988c
to
fe6fbe0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, have some nits and a big question.
cluster-up/cluster/kind/common.sh
Outdated
mkdir -p ${KUBEVIRTCI_CONFIG_PATH}/$KUBEVIRT_PROVIDER/cni-plugins | ||
curl -L https://github.com/containernetworking/plugins/releases/download/v0.8.5/cni-plugins-linux-amd64-v0.8.5.tgz | tar xz -C ${KUBEVIRTCI_CONFIG_PATH}/$KUBEVIRT_PROVIDER/cni-plugins | ||
for node in $(_get_nodes | awk '{print $1}'); do | ||
docker exec $node /bin/sh -c "curl -L https://github.com/containernetworking/plugins/releases/download/v0.8.5/cni-plugins-linux-amd64-v0.8.5.tgz | tar xz -C /opt/cni/bin" | ||
docker cp ${KUBEVIRTCI_CONFIG_PATH}/$KUBEVIRT_PROVIDER/cni-plugins/. $node:/opt/cni/bin | ||
done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not related and leaves a tarball and a directory there, also the number of nodes we usually have is quite small so is like 2 curl + unpacks vs 1 curl + unpack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My intent was to reduce as much time possible for cluster-up to finish so I prefer to curl and unpack one time and only then copy the content
I can add remove the tar and the directory right after the for loop, is that ok with you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well but this is really not related to run etcd in memory, we can do that at different PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem I will remove this commit
cluster-up/cluster/kind/common.sh
Outdated
@@ -210,9 +196,91 @@ function setup_kind() { | |||
prepare_config | |||
} | |||
|
|||
function ensure_memory_dir_for_etcd_data() { | |||
if [ "$ETCD_DATA_DIR" == none ];then | |||
echo "ETCD_DATA_DIR is not defined" && exit 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better two lines
echo "ETCD_DATA_DIR is not defined"
exit 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
cluster-up/cluster/kind/common.sh
Outdated
@@ -15,6 +15,8 @@ REGISTRY_NAME=${CLUSTER_NAME}-registry | |||
MASTER_NODES_PATTERN="control-plane" | |||
WORKER_NODES_PATTERN="worker" | |||
|
|||
ETCD_DATA_DIR=${ETCD_DATA_DIR:-none} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need this, just check if the variable set or not before use it
[ -z "$ETCD_DATA_DIR" ]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this to kind/common.sh for better visibility of that variable, so if one will review for example ensure_memory_dir_for_etcd_data
function it will be easier to understand from where this variable comes form
What you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to do that, then just throw an error here is the variable is not set.
cluster-up/cluster/kind/common.sh
Outdated
@@ -210,9 +196,91 @@ function setup_kind() { | |||
prepare_config | |||
} | |||
|
|||
function ensure_memory_dir_for_etcd_data() { | |||
if [ "$ETCD_DATA_DIR" == none ];then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto: [ -z "$ETCD_DATA_DIR" ]
cluster-up/cluster/kind/common.sh
Outdated
ls -lsah $ETCD_DATA_DIR | ||
|
||
echo "mount etcd data directory as tempfs" | ||
mount -t tmpfs tmpfs $ETCD_DATA_DIR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I am wrong but when we run this on prow the tmpfs is done by k8s when we crate the volume of typoe Memory
, right, so we don't need to do this ourselves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great question :)
I added this logic to support development environments flow where you dont have k8s to
create this mount for you.
On local dev env this function will create a directory and mount it,
if the mount already exists it will delete the mount content to prevent contradicts with new data.
On CI since we ask from the k8s to create a mount for us,
this function will check that the mount exists, delete the mount content to prevent contradicts with new data
and prints its details
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at local env we have to also remove it after we do cluster-down right ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You right I will add a clean up for this on cluster-down
|
||
setup_kind | ||
|
||
export ETCD_DATA_DIR="/mnt/etcd" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What will happend if we open more than one kind cluster ? they cannot share the etcd directory. Let's better add something that is unique per kind cluster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently we deploy each job at a time on a CI node which means that there could be only one kind cluster at a given time.
I think we should tackle this when we would need to support (if ever) multiple sriov lane jobs on single CI job.
Also it should be changed together with changes we might what to apply such as handling the cluster name (should be unique),
attaching PFs to each cluster (check which is available)
and any other change we would need to do.
What you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But you also have assumptions here which could be avoided.
- That the prev deployment was properly cleaned.
- That someone experiments on the node and runs once manually and once through the CI.
3 .The above unintentionally... i.e. bugs.
As the option of this to occur does exist, it makes sense to do it right from the beginning. I would argue that not accounting for this in the first place in the original deployment scripts was the wrong approach, so we should try and not repeat it now.
Of course, you can reason not to do it now because it requires too much effort.
I would also not attempt to do/fix this until all works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we should see if running etcd in memory works for us and actually solve the problem,
If it will and the lane will be green most of the time we could start to think about more optimizations (running more then one sriov lane job on a single CI node)
Note that the amount of sriov lane jobs that could run on single CI node is bound to the number of PF's sriov card provides which is two at the current state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am already working on running multi instances on one node (already works locally), POC soon,
as was raised in the CI meeting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that I think of, if two sriov lanes run at same node they will have different pods so the are not going to share the /mnt/etcd
but still we have beneficts from using different etcd directories, if it's not too much effort.
Also let's add more info to the name and use something like /mnt/kubevirt-kind...-[unique name]-etcd
. One thing you can use is the is the container of the first master or the like or the one from the registry container.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should address this when we will work on supporting running multiple jobs on the same CI node
especially when we dont know in what way we are going to do it.
The way I see it, we need to manipulate sriov-cni on the CI cluster so it will attach single PF for each prow job pod
than we dont need to change the current code and the whole process will be transparent.
And like you said since each job run on the different pod there shouldn't be a problem with the directories
What you think?
I can add another directory for visibility so the path will be like /mnt/kubevirt-kind/etcd
is that ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @qinqon mentioned that it is not about running multiple lanes in parallel, so I do not understand why you refer to that in your answer.
Please reason why there is no need for an unique, per-run folder name in this case, regardless of usage by other features we may encounter in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On CI unique names is not a must since we create in memory directory using emptyDir volume API (on CI cluster) on the prow job pod kubevirt/project-infra#709
After discussing about it with Quique I realized we unique name for local environments in case cluster-up is stopped in the middle for any reason.
I pushed new commit to resolve this
fe6fbe0
to
1195f36
Compare
Changes:
|
1195f36
to
cc2cf46
Compare
cc2cf46
to
e6e37c6
Compare
|
||
setup_kind | ||
|
||
export ETCD_DATA_DIR="/mnt/etcd" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that I think of, if two sriov lanes run at same node they will have different pods so the are not going to share the /mnt/etcd
but still we have beneficts from using different etcd directories, if it's not too much effort.
Also let's add more info to the name and use something like /mnt/kubevirt-kind...-[unique name]-etcd
. One thing you can use is the is the container of the first master or the like or the one from the registry container.
6c40193
to
549396c
Compare
Changes:
Please let me know if I should squash the commits |
@ormergi, what do you mean ? we cannot mount |
We can mount But it dosent matter anymore because I found a way to run etcd in memory w/o the need to create a mount! :) |
b869c81
to
68bbc66
Compare
Changes:
|
/test check-up-kind-1.17-sriov |
@ormergi this means that we don't need to mount a volume at the prow job ? I understand that what /tmp means inside the kind node depends what the kind project want to do with it, are we sure they are not going to change it ? In the future they can mount a container volume that is writing to disk and then we are at square one. Although the solution is better Thant what we have now and is self contained. |
|
||
setup_kind | ||
|
||
RUN_ETCD_IN_MEMORY="true" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this and leave true as default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
cluster-up/cluster/kind/common.sh
Outdated
@@ -15,6 +15,8 @@ REGISTRY_NAME=${CLUSTER_NAME}-registry | |||
MASTER_NODES_PATTERN="control-plane" | |||
WORKER_NODES_PATTERN="worker" | |||
|
|||
RUN_ETCD_IN_MEMORY=${RUN_ETCD_IN_MEMORY:-"false"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the variables that can be set outside of providers are prefixed with KUBEVIRT, let's rename this KUBEVIRT_WITH_KIND_ETCD_IN_MEMORY, also set the default to true at the end is what we want, if there is a problem we can go back to false changing just the prow job.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Sorry I didnt mean to say the obvious about /tmp I am sure you familiar with how it works. |
68bbc66
to
64da9ce
Compare
Changes:
|
/test check-up-kind-1.17-sriov |
cluster-up/cluster/kind/common.sh
Outdated
@@ -15,6 +15,9 @@ REGISTRY_NAME=${CLUSTER_NAME}-registry | |||
MASTER_NODES_PATTERN="control-plane" | |||
WORKER_NODES_PATTERN="worker" | |||
|
|||
KUBEVIRT_WITH_KIND_ETCD_IN_MEMORY=${KUBEVIRT_WITH_KIND_ETCD_IN_MEMORY:-"true"} | |||
ETCD_DATA_DIR="/tmp/kind-cluster-etcd" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename this to ETCD_IN_MEMORY_DATA_DIR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
cluster-up/cluster/kind/common.sh
Outdated
echo "Checking KIND cluster etcd data is mounted to RAM: $ETCD_DATA_DIR" | ||
docker exec "$CLUSTER_NAME-control-plane" df -h $(dirname $ETCD_DATA_DIR) | ||
docker exec "$CLUSTER_NAME-control-plane" du -h $ETCD_DATA_DIR | ||
[ $(echo $?) != 0 ] && echo "falid to check etcd data directory" && return 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: failed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
cluster-up/cluster/kind/common.sh
Outdated
docker exec "$CLUSTER_NAME-control-plane" df -h $(dirname $ETCD_DATA_DIR) | ||
docker exec "$CLUSTER_NAME-control-plane" du -h $ETCD_DATA_DIR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will check that it's mounted but not that it's mounted in memory, can you check that it's using tmpfs
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem
Currently we encounter bad performance of etcd on sriov provider cluster on DinD setup. We get 'etcdserver: timeout errors' that causes jobs to fail often. In cases where etcd has bad performance and the data shouldnt be persistant (e.g: on CI and dev environments) it is recommanded [1] to use in-memory etcd To do that this commit: - Adds kubeadm ClusterConfiguration to kind config, setting etcd data directory to '/tmp/...' inside kind cluster nodes. '/tmp/' directory is already mounted to RAM memory as tmpfs. - 'KUBEVIRT_WITH_KIND_ETCD_IN_MEMORY', expected variables: "true", "false" controls running etcd in memory on kind providers. - 'ETD_DATA_DIR', expects directory path that mounted to RAM memory controls the path of etcd data directory inside kind cluster nodes Running etcd in memory should improve performance and will stabilize sriov provider and lanes. [1] kubernetes-sigs/kind#1922 Signed-off-by: Or Mergi <[email protected]>
64da9ce
to
393d0cd
Compare
Changes:
|
/test check-up-kind-1.17-sriov |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: qinqon The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold cancel |
Currently we encounter bad performance of KIND cluster on DinD setup,
we get 'etcdserver: timeout errors' on SRIOV lane jobs causing jobs to fail often.
In such cases it is recommended to run in-memory etcd
kubernetes-sigs/kind#1922
kubernetes-sigs/kind#845
Running etcd in memory should improve performance and make sriov lane more stabilized
This PR patch the kind-config yaml for creating a cluster to run etcd from a specified path
which will be backed by in memory volume kubevirt/project-infra#709