Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ARM64 node_e2e test #29192

Merged
merged 1 commit into from
May 27, 2023
Merged

Add ARM64 node_e2e test #29192

merged 1 commit into from
May 27, 2023

Conversation

ike-ma
Copy link
Contributor

@ike-ma ike-ma commented Mar 30, 2023

Run node_e2e serial tests on GCP T2A (arm64) machine

  • Test locally with the following command
KUBE_BUILD_FOR_ARM64=true ARTIFACTS="${TMPDIR}" JENKINS_GCE_SSH_PRIVATE_KEY_FILE="${SSH_KEY}" \
  kubetest   --up   --test   --provider=gce   \
  --deployment=node --gcp-project="${PROJECT}"   --gcp-zone="${ZONE}" \
  "--node-args=--image-config-file="${IMAGE_CONFIG_OUT}   \
  '--node-test-args="" --kubelet-flags=""'   --node-tests=true   \
  '--test_args=--nodes=1 --focus="\[Serial\]" \
  --skip="\[Flaky\]|\[Slow\]|\[Benchmark\]|\[NodeSpecialFeature:.+\]|\[NodeSpecialFeature\]|\[NodeAlphaFeature:.+\]|\[NodeAlphaFeature\]|\[NodeFeature:Eviction\]|\[NodeFeature:NodeProblemDetector\]|\[NodeFeature:OOMScoreAdj\]"'   \
  '--timeout=300m' 2>&1 | tee -i "${TMPDIR}/build-log-$(date +%Y%m%d.%M.%S.%3N).txt"
  • Test Result: PASS
Ran 34 of 390 Specs in 1619.273 seconds
SUCCESS! -- 34 Passed | 0 Failed | 0 Pending | 356 Skipped
PASS

Cross-ref: kubernetes/kubernetes#117017 (Setup node_e2e to support ARM64)

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Mar 30, 2023
@k8s-ci-robot
Copy link
Contributor

Hi @ike-ma. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added area/config Issues or PRs related to code in /config area/jobs sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Mar 30, 2023
@ike-ma
Copy link
Contributor Author

ike-ma commented Mar 30, 2023

/cc @bobbypage

@dims
Copy link
Member

dims commented Mar 30, 2023

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 30, 2023
- --scenario=kubernetes_e2e
- --
- --deployment=node
- --env=KUBE_BUILD_FOR_ARM64=true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ike-ma where is this used? KUBE_BUILD_FOR_ARM64

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still relevant? It seems to be removed from kubernetes/kubernetes#117017

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, this should no longer be needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been removed

@ike-ma
Copy link
Contributor Author

ike-ma commented Apr 4, 2023

/assign @bobbypage

@SergeyKanzhelev
Copy link
Member

/assign @mmiranda96

Copy link
Contributor

@mmiranda96 mmiranda96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a couple of comments.

# `gcloud compute --project <to-project> disks create <image name> --image=https://www.googleapis.com/compute/v1/projects/<from-project>/global/images/<image-name>`
# `gcloud compute --project <to-project> images create <image-name> --source-disk=<image-name>`
images:
ubuntu:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you plan to use COS as well? We can add both configs here.

Copy link
Contributor Author

@ike-ma ike-ma May 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently only Ubuntu is in scope, will add COS after more testing in a separate PR.

# `gcloud compute --project <to-project> images create <image-name> --source-disk=<image-name>`
images:
ubuntu:
image: ubuntu-gke-2204-1-24-arm64-v20230217
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest that we use image-family instead, otherwise we'd need to update this job constantly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Updated.

runcmd:
- echo "Test run from /tmp folder, remounting it"
- mount /tmp /tmp -o remount,exec,suid

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if having empty lines across a YAML list is supported.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The empty lines will be ignored by the parser. I have tested it without any issue.

- --scenario=kubernetes_e2e
- --
- --deployment=node
- --env=KUBE_BUILD_FOR_ARM64=true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still relevant? It seems to be removed from kubernetes/kubernetes#117017

@mmiranda96
Copy link
Contributor

/retest

@mmiranda96
Copy link
Contributor

/assign @ndixita

@SergeyKanzhelev
Copy link
Member

@ike-ma can you please review the @mmiranda96 's comments. I'd like to merge this so we can run this test for the changes in k/k repository

@tzneal
Copy link
Contributor

tzneal commented May 17, 2023

/cc

@k8s-ci-robot k8s-ci-robot requested a review from tzneal May 17, 2023 17:11
description: "Run serial node e2e tests on ARM64 environment on Ubuntu"
labels:
preset-service-account: "true"
preset-k8s-ssh: "true"
Copy link
Member

@bobbypage bobbypage May 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will likely need

preset-dind-enabled: "true"

because ARM build needs to use dockerized build so we need to enable docker-in-docker.

Also for docker-in-docker it will also need to be marked as

   securityContext:
        privileged: true

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the pointer, added.

- --deployment=node
- --env=KUBE_BUILD_FOR_ARM64=true
- --gcp-zone=us-central1-a
- --node-args=--image-config-file=/workspace/test-infra/jobs/e2e_node/arm/image-config.yaml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will likely need the new flags introduced in https://github.dev/kubernetes/kubernetes/pull/117017, i.e.

--use-dockerized-build=true --target-build-arch=linux/arm64

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, updated with the latest command.

Copy link
Member

@chendave chendave May 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like there is a typo in the config file,

--image-config-file=/workspace/test-infra/jobs/e2e_node/arm/image-config.yaml

while it should be

--image-config-file=/workspace/test-infra/jobs/e2e_node/arm/image-config-serial.yaml

Can you confirm?

/hold

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Named as image-config.yaml in my own test. Updated to match.

@ike-ma
Copy link
Contributor Author

ike-ma commented May 24, 2023

Addressed all review comments, and updated the test config based on #28888.

Tested with

go run /usr/local/google/home/ikema/go-k8s-oss/src/k8s.io/kubernetes/test/e2e_node/runner/remote/run_remote.go --cleanup -vmodule=*=4 --ssh-env=gce --results-dir=/tmp/e2e-node-results/ason --project=ikema-gke-dev-2 --use-dockerized-build=true --target-build-arch=linux/arm64 --zone=us-central1-a --ssh-user=ikema --ssh-key=/usr/local/google/home/ikema/.ssh/google_compute_engine --ginkgo-flags='--nodes=1 --focus="\[Serial\]" --skip="\[Flaky\]|\[Slow\]|\[Benchmark\]|\[NodeSpecialFeature:.+\]|\[NodeSpecialFeature\]|\[NodeAlphaFeature:.+\]|\[NodeAlphaFeature\]|\[NodeFeature:Eviction\]|\[NodeFeature:NodeProblemDetector\]|\[NodeFeature:OOMScoreAdj\]|\[NodeFeature:DevicePluginProbe\]|\[NodeConformance\]" '  --test-timeout=5h0m0s --image-config-file=/tmp/e2e-node-results/ason/image-config.yaml 

containers:
- image: gcr.io/k8s-staging-test-infra/kubekins-e2e:v20230513-7e1db2f1bb-master
args:
- --repo=k8s.io/kubernetes=master
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should stop writing new jobs with bootstrap.py instead of decorate: true

these jobs have been logging a warning to migrate for years

this doesn't have to block adding the job, but please seriously consider migrating.

https://gist.github.com/dims/c1296f8ed42238baea0a5fcae45f4cf4

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledge, thanks for the gist pointer, will follow up in a separate PR.

Copy link
Member

@BenTheElder BenTheElder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve
with a request to please switch to prow native cloning etc. (decorate: true) and stop using the bootstrap.py entrypoint script.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: BenTheElder, ike-ma

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels May 24, 2023
Run node_e2e serial tests on GCP T2A (arm64) machine
@ike-ma
Copy link
Contributor Author

ike-ma commented May 25, 2023

Addressed review comments, and rebased to upstream/master

@chendave
Copy link
Member

/hold cancel
/lgtm
thanks!

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 27, 2023
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 27, 2023
@chendave
Copy link
Member

/test pull-test-infra-unit-test

@k8s-ci-robot k8s-ci-robot merged commit 553d4c1 into kubernetes:master May 27, 2023
@k8s-ci-robot
Copy link
Contributor

@ike-ma: Updated the job-config configmap in namespace default at cluster test-infra-trusted using the following files:

  • key node-kubelet.yaml using file config/jobs/kubernetes/sig-node/node-kubelet.yaml

In response to this:

Run node_e2e serial tests on GCP T2A (arm64) machine

  • Test locally with the following command
KUBE_BUILD_FOR_ARM64=true ARTIFACTS="${TMPDIR}" JENKINS_GCE_SSH_PRIVATE_KEY_FILE="${SSH_KEY}" \
 kubetest   --up   --test   --provider=gce   \
 --deployment=node --gcp-project="${PROJECT}"   --gcp-zone="${ZONE}" \
 "--node-args=--image-config-file="${IMAGE_CONFIG_OUT}   \
 '--node-test-args="" --kubelet-flags=""'   --node-tests=true   \
 '--test_args=--nodes=1 --focus="\[Serial\]" \
 --skip="\[Flaky\]|\[Slow\]|\[Benchmark\]|\[NodeSpecialFeature:.+\]|\[NodeSpecialFeature\]|\[NodeAlphaFeature:.+\]|\[NodeAlphaFeature\]|\[NodeFeature:Eviction\]|\[NodeFeature:NodeProblemDetector\]|\[NodeFeature:OOMScoreAdj\]"'   \
 '--timeout=300m' 2>&1 | tee -i "${TMPDIR}/build-log-$(date +%Y%m%d.%M.%S.%3N).txt"
  • Test Result: PASS
Ran 34 of 390 Specs in 1619.273 seconds
SUCCESS! -- 34 Passed | 0 Failed | 0 Pending | 356 Skipped
PASS

Cross-ref: kubernetes/kubernetes#117017 (Setup node_e2e to support ARM64)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@SergeyKanzhelev
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/config Issues or PRs related to code in /config area/jobs cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

10 participants