Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix cronjob version in manifest used for tests #30733

Merged
merged 10 commits into from
Mar 9, 2022

Conversation

ChrsMark
Copy link
Member

@ChrsMark ChrsMark commented Mar 8, 2022

What does this PR do?

Reopening #30685

Why is it important?

Cronjob objects have been move from batch/v1beta1 API to batch/v1 API of k8s after v1.21+. In this regard we need to update the spec used for our tests accordingly.

@ChrsMark ChrsMark self-assigned this Mar 8, 2022
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Mar 8, 2022
@mergify
Copy link
Contributor

mergify bot commented Mar 8, 2022

This pull request does not have a backport label. Could you fix it @ChrsMark? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-v./d./d./d is the label to automatically backport to the 7./d branch. /d is the digit

NOTE: backport-skip has been added to this pull request.

@mergify mergify bot added the backport-skip Skip notification from the automated backport with mergify label Mar 8, 2022
@ChrsMark ChrsMark added the backport-v8.2.0 Automated backport with mergify label Mar 8, 2022
@mergify mergify bot removed the backport-skip Skip notification from the automated backport with mergify label Mar 8, 2022
@ChrsMark ChrsMark added backport-skip Skip notification from the automated backport with mergify Team:Cloudnative-Monitoring Label for the Cloud Native Monitoring team labels Mar 8, 2022
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Mar 8, 2022
@mergify mergify bot removed the backport-skip Skip notification from the automated backport with mergify label Mar 8, 2022
@ChrsMark
Copy link
Member Author

ChrsMark commented Mar 8, 2022

@mdelapenya any ideas here about how we can get more about the failure? Running MODULE=kubernetes make integration-tests locally succeeds but no idea why it fails on the CI and the error message does not help.

@elasticmachine
Copy link
Collaborator

elasticmachine commented Mar 8, 2022

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview [preview](https://ci-stats.elastic.co/app/apm/services/beats-ci/transactions/view?rangeFrom=2022-03-09T16:09:46.567Z&rangeTo=2022-03-09T16:29:46.567Z&transactionName=BUILD Beats/beats/PR-{number}&transactionType=job&latencyAggregationType=avg&traceId=02342a3b859cf3dec4dab64ec5dd1c21&transactionId=5d5183840dd38284)

Expand to view the summary

Build stats

  • Start Time: 2022-03-09T16:19:46.567+0000

  • Duration: 63 min 30 sec

Test stats 🧪

Test Results
Failed 0
Passed 22827
Skipped 2441
Total 25268

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@mdelapenya
Copy link
Contributor

Hey @ChrsMark I was able to reproduce it locally:

Commands:

cd metricbeat
MODULE=kubernetes MAGEFILE_VERBOSE=true mage goIntegTest

Output:

Running target: GoIntegTest
Running dependency: Fields
Running dependency: moduleFieldsGo
Running dependency: libbeatAndMetricbeatCommonFieldsGo
exec: go list -m
Found Elastic Beats dir at /Users/mdelapenya/sourcecode/src/github.com/elastic/beats
exec: go run /Users/mdelapenya/sourcecode/src/github.com/elastic/beats/dev-tools/cmd/module_fields/module_fields.go -beat metricbeat -license ASL2 /Users/mdelapenya/sourcecode/src/github.com/elastic/beats/metricbeat/module
exec: go run -mod=readonly /Users/mdelapenya/sourcecode/src/github.com/elastic/beats/libbeat/scripts/cmd/global_fields/main.go -es_beats_path /Users/mdelapenya/sourcecode/src/github.com/elastic/beats -beat_path /Users/mdelapenya/sourcecode/src/github.com/elastic/beats/metricbeat -out fields.yml
Generated fields.yml for metricbeat to /Users/mdelapenya/sourcecode/src/github.com/elastic/beats/metricbeat/fields.yml
exec: go run -mod=readonly /Users/mdelapenya/sourcecode/src/github.com/elastic/beats/dev-tools/cmd/asset/asset.go -pkg include -in fields.yml -out include/fields/fields.go -license ASL2 metricbeat
Running dependency: fieldsYML
exec: go run -mod=readonly /Users/mdelapenya/sourcecode/src/github.com/elastic/beats/libbeat/scripts/cmd/global_fields/main.go -es_beats_path /Users/mdelapenya/sourcecode/src/github.com/elastic/beats -beat_path /Users/mdelapenya/sourcecode/src/github.com/elastic/beats/metricbeat -out fields.yml module
Generated fields.yml for metricbeat to /Users/mdelapenya/sourcecode/src/github.com/elastic/beats/metricbeat/fields.yml
Running dependency: Dashboards
Setup mage...
Running dependency: github.com/elastic/beats/v7/dev-tools/mage.buildMage
exec: mage -f -goos=linux -goarch=amd64 -compile build/mage-linux-amd64
Setup kind...
exec: git rev-parse HEAD
Kubeconfig:  /Users/mdelapenya/sourcecode/src/github.com/elastic/beats/metricbeat/build/kind/metricbeat-8-2-0-1f89a67ca7-snapshot/kubecfg
exec: kind create cluster --name metricbeat-8-2-0-1f89a67ca7-snapshot --kubeconfig /Users/mdelapenya/sourcecode/src/github.com/elastic/beats/metricbeat/build/kind/metricbeat-8-2-0-1f89a67ca7-snapshot/kubecfg --wait 300s
Creating cluster "metricbeat-8-2-0-1f89a67ca7-snapshot" ...
 ✓ Ensuring node image (kindest/node:v1.20.2) 🖼 
 ✓ Preparing nodes 📦  
 ✓ Writing configuration 📜 
 ✓ Starting control-plane 🕹️ 
 ✓ Installing CNI 🔌 
 ✓ Installing StorageClass 💾 
 ✓ Waiting ≤ 5m0s for control-plane = Ready ⏳ 
 • Ready after 27s 💚
Set kubectl context to "kind-metricbeat-8-2-0-1f89a67ca7-snapshot"
You can now use your cluster with:

kubectl cluster-info --context kind-metricbeat-8-2-0-1f89a67ca7-snapshot --kubeconfig /Users/mdelapenya/sourcecode/src/github.com/elastic/beats/metricbeat/build/kind/metricbeat-8-2-0-1f89a67ca7-snapshot/kubecfg

Thanks for using kind! 😊
>> Running testing inside of kubernetes...
>> Applying module manifest to cluster...
exec: kubectl apply -f /Users/mdelapenya/sourcecode/src/github.com/elastic/beats/metricbeat/module/kubernetes/kubernetes.yml
clusterrole.rbac.authorization.k8s.io/kube-state-metrics created
clusterrolebinding.rbac.authorization.k8s.io/kube-state-metrics created
serviceaccount/kube-state-metrics created
deployment.apps/kube-state-metrics created
service/kube-state-metrics created
statefulset.apps/basic-sts created
resourcequota/object-counts created
error: unable to recognize "/Users/mdelapenya/sourcecode/src/github.com/elastic/beats/metricbeat/module/kubernetes/kubernetes.yml": no matches for kind "CronJob" in version "batch/v1"
>> Done running testing inside of kubernetes...
Teardown kind...
exec: kind delete cluster --name metricbeat-8-2-0-1f89a67ca7-snapshot
Deleting cluster "metricbeat-8-2-0-1f89a67ca7-snapshot" ...
Teardown mage...
Error: failed modules: kubernetes

This seems the error:

error: unable to recognize "/Users/mdelapenya/sourcecode/src/github.com/elastic/beats/metricbeat/module/kubernetes/kubernetes.yml": no matches for kind "CronJob" in version "batch/v1"

@mdelapenya
Copy link
Contributor

mdelapenya commented Mar 8, 2022

If you take a look at the logs, the kind version uses kindest/node:v1.20.2, so for some reason it's not matching the project version. I've double checked that CronJobs are stable after 1.21

@ChrsMark
Copy link
Member Author

ChrsMark commented Mar 8, 2022

@mdelapenya thanks for taking the time to check this. I was about to note this to you. What is weird is that based on https://github.com/elastic/beats/pull/30732/files#diff-cca0972262451be53fcc0836d303fdfe24e6e04fedef3a1e9741c1941d80aa1cR21 we shouldn't test against v1.20 in this branch, what branch are you using locally?

(Master seems to be updated also)

My local run passes giving me:

...
Running dependency: github.com/elastic/beats/v7/dev-tools/mage.buildMage
exec: mage -f -goos=linux -goarch=amd64 -compile build/mage-linux-amd64
Setup kind...
exec: git rev-parse HEAD
Kubeconfig:  /Users/chrismark/go/src/github.com/elastic/beats/metricbeat/build/kind/metricbeat-8-2-0-1f89a67ca7-snapshot/kubecfg
exec: kind create cluster --name metricbeat-8-2-0-1f89a67ca7-snapshot --kubeconfig /Users/chrismark/go/src/github.com/elastic/beats/metricbeat/build/kind/metricbeat-8-2-0-1f89a67ca7-snapshot/kubecfg --wait 300s
Creating cluster "metricbeat-8-2-0-1f89a67ca7-snapshot" ...
 ✓ Ensuring node image (kindest/node:v1.23.4)
...

Signed-off-by: chrismark <[email protected]>
@ChrsMark ChrsMark requested a review from a team as a code owner March 8, 2022 13:34
@botelastic botelastic bot added the Team:Automation Label for the Observability productivity team label Mar 8, 2022
@ChrsMark
Copy link
Member Author

ChrsMark commented Mar 8, 2022

Looking into the code (

args = append(args, "--image", fmt.Sprintf("kindest/node:%s", kubeVersion))
) it seems that the version fallback to the default of kind version if K8S_VERSION env variable is not present.

I wonder if we can get those verbose output by the run of CI, it would help.

Signed-off-by: chrismark <[email protected]>
@ChrsMark ChrsMark requested a review from a team as a code owner March 8, 2022 14:20
@v1v
Copy link
Member

v1v commented Mar 8, 2022

I wonder if we can get those verbose output by the run of CI, it would help.

If you configure the env variable MAGEFILE_VERBOSE in https://github.com/elastic/beats/blob/main/Jenkinsfile#L24 then it will available by default for all the mage commands.

Though, I don't know if that's what you mean and what's the implications in terms of debugging, as it might contain massive logs

@ChrsMark
Copy link
Member Author

ChrsMark commented Mar 8, 2022

Thanks @v1v let me try this at c058122

@ChrsMark
Copy link
Member Author

ChrsMark commented Mar 8, 2022

Looking into the verbose logs I find [2022-03-08T15:23:42.778Z] • Ensuring node image (kindest/node:v1.17.0) 🖼 ... which is indeed problematic and it seems that K8S_VERSION is not properly set at

withEnv(["K8S_VERSION=${v}", "KIND_VERSION=v0.12.0", "KUBECONFIG=${env.WORKSPACE}/kubecfg"]){
?

@v1v
Copy link
Member

v1v commented Mar 8, 2022

which is indeed problematic and it seems that K8S_VERSION is not properly set at

@ChrsMark, this is a bit complex to explain but in a nutshell:

k8s specific tests in the CI are defined in

k8sTest:
k8sTest: "v1.23.4,v1.22.7,v1.21.10"
stage: mandatory
(see a361e8d)

That's the reason K8S_VERSION is not defined, the failures are coming from the stage defined in :

So far the existing CI pipeline only configures the kind/k8s when there are changes in:

  • "^deploy/kubernetes/.*"
  • "^libbeat/docs/version.asciidoc"

If it's needed to run for the stage mage goIntegTest in metricbeat, then the env variable is needed to be define in the mage goIntegTest rather than in the CI, this particular requirement was never needed as far as I remember, and if it's added within the mage build system, then it will be decoupled from the CI.

Could you add the env variable to the build system itself?

@ChrsMark
Copy link
Member Author

ChrsMark commented Mar 9, 2022

Thanks for the explanation @v1v . We can handle the absence of K8S_VERSION and fallback to latest version instead at

args = append(args, "--image", fmt.Sprintf("kindest/node:%s", kubeVersion))
. However one thing that still seems to be problematic is that the version of kind is also not defined and we will hit the issue of not being able to actually use a latest k8s version based on https://github.com/kubernetes-sigs/kind/releases.

I wonder how kind is installed when tests are run by

goIntegTest:
mage: "mage goIntegTest"
withModule: true
stage: mandatory

@v1v
Copy link
Member

v1v commented Mar 9, 2022

I wonder how kind is installed when tests are run by

As far as I know both k8s/kind are only installed in

k8sTest:
k8sTest: "v1.23.4,v1.22.7,v1.21.10"
stage: mandatory

There are no installed while using

goIntegTest:
mage: "mage goIntegTest"
withModule: true
stage: mandatory
as mage: is a generic step and does nothing but configuring the Go/Mage/Environments.

To clarify this, the existing pipeline was refactored (with the Jenkinsfile.yml design) back in the days, and the k8s specific requirements were implemented in #17656

@ChrsMark
Copy link
Member Author

ChrsMark commented Mar 9, 2022

It seems that kind is expected to present in the environment based on

fmt.Println("Skipping kind setup; kind command missing")
. I don't know how it is expected to be available but I think this is something that we need to change and define kind version explicitly when we run
goIntegTest:
mage: "mage goIntegTest"
withModule: true
stage: mandatory
. Otherwise the test will be quite flaky in the long run.
Would it be possible to provide a way to define both k8s and kind version while running this goIntegTest step?

@ChrsMark
Copy link
Member Author

ChrsMark commented Mar 9, 2022

As we can see at https://beats-ci.elastic.co/blue/organizations/jenkins/Beats%2Fbeats/detail/PR-30733/8/pipeline/8526#step-16121-log-1 the k8s module tests are passing when run under deploy/kubernetes-k8sTest but not when running under metricbeat-goIntegTest because kind is not installed explicitly with a specific version on the latter and hence the k8s version falls back to a really old one.

Signed-off-by: chrismark <[email protected]>
@ChrsMark
Copy link
Member Author

ChrsMark commented Mar 9, 2022

Will rebase on top of #30747 and hopefully it will go green.

Signed-off-by: chrismark <[email protected]>
@v1v
Copy link
Member

v1v commented Mar 9, 2022

/test

@v1v
Copy link
Member

v1v commented Mar 9, 2022

@ChrsMark and @mdelapenya , much better now:

Kapture 2022-03-09 at 16 49 24

You don't need to rebase anything since the builds in the CI are already doing that as part of the build itself.

Copy link
Contributor

@MichaelKatsoulis MichaelKatsoulis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ChrsMark
Copy link
Member Author

ChrsMark commented Mar 9, 2022

Thanks @v1v @mdelapenya for helping with this! Merging.

@ChrsMark ChrsMark merged commit 644d026 into elastic:main Mar 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-v8.2.0 Automated backport with mergify Team:Automation Label for the Observability productivity team Team:Cloudnative-Monitoring Label for the Cloud Native Monitoring team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants