Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consolidate e2e log collection #2345

Merged
merged 1 commit into from
Jun 13, 2022

Conversation

jsturtevant
Copy link
Contributor

What type of PR is this?
/kind cleanup

What this PR does / why we need it:
We have several different entry points for our CI and some of them use different logging scripts. This creates a binary that re-uses to the same logging as our conformance and e2e. It takes some basic configuration and runs the same logging functions used by the e2e_testsuite. An added benefit is that a user could also run this and collect logs that are the same as the e2e test.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #1372

Special notes for your reviewer:

This required moving around some of the capz functions in the e2e test pacakge similiar to what was done in #2130

Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

TODOs:

  • squashed commits
  • includes documentation
  • adds unit tests

Release note:

Consolidate e2e logging methods

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels May 31, 2022
@k8s-ci-robot k8s-ci-robot requested review from Jont828 and shysank May 31, 2022 23:47
@jsturtevant jsturtevant force-pushed the reuse-logging branch 3 times, most recently from d7c8f56 to 22a83fa Compare June 1, 2022 18:30
test/e2e/azure_clusterproxy.go Outdated Show resolved Hide resolved
test/logger.go Outdated Show resolved Hide resolved
@Jont828
Copy link
Contributor

Jont828 commented Jun 1, 2022

@jsturtevant Is this ready for review or is it still WIP? Saw that there was a broken link in the tests.

@jsturtevant
Copy link
Contributor Author

The other tests are passing, The folder structure isn't 100% the same so going to look into that but the general approach is ready for review.

The broken link is to a file that is in this PR, so not sure how we handle that. It doesn't exist but when this gets published it would exist...

@Jont828
Copy link
Contributor

Jont828 commented Jun 2, 2022

You can use <!-- markdown-link-check-disable-line --> to have it ignore the broken link for now. We use it for when we reference http://localhost:<port> as well.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 2, 2022
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 3, 2022
@jsturtevant
Copy link
Contributor Author

You can use <!-- markdown-link-check-disable-line --> to have it ignore the broken link for now. We use it for when we reference http://localhost:<port> as well.

Then we follow up with a separate PR to remove it? I feel that it would be better to merge knowing the check is only failing a link in that will be there once this merges but let me know what we typically do for this scenario.

@CecileRobertMichon
Copy link
Contributor

/retitle Consolidate e2e log collection

@k8s-ci-robot k8s-ci-robot changed the title Consolitdate e2e log collection Consolidate e2e log collection Jun 3, 2022
@jsturtevant
Copy link
Contributor Author

The folder structure for ci-entrypoint.sh is now the same as the ci-e2e.sh and ci-conformance.sh:

artifacts/
└── clusters
    ├── bootstrap
    │   ├── controllers
    │   │   ├── capi-controller-manager
    │   │   ├── capi-kubeadm-bootstrap-controller-manager
    │   │   ├── capi-kubeadm-control-plane-controller-manager
    │   │   └── capz-controller-manager
    │   └── resources
    │       └── default
    │           ├── AzureCluster
    │           ├── AzureClusterIdentity
    │           ├── AzureMachine
    │           ├── AzureMachineTemplate
    │           ├── Cluster
    │           ├── KubeadmConfig
    │           ├── KubeadmConfigTemplate
    │           ├── KubeadmControlPlane
    │           ├── Machine
    │           ├── MachineDeployment
    │           ├── MachineHealthCheck
    │           └── MachineSet
    └── capz-conf-r00mm3
        └── azure-activity-logs
        └── kube-system
        └── machines
            ├── capz-conf-r00mm3-md-win-8675ffb97b-8z89g 
            └── capz-conf-r00mm3-md-win-8675ffb97b-xmmhl

@jsturtevant
Copy link
Contributor Author

fyi @lzhecheng @nilo19 @andyzhangx

@CecileRobertMichon
Copy link
Contributor

The broken link is to a file that is in this PR, so not sure how we handle that. It doesn't exist but when this gets published it would exist...

this is a chicken-and-egg problem... no matter what it will require a second PR if we want to job to pass on this PR. We can also ignore the linter and merge the PR with the failure but that's not ideal.

@jsturtevant
Copy link
Contributor Author

this is a chicken-and-egg problem... no matter what it will require a second PR if we want to job to pass on this PR. We can also ignore the linter and merge the PR with the failure but that's not ideal.

Which approach is preferred?

@jsturtevant
Copy link
Contributor Author

/test pull-cluster-api-provider-azure-e2e

@Jont828
Copy link
Contributor

Jont828 commented Jun 3, 2022

An alternative approach would be to open a stub PR including the just the file we are linking in the docs (assuming the file isn't going to produce compile errors without the rest being present). Then, you could rebase this PR off of that one. Otherwise, I would personally lean towards using the disable link check for now.

test/logger.go Outdated Show resolved Hide resolved
test/logger.go Outdated
// optional flags that default
namespace := flag.String("namespace", "", "namespace on management cluster to collect logs for")
artifactFolder := flag.String("artifacts-folder", getArtifactsFolder(), "folder to store cluster logs")
azureresourcegroup := flag.String("azure-rg", "", "azure resource group that contains the cluster")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really need to take the RG as input here or can we figure that out from the azure cluster spec?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hadn't considered using the cluster spec, I will see if that can simplify adding an additional flag

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed it at the place in the code where we are using an env variable instead of looking it up on the cluster spec

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 8, 2022
@jsturtevant jsturtevant force-pushed the reuse-logging branch 2 times, most recently from 4e49bd9 to 3fe3425 Compare June 8, 2022 18:49
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 8, 2022
@jsturtevant jsturtevant force-pushed the reuse-logging branch 2 times, most recently from 08626ea to d0f78a6 Compare June 8, 2022 21:17
Copy link
Contributor

@CecileRobertMichon CecileRobertMichon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 9, 2022
@CecileRobertMichon CecileRobertMichon added this to the v1.4 milestone Jun 9, 2022
Copy link
Contributor

@CecileRobertMichon CecileRobertMichon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/assign @mboersma

docs/book/src/topics/troubleshooting.md Outdated Show resolved Hide resolved
docs/book/src/topics/troubleshooting.md Outdated Show resolved Hide resolved
test/e2e/azure_clusterproxy.go Outdated Show resolved Hide resolved
test/logger.go Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 13, 2022
Copy link
Contributor

@mboersma mboersma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

This is a great improvement! Thanks for addressing my ticky-tack comments.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 13, 2022
@CecileRobertMichon
Copy link
Contributor

/approve
/cherry-pick release-1.3

@k8s-infra-cherrypick-robot

@CecileRobertMichon: once the present PR merges, I will cherry-pick it on top of release-1.3 in a new PR and assign it to you.

In response to this:

/approve
/cherry-pick release-1.3

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: CecileRobertMichon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 13, 2022
@CecileRobertMichon
Copy link
Contributor

--- FAIL: TestAzureMachinePool_Validate (0.00s)
    --- FAIL: TestAzureMachinePool_Validate/HasValidImage (0.00s)
        azuremachinepool_test.go:66: 
            Unexpected error:
                <*field.Error | 0xc00031c040>: {
                    Type: "FieldValueForbidden",
                    Field: "spec",
                    BadValue: <string>"",
                    Detail: "can be set only if the MachinePool feature flag is enabled",
                }
                spec: Forbidden: can be set only if the MachinePool feature flag is enabled
            occurred

This failure seems to be coming from #2376 which is strange because it passed all tests before merging

@jsturtevant
Copy link
Contributor Author

is it happening consistently and for other prs?

/test pull-cluster-api-provider-azure-test

@CecileRobertMichon
Copy link
Contributor

hmm retest passed 🤔

I have not seen it on other PRs, but let's keep an 👁️ out. If it's non-deterministic that's an issue.

@k8s-ci-robot k8s-ci-robot merged commit 2899790 into kubernetes-sigs:main Jun 13, 2022
@k8s-infra-cherrypick-robot

@CecileRobertMichon: #2345 failed to apply on top of branch "release-1.3":

Applying: refactor logging so it is same in all scripts
.git/rebase-apply/patch:28: trailing whitespace.
which you can also leverage to pull all the logs for machines which will dump logs to `${PWD}/_artifacts}` by default. The following works 
warning: 1 line adds whitespace errors.
Using index info to reconstruct a base tree...
M	scripts/ci-entrypoint.sh
M	test/e2e/e2e_suite_test.go
Falling back to patching base and 3-way merge...
Auto-merging test/e2e/e2e_suite_test.go
CONFLICT (content): Merge conflict in test/e2e/e2e_suite_test.go
Auto-merging scripts/ci-entrypoint.sh
CONFLICT (content): Merge conflict in scripts/ci-entrypoint.sh
Removing hack/log/log-dump.sh
Removing hack/log/log-dump-daemonset.yaml
Removing hack/log/log-dump-daemonset-windows.yaml
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 refactor logging so it is same in all scripts
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/approve
/cherry-pick release-1.3

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consolidate log collection scripts for e2e tests
6 participants