Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🎁 Easier debugging, especially for the in-cluster sender #375

Merged
merged 16 commits into from
Oct 29, 2024

Conversation

cardil
Copy link
Contributor

@cardil cardil commented Oct 18, 2024

Changes

  • 🎁 Easier debugging, especially for the in-cluster sender

/kind enhancement

Fixes #129

Release Note

The errors are easier to debug because the in-cluster jobs and their pods are logged and users are directed to them.

Copy link

knative-prow bot commented Oct 18, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@knative-prow knative-prow bot added kind/enhancement do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Oct 18, 2024
@knative-prow knative-prow bot requested review from dsimansk and rhuss October 18, 2024 18:16
@knative-prow knative-prow bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 18, 2024
@knative-prow-robot knative-prow-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 18, 2024
@knative-prow knative-prow bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Oct 18, 2024
@cardil cardil force-pushed the feature/debug-abillity branch from 8891e5a to 354fe78 Compare October 21, 2024 11:08
@knative-prow-robot knative-prow-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 22, 2024
@knative-prow knative-prow bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 22, 2024
stringRE = regexp.MustCompile(`\s+"[^"]+"\s+`)
)

func errorHandler(err error, cmd *cobra.Command) bool {
Copy link
Contributor Author

@cardil cardil Oct 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error handler prints errors like this:
Screenshot from 2024-10-22 19-42-11

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With ff89790 I added gathering of K8s' events as well. That should collect all the info we'll need for debugging: job, pods, and events.

Here's updated successful log (converted to JSON array, for readability): https://gist.github.com/cardil/d8b7963c10e7b6e3befcc112916965c7

And here is a failure example (also as JSON array): https://gist.github.com/cardil/d1216a51f483f242ed39517f4ede9a9c

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's how the error handling and the kn client tui package animation plays well:

kn-event-fancy-err

Template: corev1.PodTemplateSpec{
Spec: corev1.PodSpec{
RestartPolicy: corev1.RestartPolicyNever,
RestartPolicy: corev1.RestartPolicyNever,
ActiveDeadlineSeconds: deadline(),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using this deadline should be safe (the image download time isn't counting into this deadline), as checked here: https://drive.google.com/file/d/19js79hNTqPJG7CogHH7O6-CRSijDXRNE/view

@knative-prow-robot knative-prow-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 23, 2024
@knative-prow-robot knative-prow-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 24, 2024
@knative-prow knative-prow bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Oct 24, 2024
Copy link

codecov bot commented Oct 24, 2024

Codecov Report

Attention: Patch coverage is 74.88987% with 57 lines in your changes missing coverage. Please review.

Project coverage is 68.92%. Comparing base (b7dd9cb) to head (3d07d88).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
pkg/k8s/job_gatherer.go 73.58% 9 Missing and 5 partials ⚠️
internal/cli/errors.go 82.35% 6 Missing and 3 partials ⚠️
pkg/sender/in_cluster.go 55.00% 6 Missing and 3 partials ⚠️
pkg/cli/context.go 12.50% 5 Missing and 2 partials ⚠️
pkg/errors/wrap.go 72.72% 4 Missing and 2 partials ⚠️
pkg/k8s/jobrunner.go 87.17% 5 Missing ⚠️
pkg/ics/send.go 0.00% 4 Missing ⚠️
internal/cli/send.go 0.00% 2 Missing ⚠️
internal/cli/build.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #375      +/-   ##
==========================================
+ Coverage   66.52%   68.92%   +2.39%     
==========================================
  Files          48       52       +4     
  Lines        1440     1622     +182     
==========================================
+ Hits          958     1118     +160     
- Misses        412      424      +12     
- Partials       70       80      +10     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@cardil cardil force-pushed the feature/debug-abillity branch from c705d11 to 52f8f71 Compare October 24, 2024 14:29
@cardil
Copy link
Contributor Author

cardil commented Oct 24, 2024

/test all

@cardil cardil marked this pull request as ready for review October 25, 2024 11:29
@knative-prow knative-prow bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 25, 2024
go.mod Outdated Show resolved Hide resolved
@knative-prow knative-prow bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 25, 2024
@cardil cardil force-pushed the feature/debug-abillity branch from f60b532 to 9f237ff Compare October 25, 2024 12:16
@cardil cardil changed the title 🎁 Easier to debug the failures, especially for in-cluster sender 🎁 Easier debugging of the failures, especially for the in-cluster sender Oct 25, 2024
@knative-prow knative-prow bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 28, 2024
@rhuss
Copy link
Contributor

rhuss commented Oct 29, 2024

Nice PR, great work!

I do not have much time to review it thoroughly, but I am happy to ack it. @cardil, is there anything you want to add before merging?

@cardil
Copy link
Contributor Author

cardil commented Oct 29, 2024

@rhuss: PR, great work!

Thanks. The PR works great, so I'm happy to merge it.

BTW. I think we can adopt very similar error processing across all of kn and fun. WDYT?

@cardil cardil changed the title 🎁 Easier debugging of the failures, especially for the in-cluster sender 🎁 Easier debugging, especially for the in-cluster sender Oct 29, 2024
@dsimansk
Copy link
Contributor

/approve
/lgtm

@knative-prow knative-prow bot added the lgtm Indicates that a PR is ready to be merged. label Oct 29, 2024
Copy link

knative-prow bot commented Oct 29, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cardil, dsimansk

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow knative-prow bot merged commit cd9cf60 into knative-extensions:main Oct 29, 2024
24 checks passed
@rhuss
Copy link
Contributor

rhuss commented Oct 30, 2024

BTW. I think we can adopt very similar error processing across all of kn and fun. WDYT?

That sounds like a good idea. Maybe start with talking to the func team, as this has more interactive elements already anyway. Also, we need to be sure, that the spinner is only active if there is a tty. Not sure if the PR deals already with that, that when no tty is available (i.e. used in a pipe or a redirection), then no control characters or colors should be printed.

@cardil cardil deleted the feature/debug-abillity branch October 31, 2024 15:35
@cardil
Copy link
Contributor Author

cardil commented Oct 31, 2024

@rhuss: we need to be sure, that the spinner is only active if there is a tty. Not sure if the PR deals already with that, that when no tty is available (i.e. used in a pipe or a redirection), then no control characters or colors should be printed.

That's already addressed in knative.dev/client/pkg library and in bubblegum library as well.

cardil added a commit to cardil/kn-plugin-event that referenced this pull request Nov 5, 2024
openshift-merge-bot bot pushed a commit to openshift-knative/kn-plugin-event that referenced this pull request Dec 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/enhancement lgtm Indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The in-cluster-sender error should be debuggable
4 participants