Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Event ID and Source for Cloud Events without SelfLink #2676

Closed
afrittoli opened this issue May 22, 2020 · 13 comments · Fixed by #3761
Closed

Fix Event ID and Source for Cloud Events without SelfLink #2676

afrittoli opened this issue May 22, 2020 · 13 comments · Fixed by #3761
Labels
kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt.

Comments

@afrittoli
Copy link
Member

Expected Behavior

Event ID and Source are setup according to https://github.com/cloudevents/spec/blob/master/spec.md#producer without using

Actual Behavior

Right now the ID is a newly generated uuid and source is ObjectMeta.SelfLink.
That should be enough to ensure that that the source+id in globally unique, however SelfLink is deprecated, so it needs to be replaced.

Steps to Reproduce the Problem

  1. return cmp.Equal(x.Condition, y.Condition) && cmp.Equal(x.RetryCount, y.RetryCount)

Additional Info

https://github.com/kubernetes/enhancements/blob/master/keps/sig-api-machinery/20190711-remove-selflink.md#risks-and-mitigations

@afrittoli afrittoli added the kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. label May 22, 2020
@tom24d
Copy link

tom24d commented Jun 29, 2020

Hi @afrittoli ,

After thinking about Source attribute, I am wondering if we can introduce new func in the objectWithCondition to get CloudEventSource to populate it (also should be renamed). For example, GetCloudEventSource and its corresponding function in both run_types.go (This examples are from Knative impl though).

I think this would be a fair alternative to imitate SelfLink before the removal comes.

Renamed candidate: Runnable, Emittable, Reportable... I do not have a good sense for naming stuff😅

I would like to pick up this as my first contribution to Tekton.

Or, would you plan for another semantics for ID and Source?
It would be my pleasure to let me know if you have a better idea.

Thanks,

@tom24d
Copy link

tom24d commented Jul 1, 2020

I have missed #2684 and its context, and turns out it it not really good first issue. Never mind😅

@afrittoli
Copy link
Member Author

@tom24d sorry, I'm a bit late on replying to this. #2684 was an idea, but it would need to go with a TEP probably before going ahead, it's not something we've agreed upon yet.

I like the idea of using an interface for solving the source issue, and I don't like the existing name of the interface I created :D so I wouldn't mind changing it.

Emittable sounds like an interesting idea :)
How would you build the source for taskrun and pipeline runs? I see two alternatives, one a specific tekton installation as the source and the other (like today) would be to have the taskrun or pipeline run being the source (so the producer of the event).

@tom24d
Copy link

tom24d commented Jul 2, 2020

@afrittoli
No problem! Thanks for the reply.

one a specific tekton installation as the source

Is it implying we want to identify the sources for the case of Tektons in multi-cluster? That sounds needed for me, but I assume we want to avoid having DefaultCloudEventSourceForTaskRun, DefaultCloudEventSourceForPipelineRun,.. etc. I am wondering if we have new one field such as DefaultClusterName(=https://tekton.dev as default) in the ConfigMap, then composing the source by:

<DefaultClusterName>/pkg/apis/<version>/namespace/<ns>/**run/<name>

How does this sound?
Maybe I have seen a comment that Tekton folks are a bit worried about the increase of settings, but I do not remember where :(
Would you think it is worth?

@tekton-robot
Copy link
Collaborator

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.

/lifecycle rotten

Send feedback to tektoncd/plumbing.

@tekton-robot
Copy link
Collaborator

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

/close

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Aug 15, 2020
@tekton-robot
Copy link
Collaborator

@tekton-robot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

/close

Send feedback to tektoncd/plumbing.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@vdemeester
Copy link
Member

/remove-lifecycle rotten
/remove-lifecycle stale
/reopen

@tekton-robot
Copy link
Collaborator

@vdemeester: Reopened this issue.

In response to this:

/remove-lifecycle rotten
/remove-lifecycle stale
/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@tekton-robot tekton-robot reopened this Aug 17, 2020
@tekton-robot tekton-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Aug 17, 2020
@tekton-robot
Copy link
Collaborator

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 15, 2020
@tekton-robot
Copy link
Collaborator

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle rotten

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Dec 15, 2020
@jmcshane
Copy link
Contributor

Just ran into this on a Kubernetes 1.20.2 cluster and ran into this issue. It is causing cloudevent sending to fail with this error message:

{"level":"warn","ts":"2021-02-13T21:03:45.224Z","logger":"tekton.github.com-tektoncd-pipeline-pkg-reconciler-pipelinerun.Reconciler","caller":"cloudevent/cloud_event_controller.go:145","msg":"Failed to send cloudevent: source: REQUIRED\n","commit":"95144d9","knative.dev/traceid":"d2d67f73-3676-417b-b631-c3b6a7e24cfb","knative.dev/key":"default/pipelinerun-7aed775c-ef40-462e-9ec5-5c6c4c45bf67","logging.googleapis.com/labels":{},"logging.googleapis.com/sourceLocation":{"file":"github.com/tektoncd/pipeline/pkg/reconciler/events/cloudevent/cloud_event_controller.go","line":"145","function":"github.com/tektoncd/pipeline/pkg/reconciler/events/cloudevent.SendCloudEventWithRetries.func1"}}

What this means is that the Kubernetes API is returning an empty string for this call:

https://github.com/tektoncd/pipeline/blob/master/pkg/reconciler/events/cloudevent/cloudevent.go#L98

There needs to be some sort of patch for this otherwise it fails very silently. I can submit a PR to change the source to something that will resolve, but its going to be a breaking change in some manner.

jmcshane pushed a commit to jmcshane/pipeline that referenced this issue Feb 14, 2021
SelfLink is deprecated and is causing an error on Kubernetes v1.20
clusters as outlined in the issue. This is a short term fix that
would set the source field to a value that should match the current
source URI without the value of the auto-populated selfLink field.

Another field could be used for the source field without issue, but
could cause concerns about backwards compatibility.
@vdemeester
Copy link
Member

/remove-lifecycle rotten

@tekton-robot tekton-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Feb 24, 2021
tekton-robot pushed a commit that referenced this issue Feb 24, 2021
SelfLink is deprecated and is causing an error on Kubernetes v1.20
clusters as outlined in the issue. This is a short term fix that
would set the source field to a value that should match the current
source URI without the value of the auto-populated selfLink field.

Another field could be used for the source field without issue, but
could cause concerns about backwards compatibility.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants