Skip to content

Commit

Permalink
Emit events for all TaskRun lifecycle events
Browse files Browse the repository at this point in the history
Start emitting events for additional TaskRun lifecyle events:
- taskrun started
- taskrun timeout

Introduce pre-run and post-run functions that are invoked
asynchronously when the taskrun starts and completes, to emit
events.

These same functions shall be used to trigger any other async
behaviour on start/stop of taskruns.

Add documentation on events.

Fixes tektoncd#2328
Work towards tektoncd#2082
  • Loading branch information
afrittoli committed Apr 16, 2020
1 parent 94354a2 commit 6859fb8
Show file tree
Hide file tree
Showing 8 changed files with 95 additions and 110 deletions.
39 changes: 39 additions & 0 deletions docs/events.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
<!--
---
linkTitle: "Events"
weight: 2
---
-->
# Events

Tekton runtime resources, specifically `TaskRuns` and `PipelineRuns`,
emit events when they are executed, so that users can monitor their lifecycle
and react to it. Tekton emits [kubernetes events](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.18/#event-v1-core), that can be retrieve from the resource via
`kubectl describe [resource]`.

No events are emitted for `Conditions` today.

## TaskRuns

`TaskRun` events are generated for the following `Reasons`:
- `Started`: this is triggered the first time the `TaskRun` is picked by the
reconciler from its work queue, so it only happens if web-hook validation was
successful. Note that this event does not imply that a step started executing,
as several conditions must be met first:
- task and bound resource validation must be successful
- attached conditions must run successfully
- the `Pod` associated to the `TaskRun` must be successfully scheduled
- `Succeeded`: this is triggered once all steps in the `TaskRun` are executed
successfully, including post-steps injected by Tekton.
- `Failed`: this is triggered if the `TaskRun` is completed, but not successfully.
Causes of failure may be: one the steps failed, the `TaskRun` was cancelled or
the `TaskRun` timed out.

## PipelineRuns

`PipelineRun` events are generated for the following `Reasons`:
- `Succeeded`: this is triggered once all `Tasks` reachable via the DAG are
executed successfully.
- `Failed`: this is triggered if the `PipelineRun` is completed, but not
successfully. Causes of failure may be: one the `Tasks` failed or the
`PipelineRun` was cancelled.
1 change: 1 addition & 0 deletions docs/pipelineruns.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ Creation of a `PipelineRun` will trigger the creation of
- [Workspaces](#workspaces)
- [Cancelling a PipelineRun](#cancelling-a-pipelinerun)
- [LimitRanges](#limitranges)
- [Events](events.md#pipelineruns)

## Syntax

Expand Down
6 changes: 3 additions & 3 deletions docs/taskruns.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,14 +30,14 @@ A `TaskRun` runs until all `steps` have completed or until a failure occurs.
- [Steps](#steps)
- [Results](#results)
- [Cancelling a TaskRun](#cancelling-a-taskrun)
- [Sidecars](#sidecars)
- [LimitRanges](#limitranges)
- [Events](events.md#taskruns)
- [Examples](#examples)
- [Example TaskRun](#example-taskrun)
- [Example with embedded specs](#example-with-embedded-specs)
- [Example Task Reuse](#example-task-reuse)
- [Using a `ServiceAccount`](#using-a-serviceaccount)
- [Sidecars](#sidecars)
- [LimitRanges](#limitranges)

---

## Syntax
Expand Down
4 changes: 4 additions & 0 deletions pkg/reconciler/event.go
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,10 @@ func EmitEvent(c record.EventRecorder, beforeCondition *apis.Condition, afterCon
c.Event(object, corev1.EventTypeNormal, "Succeeded", afterCondition.Message)
} else if afterCondition.Status == corev1.ConditionFalse {
c.Event(object, corev1.EventTypeWarning, "Failed", afterCondition.Message)
} else {
if beforeCondition == nil {
c.Event(object, corev1.EventTypeNormal, "Started", "")
}
}
}
}
8 changes: 8 additions & 0 deletions pkg/reconciler/event_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,14 @@ func TestEmitEvent(t *testing.T) {
Status: corev1.ConditionTrue,
},
expectEvent: true,
}, {
name: "nil to unknown",
before: nil,
after: &apis.Condition{
Type: apis.ConditionSucceeded,
Status: corev1.ConditionUnknown,
},
expectEvent: true,
}}

for _, ts := range testcases {
Expand Down
77 changes: 0 additions & 77 deletions pkg/reconciler/taskrun/cancel.go

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -66,8 +66,7 @@ func cloudEventDeliveryFromTargets(targets []string) []v1alpha1.CloudEventDelive
}

// SendCloudEvents is used by the TaskRun controller to send cloud events once
// the TaskRun is complete. `tr` is used to obtain the list of targets but also
// to construct the body of the
// the TaskRun is complete. `tr` is used to obtain the list of targets
func SendCloudEvents(tr *v1alpha1.TaskRun, ceclient CEClient, logger *zap.SugaredLogger) error {
logger = logger.With(zap.String("taskrun", tr.Name))

Expand Down
67 changes: 39 additions & 28 deletions pkg/reconciler/taskrun/taskrun.go
Original file line number Diff line number Diff line change
Expand Up @@ -106,11 +106,15 @@ func (c *Reconciler) Reconcile(ctx context.Context, key string) error {

// If the TaskRun is just starting, this will also set the starttime,
// from which the timeout will immediately begin counting down.
tr.Status.InitializeConditions()
// In case node time was not synchronized, when controller has been scheduled to other nodes.
if tr.Status.StartTime.Sub(tr.CreationTimestamp.Time) < 0 {
c.Logger.Warnf("TaskRun %s createTimestamp %s is after the taskRun started %s", tr.GetRunKey(), tr.CreationTimestamp, tr.Status.StartTime)
tr.Status.StartTime = &tr.CreationTimestamp
if !tr.HasStarted() {
tr.Status.InitializeConditions()
// In case node time was not synchronized, when controller has been scheduled to other nodes.
if tr.Status.StartTime.Sub(tr.CreationTimestamp.Time) < 0 {
c.Logger.Warnf("TaskRun %s createTimestamp %s is after the taskRun started %s", tr.GetRunKey(), tr.CreationTimestamp, tr.Status.StartTime)
tr.Status.StartTime = &tr.CreationTimestamp
}
// Run asnyc startup hooks
go c.preRunAsyncHook(ctx, tr)
}

// If the TaskRun is complete, run some post run fixtures when applicable
Expand Down Expand Up @@ -164,36 +168,20 @@ func (c *Reconciler) Reconcile(ctx context.Context, key string) error {
// If the TaskRun is cancelled, kill resources and update status
if tr.IsCancelled() {
before := tr.Status.GetCondition(apis.ConditionSucceeded)
<<<<<<< HEAD
message := fmt.Sprintf("TaskRun %q was cancelled", tr.Name)
err := c.failTaskRun(tr, v1beta1.TaskRunReasonCancelled, message)
after := tr.Status.GetCondition(apis.ConditionSucceeded)
reconciler.EmitEvent(c.Recorder, before, after, tr)
go c.postRunAsyncHook(ctx, tr, before)
return multierror.Append(err, c.updateStatusLabelsAndAnnotations(tr, original)).ErrorOrNil()
=======
err := cancelTaskRun(tr, c.KubeClientSet, c.Logger)
after := tr.Status.GetCondition(apis.ConditionSucceeded)
reconciler.EmitEvent(c.Recorder, before, after, tr)
return err
>>>>>>> Consolidate cancel and timeout logic
}

// Check if the TaskRun has timed out; if it is, this will set its status
// accordingly.
if tr.HasTimedOut() {
before := tr.Status.GetCondition(apis.ConditionSucceeded)
<<<<<<< HEAD
message := fmt.Sprintf("TaskRun %q failed to finish within %q", tr.Name, tr.GetTimeout())
err := c.failTaskRun(tr, podconvert.ReasonTimedOut, message)
after := tr.Status.GetCondition(apis.ConditionSucceeded)
reconciler.EmitEvent(c.Recorder, before, after, tr)
go c.postRunAsyncHook(ctx, tr, before)
return multierror.Append(err, c.updateStatusLabelsAndAnnotations(tr, original)).ErrorOrNil()
=======
err := timeoutTaskRun(tr, c.KubeClientSet, c.Logger)
after := tr.Status.GetCondition(apis.ConditionSucceeded)
reconciler.EmitEvent(c.Recorder, before, after, tr)
return err
>>>>>>> Consolidate cancel and timeout logic
}

// Reconcile this copy of the task run and then write back any status
Expand All @@ -205,15 +193,31 @@ func (c *Reconciler) Reconcile(ctx context.Context, key string) error {
return multierror.Append(merr, c.updateStatusLabelsAndAnnotations(tr, original)).ErrorOrNil()
}

// Run any async logic that may be required at start-up time. This method is used
// to emit events, notifications or any other async operation
func (c *Reconciler) preRunAsyncHook(ctx context.Context, tr *v1alpha1.TaskRun) {
c.Logger.Infof("preRunAsyncHook: %s", tr.Name)

// Emit event
afterCondition := tr.Status.GetCondition(apis.ConditionSucceeded)
reconciler.EmitEvent(c.Recorder, nil, afterCondition, tr)
}

// Run any async logic that may be required once the tr is successfully reconciled
// This method is used to emit events, notifications or any other async operation
func (c *Reconciler) postRunAsyncHook(ctx context.Context, tr *v1alpha1.TaskRun, beforeCondition *apis.Condition) {
c.Logger.Infof("postRunAsyncHook: %s", tr.Name)

// Emit event
afterCondition := tr.Status.GetCondition(apis.ConditionSucceeded)
reconciler.EmitEvent(c.Recorder, beforeCondition, afterCondition, tr)
}

func (c *Reconciler) reconcile(ctx context.Context, tr *v1alpha1.TaskRun) error {
// We may be reading a version of the object that was stored at an older version
// and may not have had all of the assumed default specified.
tr.SetDefaults(contexts.WithUpgradeViaDefaulting(ctx))

if tr.Spec.Timeout == nil {
tr.Spec.Timeout = &metav1.Duration{Duration: config.DefaultTimeoutMinutes * time.Minute}
}

if err := tr.ConvertTo(ctx, &v1beta1.TaskRun{}); err != nil {
if ce, ok := err.(*v1beta1.CannotConvertError); ok {
tr.Status.MarkResourceNotConvertible(ce)
Expand Down Expand Up @@ -366,7 +370,14 @@ func (c *Reconciler) reconcile(ctx context.Context, tr *v1alpha1.TaskRun) error

after := tr.Status.GetCondition(apis.ConditionSucceeded)

reconciler.EmitEvent(c.Recorder, before, after, tr)
// If after is different from before and status is not Unknown, the taskrun
// has completed its work - except for post-run tasks like emitting events,
// recording metrics, sending cloud events.
// Once tr.isDone becomes true, even when this key is queued, `reconcile`
// won't be invoked so we won't pass through here again
if tr.IsDone() && after != before {
go c.postRunAsyncHook(ctx, tr, before)
}
c.Logger.Infof("Successfully reconciled taskrun %s/%s with status: %#v", tr.Name, tr.Namespace, after)

return nil
Expand Down

0 comments on commit 6859fb8

Please sign in to comment.