Skip to content

Commit

Permalink
Emit structured log events to ApplicationInsights (#308)
Browse files Browse the repository at this point in the history
This adds some structured logging to Application Insights for various
events ("thing" Created / Finished for Workflow Run, Job, Job Partition,
and Task).
  • Loading branch information
Tom Augspurger authored Jul 11, 2024
1 parent 419b963 commit a1d85b4
Show file tree
Hide file tree
Showing 10 changed files with 291 additions and 2 deletions.
3 changes: 3 additions & 0 deletions cluster/dev-values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,9 @@ pctasks:
keyvault:
enabled: false

applicationinsights:
enabled: false

pcdev:
services:
pctasks:
Expand Down
4 changes: 4 additions & 0 deletions deployment/helm/deploy-values.template.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,10 @@ pctasks:
enabled: true
url: "{{ tf.keyvault_url }}"

applicationinsights:
enabled: true
connection_string: "{{ tf.applicationinsights_connection_string }}"

pcingress:
services:
pctasks:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -199,6 +199,11 @@ spec:
value: "{{ .Values.pctasks.run.keyvault.url }}"
{{- end }}

{{- if .Values.pctasks.run.applicationinsights.enabled }}
- name: PCTASKS_RUN__APPLICATIONINSIGHTS_CONNECTION_STRING
value: "{{ .Values.pctasks.run.applicationinsights.connection_string }}"
{{- end }}

livenessProbe:
httpGet:
path: "/_mgmt/ping"
Expand Down
4 changes: 4 additions & 0 deletions deployment/helm/published/pctasks-server/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -174,3 +174,7 @@ pctasks:
sp_tenant_id: ""
sp_client_id: ""
sp_client_secret: ""

applicationinsights:
enabled: false
connection_string: ""
4 changes: 4 additions & 0 deletions deployment/terraform/resources/output.tf
Original file line number Diff line number Diff line change
Expand Up @@ -160,6 +160,10 @@ output "instrumentation_key" {
value = azurerm_application_insights.pctasks.instrumentation_key
}

output "applicationinsights_connection_string" {
value = azurerm_application_insights.pctasks.connection_string
}

## PCTasks Server

output "argo_wf_node_group_name" {
Expand Down
35 changes: 35 additions & 0 deletions docs/getting_started/telemetry.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Telemetry

## Structured Logs

The pctasks executor generates logs for the following events:

1. `WorkflowCreated`
2. `WorkflowFinished`
3. `JobCreated`
4. `JobFinished`
5. `JobPartitionCreated`
6. `JobPartitionFinished`
7. `TaskCreated`
8. `TaskFinished`

In general, a record is emitted when something is created or finished, at the Workflow, Job, JobPartition, and Task levels.

Depending on the level (Workflow, Job, JobPartition, Task) the logs will contain the following fields:

| Field | Record Levels | Description |
| ----------- | ----------------------- | --------------------------------------------------------------------- |
| type | All | The event type, from the list above |
| workflowId | All | The ID of the workflow, from the workflow definition |
| datasetId | All | The ID of the dataset, from the workflow definition |
| runId | All | The of the workflow run, generated by pctasks |
| recordLevel | All | The level (Workflow, Job, JobPartition, Task) this record belongs to. |
| jobId | Job, JobPartition, Task | The ID of the job, from the workflow definition |
| partitionId | JobPartition, Task | The ID of the partition, from the workflow definition and pctasks |
| taskId | Task | The ID of the task, from the workflow definition and pctasks |

Depending on the record, additional fields will be included:

* `status`: Present for "Finished" events, indicating success or failure of that operation.
* `errors`: Present for `JobFinished` and `TaskFinished` events when
`status="failed"`, containing a list of errors.
16 changes: 16 additions & 0 deletions pctasks/run/pctasks/run/argo/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -197,6 +197,14 @@ def submit_workflow(
else:
kwargs = {}

if run_settings.applicationinsights_connection_string:
env.append(
EnvVar(
name="APPLICATIONINSIGHTS_CONNECTION_STRING",
value=run_settings.applicationinsights_connection_string,
)
)

# Enable local secrets for development environment
if run_settings.local_secrets:
for env_var in [
Expand Down Expand Up @@ -317,6 +325,14 @@ def submit_task(
else:
kwargs = {}

if run_settings.applicationinsights_connection_string:
env.append(
EnvVar(
name="APPLICATIONINSIGHTS_CONNECTION_STRING",
value=run_settings.applicationinsights_connection_string,
)
)

templates = [
IoArgoprojWorkflowV1alpha1Template(
name="run-workflow",
Expand Down
2 changes: 2 additions & 0 deletions pctasks/run/pctasks/run/settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,8 @@ def section_name(cls) -> str:
# Type of workflow runner to use.
workflow_runner_type: WorkflowRunnerType = WorkflowRunnerType.ARGO

applicationinsights_connection_string: Optional[str] = None

@property
def batch_settings(self) -> BatchSettings:
if not (self.batch_url and self.batch_key and self.batch_default_pool_id):
Expand Down
Loading

0 comments on commit a1d85b4

Please sign in to comment.