Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support gracefully stopping traces on-demand #2828

Merged
merged 48 commits into from
Nov 8, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
b054923
Initial refactor
schmittjoseph Oct 25, 2022
ad8b7c1
Shim egress operations
schmittjoseph Oct 25, 2022
42f7db8
Unify egress operations
schmittjoseph Oct 25, 2022
8ba2c1b
Support intake of non-actionable operations
schmittjoseph Oct 25, 2022
b211782
Remove extra interpolation
schmittjoseph Oct 25, 2022
eafdeba
Support graceful stops
schmittjoseph Oct 25, 2022
9d67416
Track dump operations
schmittjoseph Oct 25, 2022
5f19392
Add stopping state
schmittjoseph Oct 25, 2022
7219aa5
Support gracefully stopping a trace
schmittjoseph Oct 25, 2022
3e7fbde
Mirror cancellation state
schmittjoseph Oct 25, 2022
83771f4
Handle when the operation service is starved of cpu
schmittjoseph Oct 25, 2022
d97b1b0
Simplify cancellation
schmittjoseph Oct 26, 2022
e39eb3e
Remove extra state
schmittjoseph Oct 26, 2022
76564f9
Fix egress cancel endpoint
schmittjoseph Oct 26, 2022
8040823
Report stoppable status
schmittjoseph Oct 26, 2022
24e29b6
code cleanup
schmittjoseph Oct 26, 2022
df6ef91
Add initial tests
schmittjoseph Oct 27, 2022
aa00608
Check new fields
schmittjoseph Oct 27, 2022
4f2ba09
Expand test coverage
schmittjoseph Oct 27, 2022
32c015a
Add test case
schmittjoseph Nov 1, 2022
932af32
Add comment
schmittjoseph Nov 1, 2022
5cb6345
Update comments
schmittjoseph Nov 1, 2022
720059b
Update openapi.json
schmittjoseph Nov 2, 2022
8b71b7d
Initial doc changes
schmittjoseph Nov 2, 2022
2aa86d3
Update docs
schmittjoseph Nov 2, 2022
af340ef
Update docs
schmittjoseph Nov 2, 2022
c06f5f9
Code cleanup
schmittjoseph Nov 2, 2022
b4bd955
Merge branch 'stop-trace-on-demand' of https://github.com/schmittjose…
schmittjoseph Nov 2, 2022
db53f7e
Update documentation/api/operations-stop.md
schmittjoseph Nov 2, 2022
1390850
Address PR feedback
schmittjoseph Nov 2, 2022
5bb6ce1
Address PR feedback
schmittjoseph Nov 3, 2022
cfde64a
Trigger PR update
schmittjoseph Nov 3, 2022
08e8ffd
Remove old test code
schmittjoseph Nov 3, 2022
53837b4
Stash
schmittjoseph Nov 7, 2022
a6eafe2
Merge remote-tracking branch 'origin/main' into stop-trace-on-demand
schmittjoseph Nov 7, 2022
62ddca2
Stash
schmittjoseph Nov 7, 2022
88e9d4c
Remove remaining JSFIX
schmittjoseph Nov 7, 2022
93f44c3
Clean leftover code
schmittjoseph Nov 7, 2022
f0cdb29
Fix merge fomatting issues
schmittjoseph Nov 7, 2022
e321ec0
Apply suggestions from code review
schmittjoseph Nov 7, 2022
890c7cb
Update src/Tools/dotnet-monitor/Trace/AbstractTraceOperation.cs
schmittjoseph Nov 7, 2022
8c00c94
Use egress cancellation token
schmittjoseph Nov 8, 2022
cc14f06
Merge branch 'stop-trace-on-demand' of https://github.com/schmittjose…
schmittjoseph Nov 8, 2022
07f6ede
Adress PR feedback
schmittjoseph Nov 8, 2022
52c07be
Version docs
schmittjoseph Nov 8, 2022
9e4eab8
Fix tests
schmittjoseph Nov 8, 2022
9d353e6
Code cleanup
schmittjoseph Nov 8, 2022
ca51f9c
Cleanup
schmittjoseph Nov 8, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 10 additions & 1 deletion documentation/api/definitions.md
Original file line number Diff line number Diff line change
Expand Up @@ -267,6 +267,7 @@ Status of the egress operation.
|---|---|
| `Running` | Operation has been started. This is the initial state. |
| `Cancelled` | The operation was cancelled by the user. |
| `Stopping` | The operation is in the process of stopping at the request of the user. |
| `Succeeded` | Egress operation has been successful. Querying the operation will return the location of the egressed artifact. |
| `Failed` | Egress operation failed. Querying the operation will return detailed error information. |

Expand All @@ -281,6 +282,8 @@ Detailed information about an operation.
| `operationId` | guid | Unique identifier for the operation. |
| `createdDateTime` | datetime string | UTC DateTime string of when the operation was created. |
| `status` | [OperationState](#operationstate) | The current status of operation. |
| `egressProviderName` | string | (8.0+) The name of the egress provider that the artifact is being sent to. This will be null if the artifact is being sent directly back to the user from an HTTP request. |
| `isStoppable` | bool | (8.0+) Whether this operation can be gracefully stopped using [Stop Operation](operations-stop.md). Not all operations support being stopped. |

### Example

Expand All @@ -290,7 +293,9 @@ Detailed information about an operation.
"error": null,
"operationId": "67f07e40-5cca-4709-9062-26302c484f18",
"createdDateTime": "2021-07-21T06:21:15.315861Z",
"status": "Succeeded"
"status": "Succeeded",
"egressProviderName": "monitorBlob",
"isStoppable": false,
}
```

Expand All @@ -303,6 +308,8 @@ Summary state of an operation.
| `operationId` | guid | Unique identifier for the operation. |
| `createdDateTime` | datetime string | UTC DateTime string of when the operation was created. |
| `status` | [OperationState](#operationstate) | The current status of operation. |
| `egressProviderName` | string | (8.0+) The name of the egress provider that the artifact is being sent to. This will be null if the artifact is being sent directly back to the user from an HTTP request. |
| `isStoppable` | bool | (8.0+) Whether this operation can be gracefully stopped using [Stop Operation](operations-stop.md). Not all operations support being stopped. |
| `process` | [OperationProcessInfo](#operationprocessinfo) | (6.3+) The process on which the operation is performed. |

### Example
Expand All @@ -312,6 +319,8 @@ Summary state of an operation.
"operationId": "67f07e40-5cca-4709-9062-26302c484f18",
"createdDateTime": "2021-07-21T06:21:15.315861Z",
"status": "Succeeded",
"egressProviderName": null,
"isStoppable": false,
"process": {
"pid": 21632,
"uid": "cd4da319-fa9e-4987-ac4e-e57b2aac248b",
Expand Down
7 changes: 5 additions & 2 deletions documentation/api/dump.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,12 +45,14 @@ Allowed schemes:

| Name | Type | Description | Content Type |
|---|---|---|---|
| 200 OK | stream | A managed dump of the process. | `application/octet-stream` |
| 202 Accepted | | When an egress provider is specified, the Location header containers the URI of the operation for querying the egress status. | |
| 200 OK | stream | A managed dump of the process when no egress provider is specified. | `application/octet-stream` |
| 202 Accepted | | When an egress provider is specified, the artifact has begun being collected. | |
| 400 Bad Request | [ValidationProblemDetails](definitions.md#validationproblemdetails) | An error occurred due to invalid input. The response body describes the specific problem(s). | `application/problem+json` |
| 401 Unauthorized | | Authentication is required to complete the request. See [Authentication](./../authentication.md) for further information. | |
| 429 Too Many Requests | | There are too many dump requests at this time. Try to request a dump at a later time. | |

> **NOTE: (8.0+)** Regardless if an egress provider is specified if the request was successful (response codes 200 or 202), the Location header contains the URI of the operation. This can be used to query the status of the operation or change its state.
## Examples

### Sample Request
Expand All @@ -77,6 +79,7 @@ The managed dump containing all memory of the process, chunk encoded, is returne
HTTP/1.1 200 OK
Content-Type: application/octet-stream
Transfer-Encoding: chunked
Location: localhost:52323/operations/67f07e40-5cca-4709-9062-26302c484f18
```

## Supported Runtimes
Expand Down
7 changes: 5 additions & 2 deletions documentation/api/gcdump.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,12 +48,14 @@ Allowed schemes:

| Name | Type | Description | Content Type |
|---|---|---|---|
| 200 OK | stream | A GC dump of the process. | `application/octet-stream` |
| 202 Accepted | | When an egress provider is specified, the Location header containers the URI of the operation for querying the egress status. | |
| 200 OK | stream | A GC dump of the process when no egress provider is specified. | `application/octet-stream` |
| 202 Accepted | | When an egress provider is specified, the artifact has begun being collected. | |
| 400 Bad Request | [ValidationProblemDetails](definitions.md#validationproblemdetails) | An error occurred due to invalid input. The response body describes the specific problem(s). | `application/problem+json` |
| 401 Unauthorized | | Authentication is required to complete the request. See [Authentication](./../authentication.md) for further information. | |
| 429 Too Many Requests | | There are too many GC dump requests at this time. Try to request a GC dump at a later time. | `application/problem+json` |

> **NOTE: (8.0+)** Regardless if an egress provider is specified if the request was successful (response codes 200 or 202), the Location header contains the URI of the operation. This can be used to query the status of the operation or change its state.
## Examples

### Sample Request
Expand All @@ -80,6 +82,7 @@ The GC dump, chunk encoded, is returned as the response body.
HTTP/1.1 200 OK
Content-Type: application/octet-stream
Transfer-Encoding: chunked
Location: localhost:52323/operations/67f07e40-5cca-4709-9062-26302c484f18
```

## Supported Runtimes
Expand Down
5 changes: 4 additions & 1 deletion documentation/api/livemetrics-custom.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,11 +50,13 @@ The expected content type is `application/json`.
| Name | Type | Description | Content Type |
|---|---|---|---|
| 200 OK | [Metric](./definitions.md#metric) | The metrics from the process formatted as json sequence. Each JSON object is a [metrics object](./definitions.md#metric)| `application/json-seq` |
| 202 Accepted | | When an egress provider is specified, the Location header containers the URI of the operation for querying the egress status. | |
| 202 Accepted | | When an egress provider is specified, the artifact has begun being collected. | |
| 400 Bad Request | [ValidationProblemDetails](definitions.md#validationproblemdetails) | An error occurred due to invalid input. The response body describes the specific problem(s). | `application/problem+json` |
| 401 Unauthorized | | Authentication is required to complete the request. See [Authentication](./../authentication.md) for further information. | |
| 429 Too Many Requests | | There are too many requests at this time. Try to request metrics at a later time. | `application/problem+json` |

> **NOTE: (8.0+)** Regardless if an egress provider is specified if the request was successful (response codes 200 or 202), the Location header contains the URI of the operation. This can be used to query the status of the operation or change its state.
## Examples

### Sample Request
Expand Down Expand Up @@ -83,6 +85,7 @@ Authorization: Bearer fffffffffffffffffffffffffffffffffffffffffff=
```http
HTTP/1.1 200 OK
Content-Type: application/json-seq
Location: localhost:52323/operations/67f07e40-5cca-4709-9062-26302c484f18
{
"timestamp": "2021-08-31T16:58:39.7514031+00:00",
Expand Down
5 changes: 4 additions & 1 deletion documentation/api/livemetrics-get.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,11 +46,13 @@ Allowed schemes:
| Name | Type | Description | Content Type |
|---|---|---|---|
| 200 OK | [Metric](./definitions.md#metric) | The metrics from the process formatted as json sequence. | `application/json-seq` |
| 202 Accepted | | When an egress provider is specified, the Location header containers the URI of the operation for querying the egress status. | |
| 202 Accepted | | When an egress provider is specified, the artifact has begun being collected. | |
| 400 Bad Request | [ValidationProblemDetails](definitions.md#validationproblemdetails) | An error occurred due to invalid input. The response body describes the specific problem(s). | `application/problem+json` |
| 401 Unauthorized | | Authentication is required to complete the request. See [Authentication](./../authentication.md) for further information. | |
| 429 Too Many Requests | | There are too many requests at this time. Try to request metrics at a later time. | `application/problem+json` |

> **NOTE: (8.0+)** Regardless if an egress provider is specified if the request was successful (response codes 200 or 202), the Location header contains the URI of the operation. This can be used to query the status of the operation or change its state.
## Examples

### Sample Request
Expand All @@ -66,6 +68,7 @@ Authorization: Bearer fffffffffffffffffffffffffffffffffffffffffff=
```http
HTTP/1.1 200 OK
Content-Type: application/json-seq
Location: localhost:52323/operations/67f07e40-5cca-4709-9062-26302c484f18
{
"timestamp": "2021-08-31T16:58:39.7514031+00:00",
Expand Down
5 changes: 4 additions & 1 deletion documentation/api/logs-custom.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,11 +53,13 @@ The expected content type is `application/json`.
|---|---|---|---|
| 200 OK | | The logs from the process formatted as [newline delimited JSON](https://github.com/ndjson/ndjson-spec). Each JSON object is a [LogEntry](definitions.md#logentry) | `application/x-ndjson` |
| 200 OK | | The logs from the process formatted as plain text, similar to the output of the JSON console formatter. | `text/plain` |
| 202 Accepted | | When an egress provider is specified, the Location header containers the URI of the operation for querying the egress status. | |
| 202 Accepted | | When an egress provider is specified, the artifact has begun being collected. | |
| 400 Bad Request | [ValidationProblemDetails](definitions.md#validationproblemdetails) | An error occurred due to invalid input. The response body describes the specific problem(s). | `application/problem+json` |
| 401 Unauthorized | | Authentication is required to complete the request. See [Authentication](./../authentication.md) for further information. | |
| 429 Too Many Requests | | There are too many logs requests at this time. Try to request logs at a later time. | `application/problem+json` |

> **NOTE: (8.0+)** Regardless if an egress provider is specified if the request was successful (response codes 200 or 202), the Location header contains the URI of the operation. This can be used to query the status of the operation or change its state.
## Examples

### Sample Request
Expand Down Expand Up @@ -98,6 +100,7 @@ The log statements logged at the Information level or higher for 1 minute is ret
```http
HTTP/1.1 200 OK
Content-Type: text/plain
Location: localhost:52323/operations/67f07e40-5cca-4709-9062-26302c484f18
2021-05-13 18:06:41Z info: Microsoft.AspNetCore.Hosting.Diagnostics[1]
=> RequestId:0HM8M726ENU3K:0000002B, RequestPath:/, SpanId:|4791a4a7-433aa59a9e362743., TraceId:4791a4a7-433aa59a9e362743, ParentId:
Expand Down
5 changes: 4 additions & 1 deletion documentation/api/logs-get.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,11 +48,13 @@ Allowed schemes:
|---|---|---|---|
| 200 OK | | The logs from the process formatted as [newline delimited JSON](https://github.com/ndjson/ndjson-spec). Each JSON object is a [LogEntry](definitions.md#logentry) | `application/x-ndjson` |
| 200 OK | | The logs from the process formatted as plain text, similar to the output of the JSON console formatter. | `text/plain` |
| 202 Accepted | | When an egress provider is specified, the Location header containers the URI of the operation for querying the egress status. | |
| 202 Accepted | | When an egress provider is specified,. | |
| 400 Bad Request | [ValidationProblemDetails](definitions.md#validationproblemdetails) | An error occurred due to invalid input. The response body describes the specific problem(s). | `application/problem+json` |
| 401 Unauthorized | | Authentication is required to complete the request. See [Authentication](./../authentication.md) for further information. | |
| 429 Too Many Requests | | There are too many logs requests at this time. Try to request logs at a later time. | `application/problem+json` |

> **NOTE: (8.0+)** Regardless if an egress provider is specified if the request was successful (response codes 200 or 202), the Location header contains the URI of the operation. This can be used to query the status of the operation or change its state.
## Examples

### Sample Request
Expand All @@ -78,6 +80,7 @@ The log statements logged at the Information level or higher for 1 minute is ret
```http
HTTP/1.1 200 OK
Content-Type: text/plain
Location: localhost:52323/operations/67f07e40-5cca-4709-9062-26302c484f18
info: Agent.RequestProcessor[3][ProcessRequest]
Processing request 353f398a-dc74-4adc-b107-ec35edd09968.
Expand Down
3 changes: 2 additions & 1 deletion documentation/api/operations-delete.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@

# Operations - Delete

Cancel a running operation. Only valid against operations in the `Running` state. Transitions the operation to `Cancelled` state.
Cancel a running operation. Only valid against operations in the `Running` or `Stopping` state. Transitions the operation to `Cancelled` state. Cancelling an operation may result in an incomplete or unreadable artifact. To stop an operation early while still producing a valid artifact, use the [Stop Operation](operations-stop.md).

## HTTP Route

Expand Down Expand Up @@ -45,6 +45,7 @@ Authorization: Bearer fffffffffffffffffffffffffffffffffffffffffff=

```http
HTTP/1.1 200 OK
```

## Supported Runtimes

Expand Down
2 changes: 2 additions & 0 deletions documentation/api/operations-get.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,8 @@ Content-Type: application/json
"operationId": "67f07e40-5cca-4709-9062-26302c484f18",
"createdDateTime": "2021-07-21T06:21:15.315861Z",
"status": "Succeeded",
"egressProviderName": "monitorBlob",
"isStoppable": true,
"process":
{
"pid":1,
Expand Down
25 changes: 22 additions & 3 deletions documentation/api/operations-list.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,22 @@ Content-Type: application/json
{
"operationId": "67f07e40-5cca-4709-9062-26302c484f18",
"createdDateTime": "2021-07-21T06:21:15.315861Z",
"status": "Succeeded",
"status": "Succeeded",
"egressProviderName": "monitorBlob",
"isStoppable": false,
"process":
{
"pid":1,
"uid":"95b0202a-4ed3-44a6-98f1-767d270ec783",
"name":"dotnet-monitor-demo"
}
},
{
"operationId": "06ac07e2-f7cd-45ad-80c6-e38160bc5881",
"createdDateTime": "2021-07-21T20:22:15.315861Z",
"status": "Stopping",
"egressProviderName": null,
"isStoppable": false,
"process":
{
"pid":1,
Expand All @@ -76,7 +91,9 @@ Content-Type: application/json
{
"operationId": "26e74e52-0a16-4e84-84bb-27f904bfaf85",
"createdDateTime": "2021-07-21T23:30:22.3058272Z",
"status": "Failed",
"status": "Failed",
"egressProviderName": "monitorBlob",
"isStoppable": false,
"process":
{
"pid":11782,
Expand Down Expand Up @@ -105,7 +122,9 @@ Content-Type: application/json
{
"operationId": "67f07e40-5cca-4709-9062-26302c484f18",
"createdDateTime": "2021-07-21T06:21:15.315861Z",
"status": "Succeeded",
"status": "Succeeded",
"egressProviderName": "monitorBlob",
"isStoppable": false,
"process":
{
"pid":1,
Expand Down
58 changes: 58 additions & 0 deletions documentation/api/operations-stop.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@

### Was this documentation helpful? [Share feedback](https://www.research.net/r/DGDQWXH?src=documentation%2Fapi%2Foperations-stop)

# Operations - Stop (8.0+)

Gracefully stops a running operation. Only valid against operations with the `isStoppable` property set to `true`, not all operations support being gracefully stopped. Transitions the operation to `Succeeded` or `Failed` state depending on if the operation was successful.

Stopping an operation may not happen immediately such as in the case of traces where stopping may collect rundown information. An operation in the `Stopping` state can still be cancelled using [Delete Operation](operations-delete.md).

## HTTP Route

```http
DELETE /operations/{operationId}?stop=true HTTP/1.1
```

## Host Address

The default host address for these routes is `https://localhost:52323`. This route is only available on the addresses configured via the `--urls` command line parameter and the `DOTNETMONITOR_URLS` environment variable.

## Authentication

Authentication is enforced for this route. See [Authentication](./../authentication.md) for further information.

Allowed schemes:
- `Bearer`
- `Negotiate` (Windows only, running as unelevated)

## Responses

| Name | Type | Description | Content Type |
|---|---|---|---|
| 202 Accepted | | The operation was successfully queued to stop. | `application/json` |
| 400 Bad Request | [ValidationProblemDetails](definitions.md#validationproblemdetails) | An error occurred due to invalid input. The response body describes the specific problem(s). | `application/problem+json` |
| 401 Unauthorized | | Authentication is required to complete the request. See [Authentication](./../authentication.md) for further information. | |

## Examples

### Sample Request

```http
DELETE /operations/67f07e40-5cca-4709-9062-26302c484f18?stop=true HTTP/1.1
Host: localhost:52323
Authorization: Bearer fffffffffffffffffffffffffffffffffffffffffff=
```

### Sample Response

```http
HTTP/1.1 202 OK
```

## Supported Runtimes

| Operating System | Runtime Version |
|---|---|
| Windows | .NET Core 3.1, .NET 5+ |
| Linux | .NET Core 3.1, .NET 5+ |
| MacOS | .NET Core 3.1, .NET 5+ |
1 change: 1 addition & 0 deletions documentation/api/operations.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,4 @@ Operations are used to track long running operations in dotnet-monitor, specific
| [List Operations](operations-list.md) | Lists all the operations and their current state. |
| [Get Operation](operations-get.md) | Get detailed information about an operation. |
| [Delete Operation](operations-delete.md) | Cancels a running operation. |
| [Stop Operation](operations-stop.md) (8.0+) | Gracefully stops a running operation. |
6 changes: 5 additions & 1 deletion documentation/api/stacks.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,11 +46,13 @@ Allowed schemes:
|---|---|---|---|
| 200 OK | [CallStackResult](definitions.md#experimental-callstackresult-70) | Callstacks for all managed threads in the process. | `application/json` |
| 200 OK | text | Text representation of callstacks in the process. | `text/plain` |
| 202 Accepted | | When an egress provider is specified, the Location header containers the URI of the operation for querying the egress status. | |
| 202 Accepted | | When an egress provider is specified, the artifact has begun being collected. | |
| 400 Bad Request | [ValidationProblemDetails](definitions.md#validationproblemdetails) | An error occurred due to invalid input. The response body describes the specific problem(s). | `application/problem+json` |
| 401 Unauthorized | | Authentication is required to complete the request. See [Authentication](./../authentication.md) for further information. | |
| 429 Too Many Requests | | There are too many stack requests at this time. Try to request a stack at a later time. | `application/problem+json` |

> **NOTE: (8.0+)** Regardless if an egress provider is specified if the request was successful (response codes 200 or 202), the Location header contains the URI of the operation. This can be used to query the status of the operation or change its state.
## Examples

### Sample Request
Expand All @@ -67,6 +69,7 @@ Accept: application/json
```http
HTTP/1.1 200 OK
Content-Type: application/json
Location: localhost:52323/operations/67f07e40-5cca-4709-9062-26302c484f18
{
"threadId": 30860,
Expand Down Expand Up @@ -103,6 +106,7 @@ Accept: text/plain
```http
HTTP/1.1 200 OK
Content-Type: text/plain
Location: localhost:52323/operations/67f07e40-5cca-4709-9062-26302c484f18
Thread: (0x68C0)
System.Private.CoreLib.dll!System.Threading.Monitor.Wait
Expand Down
Loading