Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add transaction_ignore_urls spec #333

Merged
merged 8 commits into from
Oct 5, 2020
25 changes: 25 additions & 0 deletions specs/agents/tracing-instrumentation-http.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,31 @@ Request and response headers, cookies, and form bodies should be sanitised (i.e.

Agents may may include additional patterns if there are common conventions specific to language frameworks.

##### `transaction_ignore_urls` configuration

Used to restrict requests to certain URLs from being instrumented.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should agent propagate data related to distributed tracing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the simplest thing to do is to treat the ignored URLs as if there's no agent at all. I expect most of the use cases to exclude health checks and static resources. For those cases, typically no downstream services are involved. Obviously, there can be exceptions to that rule.
Does any of our agents propagate the context for ignored URLs?

I'm also fine to not explicitly specify that bit. If it becomes an important question, we can follow up on it.

Copy link
Contributor

@SergeyKleyman SergeyKleyman Sep 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the use case for this feature is to completely exclude parts of the site/application from being monitored (which is what I assumed when I first heard about this feature) than why do we need the exception for errors, discussed a few comments above? I think it would be much cleaner mental model if agent doesn't do anything for ignored URLs - agent doesn't report anything, doesn't propagate distributed tracing data, etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're probably onto something here. This is how we implemented error handling in the context of processing a request that corresponds to an ignored URL:

When the user manually captures an error we still send that error.

However, we won't catch exceptions that happen during the request processing and automatically capture an error, as we'd do for non-ignored URLs.

This might be different from how other agents handle it. If that's the case, I think it does make sense to align eventually but I'd like to handle that in a follow-up and not block the progress on this spec.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So wouldn’t it be better to leave both error reporting and distributed tracing data propagation aspects unspecified?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM, unless all agents currently behave the same anyways. But seems like they don't.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SergeyKleyman @felixbarny so if I understand the discussion correctly, the spec is OK as is? Or would you like a note that error handling and trace propagation remains unspecified for now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that existing implementations by the agents differ in error handling and trace propagation parts so the simplest approach at this point would be to have those aspects unspecified. I'm not sure a note is necessary - we can just not include those parts in the spec.


This property should be set to a list containing one or more strings.
When an incoming HTTP request is detected,
its request [`path`](https://tools.ietf.org/html/rfc3986#section-3.3)
will be tested against each element in this list.
For example, adding `/home/index` to this list would match and remove instrumentation from the following URLs:

```
https://www.mycoolsite.com/home/index
http://localhost/home/index
http://whatever.com/home/index?value1=123
```
beniwohli marked this conversation as resolved.
Show resolved Hide resolved

NOTE:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the motivation for this exception?

Copy link
Contributor

@beniwohli beniwohli Sep 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has always been the case, transaction sampling/ignoring has no effect on exception tracking

All errors that are captured during a request to an ignored URL are still sent to the APM Server regardless of this setting.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it mean errors for ignored transactions won’t have parent_id, transaction_id, transaction, context and trace_id?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

transaction_id will obviously have to be set to null or equivalent, I'm not so sure about the others. trace_id could theoretically be around from distributed tracing, but it might involve overhead to parse it and make it available for error collection, which is exactly what we want to avoid with this setting.

In the Python agent, we generate the context for exceptions separately, but that might be considered an implementation detail.


| | |
|----------------|---|
| Type | `List<`[`WildcardMatcher`](../../tests/agents/json-specs/wildcard_matcher_tests.json)`>` |
| Default | agent specific |
| Dynamic | `true` |
| Central config | `true` |

#### HTTP client spans

Expand Down