diff --git a/CHANGELOG.md b/CHANGELOG.md index 6c5b2d1a24..d23cf8cedd 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -15,6 +15,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ([#1403](https://github.com/open-telemetry/opentelemetry-python-contrib/pull/1403)) - `opentelemetry-instrumentation-botocore` add support for `messaging.*` in the sqs extension. ([#1350](https://github.com/open-telemetry/opentelemetry-python-contrib/pull/1350)) +- `opentelemetry-instrumentation-starlette` Add support for regular expression matching and sanitization of HTTP headers. + ([#1404](https://github.com/open-telemetry/opentelemetry-python-contrib/pull/1404)) ### Fixed diff --git a/instrumentation/opentelemetry-instrumentation-starlette/src/opentelemetry/instrumentation/starlette/__init__.py b/instrumentation/opentelemetry-instrumentation-starlette/src/opentelemetry/instrumentation/starlette/__init__.py index 65e97e9a65..cafb9f1365 100644 --- a/instrumentation/opentelemetry-instrumentation-starlette/src/opentelemetry/instrumentation/starlette/__init__.py +++ b/instrumentation/opentelemetry-instrumentation-starlette/src/opentelemetry/instrumentation/starlette/__init__.py @@ -36,8 +36,9 @@ def home(request): Exclude lists ************* -To exclude certain URLs from being tracked, set the environment variable ``OTEL_PYTHON_STARLETTE_EXCLUDED_URLS`` -(or ``OTEL_PYTHON_EXCLUDED_URLS`` as fallback) with comma delimited regexes representing which URLs to exclude. +To exclude certain URLs from tracking, set the environment variable ``OTEL_PYTHON_STARLETTE_EXCLUDED_URLS`` +(or ``OTEL_PYTHON_EXCLUDED_URLS`` to cover all instrumentations) to a string of comma delimited regexes that match the +URLs. For example, @@ -50,9 +51,14 @@ def home(request): Request/Response hooks ********************** -Utilize request/response hooks to execute custom logic to be performed before/after performing a request. The server request hook takes in a server span and ASGI -scope object for every incoming request. The client request hook is called with the internal span and an ASGI scope which is sent as a dictionary for when the method receive is called. -The client response hook is called with the internal span and an ASGI event which is sent as a dictionary for when the method send is called. +This instrumentation supports request and response hooks. These are functions that get called +right after a span is created for a request and right before the span is finished for the response. + +- The server request hook is passed a server span and ASGI scope object for every incoming request. +- The client request hook is called with the internal span and an ASGI scope when the method ``receive`` is called. +- The client response hook is called with the internal span and an ASGI event when the method ``send`` is called. + +For example, .. code-block:: python @@ -70,54 +76,93 @@ def client_response_hook(span: Span, message: dict): Capture HTTP request and response headers ***************************************** -You can configure the agent to capture predefined HTTP headers as span attributes, according to the `semantic convention `_. +You can configure the agent to capture specified HTTP headers as span attributes, according to the +`semantic convention `_. Request headers *************** -To capture predefined HTTP request headers as span attributes, set the environment variable ``OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SERVER_REQUEST`` -to a comma-separated list of HTTP header names. +To capture HTTP request headers as span attributes, set the environment variable +``OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SERVER_REQUEST`` to a comma delimited list of HTTP header names. For example, - :: export OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SERVER_REQUEST="content-type,custom_request_header" -will extract ``content-type`` and ``custom_request_header`` from request headers and add them as span attributes. +will extract ``content-type`` and ``custom_request_header`` from the request headers and add them as span attributes. + +Request header names in Starlette are case-insensitive. So, giving the header name as ``CUStom-Header`` in the +environment variable will capture the header named ``custom-header``. + +Regular expressions may also be used to match multiple headers that correspond to the given pattern. For example: +:: -It is recommended that you should give the correct names of the headers to be captured in the environment variable. -Request header names in starlette are case insensitive. So, giving header name as ``CUStom-Header`` in environment variable will be able capture header with name ``custom-header``. + export OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SERVER_REQUEST="Accept.*,X-.*" -The name of the added span attribute will follow the format ``http.request.header.`` where ```` being the normalized HTTP header name (lowercase, with - characters replaced by _ ). -The value of the attribute will be single item list containing all the header values. +Would match all request headers that start with ``Accept`` and ``X-``. -Example of the added span attribute, +Additionally, the special keyword ``all`` can be used to capture all request headers. +:: + + export OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SERVER_REQUEST="all" + +The name of the added span attribute will follow the format ``http.request.header.`` where ```` +is the normalized HTTP header name (lowercase, with ``-`` replaced by ``_``). The value of the attribute will be a +single item list containing all the header values. + +For example: ``http.request.header.custom_request_header = [","]`` Response headers **************** -To capture predefined HTTP response headers as span attributes, set the environment variable ``OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SERVER_RESPONSE`` -to a comma-separated list of HTTP header names. +To capture HTTP response headers as span attributes, set the environment variable +``OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SERVER_RESPONSE`` to a comma delimited list of HTTP header names. For example, - :: export OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SERVER_RESPONSE="content-type,custom_response_header" -will extract ``content-type`` and ``custom_response_header`` from response headers and add them as span attributes. +will extract ``content-type`` and ``custom_response_header`` from the response headers and add them as span attributes. -It is recommended that you should give the correct names of the headers to be captured in the environment variable. -Response header names captured in starlette are case insensitive. So, giving header name as ``CUStomHeader`` in environment variable will be able capture header with name ``customheader``. +Response header names in Starlette are case-insensitive. So, giving the header name as ``CUStom-Header`` in the +environment variable will capture the header named ``custom-header``. -The name of the added span attribute will follow the format ``http.response.header.`` where ```` being the normalized HTTP header name (lowercase, with - characters replaced by _ ). -The value of the attribute will be single item list containing all the header values. +Regular expressions may also be used to match multiple headers that correspond to the given pattern. For example: +:: -Example of the added span attribute, + export OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SERVER_RESPONSE="Content.*,X-.*" + +Would match all response headers that start with ``Content`` and ``X-``. + +Additionally, the special keyword ``all`` can be used to capture all response headers. +:: + + export OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SERVER_RESPONSE="all" + +The name of the added span attribute will follow the format ``http.response.header.`` where ```` +is the normalized HTTP header name (lowercase, with ``-`` replaced by ``_``). The value of the attribute will be a +single item list containing all the header values. + +For example: ``http.response.header.custom_response_header = [","]`` +Sanitizing headers +****************** +In order to prevent storing sensitive data such as personally identifiable information (PII), session keys, passwords, +etc, set the environment variable ``OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SANITIZE_FIELDS`` +to a comma delimited list of HTTP header names to be sanitized. Regexes may be used, and all header names will be +matched in a case-insensitive manner. + +For example, +:: + + export OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SANITIZE_FIELDS=".*session.*,set-cookie" + +will replace the value of headers such as ``session-id`` and ``set-cookie`` with ``[REDACTED]`` in the span. + Note: - Environment variable names to capture http headers are still experimental, and thus are subject to change. + The environment variable names used to capture HTTP headers are still experimental, and thus are subject to change. API --- diff --git a/instrumentation/opentelemetry-instrumentation-starlette/tests/test_starlette_instrumentation.py b/instrumentation/opentelemetry-instrumentation-starlette/tests/test_starlette_instrumentation.py index b168643211..a367ab0e42 100644 --- a/instrumentation/opentelemetry-instrumentation-starlette/tests/test_starlette_instrumentation.py +++ b/instrumentation/opentelemetry-instrumentation-starlette/tests/test_starlette_instrumentation.py @@ -38,6 +38,7 @@ set_tracer_provider, ) from opentelemetry.util.http import ( + OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SANITIZE_FIELDS, OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SERVER_REQUEST, OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SERVER_RESPONSE, _active_requests_count_attrs, @@ -384,21 +385,12 @@ def create_app(self): def setUp(self): super().setUp() - self.env_patch = patch.dict( - "os.environ", - { - OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SERVER_REQUEST: "Custom-Test-Header-1,Custom-Test-Header-2,Custom-Test-Header-3", - OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SERVER_RESPONSE: "Custom-Test-Header-1,Custom-Test-Header-2,Custom-Test-Header-3", - }, - ) - self.env_patch.start() self._instrumentor = otel_starlette.StarletteInstrumentor() self._app = self.create_app() self._client = TestClient(self._app) def tearDown(self) -> None: super().tearDown() - self.env_patch.stop() with self.disable_logging(): self._instrumentor.uninstrument() @@ -413,6 +405,9 @@ def _(request): headers={ "custom-test-header-1": "test-header-value-1", "custom-test-header-2": "test-header-value-2", + "my-custom-regex-header-1": "my-custom-regex-value-1,my-custom-regex-value-2", + "My-Custom-Regex-Header-2": "my-custom-regex-value-3,my-custom-regex-value-4", + "my-secret-header": "my-secret-value", }, ) @@ -426,6 +421,15 @@ async def _(websocket: WebSocket) -> None: "headers": [ (b"custom-test-header-1", b"test-header-value-1"), (b"custom-test-header-2", b"test-header-value-2"), + ( + b"my-custom-regex-header-1", + b"my-custom-regex-value-1,my-custom-regex-value-2", + ), + ( + b"My-Custom-Regex-Header-2", + b"my-custom-regex-value-3,my-custom-regex-value-4", + ), + (b"my-secret-header", b"my-secret-value"), ], } ) @@ -437,6 +441,14 @@ async def _(websocket: WebSocket) -> None: return app +@patch.dict( + "os.environ", + { + OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SANITIZE_FIELDS: ".*my-secret.*", + OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SERVER_REQUEST: "Custom-Test-Header-1,Custom-Test-Header-2,Custom-Test-Header-3,Regex-Test-Header-.*,Regex-Invalid-Test-Header-.*,.*my-secret.*", + OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SERVER_RESPONSE: "Custom-Test-Header-1,Custom-Test-Header-2,Custom-Test-Header-3,my-custom-regex-header-.*,invalid-regex-header-.*,.*my-secret.*", + }, +) class TestHTTPAppWithCustomHeaders(TestBaseWithCustomHeaders): def test_custom_request_headers_in_span_attributes(self): expected = { @@ -446,12 +458,20 @@ def test_custom_request_headers_in_span_attributes(self): "http.request.header.custom_test_header_2": ( "test-header-value-2", ), + "http.request.header.regex_test_header_1": ("Regex Test Value 1",), + "http.request.header.regex_test_header_2": ( + "RegexTestValue2,RegexTestValue3", + ), + "http.request.header.my_secret_header": ("[REDACTED]",), } resp = self._client.get( "/foobar", headers={ "custom-test-header-1": "test-header-value-1", "custom-test-header-2": "test-header-value-2", + "Regex-Test-Header-1": "Regex Test Value 1", + "regex-test-header-2": "RegexTestValue2,RegexTestValue3", + "My-Secret-Header": "My Secret Value", }, ) self.assertEqual(200, resp.status_code) @@ -464,6 +484,13 @@ def test_custom_request_headers_in_span_attributes(self): self.assertSpanHasAttributes(server_span, expected) + @patch.dict( + "os.environ", + { + OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SANITIZE_FIELDS: ".*my-secret.*", + OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SERVER_REQUEST: "Custom-Test-Header-1,Custom-Test-Header-2,Custom-Test-Header-3,Regex-Test-Header-.*,Regex-Invalid-Test-Header-.*,.*my-secret.*", + }, + ) def test_custom_request_headers_not_in_span_attributes(self): not_expected = { "http.request.header.custom_test_header_3": ( @@ -475,6 +502,9 @@ def test_custom_request_headers_not_in_span_attributes(self): headers={ "custom-test-header-1": "test-header-value-1", "custom-test-header-2": "test-header-value-2", + "Regex-Test-Header-1": "Regex Test Value 1", + "regex-test-header-2": "RegexTestValue2,RegexTestValue3", + "My-Secret-Header": "My Secret Value", }, ) self.assertEqual(200, resp.status_code) @@ -496,6 +526,13 @@ def test_custom_response_headers_in_span_attributes(self): "http.response.header.custom_test_header_2": ( "test-header-value-2", ), + "http.response.header.my_custom_regex_header_1": ( + "my-custom-regex-value-1,my-custom-regex-value-2", + ), + "http.response.header.my_custom_regex_header_2": ( + "my-custom-regex-value-3,my-custom-regex-value-4", + ), + "http.response.header.my_secret_header": ("[REDACTED]",), } resp = self._client.get("/foobar") self.assertEqual(200, resp.status_code) @@ -527,6 +564,14 @@ def test_custom_response_headers_not_in_span_attributes(self): self.assertNotIn(key, server_span.attributes) +@patch.dict( + "os.environ", + { + OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SANITIZE_FIELDS: ".*my-secret.*", + OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SERVER_REQUEST: "Custom-Test-Header-1,Custom-Test-Header-2,Custom-Test-Header-3,Regex-Test-Header-.*,Regex-Invalid-Test-Header-.*,.*my-secret.*", + OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SERVER_RESPONSE: "Custom-Test-Header-1,Custom-Test-Header-2,Custom-Test-Header-3,my-custom-regex-header-.*,invalid-regex-header-.*,.*my-secret.*", + }, +) class TestWebSocketAppWithCustomHeaders(TestBaseWithCustomHeaders): def test_custom_request_headers_in_span_attributes(self): expected = { @@ -536,12 +581,20 @@ def test_custom_request_headers_in_span_attributes(self): "http.request.header.custom_test_header_2": ( "test-header-value-2", ), + "http.request.header.regex_test_header_1": ("Regex Test Value 1",), + "http.request.header.regex_test_header_2": ( + "RegexTestValue2,RegexTestValue3", + ), + "http.request.header.my_secret_header": ("[REDACTED]",), } with self._client.websocket_connect( "/foobar_web", headers={ "custom-test-header-1": "test-header-value-1", "custom-test-header-2": "test-header-value-2", + "Regex-Test-Header-1": "Regex Test Value 1", + "regex-test-header-2": "RegexTestValue2,RegexTestValue3", + "My-Secret-Header": "My Secret Value", }, ) as websocket: data = websocket.receive_json() @@ -566,6 +619,9 @@ def test_custom_request_headers_not_in_span_attributes(self): headers={ "custom-test-header-1": "test-header-value-1", "custom-test-header-2": "test-header-value-2", + "Regex-Test-Header-1": "Regex Test Value 1", + "regex-test-header-2": "RegexTestValue2,RegexTestValue3", + "My-Secret-Header": "My Secret Value", }, ) as websocket: data = websocket.receive_json() @@ -589,6 +645,13 @@ def test_custom_response_headers_in_span_attributes(self): "http.response.header.custom_test_header_2": ( "test-header-value-2", ), + "http.response.header.my_custom_regex_header_1": ( + "my-custom-regex-value-1,my-custom-regex-value-2", + ), + "http.response.header.my_custom_regex_header_2": ( + "my-custom-regex-value-3,my-custom-regex-value-4", + ), + "http.response.header.my_secret_header": ("[REDACTED]",), } with self._client.websocket_connect("/foobar_web") as websocket: data = websocket.receive_json() @@ -624,6 +687,14 @@ def test_custom_response_headers_not_in_span_attributes(self): self.assertNotIn(key, server_span.attributes) +@patch.dict( + "os.environ", + { + OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SANITIZE_FIELDS: ".*my-secret.*", + OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SERVER_REQUEST: "Custom-Test-Header-1,Custom-Test-Header-2,Custom-Test-Header-3,Regex-Test-Header-.*,Regex-Invalid-Test-Header-.*,.*my-secret.*", + OTEL_INSTRUMENTATION_HTTP_CAPTURE_HEADERS_SERVER_RESPONSE: "Custom-Test-Header-1,Custom-Test-Header-2,Custom-Test-Header-3,my-custom-regex-header-.*,invalid-regex-header-.*,.*my-secret.*", + }, +) class TestNonRecordingSpanWithCustomHeaders(TestBaseWithCustomHeaders): def setUp(self): super().setUp()