Skip to content

Latest commit

 

History

History
362 lines (306 loc) · 19.3 KB

0027-faas-fields.md

File metadata and controls

362 lines (306 loc) · 19.3 KB

0027: Function as a Service Fields

  • Stage: 2 (candidate)
  • Date: 2021-09-14

Using APM agents in the context of serverless environments (e.g. AWS Lambda, Azure Functions, etc.) allows to capture function as a service (faas) specific context that can be of great value for the end users and provide correlation points with other sources of data.

Extending ECS with a dedicated fields group or embedding it into exsting cloud fields would allow to capture this data in a meaningful, semantically aligned way and correlate the data accross different use cases (e.g. correlating AWS Lambda traces with corresponding Lambda metrics and logs).

The existing specification in OpenTelemetry can serve as a good orientation: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/semantic_conventions/faas.md#example

Fields

Discussing the initial proposal with Andrew Wilkins, we came up with an adapted proposal (compared to the proposal for stage 0) that would reuse as many as possible existing ECS fields:

New Fields

Field Type Example Description Use case
faas.id keyword arn:aws:lambda:us-west-2:123456789012:function:my-function The unique identifier of a serverless function. For AWS Lambda it's the function ARN (Amazon Resource Name) without a version or alias suffix. Correlation of traces, logs and metrics for a specific serverless function.
faas.name keyword my-function The name of a serverless function. Display name of a serverless function.
faas.version keyword 123 The version of a serverless function. Group / differentiate data by the version of a serverless function.
faas.coldstart boolean true Boolean value indicating a cold start of a function Can be used in the UI denote function coldstarts.
faas.execution keyword "af9d5aa4-a685-4c5f-a22b-444f80b3cc28" The execution ID of the current function execution. Allows correlation with CloudWatch logs and metrics
faas.trigger.type keyword "http" one of http,pubsub,datasource, timer, other Allows differentiating different function types
faas.trigger.request_id keyword e.g. 123456789 The iD of the trigger request , message, event, etc. Correlation of metrics and logs with the corresponding trigger request

Reusing existing service.* fields

For the initially proposed fields faas.name, faas.id, faas.version and faas.instance we decided to reuse the existing fields service.name, service.id, service.version and service.node.name.

Nesting cloud.* and service.* fields under _.origin.* and _.target.*

We identified a big overlap between the initially proposed faas.trigger.* fields with the already existing cloud.* and service. fields. Allowing to self-nest cloud and service fields under cloud.origin.* / cloud.target.* and service.origin.* / service.target.*, respectively, would allow to cover most of the faas.trigger.* fields.

Moreover, the proposal for nesting cloud fields would resolve other use cases as well (e.g. #1282).

Initially proposed New proposed nested cloud or service field
faas.trigger.name service.origin.name
faas.trigger.id service.origin.id
faas.trigger.version service.origin.version
faas.trigger.account.name cloud.origin.account.name
faas.trigger.account.id cloud.origin.account.id
faas.trigger.region cloud.origin.region

Done.

Usage

faas.id, faas.name & faas.version

Allows for correlating traces, logs and metrics for individual serverless functions and versions. faas.name will be used as the display name of serverless functions in the UI.

faas.coldstart

Will be used in the APM UI to mark function invocations that resultet from a coldstart. This is a useful information for the end users to differentiate coldstart behaviour from warmstart function invocations.

faas.execution & faas.trigger.request_id

These IDs will be used to correlate APM data (traces / transactions), logs and metrics of the faas function (e.g. from CloudWatch) as well as logs and metrics from the corresponding trigger for individual invocations.

faas.trigger.type

Indicates the type of the function trigger. Allows to group different function types.

service.origin.* & cloud.origin.*

Provides meta information on the origin service that triggered the faas function. End users can use this information to better understand the context, dependencies and causalities when analyzing and troubleshooting faas-related observability scenarios. For example, this information could provide insights on analysis questions like this: "Do function invocations that are triggered from cloud region us-east-1 behave similar to invocations from region eu-west-1?", etc.

Source data

Faas functions provide meta-information in their execution environment. APM agents use instrumentation techniques to read this information. For instance, AWS Lambda provides an event and a context object with each function invocation: https://docs.aws.amazon.com/lambda/latest/dg/python-context.html

The above fields will be derived by the APM agents from the AWS Lambda context object and the event object that are passed with an invocation of a Lambda function. Below is an example for the context and event object. The mapping to the proposed fields for this example is layed out in the following table

target ECS field source field
faas.id context.invokedFunctionArn
faas.name context.functionName
faas.version context.functionVersion
faas.coldstart No source field. Determined by the APM agent on the first Lambda function invocation.
faas.execution context.awsRequestId
faas.trigger.type No source field. Determined by the APM agent based on the event object type. Would be http in this example.
faas.trigger.request_id event.requestContext.requestId
service.origin.name ${event.requestContext.httpMethod} ${event.requestContext.resourcePath}/${event.requestContext.stage} -> GET /fetch_all/dev
service.origin.id event.requestContext.apiId
service.origin.version No source field. Determined by the APM agent based on the event object type whether it's API version 1.0 or 2.0.
cloud.origin.service.name api gateway
cloud.origin.account.id event.requestContext.accountId

AWS Lambda context object

Description available here.

context:

{
    "callbackWaitsForEmptyEventLoop": true,
    "functionVersion": "$LATEST",
    "functionName": "the-function-name",
    "memoryLimitInMB": "128",
    "logGroupName": "/aws/lambda/the-function-name",
    "logStreamName": "2021/08/13/[$LATEST]08834acf4e4f463b95b7b99aa8b34aff",
    "invokedFunctionArn": "arn:aws:lambda:us-west-2:XXXXXXXXXXXX:function:the-function-name",
    "awsRequestId": "649bf7d0-c6ae-432d-899d-da44ccd7ee95"
}

AWS Lambda event object

Description available here.

event:

{
    "resource": "/fetch_all",
    "path": "/fetch_all",
    "httpMethod": "GET",
    "headers": {
        "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
        "Accept-Encoding": "gzip, deflate, br",
        "Accept-Language": "en-US,en;q=0.5",
        "CloudFront-Forwarded-Proto": "https",
        "CloudFront-Is-Desktop-Viewer": "true",
        "CloudFront-Is-Mobile-Viewer": "false",
        "CloudFront-Is-SmartTV-Viewer": "false",
        "CloudFront-Is-Tablet-Viewer": "false",
        "CloudFront-Viewer-Country": "US",
        "Host": "02plqthge2.execute-api.us-east-1.amazonaws.com",
        "upgrade-insecure-requests": "1",
        "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:72.0) Gecko/20100101 Firefox/72.0",
        "Via": "2.0 969f35f01b6eddd92239a3e818fc1e0d.cloudfront.net (CloudFront)",
        "X-Amz-Cf-Id": "eDbpfDwO-CRYymEFLkW6CBCsU_H_PS8R93_us53QWvXWLS45v3NvQw==",
        "X-Amzn-Trace-Id": "Root=1-5e502af4-fd0c1c6fdc164e1d6361183b",
        "X-Forwarded-For": "76.76.241.57, 52.46.47.139",
        "X-Forwarded-Port": "443",
        "X-Forwarded-Proto": "https"
    },
    "multiValueHeaders": {
        "Accept": [
            "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8"
        ],
        "Accept-Encoding": [
            "gzip, deflate, br"
        ],
        "Accept-Language": [
            "en-US,en;q=0.5"
        ],
        "CloudFront-Forwarded-Proto": [
            "https"
        ],
        "CloudFront-Is-Desktop-Viewer": [
            "true"
        ],
        "CloudFront-Is-Mobile-Viewer": [
            "false"
        ],
        "CloudFront-Is-SmartTV-Viewer": [
            "false"
        ],
        "CloudFront-Is-Tablet-Viewer": [
            "false"
        ],
        "CloudFront-Viewer-Country": [
            "US"
        ],
        "Host": [
            "02plqthge2.execute-api.us-east-1.amazonaws.com"
        ],
        "upgrade-insecure-requests": [
            "1"
        ],
        "User-Agent": [
            "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:72.0) Gecko/20100101 Firefox/72.0"
        ],
        "Via": [
            "2.0 969f35f01b6eddd92239a3e818fc1e0d.cloudfront.net (CloudFront)"
        ],
        "X-Amz-Cf-Id": [
            "eDbpfDwO-CRYymEFLkW6CBCsU_H_PS8R93_us53QWvXWLS45v3NvQw=="
        ],
        "X-Amzn-Trace-Id": [
            "Root=1-5e502af4-fd0c1c6fdc164e1d6361183b"
        ],
        "X-Forwarded-For": [
            "76.76.241.57, 52.46.47.139"
        ],
        "X-Forwarded-Port": [
            "443"
        ],
        "X-Forwarded-Proto": [
            "https"
        ]
    },
    "queryStringParameters": null,
    "multiValueQueryStringParameters": null,
    "pathParameters": null,
    "stageVariables": null,
    "requestContext": {
        "resourceId": "y3tkf7",
        "resourcePath": "/fetch_all",
        "httpMethod": "GET",
        "extendedRequestId": "IQumRELJIAMF6fQ=",
        "requestTime": "21/Feb/2020:19:09:40 +0000",
        "path": "/dev/fetch_all",
        "accountId": "571481734049",
        "protocol": "HTTP/1.1",
        "stage": "dev",
        "domainPrefix": "02plqthge2",
        "requestTimeEpoch": 1582312180890,
        "requestId": "6f3dffca-46f8-4c8b-800b-6bc1ea2554ec",
        "identity": {
            "cognitoIdentityPoolId": null,
            "accountId": null,
            "cognitoIdentityId": null,
            "caller": null,
            "sourceIp": "76.76.241.57",
            "principalOrgId": null,
            "accessKey": null,
            "cognitoAuthenticationType": null,
            "cognitoAuthenticationProvider": null,
            "userArn": null,
            "userAgent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:72.0) Gecko/20100101 Firefox/72.0",
            "user": null
        },
        "domainName": "02plqthge2.execute-api.us-east-1.amazonaws.com",
        "apiId": "02plqthge2"
    },
    "body": null,
    "isBase64Encoded": false
}

Scope of impact

  • Ingestion mechanisms:
    • APM server will extend the intake V2 API to accept the new fields and store them with the transaction documents
    • APM server will extend OpenTelemetry field mapping to account for these new fields
  • Usage mechanisms:
    • APM UI may utilize the new fields to provide Lambda / serverless specific visualizations (e.g. indicating cold starts on transactions in the waterfall view, showing meta information on lambda service views)
  • ECS project
    • the concept of self-nesting service and cloud fields under origin and target needs clear documentation that avoids confusion around when to use which of the fields. Tried to address this with the description in the schema for those fields in this PR.

Concerns

Nesting origin field to identify 3rd party

During stage 1 review @ebeahan identied the potential confusion over an established ECS pattern where the root entity defines the do'er and *.target.* the affected entity.

This proposal extends this pattern as there are 3 active parties involved. This puts the onus on ECS documentation being extremely clear on which field a user needs to query to get their intended results.

  • extended descriptio / footnote for service and cloud fields in this PR to avoid confusion about origin and target nesting of service and cloud fields

People

The following are the people that consulted on the contents of this RFC.

  • @AlexanderWert | author, sponsor
  • @axw | subject matter expert
  • @Mpdreamz | subject matter expert

References

RFC Pull Requests