Skip to content

Latest commit

 

History

History
395 lines (315 loc) · 15.8 KB

0002-rfc-environment.md

File metadata and controls

395 lines (315 loc) · 15.8 KB

0002: Service Environment Field

  • Stage: 2 (candidate)
  • Date: 2021-07-28

Fields

This RFC calls for the addition in ECS of one field to describe the environment ("production", "staging", "qa"...) from which an event of a component of the application layer (service, application or function) is emitted.

We propose to standardise the environment field to qualify a service using the field already used by Elastic APM: service.environment.

No existing ECS field is impacted as no ECS field relates to this concept of "environment".

The service.environment field will supplement the existing fields of the service.* namespace such as service.name and service.version.

Fields (yaml)

---
- name: service
  title: Service
  group: 2
  short: Fields describing the service for or from which the data was collected.
  description: >
    The service fields describe the service for or from which the data was collected.

    These fields help you find and correlate logs for a specific
    service and version.
  type: group
  fields:

    - name: environment
      level: extended
      type: keyword
      short: Environment on which the data is collected
      description: >
        Environment on which the data is collected

      example: production, staging, qa, or dev

Usage

service.environment will typically be used as a "static attribute" primarily defined in the APM agent and synthetics agent (ie Heartbeat) configuration. There are also use cases where the service.environment will be defined in Filebeat and Metricbeat collectors.

Usage with APM agents

To define service.environment on APM agent configurations, we use -Delastic.apm.environment=staging or export ELASTIC_APM_ENVIRONMENT=staging, see https://www.elastic.co/guide/en/apm/agent/java/current/config-core.html#config-environment

Example

java -javaagent:/path/to/elastic-apm-agent-<version>.jar \
     -Delastic.apm.service_name=www-frontend \
     -Delastic.apm.environment=production \
     -Delastic.apm.server_urls=https://apm-server.my-ecommerce.com:8200 \
     -Delastic.apm.secret_token= \
     -Delastic.apm.application_packages=com.myecommerce \
     -jar www-frontend.jar

Note: we may want to evolve the Java system property (-Delastic.apm.environment) and the environment variable (ELASTIC_APM_ENVIRONMENT) to better reflect the ECS field name, it could look like -Delastic.apm.service_environment and ELASTIC_APM_SERVICE_ENVIRONMENT.

Usage with Heartbeat

Synthetics tests are the second major use case to qualify the environment of a service. An important difference with APM agents is that an Heartbeat agent is likely to simultenaously run synthetics tests on multiple services spread across multiple environments. The qualification of the service.nameand service.environment are defined per heartbeat.monitor rather than globally.

The configuration could look like:

# EXPERIMENT - NOT SUPPORTED CONFIGURATION SNIPPET
heartbeat.monitors:
- type: http
  id: www-frontend-monitor
  name: Website Frontend Monitor
  service:
    name: www-frontend
    environment: production
  urls: ["https://www.my-ecommerce-company.com/status"]
  schedule: '@every 10s'
  check.response.status: 200
  timeout: 2s
- type: http
  id: www-frontend-monitor-staging
  name: Website Frontend Monitor - Staging
  service:
    name: www-frontend
    environment: staging
  urls: ["https://www.staging.my-ecommerce-company.com/status"]
  schedule: '@every 10s'
  check.response.status: 200
  timeout: 5s

Note the support in Heartbeat of service.name is waiting for elastic/beats#20330

Usage with Filebeat and Metricbeat

Note that all the Beats come with a different example field to define the environment: fields.env (see filebeat.yml, metricbeat.yml, auditbeat.yml or heartbeat.yml). This fields.env field is misaligned with this desire of standardisation because it cannot be standardised in ECS due to the following ECS rules:

  • The namespace fields.* is not accepted in ECS, this namespace is dedicated to non standardised fields.
  • ECS don't use abbreviations and env is the abbreviation of environment.

Note that the fields.env field is just an example and it not likely to be used very broadly because for infrastructure elements such as servers, the delineation between environments is often difficult to establish as servers are frequently simulatenaously running production and non production workloads.

We propose to tackle later the questions of

  • Standardisating the characterization of the environment for infrastructure later.
  • Specifying the service.environment and service.name for service/application related
    • Log files collected by filebeat: application log file collected on disk...
    • Metrics collected by metricbeat: application metrics collected via Prometheus, Jolokia, JMX... For the metricbeat and filebeat problems described above, the solution used by Heartbeat HTTP monitors (see above) is an interesting path forward.

Source data

Observability: data produced by the infrastruture and application layers. Data types being logs, metrics, distributed traces and uptime monitors. Security: security data should also benefit of specifying the environment from which they are emitted to offer filtering (SIEM...) on Elastic cluster spreading across multiple environments (e.g. "production" and "staging").

Note: sample json documents are stored on https://gist.github.com/cyrille-leclerc/81deca4852df7754246b70d4a01bb9b0

Sample APM Transaction JSON document

Sample of APM Transaction using service.name and service.environment

{
  "_index": "apm-7.10.0-transaction-000001",
  "_type": "_doc",
  "_id": "8tlyD3YBKOouY3Ui1sED",
  "_version": 1,
  "_score": null,
  "_source": {
    ...
    "service": {
      "node": {
        "name": "cyrillerclaptop.localdomain"
      },
      "environment": "staging",
      "framework": {
        "name": "Spring Web MVC",
        "version": "5.3.1"
      },
      "name": "frontend",
      "runtime": {
        "name": "Java",
        "version": "15"
      },
      "language": {
        "name": "Java",
        "version": "15"
      },
      "version": "1.0-SNAPSHOT"
    },
    ...
  }

Sample Heartbeat HTTP Monitor Check JSON document

service.environment would be added next to service.name.

{
  "_index": "heartbeat-7.10.0-2020.11.24-000001",
  "_type": "_doc",
  "_id": "J9uAD3YBKOouY3UiDwhY",
  "_version": 1,
  "_score": null,
  "_source": {
    "@timestamp": "2020-11-28T15:36:58.475Z",
    "monitor": {
      "check_group": "8a44a150-318f-11eb-a924-acde48001122",
      "ip": "127.0.0.1",
      "status": "up",
      "duration": {
        "us": 2156
      },
      "id": "frontend-check",
      "name": "Frontend",
      "type": "http",
      ...
    },
    "url": {
      "domain": "localhost",
      "port": 8080,
      "path": "/actuator/health",
      "full": "http://localhost:8080/actuator/health",
      "scheme": "http"
    },
    "service": {
      "name": "frontend"
      // `service.environment` would be added here
    },
    ...
  }
}

Sample Filebeat JSON document generated via the logback-ecs-encoder library

service.environment would be added next to service.name

{
  "_index": "filebeat-7.10.0-2020.11.24-000001",
  "_type": "_doc",
  "_id": "2duDD3YBKOouY3UiPCLK",
  "_version": 1,
  "_score": null,
  "_source": {
    "@timestamp": "2020-11-28T15:40:26.178Z",
    "event.dataset": "frontend.log",
    "service.name": "frontend", // `service.environment` would be added here
    "trace.id": "080e218993ca8b2916d8cc9bc9b38bc3",
    "message": "SUCCESS createOrder([OrderController.OrderForm@55b8d300list[[OrderProductDto@39448661 product = [Product@29587cf7 id = 4, name = 'Icecream', price = 5.0], quantity = 1]]]): price: 5.0, id:2509373",
    "input": {
      "type": "log"
    },
    "log.logger": "com.mycompany.ecommerce.controller.OrderController",
    "log.level": "INFO",
    "process.thread.name": "http-nio-8080-exec-3",
    ...
}

Sample Metricbeat JSON Document collected for a Prometheus metric

service.environment would be added next to service.type and service.address, we would also add service.name

{
  "_index": "metricbeat-7.10.0-2020.11.24-000001",
  "_type": "_doc",
  "_id": "vNuID3YBKOouY3UivU-d",
  "_version": 1,
  "_score": null,
  "_source": {
    "@timestamp": "2020-11-28T15:46:24.391Z",
    "metricset": {
      "name": "collector",
      "period": 10000
    },
    "service": {
      // `service.environment` would be added here. We would also add `service.name`
      "address": "http://localhost:8080/actuator/prometheus",
      "type": "prometheus"
    },
    "fields": {
      "env": "staging"
    },
    "agent": {
      "type": "metricbeat",
      "version": "7.10.0",
      "hostname": "cyrillerclaptop.localdomain",
      "ephemeral_id": "d4d682a8-6944-42cb-9273-87e07b620643",
      "id": "4acf759f-74b0-4198-80e2-94f705511512",
      "name": "cyrillerclaptop.localdomain"
    },
    "ecs": {
      "version": "1.6.0"
    },

    "prometheus": {
      "order_per_country": {
        "histogram": {
          "values": [],
          "counts": []
        }
      },
      "labels": {
        "quantile": "0.75",
        "instance": "localhost:8080",
        "job": "prometheus",
        "shipping_country": "US"
      }
    },
    ...
  },
  ...
}

Scope of impact

Concerns

Alternative field name proposed in Beats (Filebeat, Metricbeat, Heartbeat...)

Beats (Filebeat, Metricbeat, Heartbeat...) document in their configuration files (filebeat.yml, metricbeat.yml...) with an alternative field to characterize environments: fields.env.

Example with filebeat.yml

# Optional fields that you can specify to add additional information to the
# output.
#fields:
#  env: staging

"Service" namespace not suited to characterise infrastructure component

Using the namespace service for service.environment could sound awkward to some low level messages (e.g. system events). The rationale is that service.environment will only be used to qualify components of the application layer (ie services, application, function) but will not be used to qualify the environment of an infrastructure component (server, virtual machine...).

Note that the environment field is a good candidate to be reused in other namespace than the service.* to cover the Infrastructure use cases.

Different standardisation chosen by OpenTelemetry

OpenTelemetry has standardized deployment.environment, referring to Wikipedia: Deployment Environment. The benefit of deployment.environment is that it works better for the characterization of the infrastructure (e.g. physical server, vm): https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/resource/semantic_conventions/deployment_environment.md

Real-world implementations

People

  • @cyrille-leclerc | co-author
  • @eyalkoren | co-author
  • @exekias | sponsor
  • @sqren | subject matter expert

References

RFC Pull Requests