Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add ECS mode for Helm deployment #33

Merged
merged 14 commits into from
Jul 26, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .env.override
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,6 @@ FRAUD_SERVICE_DOCKERFILE=./src/frauddetectionservice/Dockerfile.elastic
# *********************
# Elastic Collector
# *********************
COLLECTOR_CONTRIB_IMAGE=docker.elastic.co/beats/elastic-agent:8.15.0-7b611e39-SNAPSHOT
COLLECTOR_CONTRIB_IMAGE=docker.elastic.co/beats/elastic-agent:8.15.0-SNAPSHOT
OTEL_COLLECTOR_CONFIG=./src/otelcollector/otelcol-elastic-config.yaml
OTEL_COLLECTOR_CONFIG_EXTRAS=./src/otelcollector/otelcol-elastic-config-extras.yaml
27 changes: 22 additions & 5 deletions .github/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,21 +49,38 @@ Additionally, the OpenTelemetry Contrib collector has also been changed to the [
helm repo update open-telemetry

# deploy the configuration for the Elastic OpenTelemetry collector distribution
kubectl apply -f configmap-elastic.yaml
kubectl apply -f configmap-deployment.yaml

# deploy the demo through helm install
helm install -f values.yaml my-otel-demo open-telemetry/opentelemetry-demo
helm install -f deployment.yaml my-otel-demo open-telemetry/opentelemetry-demo
```

#### Kubernetes monitoring

This demo already enables cluster level metrics collection with `clusterMetrics` and
Kubernetes events collection with `kubernetesEvents`.

In order to add Node level metrics collection and autodiscovery for Redis Pods
we can run an additional Otel collector Daemonset with the following:
In order to add Node level metrics collection we can run an additional Otel collector Daemonset with the following:

`helm install daemonset open-telemetry/opentelemetry-collector --values daemonset.yaml`
1. Create a secret in Kubernetes with the following command.
```
kubectl create secret generic elastic-secret-ds \
--from-literal=elastic_endpoint='YOUR_ELASTICSEARCH_ENDPOINT' \
--from-literal=elastic_api_key='YOUR_ELASTICSEARCH_API_KEY'
```
Don't forget to replace
- `YOUR_ELASTICSEARCH_ENDPOINT`: your Elasticsearch endpoint (example: `1234567.us-west2.gcp.elastic-cloud.com:443`).
- `YOUR_ELASTICSEARCH_API_KEY`: your Elasticsearch API Key

2. Execute the following command to deploy the OpenTelemetry Collector to your Kubernetes cluster:

```
# deploy the configuration for the Elastic OpenTelemetry collector distribution
kubectl apply -f configmap-daemonset.yaml

# deploy the Elastic OpenTelemetry collector distribution through helm install
helm install otel-daemonset open-telemetry/opentelemetry-collector --values daemonset.yaml
```

## Explore and analyze the data With Elastic

Expand Down
265 changes: 265 additions & 0 deletions kubernetes/elastic-helm/configmap-daemonset.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,265 @@
---
apiVersion: v1
kind: ConfigMap
metadata:
name: elastic-otelcol-agent-ds
namespace: default
labels:
app.kubernetes.io/name: otelcol

data:
relay: |
exporters:
debug:
elasticsearch:
endpoints:
- ${env:ELASTIC_ENDPOINT}
api_key: ${env:ELASTIC_API_KEY}
logs_dynamic_index:
enabled: true
metrics_dynamic_index:
enabled: true
mapping:
mode: ecs
processors:
batch: {}
elasticinframetrics:
add_system_metrics: true
add_k8s_metrics: true
resourcedetection/eks:
detectors: [env, eks]
timeout: 15s
override: true
eks:
resource_attributes:
k8s.cluster.name:
enabled: true
resourcedetection/gcp:
detectors: [env, gcp]
timeout: 2s
override: false
resource/k8s:
attributes:
- key: service.name
from_attribute: app.label.component
action: insert
attributes/k8s_logs_dataset:
actions:
- key: data_stream.dataset
value: "kubernetes.container_logs"
action: upsert
attributes/dataset:
actions:
- key: event.dataset
from_attribute: data_stream.dataset
action: upsert
resource/cloud:
attributes:
- key: cloud.instance.id
from_attribute: host.id
action: insert
resource/demo:
attributes:
- key: deployment.environment
value: "opentelemetry-demo"
action: upsert
resource/process:
attributes:
- key: process.executable.name
action: delete
- key: process.executable.path
action: delete
resourcedetection/system:
detectors: ["system", "ec2"]
system:
hostname_sources: [ "os" ]
resource_attributes:
host.name:
enabled: true
host.id:
enabled: false
host.arch:
enabled: true
host.ip:
enabled: true
host.mac:
enabled: true
host.cpu.vendor.id:
enabled: true
host.cpu.family:
enabled: true
host.cpu.model.id:
enabled: true
host.cpu.model.name:
enabled: true
host.cpu.stepping:
enabled: true
host.cpu.cache.l2.size:
enabled: true
os.description:
enabled: true
os.type:
enabled: true
ec2:
resource_attributes:
host.name:
enabled: false
host.id:
enabled: true
k8sattributes:
filter:
node_from_env_var: K8S_NODE_NAME
passthrough: false
pod_association:
- sources:
- from: resource_attribute
name: k8s.pod.ip
- sources:
- from: resource_attribute
name: k8s.pod.uid
- sources:
- from: connection
extract:
metadata:
- "k8s.namespace.name"
- "k8s.deployment.name"
- "k8s.statefulset.name"
- "k8s.daemonset.name"
- "k8s.cronjob.name"
- "k8s.job.name"
- "k8s.node.name"
- "k8s.pod.name"
- "k8s.pod.uid"
- "k8s.pod.start_time"
labels:
- tag_name: app.label.component
key: app.kubernetes.io/component
from: pod
receivers:
filelog:
retry_on_failure:
enabled: true
start_at: end
exclude:
# exlude collector logs
- /var/log/pods/default_otel-daemonset-opentelemetry-collector-agent*_*/opentelemetry-collector/*.log
include:
- /var/log/pods/*/*/*.log
include_file_name: false
include_file_path: true
operators:
- id: container-parser
type: container
hostmetrics:
collection_interval: 10s
root_path: /hostfs
scrapers:
cpu:
metrics:
system.cpu.utilization:
enabled: true
system.cpu.logical.count:
enabled: true
memory:
metrics:
system.memory.utilization:
enabled: true
process:
mute_process_exe_error: true
mute_process_io_error: true
mute_process_user_error: true
metrics:
process.threads:
enabled: true
process.open_file_descriptors:
enabled: true
process.memory.utilization:
enabled: true
process.disk.operations:
enabled: true
network:
processes:
load:
disk:
filesystem:
exclude_mount_points:
mount_points:
- /dev/*
- /proc/*
- /sys/*
- /run/k3s/containerd/*
- /var/lib/docker/*
- /var/lib/kubelet/*
- /snap/*
match_type: regexp
exclude_fs_types:
fs_types:
- autofs
- binfmt_misc
- bpf
- cgroup2
- configfs
- debugfs
- devpts
- devtmpfs
- fusectl
- hugetlbfs
- iso9660
- mqueue
- nsfs
- overlay
- proc
- procfs
- pstore
- rpc_pipefs
- securityfs
- selinuxfs
- squashfs
- sysfs
- tracefs
match_type: strict
kubeletstats:
auth_type: serviceAccount
collection_interval: 20s
endpoint: ${env:K8S_NODE_NAME}:10250
node: '${env:K8S_NODE_NAME}'
# Required to work for all CSPs without an issue
insecure_skip_verify: true
k8s_api_config:
auth_type: serviceAccount
metrics:
k8s.pod.cpu.node.utilization:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can also enable the k8s.pod.memory.node.utilization

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! Saw it now, sure I will create a follow-up PR. Thanks!

enabled: true
k8s.container.cpu_limit_utilization:
enabled: true
k8s.pod.cpu_limit_utilization:
enabled: true
k8s.container.cpu_request_utilization:
enabled: true
k8s.container.memory_limit_utilization:
enabled: true
k8s.pod.memory_limit_utilization:
enabled: true
k8s.container.memory_request_utilization:
enabled: true
k8s.node.uptime:
enabled: true
k8s.node.cpu.usage:
enabled: true
k8s.pod.cpu.usage:
enabled: true
extra_metadata_labels:
- container.id
service:
pipelines:
logs:
receivers: [filelog]
processors: [batch, k8sattributes, resourcedetection/system, resourcedetection/eks, resourcedetection/gcp, resource/demo, resource/k8s, resource/cloud, attributes/k8s_logs_dataset]
exporters: [debug, elasticsearch]
metrics:
receivers: [hostmetrics, kubeletstats]
processors: [batch, k8sattributes, elasticinframetrics, resourcedetection/system, resource/demo, resourcedetection/eks, resourcedetection/gcp, resource/k8s, resource/cloud, attributes/dataset, resource/process]
exporters: [debug, elasticsearch]
telemetry:
metrics:
address: ${env:MY_POD_IP}:8888
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@ data:
- otlp/elastic
processors:
- batch
- resource
receivers:
- otlp
metrics:
Expand All @@ -57,6 +58,7 @@ data:
- debug
processors:
- batch
- resource
receivers:
- httpcheck/frontendproxy
- otlp
Expand Down
Loading
Loading