Kelemetry aggregates various data sources including Kubernetes events, audit log, informers into the form of traditional tracing, enabling visualization through Jaeger UI and automatic analysis.
As a distributed asynchronous declarative API, Kubernetes suffers from lower explainability compared to traditional RPC-based services as there is no clear causal relationship between events; a change in one object indirectly effects changes in other objects, posing challenges to understanding and troubleshooting the system. Past attempts of tracing in Kubernetes were either limited to single components or excessively intrusive to individual components.
Kelemetry addresses the problem by associating events of related objects into the same trace. By recognizing object relations such as OwnerReferences, related events can be visualized together without prior domain-specific knowledge. The behavior of various components are recorded on the same timeline to reconstruct the causal hierarchy of the actual events.
- Collect audit logs
- Collect controller events (i.e. the "Events" section in
kubectl describe
) - Record object diff associated with audit logs
- Connect objects based on owner references
- Collect data from custom sources (Plugin API)
- Connect objects with custom rules with multi-cluster support (Plugin API)
- Navigate trace with Jaeger UI and API
- Scalable for multiple large clusters
- Construct tailormade metrics based on audit logs
graph TB
kelemetry[Kelemetry]
audit-log[Audit log] --> kelemetry
event[Event] --> kelemetry
watch[Object watch] --> kelemetry
kelemetry ---> |OpenTelemetry protocol| storage[Jaeger storage]
plugin[Kelemetry storage plugin] --> storage
user[User] --> ui[Jaeger UI] --> plugin
- Deployment for production
- Quickstart for trying out Kelemetry with a test cluster
- Development setup for developing Kelemetry
See Code of Conduct.
Kelemetry is licensed under Apache License 2.0.