Note: this project was renamed to unvariance/collector
(previously perfpod/memory-collector
). See issue#34.
A Kubernetes-native collector for monitoring memory subsystem interference between pods. This project is under active development and we welcome contributors to help build this critical observability component.
Memory Collector helps Kubernetes operators identify and quantify performance degradation caused by memory subsystem interference ("noisy neighbors") by collecting metrics about:
- Memory bandwidth utilization
- Last Level Cache (LLC) usage
- CPU performance counters related to memory access
This data helps operators:
- Identify when pods are experiencing memory subsystem interference
- Quantify the performance impact of noisy neighbors
- Build confidence before deploying memory interference mitigation solutions
Memory subsystem interference can cause:
- 25%+ increase in cycles per instruction (CPI)
- 4x-13x increase in tail latency
- Reduced application performance even with CPU and memory limits
Common sources of interference include:
- Garbage collection
- Big data analytics
- Security scanning
- Video streaming/transcoding
- Container image decompression
The project is in active development across several areas:
- Implementing collection for Intel RDT and AMD QoS
- Collecting hardware performance counters: cycles, instructions, cache misses
- Defining Prometheus metrics
- Helm chart, DaemonSet implementation
- Prometheus integration
- Architecture documentation
- Benchmark suite with example workloads
- Integration testing framework
We welcome contributions! Here's how you can help:
- Code: Check our Good First Issues and Development Guide
- Use Cases: Share interference scenarios, test in your environment
- Discussion: Open GitHub Issues or email [email protected]
- Schedule a chat: https://yonch.com/collector
This project builds on research and implementation from:
- Google's CPI² system
- Meta's Resource Control implementation
- Alibaba Cloud's Alita system
- MIT's Caladan project
Licensed under the Apache License, Version 2.0
Documentation is licensed under a Creative Commons Attribution 4.0 International License.