This is a collection of documentation, how-tos, tools and other information on
debugging and identifying Kubernetes/container workload failures, performance
and reliability considerations, and other kubernaughties
.
There are many gotchas, mud pits and blind spots running distributed systems, and kubernetes is no different. Hopefully, this stuff helps you and your team.
Currently, focused on in-depth diagnosis for IO / Resource contention. IO, resource contention notes, docs and tools.