-
Notifications
You must be signed in to change notification settings - Fork 3
Monitoring the machines involved
Each computer, or machine, has a finite, limited capacity for work. Their ability to cope with more work tails off when resources are close-to, or at, capacity. Resources include CPU, RAM, Network IO, Storage IO and Storage. Choices of storage can affect the performance characteristics, some are IO bound, others throughput bound.
It's useful to monitor each computer while they are working. We may also want to check their available capacity both before and after a period of work (such as a test). This particularly applies to storage volumes (which may be logical and/or physical disks). Kafka nodes in particular fail and struggle to recover when they run out of storage.
The monitoring needs to have a low overhead, be consistent, repeatable, and useful. Linux has lots of utility programs such as top
which are relevant and may be useful. One such example is iostat
. Here is an example of using iostat
to record the IO each second for 400 seconds, enough to record the IO and CPU for a machine before, during and immediately after a 5 minute (300 second) test.
iostat -m -t 1 400 > `hostname`.iostat.log