Skip to content
Paul Colby edited this page Dec 8, 2013 · 15 revisions

Disclaimer

The benchmarks on this page are not intended to be definitive. They are intended to show the order-of-magnitude of the overhead associated with using PMDA++ over PCP's C API. If performance is absolutely critical for you (you're working with very high-speed sampling of hardware devices, or monitoring highly constrained devices?) then you should perform your own benchmarking on the relevant device(s) / platform(s).

Methodology

The benchmarking was performed by a basic benchmark.sh script, which works as follows:

  1. Start a pmie instance to sample a specified metric (such as trivial.time) from the PMDA-under-test, at a specified rate (such as once per millisecond, or 1KHz).
  2. Use pmval to fetch the PMDA's user or system time over a specified period (such as 10 seconds).
  3. Stop the pmie instance.

The above process is executed for both the C and C++ versions of the PMDA being tested to compare the overhead of the C++ wrapper over the underlying C API.

Note, depending on your version of PCP, and its access controls, you may need to run the benchmark script as root, or some other user with permission to monitor the PMDA-under-test via the proc PMDA.

Results

trivial time simple now simple numfetch

These graphs show that the performance overhead (if any) of using PMDA++ over the standard PCP API is less than the variations caused by other services running on the test machine. If someone has access to a really consistent, idle, spare machine and would like to do some longer running tests (these were 60 second runs), then that would be great.

Other notes:

  • sampling metrics at once every half a millisecond (0.0005 seconds) only just got the system time up to ~0.16%. At this rate, pmcd was consuming roughly 50% of one core.
  • when sampling faster than once per 0.0005 resulted in pmie reporting issues with clock skew.
  • PCP really is quite lean.

Chart Data

    var trivialData = google.visualization.arrayToDataTable([
      [ 'x',    'sys C', 'usr C', 'sys C++', 'usr C++'],
      [ 10,     0.00000, 0.00000, 0.00000, 0.00000 ],
      [ 1,      0.00033, 0.00000, 0.00000, 0.00000 ],
      [ 0.1,    0.00333, 0.00100, 0.00000, 0.00467 ],
      [ 0.01,   0.03183, 0.00850, 0.03383, 0.00217 ],
      [ 0.001,  0.10449, 0.00217, 0.09416, 0.00217 ],
      [ 0.0005, 0.16300, 0.00567, 0.16383, 0.00500 ]
    ]);

    // Create and populate the data table.
    var simpleNowData = google.visualization.arrayToDataTable([
      [ 'x', 'sys C', 'usr C', 'sys C++', 'usr C++'],
      [ 10,     0.00000, 0.00000, 0.00000, 0.00000 ],
      [ 1,      0.00050, 0.00000, 0.00033, 0.00000 ],
      [ 0.1,    0.00267, 0.00000, 0.00383, 0.00433 ],
      [ 0.01,   0.03233, 0.00650, 0.03450, 0.00683 ],
      [ 0.001,  0.09866, 0.02250, 0.10400, 0.01650 ],
      [ 0.0005, 0.17016, 0.03800, 0.16849, 0.02817 ]
    ]);

    // Create and populate the data table.
    var simpleNumfetchData = google.visualization.arrayToDataTable([
      [ 'x', 'sys C', 'usr C', 'sys C++', 'usr C++'],
      [ 10,     0.00000, 0.00000, 0.00000, 0.00000 ],
      [ 1,      0.00050, 0.00017, 0.00050, 0.00017 ],
      [ 0.1,    0.00350, 0.00150, 0.00000, 0.00067 ],
      [ 0.01,   0.02950, 0.00650, 0.02817, 0.01233 ],
      [ 0.001,  0.10000, 0.02400, 0.10366, 0.01133 ],
      [ 0.0005, 0.15933, 0.03333, 0.16566, 0.03217 ]
    ]);
Clone this wiki locally