In order to understand what your application is doing and to get a better overview over key functionality it is indispensable to gather metrics about those areas. A popular combination to achieve this is StatsD with the Graphite storage and graphing backend. This chapter will introduce those tools, how to install and configure them and how they can be used to gather insight into your application's metrics.
The Graphite project is a set of tools for collecting and visualizing data, all written in Python. It consists of a file based database format and tools to work with it called whisper, a set of network daemons to collect data - the carbon daemons - and a Django based web application to display data.
Graphite at its core uses a disk-based storage system called whisper. Every
metric is stored in its corresponding .wsp
file. These are fixed sized
databases, meaning that the amount of data a file can hold is pre-determined
upon creation. In order to achieve this, every database file contains so
called archives which store (timestamp, value) pairs with varying
granularity and length. For example a possible archive setup could be to store
a data point every 10 seconds for 6 hours, after that store 1 minute data for a
week and for long time archival use a data point every 10 minutes for 5 years.
This means once your data is older than the time period stored in the archive, a decision has to be made how to fit the data into the coarser slots in the next archive. In our example setup when going from the highest precision (10 seconds) to the next archive (1 minute) we have 6 values at our disposal to aggregate. The database can be configured to aggregate by average, sum, last, max or min functions, which uses the arithmetic mean, the sum of all values, the last one recorded or the biggest or smallest value respectively to fill the value in the next archive. This process is done every time a value is too old to be stored in an archive until it is older than the lowest resolution archive available and isn't stored anymore at all.
An important setting in this context is the x-files-factor you can set on
every database. This describes how many of the archives slots have to have a
value (as opposed to None
which is Python's NULL
value and recorded
when no value is received for a timestamp) to aggregate the next lower archive
to a value and not also insert None
instead. The value range for this
setting is decimal from 0 to 1 and defaults to 0.5. This means when we use the
default setting on our example archives, at least 3 of the 6 values for the
first aggregation and 5 of the 10 values for the second one have to have
actual values.
The disk format to store all this data is rather simple. Every file has a short header with the basic information about the aggregation function used, the maximum retention period available, the x-files-factor and the number of archives it contains. These are stored as 2 longs, a float and another long thus requiring 16 bytes of storage. After that the archives are appended to the file with their (timestamp, value) pairs stored as a long and a double value consuming 12 bytes per pair.
This design makes it possible to store a lot of datapoints relatively easy and
without consuming a lot of disk space (1 year's worth of 1 minute data for a
metric can be stored in a little more than 6MB). But it also means that some
considerations about usage have to be made upfront. Due to the database's
fixed size, the highest resolution archive should match the rate at which you
send data. If data is sent at lower intervals the archive ends up with a lot
of None
values and if more than one metric are received for a timestamp,
the last one will always overwrite previous ones. So if a low rate of events
during some times is expected, it might make sense to tune down the
x-files-factor to aggregate even if only one of ten values is available. And
depending on the type of metrics (counters for example) it might also make
more sense to aggregate with the sum function instead of the default
aggregation by averaging the values. Thinking about this before sending
metrics makes things a lot easier since it's possible to change these settings
afterwards, however to keep the existing data, the whisper file has to be
resized which takes some time and makes the file unavailable for storage
during that time.
In order to understand whisper files a little bit better, we can use the set of scripts distributed with whisper to take a look at a database file. First install whisper from the PyPi distribution:
% sudo pip install whisper
And create a whisper database with a 10 second retention for 1 minute by using
the whisper-create.py
command:
% whisper-create.py test.wsp 10s:1minute
Created: test.wsp (100 bytes)
% whisper-info.py test.wsp
maxRetention: 60
xFilesFactor: 0.5
aggregationMethod: average
fileSize: 100
Archive 0
retention: 60
secondsPerPoint: 10
points: 6
size: 72
offset: 28
% whisper-fetch.py test.wsp
1354509290 None
1354509300 None
1354509310 None
1354509320 None
1354509330 None
1354509340 None
The resulting file has six ten second buckets corresponding to the retention
period in the create command. As it is visible from the :file:`whisper-info.py`
output, the database uses default values for x-files-factor and aggregation
method, since we didn't specify anything different. And we also only have one
archive which stores data at 10 seconds per value for 60 seconds as we passed
as a command argument. By default :file:`whisper-fetch.py` shows timestamps as
epoch time. There is also --pretty
option to show them in a more human
readable format, but since the exact time is not important all examples show
epoch time. For updating the database with value there is the handy
:file:`whisper-update.py` command, which takes a timestamp and a value as
arguments:
% whisper-update.py test.wsp 1354509710:3
% whisper-fetch.py test.wsp
1354509690 None
1354509700 None
1354509710 3.000000
1354509720 None
1354509730 None
1354509740 None
Notice how the timestamps are not the same as in the example above, because more than a minute has past since then and if we had values stored at those points, they wouldn't be show anymore. However taking a look at the database file with :file:`whisper-dump.py` reveals a little more information about the storage system:
% whisper-dump.py test.wsp
Meta data:
aggregation method: average
max retention: 60
xFilesFactor: 0.5
Archive 0 info:
offset: 28
seconds per point: 10
points: 6
retention: 60
size: 72
Archive 0 data:
0: 1354509710, 3
1: 0, 0
2: 0, 0
3: 0, 0
4: 0, 0
5: 0, 0
In addition to the metadata :file:`whisper-info.py` already showed, the dump command also tells us that only one slot actually has data. And in this case the time passed doesn't matter. Since slots are only changed when new values need to be written to them, this old value will remain there until then. The reason why :file:`whisper-fetch.py` doesn't show these past values is because it will only show valid data within a given time (default 24h) until max retention from the invoked point in time. And it will also fetch the points from the retention archive that can cover most of the requested time. This becomes a bit more clear when adding a new archive:
% whisper-resize.py test.wsp 10s:1min 20s:2min
Retrieving all data from the archives
Creating new whisper database: test.wsp.tmp
Created: test.wsp.tmp (184 bytes)
Migrating data...
Renaming old database to: test.wsp.bak
Renaming new database to: test.wsp
% whisper-info.py test.wsp
maxRetention: 120
xFilesFactor: 0.5
aggregationMethod: average
fileSize: 184
Archive 0
retention: 60
secondsPerPoint: 10
points: 6
size: 72
offset: 40
Archive 1
retention: 120
secondsPerPoint: 20
points: 6
size: 72
offset: 112
% whisper-fetch.py test.wsp
1354514740 None
1354514760 None
1354514780 None
1354514800 None
1354514820 None
1354514840 None
Now the database has a second archive which stores 20 second data for 2
minutes and :file:`whisper-fetch.py` returns 20 second slots. That's because it
tries to retrieve as close to 24h (the default time) as possible and the 20
second slot archive is closer to that. For getting data in 10 second slots,
the command has to be invoked with the --from=
parameter and an epoch
timestamp less than 1 minute in the past.
These commands are a good way to inspect whisper files and to get a basic understanding how data is stored. So it makes sense to experiment with them a bit before going into the rest of the Graphite eco-system.
In order to make whisper files accessible to be written to from other network services, the Graphite project includes the carbon daemon suite. The suite consists of a carbon-cache, carbon-relay and carbon-aggregator daemon, which are all based on the Twisted framework for event-driven IO in Python.
The carbon-cache daemon is the most crucial of them as it provides the basic interface to the whisper backend and a scalable and efficient way for a large number of clients to store metrics. In order to minimize write delay for a big number of metrics depending on the disk seek time (each metric has its own file) the daemon employs queuing. Every metric has its own queue and an incoming value for a metric gets appended to it. A background thread then checks the queues for data points and writes them consecutively to the storage file. This way cost of an expensive disk seek gets amortized over several metric values that are written with one seek.
The daemon relies on two config files, :file:`carbon.conf` for general
configuration and :file:`storage-schemas.conf` for whisper storage configuration.
The general configuration file contains settings like network configuration
(carbon-cache can listen on different sockets like plain TCP and UDP or even
AMQP), cache sizes and maximum updates per second in its [cache]
section.
These settings are very useful when tuning the carbon daemon for the hardware
it's running on, but to get started the default settings from the example
config files will suffice. The storage schemas configuration file contains
information about which metrics paths are using which retention archives and
aggregation methods. A basic entry looks like this:
[default_1min_for_1day]
pattern = .*
retentions = 60s:1d
Each section has a name and a regex pattern which will be matched on the
metrics path sent. The pattern shown above will match any pattern and can be
used as a catch-all rule at the end of the configuration to match uncaught
metrics. The retentions
section is a comma separated list of retention
archives to use for the metrics path in the same format that
:file:`whisper-create.py` expects them.
In order to get a basic carbon cache instance running (default listener is TCP on port 2003), install it from PyPi and copy the example config files:
% cd /opt/graphite/conf
% cp carbon.conf.example carbon.conf
% cp storage-schemas.conf.example storage-schemas.conf
% /opt/graphite/bin/carbon-cache.py start
Starting carbon-cache (instance a)
% netstat -an | grep 2003
tcp4 0 0 *.2003 *.* LISTEN
The default installation creates its directories in /opt/graphite
but this
can also be changed within the configuration. After the carbon daemon has been
started, metrics can just be recorded by sending one or more values in the
format metric_path value timestamp\n
:
% echo "test 10 1354519378" | nc -w1 localhost 2003
% whisper-fetch.py /opt/graphite/storage/whisper/test.wsp |
tail -n 3
1354519260 None
1354519320 10.000000
1354519380 None
All metrics paths that are sent are relative to the
/opt/graphite/storage/whisper
directory and will be stored there. The
interface also supports sub folders, which can be created by separate the
metrics path with dots:
% echo "this.is.a.test 10 1354519680" | nc -w1 localhost 2003
% whisper-fetch.py /opt/graphite/storage/whisper/this/is/a/test.wsp| tail -n 3
1354519560 None
1354519620 None
1354519680 10.000000
This is all that's needed to collect metrics over the network. As mentioned
before, the carbon suite contains two more daemons carbon-relay
and
carbon-aggregator
. These can be used to improve the performance of the
system under higher load.
carbon-relay
acts as a router between different carbon-cache
or
carbon-aggregator
instances. The daemon reads the [relay]
section of
the :file:`carbon.conf` configuration file where the most important sections
are the interface and TCP port to listen to via the LINE_RECEIVER_*
settings, the DESTINATIONS
property, a comma separated list of
ipaddress:port
pairs of available carbon daemons and the type of relaying
to use. The carbon-relay
can be operated in rule based or consistent
hashing based relaying mode. In the second case, consistent hashing is
employed to balance the metrics between available destinations. When using the
relay based approach, the relay daemon needs a :file:`relay-rules.conf`
configuration file of the form:
[name]
pattern = <regex>
destinations = <list of destination addresses>
This follows the storage schema configuration file format and will route any
metric matching the pattern to the given destinations. There also has to be
exactly one section which additionally has the property default = true
which is used as the catch all rule if no other rule has matched a metric
path.
To easily navigate within hundreds of metrics, it's important to normalize the name. Here are a few naming schemes:
<ENV>.<CLUSTER>.<SERVER>.metric
<ENV>.<APPLICATION-TYPE>.<APPLICATION>.metric
<ENV>.APPLICATIONS.EVENTS.<APP-NAME>.deploy
Here a couple rules to choose an appropriate scheme:
- always put the most common part on the left of the name
- differentiate them by type (e.g.: hosts / applications)
Of course, you're free to adopt different schemes. The most important rule is to be consistent when naming your metrics.
To achieve this, a common solution is to have a small proxy between the tool reporting metrics and Graphite. This could be an HTTP proxy (like documented by Jason Dixon, or a simple script that listens on a port, and rewrites the metric.
Using this pattern, you'll get more control over the format and paths chosen by developers or operations.
StatsD is a network daemon listening for statistics and sends the aggregation to a backend. In our case we will see how it works with Graphite.
StatsD is a simple daemon listening for metrics. The first implementation was done by Etsy, and is written for node.js. but other implementation exists (Python, Ruby, C, etc).
To install the one by Etsy, you will need to install node.js (if it's not packaged for your OS, you can follow the instructions). Then, to actually run StatsD:
% git clone git://github.com/etsy/statsd.git
% cd statsd
% $EDITOR /etc/statsd/config.js
% node stats.js /etc/statsd/config.js
A basic configuration file will be similar to this:
{
graphitePort: 2303,
graphiteHost: "localhost",
port: 8125,
graphite: {
legacyNamespace: false,
prefixTimer: "aggregate",
globalPrefix: ""
}
}
StatsD listens on the UDP port 8125 for incoming statistics. As for
Graphite, the protocol is line based. You send a string similar to
name:1|c
. The first element (name) is the name of the statistic,
the colon acts as a separator with the value (1), and the pipe
separates the value with the type (c, for counter).
StatsD stores statistics in buckets. A value is attached to the statistic, and periodically (by default it's every 10 seconds), the statistics are aggregated and send to the backend.
A few types are supported, and we will now see them in detail.
The counter is the most basic type.
% echo "my.statistic:1|c" | nc -w 1 -u localhost 8125
This will add 1 to the statistic named "my.statistic". After the flush the value for this statistic will be 0. It's also possible to specify to statsd that we are sampling:
% echo "my.statistic:1|c|@0.1" | nc -w 1 -u localhost 8125
% echo "my.timer:43|ms" | nc -w 1 -u localhost 8125
This type is somewhat mis-named, since you can report more than time based metrics. You give it times in milliseconds, and it will compute the percentiles, average, standard deviation, sum, lower and upper bounds for the flush interval.
Gauges are arbitrary values.
% echo "my.statistic:23|g" | nc -w 1 -u localhost 8125
Gauges can be useful when you have a script that runs periodically and you want to report a value (e.g: count the number of rows in a database). The number is final, there's no additional processing.
However, there's a few things to know about gauges:
- if you send multiple values for the same gauge between 2 flushes, only the most recent one will be kept
- if you're sending a gauge for the same metric from two different places, only one of them will be kept
- if there's no new value during the time period, it will send the one from the previous period to Graphite
When using graphite, you have to be sure that the smallest time retention in Graphite is the same as the interval between two flushes in StatsD. If you're sending to Graphite two data points in the same time period, it will overwrite the first one.
A management interface is listening (by default) on the TCP port 8126. A few commands are supported:
stats
will output statistics about the current processcounters
will dump all the current counterstimers
will dump all the current times
The stats
output looks like this:
% telnet localhost 8125
stats
uptime: 334780
messages.last_msg_seen: 0
messages.bad_lines_seen: 1590
graphite.last_flush: 9
graphite.last_exception: 334780
END
Where:
uptime
is the number of seconds elapsed since the process startedmessages.last_msg_seen
is the number of seconds since the last message receivedmessages.bad_lines_seen
is the number of badly formatted line receivedgraphite.last_flush
is the number of seconds elapsed since the last flush to Graphitegraphite.last_exception
is the number of seconds elapsed since the last exception thrown by Graphite while flushing
This list is a suggestion of things you can collect and measure.
Every time you push an application or use your configuration manager to push changes, you could send an event to statsd. Something as simple as
% echo "<ENV>.APPLICATIONS.EVENTS.<APP-NAME>.deploy:1|c" | nc -w 1 -u localhost 8125
Now, in graphite, you can use the formula drawAsInfinite
to
represent this event as a vertical line.
When you're sending statistics to StatsD, you have to be careful about the size of the payload. If the size is greater than your network's MTU, the frame will be dropped. You can refer to this documentation to find the size that might work best for your network.
Tasseo is a Graphite specific live dashboard. It is lightweight, easily configurable and provides a near-realtime view of Graphite metric data. It is a ruby based Sinatra and javascript application.
Dashing is a dashboard framework allowing you to build your own custom dashboards. Dashboards can be created with premade widgets, or custom widgets can be written using scss, html and coffeescript. Data bindings allow reuse and manipulation of data from a variety of sources.
GDash is another dashboard for Graphite. Dashboards are created using a simple DSL.