A simple Resque plugin that times and saves some simple metrics for Resque jobs back into redis. Based on this system you could build some simple auto-scaling mechanism based on the speed and ETA of queues. Also includes a hook/callback mechanism for recording/sending the metrics to your favorite tool (AKA statsd/graphite).
gem install resque-metrics
If you are using bundler add this to your Gemfile
gem "resque-metrics"
And if you want the web-ui extensions
gem "resque-metrics", :require => "resque/metrics/server"
Given a job, extend the job class with Resque::Metrics.
class SomeJob extend ::Resque::Metrics @queue = :jobs def self.perform(x, y) # sleep 10 end end
By default this will record the total job count, the total count of jobs enqueued, the total time the jobs took, the avg time the jobs took. It will also record the total number of job failures. These metrics are also tracked by queue and job class. So for the job above, it will record values and you will be able to fetch them with module methods:
Resque::Metrics.total_job_count #=> 1 Resque::Metrics.total_job_count_by_job(SomeJob) #=> 1 Resque::Metrics.total_job_count_by_queue(:jobs) #=> 10000 Resque::Metrics.total_job_time #=> 10000 Resque::Metrics.total_job_time_by_job(SomeJob) #=> 10000 Resque::Metrics.total_job_time_by_queue(:jobs) #=> 10000 Resque::Metrics.avg_job_time #=> 1000 Resque::Metrics.avg_job_time_by_job(SomeJob) #=> 1000 Resque::Metrics.avg_job_time_by_queue(:jobs) #=> 1000 Resque::Metrics.failed_job_count #=> 1 Resque::Metrics.failed_job_count_by_job(SomeJob) #=> 0 Resque::Metrics.failed_job_count_by_queue(:jobs) #=> 0
All values are recorded and returned as integers. For times, values are in milliseconds.
Resque::Metrics can also record forking metrics but these are not on by default as ‘before_fork` and `after_fork` are singluar hooks. If you don’t need to define your own fork hooks you can simply add a line to an initializer:
Resque::Metrics.watch_fork
If you do define you’re own fork hooks:
Resque.before_fork do |job| # my own fork code Resque::Metrics.before_fork.call(job) end # Resque::Metrics.(before/after)_fork just returns a lambda so just assign it if you like Resque.after_fork = Resque::Metrics.after_fork
Once enabled this will add ‘.fork` methods like `avg_fork_time`, etc. Latest Resque is required for fork recording to work.
Resque::Metrics can also record queue depth metrics. These are not on by default, as they need to run on an interval to be useful. You can record them manually by running in a console:
Resque::Metrics.record_depth
You can imagine placing this in a small script, and using cron to run it. Once you’ll have access to:
Resque::Metrics.failed_depth #=> 1 Resque::Metrics.pending_depth #=> 1 Resque::Metrics.depth_by_queue(:jobs) #=> 1
By default, Resque::Metrics keeps all it’s metrics in Resque’s redis instance, but supports plugging in other backends. Resque::Metrics itself supports redis and statsd. Here’s how you would enable statsd:
# list current backends Resque::Metrics.backends # build your statsd instance statsd = Statsd.new 'localhost', 8125 # add a Resque::Metrics::Backend Resque::Metrics.backends.append Resque::Metrics::Backends::Statsd.new(statsd)
If you have already have a statsd object for you application, just pass it to Resque::Metrics::Backends::Statsd. The statsd client already supports namespacing, and in addition, Resque::Metrics all its metrics under ‘resque’ under that namespace.
Here’s a list of metrics emitted:
resque.job.<job>.complete.count resque.job.<job>.complete.time resque.queue.<queue>.complete.count resque.queue.<queue>.complete.time resque.complete.count resque.complete.time resque.job.<job>.enqueue.count resque.job.<job>.enqueue.time resque.queue.<queue>.enqueue.count resque.queue.<queue>.enqueue.time resque.enqueue.count resque.enqueue.time resque.job.<job>.fork.count resque.job.<job>.fork.time resque.queue.<queue>.fork.count resque.queue.<queue>.fork.time resque.fork.count resque.fork.time resque.job.<job>.failure.count resque.queue.<queue>.failure.count resque.failure.count resque.depth.failed resque.depth.pending resque.depth.queue.<queue>
To write your own, you create your own class, and then implmement the following that you care about:
-
increment_metric(metric, by = 1)
-
set_metric(metric, val)
-
set_avg(metric, num, total)
-
get_metric(metric)
Resque::Metrics will in turn call each of these methods for each of it’s backend if it responds_to? it. For get_metric, since it returns a value, only will use the first backend that responds_to? it.
Resque::Metrics also has a simple callback/hook system so you can send data to your favorite agent. All hooks are passed the job class, the queue, and the time of the metric.
# Also `on_job_fork`, `on_job_enqueue`, and `on_job_failure` (`on_job_failure does not include `time`) Resque::Metrics.on_job_complete do |job_class, queue, time| # send to your metrics agent Statsd.timing "resque.#{job_class}.complete_time", time Statsd.increment "resque.#{job_class}.complete" # etc end
-
Check out the latest master to make sure the feature hasn’t been implemented or the bug hasn’t been fixed yet
-
Check out the issue tracker to make sure someone already hasn’t requested it and/or contributed it
-
Fork the project
-
Start a feature/bugfix branch
-
Commit and push until you are happy with your contribution
-
Make sure to add tests for it. This is important so I don’t break it in a future version unintentionally.
-
Please try not to mess with the Rakefile, version, or history. If you want to have your own version, or is otherwise necessary, that is fine, but please isolate to its own commit so I can cherry-pick around it.
Copyright © 2011 Aaron Quint. See LICENSE.txt for further details.