Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update stats collectors to use tagged metrics #13895

Merged
merged 13 commits into from
Apr 9, 2020
18 changes: 15 additions & 3 deletions app/jobs/stats_collector_job.rb
Original file line number Diff line number Diff line change
Expand Up @@ -47,21 +47,33 @@ def perform
def run_collectors(stats_collectors)
stats_collectors.each do |collector_name, collector|
start_time = Time.zone.now
collector.new.collect_stats&.each { |metric_name, value| emit(metric_name, value) }

collector.new.collect_stats&.each { |obj| emit_or_fail(obj) }
rescue StandardError => error
log_error(collector_name, error)
ensure
datadog_report_time_segment(segment: "#{METRIC_GROUP_NAME}.#{collector_name}", start_time: start_time)
end
end

def emit(name, value, attrs: {})
# :reek:FeatureEnvy
def emit_or_fail(hash)
# when hash is { metric: metric_name", value: 123, "some_tag": "type1" }
return emit(hash[:metric], hash[:value], tags: hash.except(:metric, :value)) if hash[:metric] && hash[:value]
yoomlam marked this conversation as resolved.
Show resolved Hide resolved

# when hash is { "metric_name" => 123 }
return emit(hash.first[0], hash.first[1]) if hash.size == 1
Copy link
Contributor Author

@yoomlam yoomlam Apr 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handle metric without tag.


fail "Unexpect metric object: #{hash}"
end

def emit(name, value, tags: {})
DataDogService.emit_gauge(
metric_group: METRIC_GROUP_NAME,
metric_name: name,
metric_value: value,
app_name: APP_NAME,
attrs: attrs
attrs: tags
)
end

Expand Down
75 changes: 28 additions & 47 deletions app/services/collectors/daily_counts_stats_collector.rb
Original file line number Diff line number Diff line change
Expand Up @@ -3,102 +3,83 @@
# Collect daily stats for basic objects. Each query should complete within 10 seconds.
# This collector is used by StatsCollectorJob.
class Collectors::DailyCountsStatsCollector
include Collectors::StatsCollector

TOTALS_METRIC_NAME_PREFIX = "daily_counts.totals"
INCREMENTS_METRIC_NAME_PREFIX = "daily_counts.increments"

def collect_stats
{}.tap do |stats|
stats.merge! flatten_stats total_counts_hash(TOTALS_METRIC_NAME_PREFIX)
stats.merge! flatten_stats daily_counts(INCREMENTS_METRIC_NAME_PREFIX)
[].tap do |stats|
stats.concat flatten_stats(TOTALS_METRIC_NAME_PREFIX, total_counts_hash)
stats.concat flatten_stats(INCREMENTS_METRIC_NAME_PREFIX, daily_counts)
Comment on lines +12 to +14
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return array since metric names are no longer unique when using tags.

end
end

private

def flatten_stats(stats_hash)
{}.tap do |stats|
stats_hash.each do |key_namespace, counts_hash|
stats.merge! prepend_namespace_to_keys(key_namespace, counts_hash)
end
end
end

def prepend_namespace_to_keys(namespace, hash)
hash.transform_keys { |key| "#{namespace}.#{key_to_name(key)}" }
end

def key_to_name(key)
return "nil" unless key

return key.map(&:parameterize).map(&:underscore).join(".") if key.instance_of?(Array)

return key.parameterize.underscore if key.instance_of?(String)

key.to_s.parameterize.underscore
end

def total_counts_hash(prefix)
def total_counts_hash
{
"#{prefix}.claimant.decision_review" => Claimant.group(:decision_review_type).count,
"claimant.decision_review" => Claimant.group(:decision_review_type).count,
# => {"HigherLevelReview"=>75769, "SupplementalClaim"=>336809, "Appeal"=>48793}
"#{prefix}.claimant.payee_code" => Claimant.group(:payee_code).count,
"claimant.payee_code" => Claimant.group(:payee_code).count,
# => {nil=>430595, "13"=>45, "17"=>1, "11"=>669, "00"=>14258, "60"=>75, "29"=>1, "10"=>15599, "12"=>89, ...

"#{prefix}.certification.office" => Certification.group(:certifying_office).count,
"certification.office" => Certification.group(:certifying_office).count,
# => {nil=>39579, "Nashville, TN"=>2657, "Little Rock, AR"=>1796, "Indianapolis, IN"=>2607, ...
"#{prefix}.certification.rep" => Certification.group(:vacols_representative_type).count,
"certification.rep" => Certification.group(:vacols_representative_type).count,
# => {nil=>55738, "Service Organization"=>77601, "Agent"=>3246, "None"=>8588, "Attorney"=>36092, ...

"#{prefix}.hearing" => Hearing.group(:disposition).count,
"hearing.disposition" => Hearing.group(:disposition).count,
# => {nil=>432, "postponed"=>577, "no_show"=>125, "held"=>1253, "cancelled"=>219}

"#{prefix}.hearing_virtual" => VirtualHearing.group(:hearing_type).count,
"hearing_virtual" => VirtualHearing.group(:hearing_type).count,
# => {"LegacyHearing"=>9}

"#{prefix}.distribution" => Distribution.all.group(:status).count,
"distribution" => Distribution.all.group(:status).count,
# => {"completed"=>5090, "error"=>1431}

"#{prefix}.case_review" => {
"case_review" => {
"judge" => JudgeCaseReview.count,
"attorney" => AttorneyCaseReview.count
},

"#{prefix}.decision_doc" => DecisionDocument.group(:appeal_type).count,
"decision_doc" => DecisionDocument.group(:appeal_type).count,
# => {"Appeal"=>8047, "LegacyAppeal"=>63192}

"#{prefix}.claim_establishment" => ClaimEstablishment.group(:decision_type).count,
"claim_establishment" => ClaimEstablishment.group(:decision_type).count,
# => {nil=>40, "full_grant"=>46292, "partial_grant"=>29677, "remand"=>111624}

"#{prefix}.req_issues" => RequestIssue.group(:decision_review_type).count,
"req_issue" => RequestIssue.group(:decision_review_type).count,
# => {nil=>78, "HigherLevelReview"=>172753, "Appeal"=>125898, "SupplementalClaim"=>717169}
"#{prefix}.decision_issues" => DecisionIssue.group(:benefit_type, :disposition).count,
"decision_issue" => DecisionIssue.group(:benefit_type, :disposition).count,
# => {["fiduciary", "remanded"]=>2, ["pension", "dismissed_matter_of_law"]=>31, ["vha", "Granted"]=>70, ...

"#{prefix}.dispatch" => Dispatch::Task.group(:type, :aasm_state).count
"dispatch" => Dispatch::Task.group(:type, :aasm_state).count
# => {["EstablishClaim", "started"]=>5, ["EstablishClaim", "unprepared"]=>64, ...
}
end

def daily_counts(prefix)
def daily_counts
{
"#{prefix}.appeal" => query_yesterdays(Appeal).group(:docket_type).count,
# => {"evidence_submission"=>52, "hearing"=>269, "direct_review"=>36}

prefix => {
"counts" => {
"appeal" => query_yesterdays(Appeal).count,
"legacy_appeal" => query_yesterdays(LegacyAppeal).count,
"appeal_series" => query_yesterdays(AppealSeries).count,
"hearing" => query_yesterdays(Hearing).count,
"legacy_hearing" => query_yesterdays(LegacyHearing).count
},

"#{prefix}.appeal.status" => count_groups(query_yesterdays(Appeal)) { |record| record.status.status },
"appeal" => query_yesterdays(Appeal).group(:docket_type).count,
# => {"evidence_submission"=>52, "hearing"=>269, "direct_review"=>36}

"appeal.status" => count_groups(query_yesterdays(Appeal)) { |record| record.status.status },
# => {:not_distributed=>349, :distributed_to_judge=>4, :cancelled=>2, :assigned_to_attorney=>2}

"#{prefix}.appeal_series.num_of_appeals" =>
"appeal_series.num_of_appeals" =>
count_groups(query_yesterdays(AppealSeries)) { |record| record.appeals.count },
# => {1=>555, 2=>56, 4=>12, 5=>3, 3=>20, 6=>1, 7=>2}

"#{prefix}.distribution" => query_yesterdays(Distribution).group(:status).count
"distribution" => query_yesterdays(Distribution).group(:status).count
# => {"completed"=>13, "error"=>3}
}
end
Expand Down
77 changes: 54 additions & 23 deletions app/services/collectors/request_issues_stats_collector.rb
Original file line number Diff line number Diff line change
Expand Up @@ -5,40 +5,71 @@
# The stats are typically used for monthly reporting.
# This collector is used by StatsCollectorJob.
class Collectors::RequestIssuesStatsCollector
METRIC_NAME_PREFIX = "request_issues.unidentified_with_contention"
include Collectors::StatsCollector

METRIC_NAME_PREFIX = "req_issues.w_contentions"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shorten name metric name prefix for readability in DataDog.


# :reek:FeatureEnvy
def collect_stats
start_of_month = Time.zone.now.prev_month.beginning_of_month
req_issues = unidentified_request_issues_with_contention(start_of_month, start_of_month.next_month)
[].tap do |stats|
group_count_hash = {
"status" => request_issues_with_contention.group(:closed_status).count,
"benefit" => request_issues_with_contention.group(:benefit_type).count,
"decis_review" => request_issues_with_contention.group(:decision_review_type).count,
yoomlam marked this conversation as resolved.
Show resolved Hide resolved

{}.tap do |stats|
stats[METRIC_NAME_PREFIX] = req_issues.count
"report" => monthly_report_hash
}
stats.concat flatten_stats(METRIC_NAME_PREFIX, group_count_hash)
end
end

req_issues.group(:closed_status).count.each do |status, count|
stats["#{METRIC_NAME_PREFIX}.closed_status.#{status || 'nil'}"] = count
end
private

req_issues.group(:benefit_type).count.each do |ben_type, count|
stats["#{METRIC_NAME_PREFIX}.benefit.#{ben_type}"] = count
end
def monthly_report_hash
{
"hlr_established" => HigherLevelReview
.where("establishment_processed_at >= ?", start_date)
.where("establishment_processed_at < ?", end_date).count,
"sc_established" => SupplementalClaim
.where("establishment_processed_at >= ?", start_date)
.where("establishment_processed_at < ?", end_date).count,

dr_counts_by_type = req_issues.group(:decision_review_type).count
dr_counts_by_type.each do |dr_type, count|
stats["#{METRIC_NAME_PREFIX}.decision_review.#{dr_type}"] = count
end
"030_end_products_established" => endproduct_establishment.where(source_type: "HigherLevelReview").count,
"040_end_products_established" => endproduct_establishment.where(source_type: "SupplementalClaim").count,

"created" => request_issues_with_contention.count,
"edited" => edited_request_issues_with_contention.count,
"unidentified_created" => request_issues_with_contention.where(is_unidentified: true).count,

# Could use `req_issues.group(:veteran_participant_id).count.count` but there's a count discrepancy
stats["#{METRIC_NAME_PREFIX}.vet_count"] = req_issues.map(&:decision_review).map(&:veteran_file_number).uniq.count
end
"vet_count" => request_issues_with_contention.map(&:decision_review).map(&:veteran_file_number).uniq.count
}
yoomlam marked this conversation as resolved.
Show resolved Hide resolved
end

private
def start_date
@start_date ||= Time.zone.now.prev_month.beginning_of_month
end

def end_date
@end_date ||= start_date.next_month
end

def endproduct_establishment(start = start_date, stop = end_date)
EndProductEstablishment.where.not(reference_id: nil)
.where("committed_at >= ?", start)
.where("committed_at < ?", stop)
end

def request_issues_with_contention(start = start_date, stop = end_date)
@request_issues_with_contention ||= RequestIssue.where.not(contention_reference_id: nil)
.where("created_at >= ?", start)
.where("created_at < ?", stop)
end

def unidentified_request_issues_with_contention(start_date, end_date)
RequestIssue.where.not(contention_reference_id: nil)
.where(is_unidentified: true)
.where("created_at >= ?", start_date)
.where("created_at < ?", end_date)
def edited_request_issues_with_contention(start = start_date, stop = end_date)
@edited_request_issues_with_contention ||= RequestIssue.where.not(contention_reference_id: nil)
.where.not(contention_updated_at: nil)
.where("contention_updated_at >= ?", start)
.where("contention_updated_at < ?", stop)
end
end
62 changes: 62 additions & 0 deletions app/services/collectors/stats_collector.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# frozen_string_literal: true

module Collectors::StatsCollector
yoomlam marked this conversation as resolved.
Show resolved Hide resolved
# Given metric_name_prefix and a hash like { "hearing.disposition" => Hearing.group(:disposition).count },
# the result will be an array of metrics like:
# [ { :metric => "#{metric_name_prefix}.hearing.disposition", :value => 300, "disposition" => "held" },
# { :metric => "#{metric_name_prefix}.hearing.disposition", :value => 100, "disposition" => "postponed" }
# ]
def flatten_stats(metric_name_prefix, stats_hash)
[].tap do |stats|
stats_hash.each do |metric_name, counts_hash|
unless valid_metric_name?(metric_name)
fail "Invalid metric name #{metric_name}; "\
"see https://docs.datadoghq.com/developers/metrics/#naming-custom-metrics"
end

stats.concat add_tags_to_group_counts(metric_name_prefix, metric_name, counts_hash)
end
end
end

protected

def add_tags_to_group_counts(prefix, metric_name, group_counts)
tag_key = to_valid_tag_key(metric_name.split(".").last)
Copy link
Contributor Author

@yoomlam yoomlam Apr 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use that last part of the metric_name as the tag_key.
Example: If metric_name="hearing.disposition", then tag_key="disposition"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest passing in metric_name and doing the split inside of the to_valid_tag_key, that way you have all of the context of creating a valid tag key encapsulated in the function, instead of split up

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered this. The reason I did not encapsulate the split call into to_valid_tag_key is in case other collectors want to use it. I don't want to be doing too much manipulation to the name input argument. I could be convinced otherwise, but I'm leaning towards keeping it as it.


group_counts.map do |key, count|
{ :metric => "#{prefix}.#{metric_name}", :value => count, tag_key => to_valid_tag(group_key_to_name(key)) }
end
end

# See valid tag name rules at https://docs.datadoghq.com/tagging/#defining-tags
def to_valid_tag(name)
name.gsub(/[^a-zA-Z_\-\:\.\d\/]/, "__")
end

def to_valid_tag_key(name)
return "#{name}_" if %w[host device source service].include?(name)

tag_key = to_valid_tag(name)
end

def valid_metric_name?(metric_name)
# Actual limit is 200 but since the actual metric name in DataDog has
# "dsva_appeals.stats_collector_job." prepended, let's just stick with a 150 character limit.
return false if metric_name.length > 150

return true if metric_name =~ /\A[a-zA-Z][a-zA-Z\d_\.]*\Z/

false
end

def group_key_to_name(key)
return "nil" unless key

return key.map(&:underscore).map(&:parameterize).join(".") if key.instance_of?(Array)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example: when key=["EstablishClaim", "started"], return "establish_claim.started"


return key.underscore.parameterize if key.instance_of?(String)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


key.to_s.underscore.parameterize
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For numbers and booleans.

end
end
Loading