Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add decorator for tracking execution statistics of check methods #10809

Merged
merged 8 commits into from
Dec 8, 2021

Conversation

djova
Copy link
Contributor

@djova djova commented Dec 7, 2021

What does this PR do?

Adds a new decorator tracked_method to be used for tracking the performance of methods in a standardized way. This will significantly reduce measurement boilerplate across the DBM integrations and make the measurements less error prone. It's important to track the performance and error rates of various parts of the DBM integrations in order to better troubleshoot customer issues.

We already have a bunch of internal debug metrics tracking the execution time and error rate of various operations in DBM. For example, dd.postgres.collect_statement_samples.time tracks the execution time of the _collect_statement_samples method for postgres. All of these manual measurements will be updated to use this new measurement decorator instead.

Motivation

Reduce boilerplate and make debugging performance measurements more standardized and less error prone.

Review checklist (to be filled by reviewers)

  • Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
  • PR title must be written as a CHANGELOG entry (see why)
  • Files changes must correspond to the primary purpose of the PR as described in the title (small unrelated changes should have their own PR)
  • PR must have changelog/ and integration/ labels attached

Adds a new decorator `dbm_tracked_method` to be used for tracking the performance of methods in a standardized way. This will significantly reduce measurement boilerplate across the DBM integrations and make the measurements less error prone. It's important to track the performance and error rates of various parts of the DBM integrations in order to better troubleshoot customer issues.

We already have a bunch of internal debug metrics tracking the execution time and error rate of various operations in DBM. For example, `dd.postgres.collect_statement_samples.time` tracks the execution time of the `_collect_statement_samples` method for postgres. All of these manual measurements will be updated to use this new measurement decorator instead.
@djova djova requested review from a team as code owners December 7, 2021 23:57
@djova djova changed the title dbm add tracked method decorator add DBM tracked method decorator Dec 7, 2021
@codecov
Copy link

codecov bot commented Dec 8, 2021

Codecov Report

Merging #10809 (a4ab9b1) into master (41be2b4) will increase coverage by 0.22%.
The diff coverage is 90.10%.

Flag Coverage Δ
active_directory 100.00% <ø> (ø)
activemq_xml 82.31% <ø> (ø)
aerospike 86.97% <ø> (+0.36%) ⬆️
airflow 90.00% <ø> (ø)
amazon_msk 88.83% <ø> (ø)
ambari 85.75% <ø> (ø)
apache 95.08% <ø> (ø)
aspdotnet 93.87% <ø> (ø)
avi_vantage 91.92% <ø> (ø)
azure_iot_edge 82.00% <ø> (ø)
btrfs 82.91% <ø> (ø)
cacti 83.95% <ø> (ø)
cassandra_nodetool 94.19% <ø> (ø)
ceph 91.02% <ø> (ø)
cilium 85.84% <ø> (+1.88%) ⬆️
cisco_aci 95.83% <ø> (ø)
citrix_hypervisor 87.50% <ø> (ø)
clickhouse 95.63% <ø> (ø)
cloud_foundry_api 95.98% <ø> (+0.12%) ⬆️
cockroachdb 100.00% <ø> (ø)
consul 91.74% <ø> (ø)
coredns 95.74% <ø> (ø)
couch 95.19% <ø> (+0.24%) ⬆️
couchbase 81.45% <ø> (ø)
crio 100.00% <ø> (ø)
datadog_checks_base 90.31% <90.10%> (+0.34%) ⬆️
datadog_checks_dev 80.00% <ø> (+<0.01%) ⬆️
datadog_checks_downloader 80.64% <ø> (ø)
datadog_cluster_agent 97.50% <ø> (ø)
directory 94.87% <ø> (ø)
disk 91.61% <ø> (ø)
dns_check 93.84% <ø> (ø)
dotnetclr 100.00% <ø> (ø)
druid 97.70% <ø> (ø)
ecs_fargate 80.23% <ø> (ø)
eks_fargate 94.05% <ø> (ø)
elastic 90.52% <ø> (ø)
envoy 94.17% <ø> (ø)
etcd 93.87% <ø> (ø)
exchange_server 100.00% <ø> (ø)
external_dns 100.00% <ø> (ø)
fluentd 94.77% <ø> (ø)
gearmand 78.26% <ø> (+1.24%) ⬆️
gitlab 89.94% <ø> (ø)
gitlab_runner 91.94% <ø> (ø)
glusterfs 80.09% <ø> (+0.92%) ⬆️
go_expvar 92.73% <ø> (ø)
gunicorn 93.60% <ø> (ø)
haproxy 95.09% <ø> (+0.16%) ⬆️
harbor 81.29% <ø> (ø)
hazelcast 92.39% <ø> (ø)
hdfs_datanode 89.74% <ø> (ø)
hdfs_namenode 86.72% <ø> (ø)
http_check 90.98% <ø> (+1.74%) ⬆️
ibm_db2 94.84% <ø> (ø)
ibm_i 80.65% <ø> (ø)
ibm_mq 89.61% <ø> (ø)
ibm_was 96.06% <ø> (ø)
iis 94.91% <ø> (+38.49%) ⬆️
istio 77.46% <ø> (+0.56%) ⬆️
kafka_consumer 82.28% <ø> (ø)
kong 92.21% <ø> (ø)
kube_apiserver_metrics 97.35% <ø> (ø)
kube_controller_manager 96.85% <ø> (ø)
kube_dns 98.85% <ø> (ø)
kube_metrics_server 100.00% <ø> (ø)
kube_proxy 100.00% <ø> (ø)
kube_scheduler 96.20% <ø> (ø)
kubelet 89.61% <ø> (ø)
kubernetes_state 89.52% <ø> (ø)
kyototycoon 85.96% <ø> (ø)
lighttpd 83.64% <ø> (ø)
linkerd 85.14% <ø> (+1.14%) ⬆️
linux_proc_extras 96.22% <ø> (ø)
mapr 82.62% <ø> (ø)
mapreduce 81.77% <ø> (ø)
marathon 83.12% <ø> (ø)
marklogic 95.33% <ø> (ø)
mcache 93.52% <ø> (ø)
mesos_master 90.68% <ø> (ø)
mesos_slave 93.63% <ø> (ø)
mongo 94.45% <ø> (+0.49%) ⬆️
mysql 87.09% <ø> (+0.25%) ⬆️
nagios 89.53% <ø> (ø)
network 77.76% <ø> (+1.00%) ⬆️
nfsstat 95.20% <ø> (ø)
nginx 96.37% <ø> (+0.65%) ⬆️
nginx_ingress_controller 98.36% <ø> (ø)
openldap 96.33% <ø> (ø)
openmetrics 97.14% <ø> (ø)
openstack 51.45% <ø> (ø)
openstack_controller 90.74% <ø> (ø)
oracle 93.65% <ø> (+0.52%) ⬆️
pdh_check 95.65% <ø> (ø)
pgbouncer 90.45% <ø> (ø)
php_fpm 90.21% <ø> (+0.42%) ⬆️
postfix 88.04% <ø> (ø)
postgres 91.49% <ø> (+0.21%) ⬆️
powerdns_recursor 96.65% <ø> (ø)
process 85.07% <ø> (+0.28%) ⬆️
prometheus 94.17% <ø> (ø)
proxysql 98.97% <ø> (ø)
rabbitmq 94.40% <ø> (ø)
redisdb 87.44% <ø> (ø)
rethinkdb 97.93% <ø> (ø)
riak 99.22% <ø> (ø)
riakcs 93.61% <ø> (ø)
sap_hana 92.39% <ø> (ø)
scylla 100.00% <ø> (ø)
singlestore 90.81% <ø> (ø)
snmp 89.69% <ø> (+0.04%) ⬆️
snowflake 93.60% <ø> (ø)
sonarqube 95.69% <ø> (ø)
spark 93.51% <ø> (ø)
sqlserver 85.95% <ø> (ø)
squid 100.00% <ø> (ø)
ssh_check 91.58% <ø> (ø)
statsd 87.36% <ø> (+1.05%) ⬆️
supervisord 92.30% <ø> (ø)
system_core 91.04% <ø> (ø)
system_swap 98.30% <ø> (ø)
tcp_check 88.83% <ø> (ø)
teamcity 80.00% <ø> (ø)
tls 97.04% <ø> (+0.87%) ⬆️
tokumx 58.40% <ø> (?)
twemproxy 78.33% <ø> (ø)
twistlock 80.25% <ø> (ø)
varnish 84.57% <ø> (+0.24%) ⬆️
vault 95.04% <ø> (+0.55%) ⬆️
vertica 92.33% <ø> (ø)
voltdb 96.81% <ø> (ø)
vsphere 89.78% <ø> (+0.08%) ⬆️
win32_event_log 86.32% <ø> (+0.56%) ⬆️
windows_performance_counters 98.36% <ø> (ø)
windows_service 95.83% <ø> (ø)
wmi_check 92.91% <ø> (ø)
yarn 89.85% <ø> (ø)
zk 86.04% <ø> (+0.93%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

djova added a commit that referenced this pull request Dec 8, 2021
Update all internal method invocation debug instrumentation to use new decorator added in #10809
Copy link
Contributor

@coignetp coignetp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks interesting! I have a few suggestions ; did you manage to measure the overhead of this decorator?

datadog_checks_base/datadog_checks/base/utils/db/utils.py Outdated Show resolved Hide resolved
datadog_checks_base/datadog_checks/base/utils/db/utils.py Outdated Show resolved Hide resolved
@djova djova force-pushed the djova/dbm-util-tracked-method branch from 874d4bf to e2c6696 Compare December 8, 2021 15:28
@djova
Copy link
Contributor Author

djova commented Dec 8, 2021

This looks interesting! I have a few suggestions ; did you manage to measure the overhead of this decorator?

Addressed all of your comments. I didn't measure the overhead. statsd is very low overhead generally and in this case the metrics are being aggregated directly in the agent via the internal API so it should be even lower overhead than typical statsd sent over the network.

djova added a commit that referenced this pull request Dec 8, 2021
Update all internal method invocation debug instrumentation to use new decorator added in #10809
@djova djova changed the title add DBM tracked method decorator add tracked method decorator Dec 8, 2021
coignetp
coignetp previously approved these changes Dec 8, 2021
Copy link
Contributor

@coignetp coignetp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 if CI is green

ofek
ofek previously approved these changes Dec 8, 2021
@ofek
Copy link
Contributor

ofek commented Dec 8, 2021

Just please edit the title to be more descriptive

@djova djova changed the title add tracked method decorator Add decorator for tracking execution statistics of check methods Dec 8, 2021
@djova djova dismissed stale reviews from ofek and coignetp via a4ab9b1 December 8, 2021 18:18
@djova djova merged commit 4395d55 into master Dec 8, 2021
@djova djova deleted the djova/dbm-util-tracked-method branch December 8, 2021 20:03
djova added a commit that referenced this pull request Dec 8, 2021
Update all internal method invocation debug instrumentation to use new decorator added in #10809
djova added a commit that referenced this pull request Dec 9, 2021
* improve internal check execution instrumentation

Update all internal method invocation debug instrumentation to use new decorator added in #10809

* revert typo

* fix for py2

* use new base version
cswatt pushed a commit that referenced this pull request Jan 5, 2022
)

* dbm add tracked method decorator

Adds a new decorator `dbm_tracked_method` to be used for tracking the performance of methods in a standardized way. This will significantly reduce measurement boilerplate across the DBM integrations and make the measurements less error prone. It's important to track the performance and error rates of various parts of the DBM integrations in order to better troubleshoot customer issues.

We already have a bunch of internal debug metrics tracking the execution time and error rate of various operations in DBM. For example, `dd.postgres.collect_statement_samples.time` tracks the execution time of the `_collect_statement_samples` method for postgres. All of these manual measurements will be updated to use this new measurement decorator instead.

* move to new tracking module

* revert

* add option to disable

* standardize naming

* logging fix

* style

* fix py2
cswatt pushed a commit that referenced this pull request Jan 5, 2022
* improve internal check execution instrumentation

Update all internal method invocation debug instrumentation to use new decorator added in #10809

* revert typo

* fix for py2

* use new base version
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants