-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pgbouncer] Fix stats deprecated in pgbouncer 1.8 #1016
[pgbouncer] Fix stats deprecated in pgbouncer 1.8 #1016
Conversation
Adds new column support while maintaining backward compat, detected using result set column names. See pgbouncer/pgbouncer@876d8a5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, thanks for the PR! (Also appreciate the tests!) Left a couple of quick thoughts as well, let me know what you think.
|
||
desc = scope['descriptors'] | ||
try: | ||
self.log.debug("Running query: %s" % query) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be helpful to have a debug line here that also reports the output of the query to help troubleshoot possible issues in the future. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did add something like this but was worried it might be too much output to the log so omitted from the final commit. If you think it's a good idea I'll add it back.
pgbouncer/test/test_pgbouncer.py
Outdated
@@ -22,6 +23,9 @@ class TestPgbouncer(AgentCheckTest): | |||
CHECK_NAME = 'pgbouncer' | |||
|
|||
def test_checks(self): | |||
pgbouncer_version = os.environ.get('FLAVOR_VERSION', 'latest') | |||
pgbouncer_deprecated = pgbouncer_version in ('1.5', '1.7') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for writing the tests! I'm not sure we want to deprecate older versions of PGBouncer. Could we rename this to something like 'pgbouncer_pre18` or something similar here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea! I was trying to use something other than "old" and "new" 😅
(Separated commits for review purposes, probably should be squashed for merge.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if row[0] == self.DB_NAME: | ||
continue | ||
except pg.Error as e: | ||
self.log.exception("Not all metrics may be available") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should still log the PG exception, it might be useful. What do you think ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using log.exception
will print the exception details in the log per the docs, e.g.:
checks.pgbouncer: ERROR: Not all metrics may be available
Traceback (most recent call last):
File "/Users/sj26/Projects/datadog/integrations-core/pgbouncer/datadog_checks/pgbouncer/pgbouncer.py", line 106, in _collect_stats
cursor.execute(query)
File "/Users/sj26/Projects/datadog/integrations-core/venv/lib/python2.7/site-packages/psycopg2/extras.py", line 144, in execute
return super(DictCursor, self).execute(query, vars)
ProgrammingError: unrecognized configuration parameter "pools"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indeed, thanks !
|
||
assert len(row) == len(cols) + len(desc) | ||
tags = list(instance_tags) | ||
tags += ["%s:%s" % (tag, row[column]) for (column, tag) in descriptors] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should test if row
contains column
to avoid raising an exception if the column in descriptor
does not exist.
Something like:
tags += ["%s:%s" % (tag, row[column]) for (column, tag) in descriptors if column in row]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are pretty important and should never be missing. If they are missing then it should probably explode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add the conditional anyway as protection against column name changes 👍
('total_received', ('pgbouncer.stats.bytes_received_per_second', RATE)), | ||
('total_sent', ('pgbouncer.stats.bytes_sent_per_second', RATE)), | ||
('total_query_time', ('pgbouncer.stats.total_query_time', GAUGE)), | ||
('avg_req', ('pgbouncer.stats.avg_req', GAUGE)), | ||
('total_xact_time', ('pgbouncer.stats.total_transaction_time', GAUGE)), # >= 1.8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be a rate since the documentation say:
Total number of microseconds spent by **pgbouncer** when connected to PostgreSQL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was copying total_query_time — if this should change, so should that. Probably makes sense that all the "total" are rates, and all the "avg" are gauges. Will update.
except pg.Error as e: | ||
self.log.error("Connection error: %s" % str(e)) | ||
self.log.exception("Connection error") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also e
is never used which make the flake8
fail (that's why the travis tests are red).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove!
Thanks a lot for this PR @sj26 ! It will be available in the next version of the agent |
When using pgbouncer version 1.8 the pgbouncer check fails with:
This is due to some stats being renamed in an upstream commit as part of pgbouncer 1.8: pgbouncer/pgbouncer@876d8a5
The attached commit updates the datadog pgbouncer check a little, adds a test flavor version for pgbouncer 1.8, and modifies the check to cope with the renamed and added stats maintaining backwards compatibility.
(We're now running this version of the check in production.)
I didn't see this work had already been started in #1011 — feel free to take this commit and fold it in there, or whatever you'd like. (I've kept version and changelog updates out of here so it can be cherried or rebased.)