-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add more prometheus metrics #1054
Add more prometheus metrics #1054
Conversation
15408e9
to
c0561ec
Compare
tapdb/sqlc/queries/metadata.sql
Outdated
@@ -0,0 +1,2 @@ | |||
-- name: AssetsDBSize :one | |||
SELECT pg_catalog.pg_database_size(current_database()) AS size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this query precise enough? What if the database is multi tenant and tapdb
is just one of the users?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that it's built-in I trust it to a certain degree. It seems to be accounting for every byte at the time of execution, including unused allocated space, data marked as deleted etc
The query should return the same size for the db regardless of the user and what they have access to within the db
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I mean is that if we're also running lnd
in the same postgres DB, will it return the total size, or just the size of the DB w/ the attached user?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
current_database()
would refer to the database/schema you're using. Assuming there would be a different one for lnd
and taproot-assets
, I think this should work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, I tested this locally where I have a single development timescale/postgres instance with different databases.
And I get distinct values for each of the following queries:
SELECT pg_catalog.pg_database_size('taprootassets') AS size;
--> 16802671
SELECT pg_catalog.pg_database_size('loop') AS size;
--> 8135535
SELECT pg_catalog.pg_database_size('pool') AS size;
--> 8135535
select current_database();
--> taprootassets
monitoring/asset_proof_collector.go
Outdated
defer cancel() | ||
|
||
// Fetch all proofs. | ||
proofs, err := a.cfg.AssetStore.FetchAssetProofs(ctxdb) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the default scraping interval, I think this'll run every 60 seconds or so. Extrapolating out to say 100x the current scale, will this be fast enough?
We may want to consider a more precise query than what we have here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree
If we want to stick with taking a snapshot of all asset proofs and keep track of them, we don't have anything better currently. I believe we can stick to this for now, and our loadtests will indicate the next step.
Also, if we want to keep the loadtest behavior "random" (i.e send a random asset X times) we can't really know which asset is going to be used, so it kind of makes sense to keep track of everything.
If we change the loadtest behavior so that specific assets are being sent around in specific patterns, then we can change the above query and only track what we're interested in
c0561ec
to
5cabb7e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good. I agree that we should optimize the call that fetches the proof sizes and put it into a histogram. The rest seems good, will do a manual test once comments are addressed.
5cabb7e
to
9ed81e6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of optimization suggestions. But this works, tested it by running the load test locally with one sqlite and one postgres backed instance.
9ed81e6
to
5a0a75c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tACK, LGTM 🎉
@Roasbeef: review reminder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🧞♂️
Description
This PR adds any missing metrics for the tapd monitoring package based on this comment
Specifically, in this PR we add: