ASB Metrics #448

bonomat · 2021-04-26T06:09:24Z

As a maker I would like a cli command to see for all swaps:

when they were started
amounts being swapped (incl rate at start time)
date at each step (fund/redeem/refund) including rate
taker's id

Hint: currently old states are overwritten in the DB. We could just add a new DB entry for each event including details.

thomaseizinger · 2021-05-01T21:39:53Z

Rust-libp2p is exploring OpenMetrics: libp2p/rust-libp2p#2063

Don't have much experience but this OpenMetrics sounds like a useful standard that we could follow here where we don't have to reinvent the wheel and make it easy for the user to later tap into a wider ecosystem of monitoring.

bonomat · 2021-05-04T03:55:10Z

OpenMetrics (same for Prometheus) does not seem to be suitable for our needs.

It is designed to monitor services from a data perspective:

"Metrics are a specific kind of telemetry data. They represent a snapshot of the current state for a set of data. "

Examples range from CPU utilization and I/O load to service response times.

The continue with :

"They are distinct from logs or events, which focus on records or information about individual events."

Source

The latter one is more what we need, i.e. record and collect events.

It gets more clear that OpenMetrics does not fit when reading through the defined metric types.

It defines:

Gauge: for current measurements such as bytes of memory currently used or number of items in a queue.
Counter: to measure discrete events, such as number of HTTP requests, CPU seconds spent, bytes sent.
StateSet: for a series of related boolean values. Further down in the document. The idea is to record states by only using boolean values, i.e. the boolean indicates if a certain state is enabled .
Info: sounds suitable but is meant for textual information which "SHOULD NOT" change during process lifetime.
Histogram: for discrete events
Summary: similar to histogram but a bit more flexible. Example can be found here.

bonomat · 2021-05-04T04:21:16Z

I propose to follow a store-and-log approach instead:

store interesting events in the db (create a new tree for metrics).
add a new command which reads all (or only events for a specific swap-id) and prints it to stdout.
a) this allows us to decide the format for the events later-on.

bonomat · 2021-05-04T04:28:49Z

What I would like to see (the format of the log does not matter atm):

SwapDetail: 
{
	`swap-id`: `Id`,
	'states` : `[State]`, 
	`btc-amount`: `Number`,
	`xmr-amount`: `Number`,
	`counter-party`: `PerrId`
}

State: 
{
    `recorded`: `Timestamp` // when this state was recorded
	`rate`: `Number` // what was the exchange rate at this point
	`tx-id`: `TxId` // if any
}

da-kami · 2021-05-04T05:26:04Z

In general:

Yeah, what you describe has nothing to do with Metrics but more with Events and just swap-state - once we aggregate these Events and analyze them we have actual Metrics.

To be more concrete: What you outline above as Metric should be named Swaps of SwapDetails I think:

Swaps: 
{
	`swap-id`: `Id`,
	'states` : `[State]`, 
	`btc-amount`: `Number`,
	`xmr-amount`: `Number`,
	`counter-party`: `PerrId`
}

Only once we analyze multiple of these swaps and we would actually have Metrics as in "5 out of 10 swaps finished successfully with state ...".

I am fine with recording more swap details in the Sled DB in separate trees for a "quick solution" - but if we plan for the long run it would be better to do what I outline below.

I had the alternative idea to improve the logs and let an external software handle our "Metrics" through the logs. We already use swap and peer-id contexts within the logging and could easily extract swap / peer information using log frameworks such as vector

Advantages:

We don't clutter our application code with recording events that don't serve any purpose other than log-details
We improve our logs rather than adding additional "events" on top
We can easily visualize data (UI) by pumping it into other software - Vector supports various sinks - could start with files to keep it simple (as source we can directly plug into journal)
We can define transformation rules without having to touch application code and release...

Disadvantage:

We have to set it up and get into Vector. (it looks fairly straight forward, I started playing with it but don't have a working example yet...)
It does not work "out of the box" - i.e. other ASB providers would have to set e.g. Vector up themselves. We could provide documentation , but it it somewhat of a higher entry burden.

thomaseizinger · 2021-05-04T05:36:36Z

Sad that OpenMetrics doesn't support more complex things that than. The big advantage would have been that we don't need to store this data. An ever-growing database is not particularly ideal and metrics / logs are things that you usually don't care about past a certain point in time.

If we store things in a database, can we use this an opportunity to start migrating to SQLite? The actual extraction of metrics could then be as simple as loading the database into any SQL tool and running a couple of queries against it.

That should reduce the required development effort significantly. Also, if we create a separate reporting database, deleting that one is safe if the user ever wants to clean up storage space.

bonomat · 2021-05-04T05:37:43Z

Yeah, what you describe has nothing to do with Metrics but more with Events and just swap-state - once we aggregate these Events and analyze them we have actual Metrics.

That's a good summary :)

I had a quick look into Vector as well. It's a log analyzer, maybe comparable to LogStash:

Vector is a high-performance observability data pipeline that allows you to collect, transform, and route all your logs and metrics.

It even has an export feature to send data to prometheus (which afaik is in OpenMetrics format).

If we want to go this way, we can ignore the DB and print more details into the logs.

@thomaseizinger : what do you think?

da-kami · 2021-05-04T05:40:39Z

Yeah, what you describe has nothing to do with Metrics but more with Events and just swap-state - once we aggregate these Events and analyze them we have actual Metrics.

That's a good summary :)

I had a quick look into Vector as well. It's a log analyzer, maybe comparable to LogStash:

Vector is a high-performance observability data pipeline that allows you to collect, transform, and route all your logs and metrics.

It even has an export feature to send data to prometheus (which afaik is in OpenMetrics format).

If we want to go this way, we can ignore the DB and print more details on the logs.

@thomaseizinger : what do you think?

I think it boils down to either using a relational DB and then use some tools on top of the relational DB for analyzis OR ignore the DB and go with the logs. I am not sure what is the better approach, but I think the log approach would give us faster results.
Going DB would be a good thing though, because it will help us with refactoring the current DB eventually, which is old tech debt.

bonomat · 2021-05-04T06:43:05Z

After quick chat with @da-kami:

Analyzing and coming up with Metrics for the ASB is in the role of the ASB provider and hence we can see this as a nice-to-have feature.
It is hard to get the metrics right for everyone hence we should strive for a flexible solution.
A new DB (or sled-tree) would add additional complexity because we would need to think of an upgradable or flexible data-schema. Additionally, it was mentioned that SQL should replace Sled eventually. We should not jump the gun on this topic because we want some more information to analyze.

Conclusion: Because of 1. + 2. best is to add more information into our logs. These can then be analyzed using tools like Vector or LogStash+Elastic or other tools.

474: Add more log details r=bonomat a=bonomat Resolves #448 1. The first commit adds an additional log statement of the exchange rate for each state-update. This is useful because it allows us to measure profitability easily, i.e. by knowing what was the exchange rate when the swap was started and what was it when it was finalized. 2. The second commit changes a bunch of log messages. 3. The third commit is adds a new commandline flag to toggle json format. Co-authored-by: Philipp Hoenisch <[email protected]> Co-authored-by: Philipp Hoenisch <[email protected]>

bonomat self-assigned this May 4, 2021

bonomat removed their assignment May 4, 2021

bonomat mentioned this issue May 5, 2021

Add more log details #474

Merged

bonomat self-assigned this May 6, 2021

bors bot closed this as completed in f03e8fa May 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ASB Metrics #448

ASB Metrics #448

bonomat commented Apr 26, 2021

thomaseizinger commented May 1, 2021 •

edited

Loading

bonomat commented May 4, 2021 •

edited

Loading

bonomat commented May 4, 2021

bonomat commented May 4, 2021 •

edited

Loading

da-kami commented May 4, 2021 •

edited

Loading

thomaseizinger commented May 4, 2021

bonomat commented May 4, 2021 •

edited

Loading

da-kami commented May 4, 2021 •

edited

Loading

bonomat commented May 4, 2021 •

edited

Loading

ASB Metrics #448

ASB Metrics #448

Comments

bonomat commented Apr 26, 2021

thomaseizinger commented May 1, 2021 • edited Loading

bonomat commented May 4, 2021 • edited Loading

bonomat commented May 4, 2021

bonomat commented May 4, 2021 • edited Loading

da-kami commented May 4, 2021 • edited Loading

thomaseizinger commented May 4, 2021

bonomat commented May 4, 2021 • edited Loading

da-kami commented May 4, 2021 • edited Loading

bonomat commented May 4, 2021 • edited Loading

thomaseizinger commented May 1, 2021 •

edited

Loading

bonomat commented May 4, 2021 •

edited

Loading

bonomat commented May 4, 2021 •

edited

Loading

da-kami commented May 4, 2021 •

edited

Loading

bonomat commented May 4, 2021 •

edited

Loading

da-kami commented May 4, 2021 •

edited

Loading

bonomat commented May 4, 2021 •

edited

Loading