Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Metrics for Cache #763

Closed
alpe opened this issue Feb 4, 2021 · 4 comments · Fixed by #877
Closed

Add Metrics for Cache #763

alpe opened this issue Feb 4, 2021 · 4 comments · Fixed by #877
Assignees
Milestone

Comments

@alpe
Copy link
Contributor

alpe commented Feb 4, 2021

It would be great if we could use prometheus to collect metrics for the cache.
Useful metrics would be

  • hits/misses
  • memory used/ available

While there is a Rust lib for prometheus the integration into the metrics server (Go side) can be a bit complicated (wrapped in cosmos-sdk/telemetry, which uses github.com/armon/go-metrics/prometheus.

A "simpler" solution may be adding a return value for the Go calls that contains cache hit/memory-used data that is then fed into prometheus on the Go side.

@webmaster128
Copy link
Member

There is a stats call allready, returning

#[derive(Debug, Default, Clone, Copy)]
pub struct Stats {
    pub hits_memory_cache: u32,
    pub hits_fs_cache: u32,
    pub misses: u32,
}

This seems like a good start to me. It can easily be extended. This is not yet connected into Go in wasmvm, but I'd rather do that than adding prometheus here.

@webmaster128
Copy link
Member

webmaster128 commented Apr 13, 2021

Hey @alpe, in #877 I started exposing a Metrics object that contains (when flattened):

struct Metrics {
    pub hits_pinned_memory_cache: u32,
    pub hits_memory_cache: u32,
    pub hits_fs_cache: u32,
    pub misses: u32,
    pub elements_pinned_memory_cache: usize,
    pub elements_memory_cache: usize,
    pub size_pinned_memory_cache: usize,
    pub size_memory_cache: usize,
}

Is this a useful set of data to to start with? I especially wonder about hits/misses, which is a a sum that accumulates over time. Can this be processed nicely? Or should we calculate always current valus (like moving average of hit ration)?

@webmaster128 webmaster128 self-assigned this Apr 13, 2021
@alpe
Copy link
Contributor Author

alpe commented Apr 14, 2021

Is this a useful set of data to to start with? I especially wonder about hits/misses, which is a a sum that accumulates over time. Can this be processed nicely? Or should we calculate always current valus (like moving average of hit ration)?

This looks good. Counter or Gauge (incr/decr) types are fine. We can always do calculation, windows, ... with them in prometheus. Better keep them simple here.

Does it make sense to return the available memory cache? If it always fills up completely then it is not relevant, I guess.

@webmaster128
Copy link
Member

Does it make sense to return the available memory cache? If it always fills up completely then it is not relevant, I guess.

The value you are looking for is available indirectly as cache size (configurable per node) - metrics.size_memory_cache. I.e. we report the usage, not the available memory. Let's start with that.

@webmaster128 webmaster128 added this to the 0.14.0 milestone Apr 14, 2021
@mergify mergify bot closed this as completed in #877 Apr 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants