-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
oximeter
needs a way to represent missing samples
#4311
Comments
The omicron/oximeter/oximeter/src/types.rs Lines 344 to 349 in 9c33a62
There are I think some useful tricks we can play with One simple (and maybe stupid) idea is to rewrite each tuple variant from Another path would be to update the |
Idea (2) above isn't quite right, since there's no way to have
This is probably the simplest overall approach. We put the |
Closed by #4552 |
Code that produces data for
oximeter
to collect is required to generate aSample
, in the form of theProducer
trait. That trait allows reporting errors, but those errors are divorced from the timeseries underlying them.oximeter
needs to grow the concept of a missing or failed sample specifically, to indicate that there should be data from a particular timeseries at some time, but that it could not be generated. To put a finer point on it, there's currently no way to distinguish whether a gap in a timeseries is due to a network partition betweenoximeter
and the producer; or a failure to produce the sample at all.One possible implementation is to make the
Sample
contain an enum like:I'm not sure how valuable the
MissingCause
is, or if we just want aString
there for folks to put whatever messages they want.One trickier part about this is the database representation. We'd like to show these missing samples as "interleaved" with the raw data, which raises the question of how to model that in ClickHouse tables. A dumb first idea might be to make the actual measurement value column nullable, and add another nullable column for the error. I'm not sure what goes in that column, but some context about the error might be nice, such as a short string or a small enum with a few possible causes for such errors.
The text was updated successfully, but these errors were encountered: