You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When oximeter fetches samples from a producer, it extracts a schema and a timeseries key. This is used to ID the specific instantiation of the timeseries within the schema, which is critical for being able to associate the records for a specific sample across the various tables. We use this to determine which fields go with which measurements.
We're currently using std::collections::hash_map::DefaultHasher to do this. There are no real stability guarantees around that though, which means we may lose the ability to associate records if that hash algorithm changes. We need to switch to a stable algorithm. XXHash or CityHash are attractive candidates.
I see no way to do this without breaking existing data. We need to do this ASAP, to minimize the data loss.
The text was updated successfully, but these errors were encountered:
#4251)
- Uses [bcs](https://crates.io/crates/bcs) as a stable binary format for
inputs to the `timeseries_key` function
- Uses [highway](https://crates.io/crates/highway) as a stable hash
algorithm (it's keyed, fast, portable, and well-distributed).
- Additionally, adds an EXPECTORATE test validating the stability of
timeseries_key values.
Fixes#4008 , and also
addresses the issue raised in
#4221 regarding stable
input
NOTE: This PR itself *also* breaks the stability of the `timeseries_key`
(hopefully for the last time), and will rely on
#4246 to wipe the metrics
DB for the next release.
When
oximeter
fetches samples from a producer, it extracts a schema and a timeseries key. This is used to ID the specific instantiation of the timeseries within the schema, which is critical for being able to associate the records for a specific sample across the various tables. We use this to determine which fields go with which measurements.We're currently using
std::collections::hash_map::DefaultHasher
to do this. There are no real stability guarantees around that though, which means we may lose the ability to associate records if that hash algorithm changes. We need to switch to a stable algorithm. XXHash or CityHash are attractive candidates.I see no way to do this without breaking existing data. We need to do this ASAP, to minimize the data loss.
The text was updated successfully, but these errors were encountered: