Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve detections for sorting data before dedup #5719

Open
fpetkovski opened this issue Sep 24, 2022 · 1 comment
Open

Improve detections for sorting data before dedup #5719

fpetkovski opened this issue Sep 24, 2022 · 1 comment

Comments

@fpetkovski
Copy link
Contributor

fpetkovski commented Sep 24, 2022

This issue is just me documenting my findings from trying to remove the expensive sort that the querier does here. Some manual tests I did in a staging environment showed that the sort alone takes 50% of the entire time needed to fetch and deduplicate all series.

This issue also proposes some additional improvements that can be done as follow ups of #5692, or even directly in that PR.

Problem

Thanks to #5296, we now have a mechanism to merge series coming from multiple stores on the fly.

That change alone is not sufficient to completely remove a very expensive re-sorting of the data which is mandatory for deduplication to work correctly. We need to re-sort series because the dedup process relies on series being sorted without taking into account their replica labels, so that series that are replicas of each other end up together in the series set. Such sorting is not easy to achieve in stores without buffering, because the TSDB has no notion of replica labels, and can only return series sorted with replica labels as part of the comparison function.

PR #5692 adds detections which allow us to avoid this sort in certain cases. The sort is not necessary if:

  • Each store sends series in TSDB sorted order
  • Each store is responsible for only sending data from one replica.

Condition 2. is mandatory because data sent in this order from a single store is not ready for dedup:

a=1, replica=1, z=1
a=1, replica=1, z=2
a=1, replica=2, z=1

After stripping or removing the replica label, series would end up in unsorted order and a=1, z=1 will not be properly deduplicated.

The checks added in #5296 can still create false positive when a store sends data for multiple replicas, but the data is already sorted in the same order as needed for deduplication. A concrete case would be a layered setup where one querier connects to the proxy of another querier. The proxy in this case would properly sort data for dedup, but multi-replica detection would trigger a global sort in the root querier. There can also be cases where store-gateway sends multi-replica data sorted in order appropriate for dedup, such as:

a=1, b=1, replica=1
a=1, b=1, replica=2
a=1, b=2, replica=1

Potential solutions

There are two potential solutions that come to mind. We can pursue either of them or come up with something else.

  1. Replace the multi-replica detection with a check where we assert that series coming from each store are sorted correctly for dedup. This can be done by keeping track of the last received series in the asyncResponseSet. We can then compare the current series with the previous, without taking into account replica labels. In this case we still need to wait for all series to be received before starting deduplication. The reason for this is that until we have seen the last series, we cannot know whether data is properly sorted for dedup.

  2. If we merge Push replica labels at the end after Recv #5692, replica labels can be propagated to stores in the storepb.SeriesRequest. This will allows us to sort series in a manner that is ready for dedup directly in individual stores. However, this will be challening to do without buffering, since we cannot iterate through series from TSDB in an arbitrary order. TSDB has its own sorting order which is not aware of replica labels. If we manage to achieve this, all detections from the querier can be removed and we can rely on stores sending data in dedup-ready order.

Note

We have been aiming at removing buffering in the seriesServer completely and streaming series on-the-fly into the engine. While this seems like removing a large bottleneck, in practice the benefits could be fairly marginal. PromQL evaluation is done step-by-step, which means we need to access all series immediatelly for the first step. As a result, the current engine expands all series as soon as it receives them, so removing the additional buffering only saves us one linear pass through all series.

@fpetkovski fpetkovski changed the title Completely remove the need to sort data before deduplication Improve detections for sorting data before dedup Sep 24, 2022
@fpetkovski fpetkovski mentioned this issue Sep 24, 2022
6 tasks
@bwplotka
Copy link
Member

We discussed in on last community call -> the solution would be to add extra guarantee for Store API:

  • to ensure all series are sorted
  • to allow sorting by/without labels (or removing them completely)

Then we should have hint in Info or Series response if this is done or not (:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants