Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Severe performance penalty with label filter in queries #13869

Closed
JohanLindvall opened this issue Aug 13, 2024 · 0 comments · Fixed by #13922
Closed

Severe performance penalty with label filter in queries #13869

JohanLindvall opened this issue Aug 13, 2024 · 0 comments · Fixed by #13922

Comments

@JohanLindvall
Copy link
Contributor

JohanLindvall commented Aug 13, 2024

Describe the bug
We are using Loki 3.1.1, deployed in with the official Helm charts in distributed mode.

Some of our Loki queries are unexpectedly slow. These queries filter on a label to reduce the query volume and then proceed with quite a lot of quite expensive regexes.

To Reproduce
This is a typical query (abbreviated, I have removed about 50 negative regexes in the form of !~".*some expression.*")

(Note that severity_text comes from structured metadata)

{source_type="kubernetes", namespace="service"} |  severity_text="error" !~ `.*Error when executing service.*Check.*The client reset.*.*` != `Failed to index help data` !~ ...

The label filter on severity_text should filters the query input volume heavily from ~900MiB to 20MiB, but the queries still time out after 60 seconds and / or consume a lot of CPU resources.

I believe we are being hit by #8914 which puts the very expensive line filters before the label filter expressions.

If I convert the query to

{source_type="kubernetes", namespace="service"} |  severity_text="error" | line_format "{{__line__}}" !~ `.*Error when executing service.*Check.*The client reset.*.*` != `Failed to index help data` !~ ...

it passes in well under a second. This disables the line filter reordering in reorderStages.

Expected behavior
I expect the query to pass in under 1 second without the workaround
A clear and concise description of what you expected to happen.

Environment:

  • Running on AKS 1.29.7, Loki 3.1.1 deployed with Helm, distributed charts.

Analysis
To me, it seems as if the code in

func (m MultiStageExpr) reorderStages() []StageExpr {
forgot to handle LabelFilterExpr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant