Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better enable discovering dropped log lines #2280

Closed
owen-d opened this issue Jul 1, 2020 · 6 comments
Closed

Better enable discovering dropped log lines #2280

owen-d opened this issue Jul 1, 2020 · 6 comments
Labels
stale A stale issue or PR that will automatically be closed.

Comments

@owen-d
Copy link
Member

owen-d commented Jul 1, 2020

Finding dropped log lines has proved a problem for many. We've been hesitant to log these at the API level due to cluttering concerns. Currently there are metrics in place for the intersection of tenant,reason concerning dropped lines, but we should consider better enabling the use case of finding these.

Perhaps we can log the label sets at the promtail level?

@slim-bean
Copy link
Collaborator

The error message we return to the client is fairly descriptive for these dropped lines:

RateLimited = "rate_limited"
rateLimitErrorMsg = "Ingestion rate limit exceeded (limit: %d bytes/sec) while attempting to ingest '%d' lines totaling '%d' bytes, reduce log volume or contact your Loki administrator to see if the limit can be increased"
// LineTooLong is a reason for discarding too long log lines.
LineTooLong = "line_too_long"
lineTooLongErrorMsg = "Max entry size '%d' bytes exceeded for stream '%s' while adding an entry with length '%d' bytes"
// StreamLimit is a reason for discarding lines when we can't create a new stream
// because the limit of active streams has been reached.
StreamLimit = "stream_limit"
streamLimitErrorMsg = "Maximum active stream limit exceeded, reduce the number of active streams (reduce labels or reduce label values), or contact your Loki administrator to see if the limit can be increased"
// GreaterThanMaxSampleAge is a reason for discarding log lines which are older than the current time - `reject_old_samples_max_age`
GreaterThanMaxSampleAge = "greater_than_max_sample_age"
greaterThanMaxSampleAgeErrorMsg = "entry for stream '%s' has timestamp too old: %v"
// TooFarInFuture is a reason for discarding log lines which are newer than the current time + `creation_grace_period`
TooFarInFuture = "too_far_in_future"
tooFarInFutureErrorMsg = "entry for stream '%s' has timestamp too new: %v"
// MaxLabelNamesPerSeries is a reason for discarding a log line which has too many label names
MaxLabelNamesPerSeries = "max_label_names_per_series"
maxLabelNamesPerSeriesErrorMsg = "entry for stream '%s' has %d label names; limit %d"
// LabelNameTooLong is a reason for discarding a log line which has a label name too long
LabelNameTooLong = "label_name_too_long"
labelNameTooLongErrorMsg = "stream '%s' has label name too long: '%s'"
// LabelValueTooLong is a reason for discarding a log line which has a lable value too long
LabelValueTooLong = "label_value_too_long"
labelValueTooLongErrorMsg = "stream '%s' has label value too long: '%s'"
// DuplicateLabelNames is a reason for discarding a log line which has duplicate label names
DuplicateLabelNames = "duplicate_label_names"
duplicateLabelNamesErrorMsg = "stream '%s' has duplicate label name: '%s'"

I'm reasonably sure we log this in promtail but I haven't verified this in a while, would be worth doing.

@slim-bean
Copy link
Collaborator

One area we should improve here is the promtail_mixin in this regard, the dashboard and metrics (as well as alerts) should be updated to give some better visibility here.

It's tricky though because of the batched nature of promtail pushes, when we get an error back we don't know how much of the batch succeeded vs failed. Loki doesn't provide that currently and it's not easy to fix this...

@owen-d
Copy link
Member Author

owen-d commented Jul 6, 2020

I think this will be greatly helped by LogQL v2's addition of metric extraction + an exported dashboard. That will let us intersect the tenant,reason labels but also group by other labels (container, etc) allowing source discovery.

@piroux
Copy link

piroux commented Jul 6, 2020

Moreover, I would really love to be able generate metrics, in a match block, on some filtered logs, which I am dropping intentionally.
Right now, as written in the docs, this is not possible with the current behaviour of the drop action:

When set to drop, entries will be dropped and no later metrics will be recorded.

So what I would need to better discover the logs I drop is to defer the drop action at the end of the match block and be able to add metrics stages between.

I do not know if this is out of the scope of this issue, I guess you will tell me.

@stale
Copy link

stale bot commented Aug 5, 2020

This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale A stale issue or PR that will automatically be closed. label Aug 5, 2020
@stale stale bot closed this as completed Aug 13, 2020
cyriltovena pushed a commit to cyriltovena/loki that referenced this issue Jun 11, 2021
* Refactor: simplify query batching by removing adaptor function

Signed-off-by: Bryan Boreham <[email protected]>

* Shortcut DoParallelQueries for single-query case

Single query case is very common, e.g. for series-to-chunks lookup in
schema v9.

Shortcutting avoids the creation of two goroutines and associated data
structures, and removes an uninteresting tracing span.

Signed-off-by: Bryan Boreham <[email protected]>

* Clarify CHANGELOG entry

Signed-off-by: Bryan Boreham <[email protected]>

* Simplify by using existing Callback type

Signed-off-by: Bryan Boreham <[email protected]>

* Fix up type signatures where chunk_util.Callback is used

Signed-off-by: Bryan Boreham <[email protected]>
@LinTechSo
Copy link
Contributor

Hi, any updates ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale A stale issue or PR that will automatically be closed.
Projects
None yet
Development

No branches or pull requests

4 participants