Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Promtail duplicates events during pod recreation #3443

Closed
jtackaberry opened this issue Mar 7, 2021 · 2 comments · Fixed by grafana/helm-charts#279
Closed

Promtail duplicates events during pod recreation #3443

jtackaberry opened this issue Mar 7, 2021 · 2 comments · Fixed by grafana/helm-charts#279
Labels
stale A stale issue or PR that will automatically be closed.

Comments

@jtackaberry
Copy link
Contributor

I have Promtail 2.1 with Loki 2.1 running in a single node microk8s cluster, installed via their respective Helm charts, and it seems Promtail gets momentarily confused when a pod is recreated, causing events from the new pod being erroneously labeled under the previous pod name, while also including the same events under the current pod name, resulting in duplication of the events.

In this scenario, I have a k8s deployment called "stagehand" and a pod stagehand-6d4ff786c-vx8jw which was deleted, and a new pod is automatically recreated as stagehand-6d4ff786c-zmn47. The log lines during startup of the new pod zmn47 are being duplicated.

Easiest just to demonstrate this with an annotated screenshot from Grafana:

image

The duplication stops after a couple seconds.

Here's the relevant log excerpt from Promtail:

ts=2021-03-07T21:22:18.523526049Z caller=log.go:124 component=tailer level=info msg="Re-opening moved/deleted file /var/log/pods/stagehand_stagehand-6d4ff786c-vx8jw_57a4765b-fc1f-4ca2-9913-7d07eb384b8c/stagehand/0.log ..."
ts=2021-03-07T21:22:18.52364537Z caller=log.go:124 component=tailer level=info msg="Waiting for /var/log/pods/stagehand_stagehand-6d4ff786c-vx8jw_57a4765b-fc1f-4ca2-9913-7d07eb384b8c/stagehand/0.log to appear..."
level=info ts=2021-03-07T21:22:19.46996655Z caller=tailer.go:166 component=tailer msg="skipping update of position for a file which does not currently exist"
level=info ts=2021-03-07T21:22:19.470983766Z caller=tailer.go:95 component=tailer msg="position timer: exited" path=/var/log/pods/stagehand_stagehand-6d4ff786c-vx8jw_57a4765b-fc1f-4ca2-9913-7d07eb384b8c/stagehand/0.log
level=info ts=2021-03-07T21:22:19.471048676Z caller=tailer.go:166 component=tailer msg="skipping update of position for a file which does not currently exist"
level=info ts=2021-03-07T21:22:19.471100326Z caller=tailer.go:122 component=tailer msg="tail routine: started" path=/var/log/pods/stagehand_stagehand-6d4ff786c-zmn47_d54e002f-f0c2-4f97-aae0-10f4ebad6960/stagehand/0.log
level=info ts=2021-03-07T21:22:19.471144555Z caller=tailer.go:138 component=tailer msg="tail routine: tail channel closed, stopping tailer" path=/var/log/pods/stagehand_stagehand-6d4ff786c-vx8jw_57a4765b-fc1f-4ca2-9913-7d07eb384b8c/stagehand/0.log
ts=2021-03-07T21:22:19.471176509Z caller=log.go:124 component=tailer level=info msg="Seeked /var/log/pods/stagehand_stagehand-6d4ff786c-zmn47_d54e002f-f0c2-4f97-aae0-10f4ebad6960/stagehand/0.log - &{Offset:0 Whence:0}"
level=info ts=2021-03-07T21:22:19.471179506Z caller=tailer.go:131 component=tailer msg="tail routine: exited" path=/var/log/pods/stagehand_stagehand-6d4ff786c-vx8jw_57a4765b-fc1f-4ca2-9913-7d07eb384b8c/stagehand/0.log
level=info ts=2021-03-07T21:22:19.471219447Z caller=tailer.go:204 component=tailer msg="stopped tailing file" path=/var/log/pods/stagehand_stagehand-6d4ff786c-vx8jw_57a4765b-fc1f-4ca2-9913-7d07eb384b8c/stagehand/0.log
level=info ts=2021-03-07T21:22:19.478799466Z caller=filetargetmanager.go:261 msg="Adding target" key="{app=\"stagehand\", container=\"stagehand\", job=\"stagehand/stagehand\", namespace=\"stagehand\", node_name=\"k8s\", pod=\"stagehand-6d4ff786c-zmn47\"}"
level=info ts=2021-03-07T21:22:19.479332502Z caller=filetargetmanager.go:277 msg="Removing target" key="{app=\"stagehand\", container=\"stagehand\", job=\"stagehand/stagehand\", namespace=\"stagehand\", node_name=\"k8s\", pod=\"stagehand-6d4ff786c-vx8jw\"}"
ts=2021-03-07T21:22:19.479445147Z caller=log.go:124 component=tailer level=info msg="Seeked /var/log/pods/stagehand_stagehand-6d4ff786c-zmn47_d54e002f-f0c2-4f97-aae0-10f4ebad6960/stagehand/0.log - &{Offset:0 Whence:0}"
level=info ts=2021-03-07T21:22:19.479442941Z caller=tailer.go:122 component=tailer msg="tail routine: started" path=/var/log/pods/stagehand_stagehand-6d4ff786c-zmn47_d54e002f-f0c2-4f97-aae0-10f4ebad6960/stagehand/0.log
level=info ts=2021-03-07T21:22:19.482425419Z caller=tailer.go:95 component=tailer msg="position timer: exited" path=/var/log/pods/stagehand_stagehand-6d4ff786c-zmn47_d54e002f-f0c2-4f97-aae0-10f4ebad6960/stagehand/0.log
level=info ts=2021-03-07T21:22:19.482528085Z caller=tailer.go:138 component=tailer msg="tail routine: tail channel closed, stopping tailer" path=/var/log/pods/stagehand_stagehand-6d4ff786c-zmn47_d54e002f-f0c2-4f97-aae0-10f4ebad6960/stagehand/0.log
level=info ts=2021-03-07T21:22:19.482551112Z caller=tailer.go:131 component=tailer msg="tail routine: exited" path=/var/log/pods/stagehand_stagehand-6d4ff786c-zmn47_d54e002f-f0c2-4f97-aae0-10f4ebad6960/stagehand/0.log
level=info ts=2021-03-07T21:22:19.482571404Z caller=tailer.go:204 component=tailer msg="stopped tailing file" path=/var/log/pods/stagehand_stagehand-6d4ff786c-zmn47_d54e002f-f0c2-4f97-aae0-10f4ebad6960/stagehand/0.log
level=info ts=2021-03-07T21:22:19.482585098Z caller=filetarget.go:169 msg="filetarget: watcher closed, tailer stopped, positions saved" path=/var/log/pods/*/stagehand/*.log

FWIW, I have Fluent Bit sending to Elasticsearch on the same node and there's no duplication there.

@stale
Copy link

stale bot commented Jun 3, 2021

This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale A stale issue or PR that will automatically be closed. label Jun 3, 2021
@stale stale bot closed this as completed Jun 11, 2021
cyriltovena pushed a commit to cyriltovena/loki that referenced this issue Jun 11, 2021
the current dummy credentials deny using EKS IAM service accounts, removing this line fallback to the AWS SDK default which is the following:

* Environment Variables
* Shared Credentials file
* Shared Configuration file (if SharedConfig is enabled)
* EC2 Instance Metadata (credentials only)
@zhoushuke
Copy link

@jtackaberry did you fix prombles by this merged? I face same prombles but in my case duplication log always exists by time while you just The duplication stops after a couple seconds
by the way, my k8s cluster has annotation "kuberntes.io/config.hash"
sorry if bother your time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale A stale issue or PR that will automatically be closed.
Projects
None yet
2 participants