Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

argocd app logs [app] --follow no longer works in ArgoCD 2.3.1 #8947

Closed
3 tasks done
todaywasawesome opened this issue Mar 30, 2022 · 8 comments · Fixed by #8987
Closed
3 tasks done

argocd app logs [app] --follow no longer works in ArgoCD 2.3.1 #8947

todaywasawesome opened this issue Mar 30, 2022 · 8 comments · Fixed by #8987
Labels
bug Something isn't working

Comments

@todaywasawesome
Copy link
Contributor

Checklist:

  • I've searched in the docs and FAQ for my answer: https://bit.ly/argocd-faq.
  • I've included steps to reproduce the bug.
  • I've pasted the output of argocd version.

Describe the bug

argocd app logs [app] --follow no longer works in ArgoCD 2.3.1

After initially displaying output, around 30 seconds it will fail with an rpc code.
FATA[0032] stream read failed: rpc error: code = Unknown desc = net/http: request canceled (Client.Timeout or context cancellation while reading body)

This is in both the CLI and the UI.

To Reproduce

  • Install Argo CD 2.3.1
  • Try to stream logs with argocd app logs [app] --follow
  • Wait ~30 seconds

Expected behavior

Following logs should continue to work.

Screenshots
Screen Shot 2022-03-30 at 11 50 19 AM

Version

argocd version                   
argocd: v2.3.3+07ac038.dirty
  BuildDate: 2022-03-30T05:14:36Z
  GitCommit: 07ac038a8f97a93b401e824550f0505400a8c84e
  GitTreeState: dirty
  GoVersion: go1.18
  Compiler: gc
  Platform: darwin/arm64
argocd-server: v2.3.1+b65c169
  BuildDate: 2022-03-10T22:51:09Z
  GitCommit: b65c1699fa2a2daa031483a3890e6911eac69068
  GitTreeState: clean
  GoVersion: go1.17.6
  Compiler: gc
  Platform: linux/amd64
  Ksonnet Version: v0.13.1
  Kustomize Version: v4.4.1 2021-11-11T23:36:27Z
  Helm Version: v3.8.0+gd141386
  Kubectl Version: v0.23.1
  Jsonnet Version: v0.18.0

Logs

:"argoproj.io/v1alpha1","resourceVersion":"55685256"}, "reason": "unchanged", "message": "unchanged Application \"elite-cluster-longhorn\""}
time="2022-03-30T17:45:43Z" level=info msg="unchanged Application" app=elite-cluster-nginx-ingress appSet=elite-cluster
1.6486623438576424e+09	DEBUG	events	Normal	{"object": {"kind":"ApplicationSet","namespace":"argocd","name":"elite-cluster","uid":"d22d39bd-1d94-49ed-ad82-38c392fd2c78","apiVersion":"argoproj.io/v1alpha1","resourceVersion":"55685256"}, "reason": "unchanged", "message": "unchanged Application \"elite-cluster-nginx-ingress\""}
time="2022-03-30T17:45:43Z" level=info msg="unchanged Application" app=elite-cluster-prometheus appSet=elite-cluster
1.6486623438592565e+09	DEBUG	events	Normal	{"object": {"kind":"ApplicationSet","namespace":"argocd","name":"elite-cluster","uid":"d22d39bd-1d94-49ed-ad82-38c392fd2c78","apiVersion":"argoproj.io/v1alpha1","resourceVersion":"55685256"}, "reason": "unchanged", "message": "unchanged Application \"elite-cluster-prometheus\""}
time="2022-03-30T17:45:43Z" level=info msg="unchanged Application" app=elite-cluster-system-upgrade-controller appSet=elite-cluster
1.648662343859787e+09	DEBUG	events	Normal	{"object": {"kind":"ApplicationSet","namespace":"argocd","name":"elite-cluster","uid":"d22d39bd-1d94-49ed-ad82-38c392fd2c78","apiVersion":"argoproj.io/v1alpha1","resourceVersion":"55685256"}, "reason": "unchanged", "message": "unchanged Application \"elite-cluster-system-upgrade-controller\""}
time="2022-03-30T17:45:43Z" level=info msg="unchanged Application" app=elite-cluster-cert-manager appSet=elite-cluster
1.6486623438601198e+09	DEBUG	events	Normal	{"object": {"kind":"ApplicationSet","namespace":"argocd","name":"elite-cluster","uid":"d22d39bd-1d94-49ed-ad82-38c392fd2c78","apiVersion":"argoproj.io/v1alpha1","resourceVersion":"55685256"}, "reason": "unchanged", "message": "unchanged Application \"elite-cluster-cert-manager\""}
time="2022-03-30T17:45:43Z" level=info msg="unchanged Application" app=elite-cluster-metallb appSet=elite-cluster
1.648662343860343e+09	DEBUG	events	Normal	{"object": {"kind":"ApplicationSet","namespace":"argocd","name":"elite-cluster","uid":"d22d39bd-1d94-49ed-ad82-38c392fd2c78","apiVersion":"argoproj.io/v1alpha1","resourceVersion":"55685256"}, "reason": "unchanged", "message": "unchanged Application \"elite-cluster-metallb\""}
time="2022-03-30T17:45:43Z" level=info msg="end reconcile" requeueAfter=20s
time="2022-03-30T17:45:43Z" level=info msg="finished unary call with code OK" grpc.code=OK grpc.method=Check grpc.request.deadline="2022-03-30T17:46:43Z" grpc.service=grpc.health.v1.Health grpc.start_time="2022-03-30T17:45:43Z" grpc.time_ms=0.041 span.kind=server system=grpc
FATA[0032] stream read failed: rpc error: code = Unknown desc = net/http: request canceled (Client.Timeout or context cancellation while reading body) 
@todaywasawesome todaywasawesome added the bug Something isn't working label Mar 30, 2022
@Paragon1970
Copy link

@todaywasawesome (Dan), as discussed here in slack:
https://cloud-native.slack.com/archives/C01TSERG0KZ/p1648585025558609

Thanks again for taking the time to verify and submitting this bug.

@danielhelfand
Copy link
Contributor

Thanks @todaywasawesome and @Paragon1970 for opening this. Here's what I have so far:

  1. The app logs command is not appropriately retrying in this case (i.e. an unknown grpc code is being returned in this case as shown in the logs above code = Unknown).
  2. If you were to add some logic like || st.Code() == codes.Unknown && follow, the --follow flag still works but will still spam with the timeout error above. It seems like not enough error info is being passed back to handle what is happening at the moment.

@danielhelfand
Copy link
Contributor

It looks like #8404 introduced the ARGOCD_K8s_TCP_TIMEOUT environment variable. It defaults to timing out API calls after 32 seconds, which is causing the logs to stop following for both the CLI/UI.

K8sServerSideTimeout = env.ParseDurationFromEnv(EnvK8sTCPTimeout, 32*time.Second, 0, math.MaxInt32)

@Paragon1970
Copy link

Nice work @danielhelfand and very much appreciated.
Thanks for the update.

@danielhelfand
Copy link
Contributor

Sure thing @Paragon1970. Setting the env var above to something higher should be a workaround for now.

Seems like this should be something users opt into as opposed to setting a default like is happening now. Hopefully something like that will be long term fix.

@Paragon1970
Copy link

Hi @danielhelfand I would also agree that this should be an opt-in since as the document you linked to #8404 states its design considerations are to stabilize operations to clusters at the far edge.

I have attempted to add ARGOCD_K8s_TCP_TIMEOUT to the application-server but it has not made any difference. Still no dice.
Is this the correct location am I missing something?
Configured and applied under spec.template.spec.containers.env

ARGOCD_K8s_TCP_TIMEOUT is shown as set within the argocd-application-controller pod

argocd@argocd-application-controller-0:~$ env | grep ARGOCD_K8s_TCP_TIMEOUT 
ARGOCD_K8s_TCP_TIMEOUT=300

ArgoCD client / server are now on 2.3.3 (latest)

@danielhelfand
Copy link
Contributor

You may have unfortunately discovered another bug. I get the following when I set the env var on the application-controller stateful set:

k logs statefulset/argocd-application-controller
time="2022-04-03T19:46:19Z" level=warning msg="Value in ARGOCD_K8S_TCP_TIMEOUT is 10m0s, which is greater than maximum 2.147483647s allowed"
time="2022-04-03T19:46:19Z" level=warning msg="Value in ARGOCD_K8S_TCP_TIMEOUT is 10m0s, which is greater than maximum 2.147483647s allowed"

The comparison of the configured duration to math.MaxInt32 seems to not allow this to even be configured at the moment:

if dur > max {

The func has a max param to disallow over a certain value, but it's not being interpreted correctly as a duration. Here's an example of comparing durations in go playground: https://go.dev/play/p/my5nte8wHum

@Paragon1970
Copy link

Thanks for the above information @danielhelfand
I visited the link to example in dev playground that you provided which explains the issue you describe clearly, thanks for this 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants