Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(ci): Upgrade Knative to version 0.20 #2195

Closed
wants to merge 1 commit into from
Closed

chore(ci): Upgrade Knative to version 0.20 #2195

wants to merge 1 commit into from

Conversation

astefanutti
Copy link
Member

Release Note

NONE

@astefanutti astefanutti added the status/wip Work in progress label Apr 8, 2021
@astefanutti
Copy link
Member Author

Here is my current analysis of the Knative workflow failure:

  • Some tests that covers SinkBinding fail, as the Knative eventing webhook fails to inject the K_SINK environment variable, which leads to Integrations being unable to resolve the sink channel:
    java.lang.IllegalArgumentException: Unable to find a resource definition for channel/sink/messages
  • The Knative eventing webhook fails because it reports the sink channel referenced in the SinkBinding does not exist:
     {
      "level":"error",
      "ts":"2021-05-07T08:19:20.022Z",
      "logger":"eventing-webhook.SinkBindings",
      "caller":"sinkbinding/controller.go:153",
      "msg":"Failed to get URI from Destination: %!w(*errors.StatusError=&{{{ } {   <nil>} Failure inmemorychannels.messaging.knative.dev \"messages\" not found NotFound 0xc001469e60 404}})",
      "commit":"a414aee",
      "knative.dev/pod":"eventing-webhook-6c759fbb5b-sjvc7",
      "knative.dev/traceid":"3f02244d-1dc7-4a99-82e3-caf29cdcbe87",
      "knative.dev/key":"test-53bbfc1e-9583-4fec-9bd6-04560089130c/knativegetpost1",
      "logging.googleapis.com/labels":{},
      "logging.googleapis.com/sourceLocation":{
        "file":"knative.dev/eventing/pkg/reconciler/sinkbinding/controller.go",
        "line":"153","function":"knative.dev/eventing/pkg/reconciler/sinkbinding.(*SinkBindingSubResourcesReconciler).Reconcile"},
      "stacktrace":"knative.dev/eventing/pkg/reconciler/sinkbinding.(*SinkBindingSubResourcesReconciler).Reconcile\n\tknative.dev/eventing/pkg/reconciler/sinkbinding/controller.go:153\nknative.dev/pkg/webhook/psbinding.(*BaseReconciler).reconcile\n\tknative.dev/[email protected]/webhook/psbinding/reconciler.go:188\nknative.dev/pkg/webhook/psbinding.(*BaseReconciler).Reconcile\n\tknative.dev/[email protected]/webhook/psbinding/reconciler.go:140\nknative.dev/pkg/controller.(*Impl).processNextWorkItem\n\tknative.dev/[email protected]/controller/controller.go:523\nknative.dev/pkg/controller.(*Impl).RunContext.func3\n\tknative.dev/[email protected]/controller/controller.go:461"
    }
  • But evidences from the tests and the Knative InMemoryChannel controller indicate the sink channel actually exists
  • Running each of these tests individually succeeds, while running two of them sequentially produces the error, despite the fact they run in different namespaces

I haven't been able to reproduce the error locally, so that seems environment specific, which complicates the production of a simple reproducer. Someone with deeper knowledge about the Knative eventing internals may help troubleshooting the issue, and possibly identifying the root cause.

@astefanutti astefanutti added help wanted Extra attention is needed area/continuous integration Related to CI and automated testing labels May 12, 2021
tadayosi added a commit to tadayosi/camel-k that referenced this pull request Jun 21, 2021
Due to apache#2195 and apache#2343, eventing is still required to stay lower than
v0.18.0.
@tadayosi
Copy link
Member

tadayosi commented Jun 21, 2021

What I've found so far is that this change from knative-eventing in v0.18.0-dev triggered the failure in our test suite. So we can upgrade eventing up to v0.17.9 without failure.
knative/eventing#3348
And I can reliably reproduce the issue locally with minikube. Possibly it's only reproducible on Linux and not on Mac?

That change which caused the test failure leads me to have an impression that the root cause appears to be related to some changes in k8s api 1.18.

Also another funny finding from my experiments is that when you kill the eventing-webhook pod each time in our knative tests like this:

kubectl delete -n knative-eventing po -l app=eventing-webhook

the tests seem to pass. If we desparately need to upgrade eventing to latest, that might be a possible workaround.

astefanutti pushed a commit that referenced this pull request Jun 21, 2021
Due to #2195 and #2343, eventing is still required to stay lower than
v0.18.0.
@astefanutti
Copy link
Member Author

What I've found so far is that this change from knative-eventing in v0.18.0-dev triggered the failure in our test suite. So we can upgrade eventing up to v0.17.9 without failure.
knative/eventing#3348
And I can reliably reproduce the issue locally with minikube. Possibly it's only reproducible on Linux and not on Mac?

IIRC, I tested it by running e2e tests against an OCP 4.6 cluster, which depends on Kubernetes 1.19, but it's possible my testing procedure wasn't right. Do you manage to reproduce the exact same error reported by the Knative Eventing webhook?

That change which caused the test failure leads me to have an impression that the root cause appears to be related to some changes in k8s api 1.18.

Also another funny finding from my experiments is that when you kill the eventing-webhook pod each time in our knative tests like this:

kubectl delete -n knative-eventing po -l app=eventing-webhook

the tests seem to pass. If we desparately need to upgrade eventing to latest, that might be a possible workaround.

That seems consistent with my experimentation, that is, each test passes individually, because the Eventing webhook is executed only once!

@tadayosi
Copy link
Member

Do you manage to reproduce the exact same error reported by the Knative Eventing webhook?

I tested with a range of Eventing versions including v0.18.0, v0.20.4, v0.21.4, and v0.22.1 and the errors occured vary slightly depending on the versions. The error I had with v0.20.4 (if I remember it correctly) is as follows:

{
  "level": "error",
  "ts": "2021-06-20T13:54:34.688Z",
  "logger": "eventing-webhook",
  "caller": "v1/sinkbinding_lifecycle.go:109",
  "msg": "URI could not be extracted from destination: ",
  "commit": "af9c290",
  "knative.dev/pod": "eventing-webhook-8489d86856-glgkz",
  "knative.dev/kind": "apps/v1, Kind=Deployment",
  "knative.dev/namespace": "test-90ba8b8a-f863-4ed1-912b-655a7efb17bd",
  "knative.dev/name": "knativegetpost1",
  "knative.dev/operation": "CREATE",
  "knative.dev/resource": "apps/v1, Resource=deployments",
  "knative.dev/subresource": "",
  "knative.dev/userinfo": "{system:serviceaccount:test-90ba8b8a-f863-4ed1-912b-655a7efb17bd:camel-k-operator f330ad09-9852-4c79-ac4e-fe7f8665a45a [system:serviceaccounts system:serviceaccounts:test-90ba8b8a-f863-4ed1-912b-655a7efb17bd system:authenticated] map[]}",
  "error": "inmemorychannels.messaging.knative.dev \"messages\" not found",
  "logging.googleapis.com/labels": {},
  "logging.googleapis.com/sourceLocation": {
    "file": "knative.dev/eventing/pkg/apis/sources/v1/sinkbinding_lifecycle.go",
    "line": "109",
    "function": "knative.dev/eventing/pkg/apis/sources/v1.(*SinkBinding).Do"
  },
  "stacktrace": "knative.dev/eventing/pkg/apis/sources/v1.(*SinkBinding).Do\n\tknative.dev/eventing/pkg/apis/sources/v1/sinkbinding_lifecycle.go:109\nknative.dev/pkg/webhook/psbinding.(*Reconciler).Admit\n\tknative.dev/[email protected]/webhook/psbinding/psbinding.go:221\nknative.dev/pkg/webhook.admissionHandler.func1\n\tknative.dev/[email protected]/webhook/admission.go:115\nnet/http.HandlerFunc.ServeHTTP\n\tnet/http/server.go:2042\nnet/http.(*ServeMux).ServeHTTP\n\tnet/http/server.go:2417\nknative.dev/pkg/webhook.(*Webhook).ServeHTTP\n\tknative.dev/[email protected]/webhook/webhook.go:248\nknative.dev/pkg/network/handlers.(*Drainer).ServeHTTP\n\tknative.dev/[email protected]/network/handlers/drain.go:96\nnet/http.serverHandler.ServeHTTP\n\tnet/http/server.go:2843\nnet/http.(*conn).serve\n\tnet/http/server.go:1925"
}
...
{
  "level": "warn",
  "ts": "2021-06-20T13:54:34.933Z",
  "logger": "eventing-webhook.SinkBindings",
  "caller": "psbinding/reconciler.go:147",
  "msg": "Failed to update resource status",
  "commit": "af9c290",
  "knative.dev/pod": "eventing-webhook-8489d86856-glgkz",
  "knative.dev/traceid": "232ab6fc-a146-4184-8536-1e8362e702d2",
  "knative.dev/key": "test-90ba8b8a-f863-4ed1-912b-655a7efb17bd/knativegetpost1",
  "error": "Operation cannot be fulfilled on sinkbindings.sources.knative.dev \"knativegetpost1\": the object has been modified; please apply your changes to the latest version and try again",
  "logging.googleapis.com/labels": {},
  "logging.googleapis.com/sourceLocation": {
    "file": "knative.dev/[email protected]/webhook/psbinding/reconciler.go",
    "line": "147",
    "function": "knative.dev/pkg/webhook/psbinding.(*BaseReconciler).Reconcile"
  }
}
{
  "level": "error",
  "ts": "2021-06-20T13:54:34.933Z",
  "logger": "eventing-webhook.SinkBindings",
  "caller": "controller/controller.go:538",
  "msg": "Reconcile error",
  "commit": "af9c290",
  "knative.dev/pod": "eventing-webhook-8489d86856-glgkz",
  "error": "Operation cannot be fulfilled on sinkbindings.sources.knative.dev \"knativegetpost1\": the object has been modified; please apply your changes to the latest version and try again",
  "logging.googleapis.com/labels": {},
  "logging.googleapis.com/sourceLocation": {
    "file": "knative.dev/[email protected]/controller/controller.go",
    "line": "538",
    "function": "knative.dev/pkg/controller.(*Impl).handleErr"
  },
  "stacktrace": "knative.dev/pkg/controller.(*Impl).handleErr\n\tknative.dev/[email protected]/controller/controller.go:538\nknative.dev/pkg/controller.(*Impl).processNextWorkItem\n\tknative.dev/[email protected]/controller/controller.go:524\nknative.dev/pkg/controller.(*Impl).RunContext.func3\n\tknative.dev/[email protected]/controller/controller.go:461"
}
...
{
  "level": "error",
  "ts": "2021-06-20T13:54:34.950Z",
  "logger": "eventing-webhook.SinkBindings",
  "caller": "sinkbinding/controller.go:153",
  "msg": "Failed to get URI from Destination: %!w(*errors.StatusError=&{{{ } {   <nil>} Failure inmemorychannels.messaging.knative.dev \"messages\" not found NotFound 0xc001756060 404}})",
  "commit": "af9c290",
  "knative.dev/pod": "eventing-webhook-8489d86856-glgkz",
  "knative.dev/traceid": "4476c6cb-054a-4cae-937e-959d138a0f48",
  "knative.dev/key": "test-90ba8b8a-f863-4ed1-912b-655a7efb17bd/knativegetpost1",
  "logging.googleapis.com/labels": {},
  "logging.googleapis.com/sourceLocation": {
    "file": "knative.dev/eventing/pkg/reconciler/sinkbinding/controller.go",
    "line": "153",
    "function": "knative.dev/eventing/pkg/reconciler/sinkbinding.(*SinkBindingSubResourcesReconciler).Reconcile"
  },
  "stacktrace": "knative.dev/eventing/pkg/reconciler/sinkbinding.(*SinkBindingSubResourcesReconciler).Reconcile\n\tknative.dev/eventing/pkg/reconciler/sinkbinding/controller.go:153\nknative.dev/pkg/webhook/psbinding.(*BaseReconciler).reconcile\n\tknative.dev/[email protected]/webhook/psbinding/reconciler.go:188\nknative.dev/pkg/webhook/psbinding.(*BaseReconciler).Reconcile\n\tknative.dev/[email protected]/webhook/psbinding/reconciler.go:140\nknative.dev/pkg/controller.(*Impl).processNextWorkItem\n\tknative.dev/[email protected]/controller/controller.go:523\nknative.dev/pkg/controller.(*Impl).RunContext.func3\n\tknative.dev/[email protected]/controller/controller.go:461"
}

The last line is identical to what you showed in a preivous comment.

@astefanutti
Copy link
Member Author

@tadayosi right, that is the same error. Out of curiosity, do you reproduce it by running the e2e tests against Minikube, or by creating directly any Integrations that rely on SinkBinding?

@tadayosi
Copy link
Member

I'm using the e2e tests. A minimal testing I use for reproducing. I think it's sufficient because we need at least two runs to reproduce it.

go test -timeout 20m -v ./e2e/knative -tags=integration -run TestRunChannel.+

@astefanutti
Copy link
Member Author

Superseded by #2424. Thanks @tadayosi.

@astefanutti astefanutti removed help wanted Extra attention is needed status/wip Work in progress labels Jun 22, 2021
@astefanutti astefanutti deleted the pr-230 branch November 22, 2021 15:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/continuous integration Related to CI and automated testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants