Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bundle's Etag is cached if activation failes #2220

Closed
michaelwittig opened this issue Mar 24, 2020 · 4 comments · Fixed by #2286
Closed

Bundle's Etag is cached if activation failes #2220

michaelwittig opened this issue Mar 24, 2020 · 4 comments · Fixed by #2286
Assignees
Labels

Comments

@michaelwittig
Copy link
Contributor

I think I found a bug: If bundles come with an Etag header but fail during activation they are still considered “downloaded/activated” and activation is never retried.

Expected Behavior

If a bundle can not be activated, the activation should be retried in the next cycle even if the Etag stays the same.

Actual Behavior

If a bundle ca nnot be activated, the activation is never retried if the Etag stays the same.

Steps to Reproduce the Problem

Configure two bundles
bundle a depends on bundle b
make the download of bundle b slow, so that bundle a loads before bundle b

Additional Info

Context:
we have OPA configured to fetch two bundles. A “global” bundle that contains some helper functions and a “service” specific bundle with the rules to protect a specific service. The “service” bundle depends on the “global” bundle.
If the global bundle loads/activates before the service bundle, things are working as expected.
If the service bundle loads/activates before the global bundle I get an error like this: rego_type_error: undefined function data.comp.authz.check which is correct because the global helper is not available
I expected that this ordering problem will be solved with the next download cycle of the bundles. But I think there might be a bug:
If no Etag headers are used, it works: The service bundle will be activated on the 2nd try.
If the Etag headers are used, it does not work: The service bundle activation will not be retried.

@michaelwittig
Copy link
Contributor Author

iam_opa_1       | {"level":"debug","msg":"Download starting.","time":"2020-03-24T08:15:17Z"}
iam_opa_1       | {"headers":{"User-Agent":["Open Policy Agent/0.17.1-istio (linux, amd64)"]},"level":"debug","method":"GET","msg":"Sending request.","time":"2020-03-24T08:15:17Z","url":"http://localhost:8080/bundle.tar.gz"}
iam_opa_1       | {"headers":{"Accept-Ranges":["bytes"],"Cache-Control":["no-cache"],"Connection":["keep-alive"],"Content-Length":["477"],"Content-Type":["application/gzip"],"Date":["Tue, 24 Mar 2020 08:15:18 GMT"],"Etag":["\"3fabe82dce75ec32e18bbe1273961695ee8541a2\""],"Last-Modified":["Mon, 23 Mar 2020 15:20:58 GMT"]},"level":"debug","method":"GET","msg":"Received response.","status":"200 OK","time":"2020-03-24T08:15:18Z","url":"http://localhost:8080/bundle.tar.gz"}
iam_opa_1       | {"level":"debug","msg":"Download in progress.","time":"2020-03-24T08:15:18Z"}
iam_opa_1       | {"level":"debug","msg":"Bundle activation in progress. Opening storage transaction.","name":"service","plugin":"bundle","time":"2020-03-24T08:15:18Z"}
iam_opa_1       | {"level":"debug","msg":"Opened storage transaction (5).","name":"service","plugin":"bundle","time":"2020-03-24T08:15:18Z"}
iam_opa_1       | {"level":"debug","msg":"Closing storage transaction (5).","name":"service","plugin":"bundle","time":"2020-03-24T08:15:18Z"}
iam_opa_1       | {"level":"error","msg":"Bundle activation failed: 2 errors occurred:\ncomp/service/policy.rego:15: rego_type_error: undefined function data.comp.authz.check\ncomp/service/policy.rego:24: rego_type_error: undefined function data.comp.authz.check_with_binding","name":"service","plugin":"bundle","time":"2020-03-24T08:15:18Z"}
iam_opa_1       | {"level":"debug","msg":"Waiting 13.009118605s before next download/retry.","time":"2020-03-24T08:15:18Z"}
iam_opa_1       | {"level":"debug","msg":"Download starting.","time":"2020-03-24T08:15:31Z"}
iam_opa_1       | {"headers":{"If-None-Match":["\"3fabe82dce75ec32e18bbe1273961695ee8541a2\""],"User-Agent":["Open Policy Agent/0.17.1-istio (linux, amd64)"]},"level":"debug","method":"GET","msg":"Sending request.","time":"2020-03-24T08:15:31Z","url":"http://localhost:8080/bundle.tar.gz"}
iam_opa_1       | {"headers":{"Cache-Control":["no-cache"],"Connection":["keep-alive"],"Content-Type":["application/gzip"],"Date":["Tue, 24 Mar 2020 08:15:32 GMT"],"Etag":["\"3fabe82dce75ec32e18bbe1273961695ee8541a2\""],"Last-Modified":["Mon, 23 Mar 2020 15:20:58 GMT"]},"level":"debug","method":"GET","msg":"Received response.","status":"304 Not Modified","time":"2020-03-24T08:15:32Z","url":"http://localhost:8080/bundle.tar.gz"}
iam_opa_1       | {"level":"debug","msg":"Waiting 15.152126285s before next download/retry.","time":"2020-03-24T08:15:32Z"}
iam_opa_1       | {"headers":{"Accept-Ranges":["bytes"],"Cache-Control":["no-cache"],"Connection":["keep-alive"],"Content-Length":["1089"],"Content-Type":["application/gzip"],"Date":["Tue, 24 Mar 2020 08:15:35 GMT"],"Etag":["\"8bc7cfb0bedd9bc156e0e58760198181b386870a\""],"Last-Modified":["Tue, 24 Mar 2020 07:24:03 GMT"]},"level":"debug","method":"GET","msg":"Received response.","status":"200 OK","time":"2020-03-24T08:15:35Z","url":"http://bundleserver:4000/bundle.tar.gz"}
iam_opa_1       | {"level":"debug","msg":"Download in progress.","time":"2020-03-24T08:15:35Z"}
iam_opa_1       | {"level":"debug","msg":"Bundle activation in progress. Opening storage transaction.","name":"global","plugin":"bundle","time":"2020-03-24T08:15:35Z"}
iam_opa_1       | {"level":"debug","msg":"Opened storage transaction (6).","name":"global","plugin":"bundle","time":"2020-03-24T08:15:35Z"}
iam_opa_1       | {"level":"debug","msg":"Closing storage transaction (6).","name":"global","plugin":"bundle","time":"2020-03-24T08:15:35Z"}
iam_opa_1       | {"level":"info","msg":"Bundle downloaded and activated successfully. Etag updated to \"8bc7cfb0bedd9bc156e0e58760198181b386870a\".","name":"global","plugin":"bundle","time":"2020-03-24T08:15:35Z"}
iam_opa_1       | {"level":"debug","msg":"Waiting 18.136399609s before next download/retry.","time":"2020-03-24T08:15:35Z"}
iam_opa_1       | {"level":"debug","msg":"Download starting.","time":"2020-03-24T08:15:47Z"}
iam_opa_1       | {"headers":{"If-None-Match":["\"3fabe82dce75ec32e18bbe1273961695ee8541a2\""],"User-Agent":["Open Policy Agent/0.17.1-istio (linux, amd64)"]},"level":"debug","method":"GET","msg":"Sending request.","time":"2020-03-24T08:15:47Z","url":"http://localhost:8080/bundle.tar.gz"}
iam_opa_1       | {"headers":{"Cache-Control":["no-cache"],"Connection":["keep-alive"],"Content-Type":["application/gzip"],"Date":["Tue, 24 Mar 2020 08:15:48 GMT"],"Etag":["\"3fabe82dce75ec32e18bbe1273961695ee8541a2\""],"Last-Modified":["Mon, 23 Mar 2020 15:20:58 GMT"]},"level":"debug","method":"GET","msg":"Received response.","status":"304 Not Modified","time":"2020-03-24T08:15:48Z","url":"http://localhost:8080/bundle.tar.gz"}
iam_opa_1       | {"level":"debug","msg":"Waiting 12.142638725s before next download/retry.","time":"2020-03-24T08:15:48Z"}
iam_opa_1       | {"level":"debug","msg":"Download starting.","time":"2020-03-24T08:15:53Z"}
iam_opa_1       | {"headers":{"Authorization":["Bearer xxx"],"If-None-Match":["\"8bc7cfb0bedd9bc156e0e58760198181b386870a\""],"User-Agent":["Open Policy Agent/0.17.1-istio (linux, amd64)"]},"level":"debug","method":"GET","msg":"Sending request.","time":"2020-03-24T08:15:53Z","url":"http://bundleserver:4000/bundle.tar.gz"}
iam_opa_1       | {"level":"debug","msg":"Download starting.","time":"2020-03-24T08:16:00Z"}
iam_opa_1       | {"headers":{"If-None-Match":["\"3fabe82dce75ec32e18bbe1273961695ee8541a2\""],"User-Agent":["Open Policy Agent/0.17.1-istio (linux, amd64)"]},"level":"debug","method":"GET","msg":"Sending request.","time":"2020-03-24T08:16:00Z","url":"http://localhost:8080/bundle.tar.gz"}
iam_opa_1       | {"headers":{"Cache-Control":["no-cache"],"Connection":["keep-alive"],"Content-Type":["application/gzip"],"Date":["Tue, 24 Mar 2020 08:16:01 GMT"],"Etag":["\"3fabe82dce75ec32e18bbe1273961695ee8541a2\""],"Last-Modified":["Mon, 23 Mar 2020 15:20:58 GMT"]},"level":"debug","method":"GET","msg":"Received response.","status":"304 Not Modified","time":"2020-03-24T08:16:01Z","url":"http://localhost:8080/bundle.tar.gz"}
iam_opa_1       | {"level":"debug","msg":"Waiting 12.830341511s before next download/retry.","time":"2020-03-24T08:16:01Z"}

@patrick-east
Copy link
Contributor

patrick-east commented Mar 25, 2020

This does indeed look like an issue, I think the original design assumed that if the bundle failed to activate it would have required an update (and a new etag) so none of the test cases look for this behavior.

I see in the bundle plugin we do the "right" thing with the etag for the bundle status, only updating the etag if activation succeeds:

if err := p.activate(ctx, name, u.Bundle); err != nil {
p.logError(name, "Bundle activation failed: %v", err)
p.status[name].SetError(err)
return
}
p.status[name].SetError(nil)
p.status[name].SetActivateSuccess(u.Bundle.Manifest.Revision)
if u.ETag != "" {
p.logInfo(name, "Bundle downloaded and activated successfully. Etag updated to %v.", u.ETag)
} else {
p.logInfo(name, "Bundle downloaded and activated successfully.")
}
p.etags[name] = u.ETag

But over in the downloader we just always set the etag:

opa/download/download.go

Lines 128 to 132 in d0a760c

if d.f != nil {
d.f(ctx, Update{ETag: etag, Bundle: b, Error: err, Metrics: m})
}
d.etag = etag

Couple options come to mind to correct this...

  1. We could keep references to bundles by etag, even if they fail to activate (maybe just like last downloaded and last activated?). When we handle the download results if we get an etag back instead of just ignoring it if it doesn't match
    if etag, ok := p.etags[name]; ok && u.ETag == etag {
    p.logDebug(name, "Bundle download skipped, server replied with not modified.")
    p.status[name].SetError(nil)
    return
    }
    we could try to activate the bundle with the etag we were just given by the downloader.
  2. We could give the downloader an API to like set or clear the etag/cache state and have the bundle plugin call it when activation fails. The downside to this is that we don't know if we need to download a new version or just wait until some other bundles have been activated and try again. In its most efficient form I guess we probably want to avoid re-downloading a perfectly good bundle, which makes me lean towards option (1)

@michaelwittig
Copy link
Contributor Author

Thanks @patrick-east I came to a similar understanding of the problem when looking at plugin.go/download.go.

Unfortunately, I'm not an experienced go programmer. If I had to rearrange a few lines I would feel comfortable making the change. But this requires go skills that I don't have.

@patrick-east
Copy link
Contributor

@michaelwittig no worries! Its on the OPA teams radar now so its just a matter of time until someone picks it up

ashutosh-narkar added a commit to ashutosh-narkar/opa that referenced this issue Apr 10, 2020
Earlier the Etag on the downloader would be updated unconditionally after every attempt to download a bundle. This could lead to a situation wherein a bundle fails to activate and would remain in an unactivated state since any subsequent downloads of the same version of the bundle would not trigger the activation process. This change attempts to resolve the issue by allowing the client to reset the Etag on the downloader incase of downloader errors and bundle activation failures. The drawback now is that we could end up re-downloading the same version of a bundle multiple times till it successfully activates. This situtation is likely to occur when using multiple bundle sources where a bundle may depend on some other. Generally using multiple bundle sources isn't recommended so the extra network traffic as a result of the re-downloads although not ideal may not too harmful.

Fixes open-policy-agent#2220
Fixes open-policy-agent#2279

Signed-off-by: Ashutosh Narkar <[email protected]>
ashutosh-narkar added a commit that referenced this issue Apr 10, 2020
Earlier the Etag on the downloader would be updated unconditionally after every attempt to download a bundle. This could lead to a situation wherein a bundle fails to activate and would remain in an unactivated state since any subsequent downloads of the same version of the bundle would not trigger the activation process. This change attempts to resolve the issue by allowing the client to reset the Etag on the downloader incase of downloader errors and bundle activation failures. The drawback now is that we could end up re-downloading the same version of a bundle multiple times till it successfully activates. This situtation is likely to occur when using multiple bundle sources where a bundle may depend on some other. Generally using multiple bundle sources isn't recommended so the extra network traffic as a result of the re-downloads although not ideal may not too harmful.

Fixes #2220
Fixes #2279

Signed-off-by: Ashutosh Narkar <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants