Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Work around Kubernetes bug causing garbage collection across namespaces to happen when ownerReferences are set #3986

Closed
sebgl opened this issue Nov 26, 2020 · 4 comments · Fixed by #3992
Assignees
Labels
>bug Something isn't working

Comments

@sebgl
Copy link
Contributor

sebgl commented Nov 26, 2020

This Kubernetes bug leads to very dangerous situations for ECK users currently.

As we explained in ECK documentation, the following may happen:

  • a user copies the Kubernetes secret holding the elastic user password from one namespace to another, in order to use it in other applications
  • suddenly, Kubernetes garbage collection mechanism detects this copied secret, and decides to garbage collect the entire owner resource tree
  • this means all PersistentVolumesClaims suddenly get deleted. If their storage class specify a delete retention policy, all Elasticsearch data gets immediately deleted. Even if they use the retain detention policy, it is likely that new PVCs will get created along with empty fresh PVs

This bug will be fixed in Kubernetes 1.20.

In general, we try to not work around Kubernetes bugs in ECK, and instead encourage people to upgrade their Kubernetes version, but here the impact is dangerous enough to deserve being addressed in ECK directly.

I see several solutions to this problem:

1. Don't set ownerReferences on Secrets user are likely to copy around

I think there are 2 Secrets users can be tempted to copy across namespaces:

  • the elastic user password
  • the public-facing CA of Elasticsearch HTTP certificates

We could also potentially add secrets of other stack components to that list:

  • Kibana public CA
  • APM token
  • etc.

Those additional secrets are also impacted by the GC bug, but will not cause Elasticsearch data loss.

In those secrets, we could decide to not set an ownerReference. Which means they will remain orphan Kubernetes resources if the user decides to delete the Elasticsearch cluster.
Overall it means replacing the ownerReference bug with another bug (not cleaning secrets on deletion), but that second bug is much less dangerous.

2. Set the ownerReference of those Secrets to target a different, harmless resource

To be experimented, not sure if it works.
We could decide to bind the ownerReference of those secrets to a different resource. And make that different resource have an ownerReference to the Elasticsearch resource. So the parent-child tree looks as follow:

Elasticsearch <-- PlaceHolderWorkaroundResource <-- Secret

That intermediate resource could either be an existing resource we already manage, or be a new placeholder resource. For example: an empty configMap whose only purpose is to work around the ownerRef bug.

When the k8s garbage collector decides to cleanup parent resources, I think it may (to be tested) limit to deleting that intermediate configmap which we don't care about much. And not touch other siblings in the dependency tree (i.e. PersistentVolumeClaims).

3. Don't set an ownerRef on those secrets, and handle the garbage collection in ECK

We already garbage collect some secrets related to resource associations (e.g. credentials for the Kibana user across Elasticsearch namespace + Kibana namespace).
We could do the same with those orphan secrets.
This may be trickier than it sounds, since we need to ensure we properly deal with cache inconsistencies (for example: deleting the Secret while we don't see the Elasticsearch resource yet in the cache).
It's also not a "perfect" solution in the sense that deleting ECK would preserve the orphan secrets.


Order of operations. A reasonable plan to handle the above:

  1. Immediately implement solution 1. To be included in the next ECK patch release (1.3.1).
  2. Experiment options 2 and 3, to be included in ECK 1.3.1 if easy enough, or in a later minor release.
@sebgl sebgl added the >bug Something isn't working label Nov 26, 2020
@barkbay
Copy link
Contributor

barkbay commented Nov 30, 2020

  1. Set the ownerReference of those Secrets to target a different, harmless resource

IIUC the GC would delete the Secret in the cluster namespace and the copied Secret in the other namespace. The first Secret would be recreated by the operator, but it would still lead to some weird situations from a user perspective where the copied Secrets would be outdated or deleted.

@sebgl sebgl self-assigned this Nov 30, 2020
@david-kow
Copy link
Contributor

Immediately implement solution 1. To be included in the next ECK patch release (1.3.1).

I'm afraid that we are removing a very serious, but rare, bug for mostly harmless, but affecting 100% of users, one. IIUC, all kubectl delete es ... will end up with orphaned Secret(s). How the proper clean up should look like? Will we document it?

@sebgl
Copy link
Contributor Author

sebgl commented Nov 30, 2020

I'm afraid that we are removing a very serious, but rare, bug for mostly harmless, but affecting 100% of users, one. IIUC, all kubectl delete es ... will end up with orphaned Secret(s). How the proper clean up should look like? Will we document it?

#3992 proposes approach number 3 (don't set the ownerRef + garbage collect orphan secrets).
The garbage collection is best-effort since handled when we receive a deletion event for the owner (e.g. Elasticsearch). Which means you may end up with orphan secrets if the operator is not running when the owner is deleted.

I feel like that's a very acceptable trade-off?

edit: seems easy enough to also implement gc on operator startup to cover the operator not running case.

@david-kow
Copy link
Contributor

@sebgl, I think I wasn't clear by just quoting - I was referring only to solution 1, in which we would ignore the orphaned secret.

As to your PR (approach 3), I'm all for it 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants