-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add network policies #56
base: main
Are you sure you want to change the base?
Conversation
6c96fb6
to
2edc700
Compare
d98fa4b
to
bfe6ecb
Compare
c237036
to
4f189f6
Compare
This is fairly complex, will probably have to refactor it
4f189f6
to
2a4f782
Compare
1afc5cf
to
2a4f782
Compare
c10a54e
to
51b8643
Compare
833948e
to
c2e5475
Compare
994419e
to
08c463a
Compare
3301bd4
to
f5b33f7
Compare
"""The default policy name that will be created.""" | ||
return f"{self._charm.app.name}-network-policy" | ||
|
||
def apply_ingress_policy( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We expect a list of policies so we should probably call the method apply_ingress_policies
logger.error(msg) | ||
raise NetworkPoliciesHandlerError() | ||
|
||
def delete_network_policy(self, name: Optional[str] = None) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should standardise on naming, we use ingress policies
and network policies
interchangeably.
src/charm.py
Outdated
] | ||
) | ||
except NetworkPoliciesHandlerError: | ||
event.defer() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we simply change the status to blocked (or not catch the error and let the charm go to error state)? If something is broken we may not want to be deferring forever
We can't use the named port because we have named the ports on the Service and not on the Pod
5d189a0
to
70544be
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where are the unit tests for this file 👀
|
||
|
||
class NetworkPoliciesHandlerError(Exception): | ||
"""Applying the network policies failed.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""Applying the network policies failed.""" | |
"""Raised when applying the network policies failed.""" |
@@ -442,6 +449,9 @@ def _on_config_changed(self, event: ConfigChangedEvent) -> None: | |||
"""Event Handler for config changed event.""" | |||
self._handle_status_update_config(event) | |||
|
|||
def _cleanup(self, event: RemoveEvent) -> None: | |||
self.network_policy_handler.delete_ingress_policies() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be a juju bug, but when I try removing kratos it gets into terminated state with unit lost. When I use --force --no-wait
, the charm is removed but in both cases the network policy is not deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I tried testing it a little more and when I test it manually, the policy is removed. I wrote an integration test to validate this and in the test the policy is not removed. This is very weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell the stop event is never fired, this is most likely a juju bug and not something we can control. Let's leave this as it is for now, it is the best we can do and juju should (hopefully) fix it in a later version.
def apply_ingress_policies( | ||
self, policies: List[IngressPolicyDefinition], name: Optional[str] = None | ||
) -> None: | ||
"""Apply an ingress network policy about a related application. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""Apply an ingress network policy about a related application. | |
"""Apply an ingress network policy for a related application. |
tests/integration/conftest.py
Outdated
@@ -0,0 +1,14 @@ | |||
#!/usr/bin/env python3 | |||
# Copyright 2022 Canonical Ltd. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Copyright 2022 Canonical Ltd. | |
# Copyright 2023 Canonical Ltd. |
(38813, []), | ||
] | ||
) | ||
except NetworkPoliciesHandlerError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we retry applying the network policy?
Right now if you deploy kratos without trust, it gets into BlockedStatus
, but if you trust it then it gets into active without creating the network policy.
51c1c17
to
b902862
Compare
In case applying the network policies fails we have several options:
Since it is not needed for the charm operation and it is not essential for security I think that (1) and (4) make most sense because they allow the charm to run despite any error that might occur and the user can make the decision to stop the charm if they deem it necessary. If we decide to go with (1) we can also provide an action for applying the network policies to allow the user to retry applying them if something goes wrong. cc @natalian98 @gruyaume |
Apply network policies to deny ingress traffic to the Kratos APIs when it isn't coming from the traefik ingress.
I had to open all the none Kratos ports as I couldn't figure out exactly what is needed by juju/pebble. (need to change that).
To test these changes you need to:
I am first creating the traefik relation and then the postgres one because I had some trouble with dns issues that where causing the migration to fail. First relating traefik and then postgres assures us that those issues are gone.
You must be able to see the network policy by running:
$ microk8s kubectl get -n {namespace} networkpolicies
To validate the networkpolicies you can
juju ssh
to a random pod (e.g. postgres and try to reach the kratos admin API):This should timeout. To get the kratos app ip simply run
juju status
.By doing the same test from the traefik pod, the http request must return.