-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add option to clear rules from default VPC security group and ACL #516
Conversation
Add option to clean (remove all rules) from the default security group created automatically by the VPC
scripts/fix_security_group.py
Outdated
import logging | ||
import os | ||
|
||
import requests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not come out of the box with python. Is there anything else we can use? Otherwise we will have to do a pip install (not ideal)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latest revision has removed the "requests" external dependency, and uses the native "urllib" instead. Now both python scripts are only using native imports.
Tested these new scripts in schematics, everything is working there as well.
I'm pondering whether we'd not be better off doing this using the ibmcloud cli in bash to simplify. This may introduce a dependency on jq, but we already have it in other modules anyways. |
@vburckhardt I now have it working with pure python3 scripts, with no external dependencies other than python3 itself. SLZ had already had a dependency on python3 (uses scripts to do json conversions) so I figured this is safest. IBM CLI with bash introduces some other developer machine dependencies, such as:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we have some code duplication like get_http_response_str
and get_bearer_token
. Perhaps place those functions into a utils.py script and import into the 2 python scripts you have so that we remove the code duplication?
Hi @toddgiguere - you're correct on the dependencies, but those pre-reqs already exists for other modules, so I am not that worried about this. Schematics has got all of those installed as well. That said, my main worry here here is the complexity of the code, I think it's going to be a serious obstacle for maintenance going forward. It is also potentially a greater source of bugs. |
scripts/fix_security_group.py
Outdated
def get_bearer_token( | ||
refresh_token: str, cloud_base_domain: str = "cloud.ibm.com" | ||
) -> str: | ||
url = f"https://iam.{cloud_base_domain}/identity/token" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to break for customers going through the CSE endpoints or IAM VPE. The domain is private.iam.cloud.ibm.com in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the private endpoints for IAM and endpoints for IaaS resources (your other comment) related? If we have a single boolean that can be supplied to "use_private_endpoints" cover both cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latest version now has support for private endpoints, enabled by setting new python parameter to "true" (default of "false"). Intent is for that input variable to come from the terraform call.
scripts/fix_security_group.py
Outdated
def get_security_group_rules( | ||
sg_id: str, region: str, access_token: str, cloud_base_domain: str = "cloud.ibm.com" | ||
) -> dict: | ||
url = f"https://{region}.iaas.{cloud_base_domain}/v1/security_groups/{sg_id}/rules?version=2023-04-11&generation=2" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going to break for VPC VPE. Eg: the domain is eu-de.private.iaas.cloud.ibm.com for the VPC VPE in Frankfurt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latest version now has support for private endpoints, enabled by setting new python parameter to "true" (default of "false"). Intent is for that input variable to come from the terraform call.
@vburckhardt Would you like me to convert this to BASH/CLI/JQ then? If so I will need to add the ibmcloud_api_key as a new input to the module. Technically the api key is not always required, only if you need to clean the scripts, so if it is not required var I'll need to verify using validation, and if we do make it required it will be a breaking change. Also, will the CLI method take care of the private/public endpoints you commented about? |
@toddgiguere CLI can be set up to use private. See https://cloud.ibm.com/docs/cli?topic=cli-service-connection |
Latest changes:
|
I have added two shell scripts that implement the same removal of rules as the python scripts, written in BASH/CLI/JQ, for comparison. THESE ARE NOT CALLED FROM TERRAFORM YET! I am including them for a comparison, and we can then pick the version we would like to go with. I tested both scripts manually, including in bash 3.2 (mac native), and both scripts work, replacing the python in terraform will be trivial. ONE FINAL NOTE: I have the CLI login configured to do optional "private" endpoint, but I am unclear how this works with the |
variables.tf
Outdated
} | ||
|
||
variable "ibmcloud_api_visibility" { | ||
description = "IBM Cloud API visibility used for internal scripts. Must be 'public', 'private', or 'public-and-private'" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The word "internal" might be confusing here (since this is a public module). Perhaps say "used for scripts run by this module" ?
+1 on bash scripts. Thanks for putting this together. Much of the python logic is low level and essentially trying to re-implement functions (eg: token exchange) that already exists in cli, so worried about carrying that baggage going forwards. This will help from an evolutionary perspective in term of maintaining same level of support in future - eg: things like authenticating with trusted profile, etc I think at some stage we should consolidate the common scripts across module, but this is outside the scope of this PR - eg: the login logic exists here https://github.com/terraform-ibm-modules/terraform-ibm-base-ocp-vpc/blob/main/scripts/reset_iks_api_key.sh#L27 , and internally as well (csutils module) https://github.ibm.com/GoldenEye/csutils-module/blob/master/scripts/ibm_cloud_login.sh . There is an existing issue in our internal board. |
I'll need to add ibmcloud_api_key as an input variable to this module for bash scripts. In this case, it could be "optional" and I can do one of our regex verifications that it needs to be included if the "clear" booleans are true, or we can make it required and have a potential breaking change. Thoughts? Also I'm still not sure how to use private endpoints through CLI for the |
I'd go with optional api key for now as input, with a check in the module that error out if clean_default_acl / sg is set. On private endpoint - this may help: |
Right now I'm checking for API Key != null inside of scripts. If the "clean" booleans are true, but API key variable is null in terraform, the following terraform error will appear. We can adjust the wording of the error.
|
converted the python scripts used for default acl and sg rule clean into bash/ibmcloud-cli/jq style of scripting. This also means I needed to add optional ibmcloud_api_key inputs in all modules.
Latest version I've removed the python scripts and replaced with shell scripts written in BASH, utilizing IBM Cloud CLI and JQ to do the rule clearing. I've split the IBM Cloud login to its own file, if we like how it functions we might be able to use it as a common script later. I tried to design it so it would be flexible, complete with optional parameters like region and resource group. The bash scripts should properly catch all errors and fail the terraform block, with no "silent fails". I've also added a short section in the main README to alert users that they will need the CLI installed to run this terraform. |
to see if this fixes random "no region targeted" errors in further scripts
Getting a strange error in the automated tests that I do not see when testing locally or in schematics, have not figured it out yet. In the security group clean, only for the management VPC, the CLI is returning "No region targeted". Is this something to do with the way the login to CLI works? Again, is not happening in either a local run or in a schematic run.
|
I think I've solved the IBMCloud CLI issues. The problem: My solution:
SPECIAL NOTE: this also solves another issue that will happen when we include this new VPC module version into SLZ parent. We will now be able to set the "resource_group" in the |
variables.tf
Outdated
} | ||
|
||
variable "ibmcloud_api_key" { | ||
description = "IBM Cloud API Key that will be used for authentication in scripts run in this module." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps say only required if setting clean_default_security_group
or clean_default_acl
to true ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a more verbose description for api key variable.
|
||
# Create a temporary home for CLI config, used by both login and further commands. | ||
# This ensures config separation between various terraform provision blocks on the same machine. | ||
ibmcloud_config_home=$(mktemp -d) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should probably be cleaning up the temp directory after? But I guess it would have to be done after the scripts have completed, which is outside the scope of this function. Perhaps there is a cli command to get current home, and delete it (if its not the default home)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, I forgot the cleanup. It should be easy since I have it in an ENV already I can just remove the whole directory at the end of the script. However, if the script errors, it won't remove. I could look into doing a bash-style error handling block to remove it there, if that is important.
One thing I will do though is ignore an error in the removal, I don't want that to trigger error in terraform.
Another note is that mktemp
will always create the folder in the underlying OS official $TMPDIR location, so theoretically it will get cleaned up eventually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latest version I've added a removal of the temp directory that will occur with normal script termination, and also if an error occurs anywhere (via bash trap ERR
mechanic). I've tested error scenarios at every level and it seems to be getting caught and removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good, left some final comments
LGTM - @toddgiguere let me know once happy to merge |
@ocofaigh I have tested this branch with landing zone patterns and everything is working as expected, I think this is good to release and I can then finish on landing zone side. |
🎉 This PR is included in version 7.1.0 🎉 The release is available on GitHub release Your semantic-release bot 📦🚀 |
Description
Add new option, in the form of boolean input variables, to clear (remove) all rules from the default Security Group and ACL that are automatically created by the VPC provider. These default network objects are very permissive, and can affect security requirements and compliance.
IBM Cloud does not allow the removal of the default Security Group or ACL resources, so these changes will instead remove all rules inside those resources so that they will deny all by default.
Some notes about these changes:
Types of changes in this PR
No release required
Release required
x.x.X
): Change that fixes an issue and is compatible with earlier versions)x.X.x
): Change that adds functionality and is compatible with earlier versions)X.x.x
): Change that is likely incompatible with previous versions)Release notes content
Added new option, in the form of boolean input variables, to clear (remove) all rules from the default Security Group and ACL that are automatically created by the VPC provider. These default network objects are very permissive, and can affect security requirements and compliance. The defaults cannot be directly removed, so this change will instead remove all rules contained inside of them so that they deny all traffic by default. This will allow a consumer to then add their own least permissive network Security Groups and ACLs to their VPC.
Checklist for reviewers
Merge actions for mergers
Merge by using "Squash and merge".
Use a relevant conventional commit message that is based on the PR contents and any release notes provided by the PR author.
The commit message determines whether a new version of the module is needed, and if so, which semver increment to use (major, minor, or patch).