-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add VPC endpoints #93
Conversation
Now that we are no longer using a separate NAT gateway for each private subnet, the private subnets' route tables are all the same.
…TS and SSM services
* Remove egress via 443 to 0.0.0.0/0 * Allow egress via 443 to the STS and SSM security groups * Allow egress via 443 to the S3 gateway endpoint
These are necessary for the CloudWatch Agent to work. This agent is responsible for forwarding our system logs to CloudWatch.
This is necessary because adding the VPC endpoints to the VPCs makes the VPC-internal DNS for sts.us-east-1.awsamazon.com, for example, resolve to the internal IP of the VPC endpoint instead of a public AWS IP.
The operations subnet could still access S3 via the internet gateway, but it may as well use the gateway endpoint since it has to use the interface endpoints for all the other AWS services. Also rename some route-related Terraform resources to avoid collision between operations resources and private resources.
This reverts commit 6b091f0. I realized that the guacamole instance still needs the NAT gateway and external web access in order to download the Docker images used in the guacamole docker composition.
This is necessary so that the guacamole instance can download the Docker images required for the guacamole Docker composition.
I amended the PR description. The commit a16b36a does not need to be reverted. As it turns out, the assessment VPCs were never set up like the shared services VPC with Sharp eyes! |
…S calls STS used to be un-regioned, like S3, but now it is regioned. This is the one case where boto3 _does not_ do the right thing when you set the region. We have to set the region-specific endpint URL manually. This is important since the STS VPC endpoint _only_ sets a local DNS record to override the _local region's_ public STS endpoint. If we don't do this then boto3 will reach out to the _global_ https://sts.amazonaws.com URL, and that DNS entry will still point to an external IP. See this boto/boto3#1859 for more information about boto3's perverse behavior in the case of STS.
…STS calls STS used to be un-regioned, like S3, but now it is regioned. The AWS CLI by default uses the global endpoint URL https://sts.amazonaws.com instead of the region-specific one, even when you specify the region. Therefore, we have to set the region-specific endpint URL manually. This is important since the STS VPC endpoint _only_ sets a local DNS record to override the _local region's_ public STS endpoint. If we don't specify the endpoint URL then the AWS CLI will reach out to the _global_ https://sts.amazonaws.com URL, and that DNS entry will still point to an external IP.
|
||
# Normally we would use a for_each here, but the private subnets are | ||
# _all in the same AZ_ in this case. Why even have multiple subnets | ||
# if you're not going to spread them around? Harumph! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One suggestion which is only a comment change, and otherwise: 🏆
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent work and really great job on the comments! 🐎
...except for one comment that needs a slight revision.
I noticed that previously we were using the name "aws_region" in the cloud-init templates, whereas I had used the name "region" in the code that I added. For consistency's sake, I changed the name in the code that I added.
Co-authored-by: Hillary <[email protected]>
Co-authored-by: dav3r <[email protected]>
@@ -139,7 +139,7 @@ resource "aws_network_acl_rule" "private_egress_to_operations_via_vnc" { | |||
network_acl_id = aws_network_acl.private[each.value].id | |||
egress = true | |||
protocol = "tcp" | |||
rule_number = 120 + index(var.private_subnet_cidr_blocks, each.value) | |||
rule_number = 130 + index(var.private_subnet_cidr_blocks, each.value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧙 Seems magical. Is this counting by 10 jazz useful? Does this break if index ≥ 10?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it breaks if index >= 10
. Our numbering of the ACL rules has always sucked. Do you have any thoughts as to how it can be improved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonderful. This makes a lot of sense. 💯
I had a question about preloading the Docker images into the Guac AMI so that we can remove the NAT gateway. We should discuss that. 🐳 🥑
My motto was always to keep swinging. Whether I was in a slump or feeling badly or having trouble off the field, the only thing to do was keep swinging. --Hank Aaron ⚾
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a pretty big lift and some strong work 💪💪💪
I had some suggestions/questions, but I was also curious if we have a standard on capitalization of protocols? We seem to use for example https
and HTTPS
without clear purpose.
# https://sts.amazonaws.com URL, and that DNS entry will still point | ||
# to an external IP. | ||
# | ||
# See this link for more information about boto3's perverse behavior |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
boto3's perverse behavior
🤣
Co-authored-by: Nick M. <[email protected]>
We do not have such a standard, but you are welcome to propose one. Capitalization is probably the way to go. |
Co-authored-by: Nick M <[email protected]>
If I'm proposing, then yes, I would vote for defaulting to capitalization for protocols. |
You could add something to this long-running PR. |
…re-commit_version Update the `ansible-lint` Version in the pre-commit Configuration
🗣 Description
This pull request:
(It would be nice to remove the NAT gateway from the operations subnet, since the private subnets no longer require it to communicate with AWS services, but they still require it to pull the Docker images from Docker Hub for the Guacamole Docker composition.)
💭 Motivation and context
Resolves #94.
🧪 Testing
I deployed these changes to env0 in our staging COOL environment and verified that the cloud-init scripts, CloudWatch agent, and SSM agent are all running without errors on all private and operations instances. I also verified that the FreeIPA initialization for the guacamole instance still works.
I also redeployed with the external route removed from the operations subnet's route table and verified that everything still worked, with the exception of the downloading of the Docker images required for the Guacamole Docker composition. This serves as an ironclad check that all AWS API calls are being made to the VPC endpoints and none are being made to the public endpoints.
✅ Checklist