-
Notifications
You must be signed in to change notification settings - Fork 312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature request: Disclose all external sites/URLs needed for deployment #1339
Comments
Hi @chrisdag, For the shorter term request, the nodes are provisioned using the Chef recipes that you can find here. Thanks |
Hi @lukeseawalker I finally got the time to do what you suggested -- I stood up a Squid logging proxy in a public subnet and deployed v2.4.1 through it and captured all of the logs. The short summary is that If I strip out all of the OS patching, EPEL and Python PyPi traffic the list of API and external destinations is super small right now:
I'm able to ignore all the Patch, OS, Update and PyPi traffic because we use pre_install bootstrap scripts to override all software repos anyway with links that point to internal private mirrors or Nexus Repository managers (for PyPi). From our perspective the offending entry was github.com as our firewall was clearly blocking that -- that ended up being the only destination responsible for our rollback and deploy failures. And other than that the only unusual thing to see was the request to the ec2-metadata link was a pure HTTP request made over TCP:80 -- one of the few non-HTTPS connections to an amazon operated destination. I wrote a much longer blog post about this and have the full squid logs available over at https://bioteam.net/2019/11/aws-parallelcluster-private-deployment-in-hardened-vpcs/ |
@chrisdag Really enjoyed reading your blogpost. Excellent work! Btw the github call comes from We owe you some 🍻 |
Hi @chrisdag, L |
Resolving this, we enabled cluster creation in subnets with no internet access as part of 3.1.1 release and added list of VPC endpoints and instructions in official documentation. |
We run Parallelcluster at scale in hardened VPCs that are not granted unrestricted access to the internet. Each external destination needs to be documented and whitelisted in a firewall or on a proxy server.
Prior versions of cfncluster/parallelcluster just talked to APIs and pulled templates, scripts and data from s3:// so this was easy to support and configure security rules around.
But now we have deployments of the latest version failing because it looks like the bootstrap process is trying to pull artifacts or code from github and our firewall is killing those sessions. The end result is our users see frustrating rollbacks on deployment.
As a longer term feature request can we ask that some sort of doc be created that lists the required external destinations for deployment success?
And shorter term if someone could refresh my memory on what parts of the source code I should check by hand to gather a list of hosts to whitelist on our firewall that would be great. From memory I think that some of the deployment and bootstrapping scripts are not actually in this project but are in a different project?
Thanks!
The text was updated successfully, but these errors were encountered: