Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS firewall configuration via security groups #239

Merged
merged 19 commits into from
May 4, 2022

Conversation

ShishirPatil
Copy link
Member

@ShishirPatil ShishirPatil commented Mar 29, 2022

Adds firewalls

Fixes #272

@parasj
Copy link
Contributor

parasj commented Apr 6, 2022

Follow up from today's meeting: we should reuse the same VPC + SG and not clear it (except in skylark deprovision). ReplicatorClient should remove instance IPs from the security groups FIRST then terminate instances.

@parasj parasj changed the title Firewalls AWS firewall configuration via security groups Apr 7, 2022
@ShishirPatil
Copy link
Member Author

@parasj Can you list out the exact scenario you are thinking for which this would be useful? I can then reason about the best way to handle it.

@parasj
Copy link
Contributor

parasj commented Apr 13, 2022

@ShishirPatil Setup:

  • If SG does not exist, create a new Skylark SG. If a VPC does not exist, create it and bind the Skylark SG.
  • At the start of a transfer (provision_gateways), add node IPs to the SG.
  • Run the transfer
  • In deprovision_gateways, first remove each IP from the SG then terminate the instance (not other way around to avoid race condition)

This will allow concurrent transfers to occur.

@parasj
Copy link
Contributor

parasj commented Apr 13, 2022

In a new issue: we should remove the client from the SG and use port 22 to copy a JSON to each instance containing the initial gateway ChunkRequests using SFTP (already implemented) then call curl via SSH to actually start the transfer.

@ShishirPatil
Copy link
Member Author

@ShishirPatil Setup:

* If SG does not exist, create a new Skylark SG. If a VPC does not exist, create it and bind the Skylark SG.

* At the start of a transfer (provision_gateways), add node IPs to the SG.

* Run the transfer

* In deprovision_gateways, first remove each IP from the SG then terminate the instance (not other way around to avoid race condition)

This will allow concurrent transfers to occur.

We cannot support concurrent transfers with the current architecture. This is because - on the source bucket side we will first check if there are any instances up. If there are, then we will just start re-using those instances. However, we don't check if those instances are busy, already engaged in a transfer.

ShishirPatil and others added 11 commits May 2, 2022 16:29
* Switch to ILock from oslo.concurrency's lockutils
* Change behavior of remove_ip_from_security_group to instead remove
* Remove make_vpc from add_ip_to_security_group and instead call it explicitly
* Firewall rules also called old init jobs, fixed by redefining jobs
* Remove add_ip_to_security_group and remove_ip_from_security_group from GCP/Azure since those two clouds have different terminology
@ShishirPatil ShishirPatil merged commit 31f44bb into main May 4, 2022
@ShishirPatil ShishirPatil deleted the dev/shishir/firewall branch May 4, 2022 22:01
parasj added a commit that referenced this pull request May 9, 2022
parasj added a commit that referenced this pull request May 10, 2022
parasj added a commit that referenced this pull request May 10, 2022
* Clean up instance profiles in deprovision

* Fix bug in #239

* Fix pytype

* Update deprovisoin logic

* Cache pytype

* Fix pytype

* Disable pylint
parasj pushed a commit that referenced this pull request May 10, 2022
Skyplane now supports concurrent transfers in a secure manner. Every instance's ip is manually added to the SG at the start of transfer, and removed from the SG at the end of a transfer.
parasj added a commit that referenced this pull request May 10, 2022
parasj added a commit that referenced this pull request May 10, 2022
* Clean up instance profiles in deprovision

* Fix bug in #239

* Fix pytype

* Update deprovisoin logic

* Cache pytype

* Fix pytype

* Disable pylint
parasj added a commit that referenced this pull request May 10, 2022
…ation (#323)

* Fix #318 by passing project_id during setup to region config serialization

* Try another fix

* Fig

* Fix project ID

* Fix bug with GCS transfers

* Add internal CLI commands

* Exit replicator client upon any errors (#324)

* AWS firewall configuration via security groups (#239)

Skyplane now supports concurrent transfers in a secure manner. Every instance's ip is manually added to the SG at the start of transfer, and removed from the SG at the end of a transfer.

* Fix bug in #239

* Add CLI option to use BBR for transfers (#331)

* Add option to use BBR in skylark cli

* format

* Update

* Fix pytype

* Open tunnel

* Bold font

* Update

* Update

* Update

* Increase connections

* Clean up instance profiles in deprovision (#334)

* Clean up instance profiles in deprovision

* Fix bug in #239

* Fix pytype

* Update deprovisoin logic

* Cache pytype

* Fix pytype

* Disable pylint

* Fix issue

Co-authored-by: Shishir Patil <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Status: Done
Development

Successfully merging this pull request may close these issues.

AWS Firewalls should use SSH + SFTP to remove the client IP from the skylark security group
2 participants