Remove dependencies on specific cloud provider packages in `TransferJob` and update test deprovision bucket names #884

sarahwooders · 2023-06-23T18:18:46Z

No description provided.

…ne-project#865) S3 currently does not have proper bucket cleanup after testing leading to `TooManyBuckets` errors on tests after a certain point. This adds the cleanup logic to every integration test. Furthermore, the hadoop jdk installs were removed as they should not be required anymore.

Instead of storing the vcpu limits as a hardcoded variable inside Planner, created a new .csv file that includes that information. When Planner is created, we read this csv file. Wrote test cases that include fake quota limits.

Edited the AzureBlobInterface class: * Wrote the logic for staging/uploading a block for a multipart upload in the method upload_object * Created two functions to 1) initiate the multipart upload and 2) complete the multipart upload * Since Azure works differently than s3 and gcs in that it doesn't provide a global upload id for a destination object, I used the destination object name instead as an upload id to stay consistent with the other object stores. This pseudo-upload id is to keep track of which blocks and their blockIDs belong to in the CopyJob/SyncJob. * Upon completion of uploading/staging all blocks, all blocks for a destination object are committed together. More things to consider about this implementation: Upload ID handling: Azure doesn't really have a concept equivalent to AWS's upload IDs. Instead, blobs are created immediately and blocks are associated with a blob via block IDs. My workaround of using the blob name as the upload ID should work since I only use upload_id to distinguish between requests in the finalize() method Block IDs: It's worth noting that Azure requires block IDs to be of the same length. I've appropriately handled this by formatting the IDs to be of length len("{number of digits in max blocks supported by Azure (50000) = 5}{destination_object_key}"). --------- Co-authored-by: Sarah Wooders <[email protected]>

* Modified the tests so that they load from an actual quota file instead of me defining a dictionary. * Modified planner so that it can accept a file name for the quota limits (default to the skyplane config quota files) * Added more tests for error conditions (no quota file is provided + quota file is provided but the requested region is not included in the quota file) --------- Co-authored-by: Sarah Wooders <[email protected]> Co-authored-by: Asim Biswal <[email protected]>

…oject#879)

sarahwooders and others added 30 commits June 15, 2023 06:57

fix planner

592e72a

Make MulticastDirectPlanner take in TransferConfig (skyplane-projec…

b2741c5

…t#872)

Disable cloudflare for integration tests (skyplane-project#864)

faba29b

Cleanup planner to take in TransferConfig and remove unused planner code

ac4d589

Merge remote-tracking branch 'upstream/0.3.1-release' into release

f656526

add copy tests

4225644

initial integration tests

53d3439

planner fixe

2429449

actualy disable cloudflare

84ebdc2

actualy disable cloudflare

9fd0cd8

test fixes

d2e4200

cleanup

00dd79b

add all cloud tests

17d863d

debugging pipeline

7096e63

fixed pipeline bug

801371c

cleanup

54ffa1f

cleanup

e27b90d

Merge branch '0.3.1-release' into release

f4627cf

Add pytest integration tests (skyplane-project#874)

d6c5430

Update integration-test-local.yml

dfd03d6

Update pyproject.toml

e458395

Update __init__.py

41b94e7

Update conf.py

b223c21

update poetry lock

b2ea7ab

merge

f7e2075

Merge branch 'release' of github.com:sarahwooders/skyplane into release

fcc97f2

sarahwooders added 17 commits June 22, 2023 16:49

change publish versions

c18b0ec

Upgrade poetry.lock and poetry publish version (skyplane-project#878)

f99a5c7

Update poetry-publish-nightly.yml

59db561

fix planner for one-sided transfers

1c7b5b2

format

f81e346

Merge remote-tracking branch 'upstream/0.3.1-release' into release

060f616

Add cloudflare keys to integration tests and fix planner (skyplane-pr…

bc3d7c0

…oject#879)

cleanup and have error reporting support multiple destinations

23a1783

cleanup

bdf5ea2

Merge remote-tracking branch 'upstream/0.3.1-release' into release

30c5ee3

cleanup

c783f42

Update logging for multiple destinations (skyplane-project#880)

3117d4a

cleanup tests

1b5d882

Merge remote-tracking branch 'upstream/0.3.1-release' into release

090105e

Merge remote-tracking branch 'upstream/main' into release

b9bd233

remove imports

533e91b

rename buckets

863bc33

sarahwooders changed the title ~~Remove dependencies on specific cloud provider packages in TransferJob~~ Remove dependencies on specific cloud provider packages in TransferJob and update test deprovision bucket names Jun 23, 2023

sarahwooders merged commit e9ce64f into skyplane-project:main Jun 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove dependencies on specific cloud provider packages in `TransferJob` and update test deprovision bucket names #884

Remove dependencies on specific cloud provider packages in `TransferJob` and update test deprovision bucket names #884

sarahwooders commented Jun 23, 2023

Remove dependencies on specific cloud provider packages in TransferJob and update test deprovision bucket names #884

Remove dependencies on specific cloud provider packages in TransferJob and update test deprovision bucket names #884

Conversation

sarahwooders commented Jun 23, 2023

Remove dependencies on specific cloud provider packages in `TransferJob` and update test deprovision bucket names #884

Remove dependencies on specific cloud provider packages in `TransferJob` and update test deprovision bucket names #884