-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clay pipeline v04 #173
Merged
Merged
Clay pipeline v04 #173
Changes from all commits
Commits
Show all changes
13 commits
Select commit
Hold shift + click to select a range
e7273d0
Catch case where datacube size is 0 on S1
yellowcap cf449f3
Reduce tile size to 256 and bump pipeline version to 04
yellowcap 2cdc663
Updated mgrs sampling strategy for v0.2
yellowcap 242eb22
Allow extracting index from array job index
yellowcap 6dc4cf5
Fix variable name in docs
yellowcap 731f696
Mute s3 sync command
yellowcap 888c9fd
Increase dates per location and s1 match attempts
yellowcap 60166c9
Use logging instead of printing
yellowcap 9818534
Moved sync out of loop
yellowcap 08a9e8c
Convert batch command to array job
yellowcap 98c7f16
Update submit script memory requirements and sample source
yellowcap c7d224e
Update default sample source
yellowcap bc8684a
Merge branch 'main' into clay-pipeline-v04
yellowcap File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,31 +1,40 @@ | ||
import os | ||
|
||
import boto3 | ||
|
||
batch = boto3.client("batch", region_name="us-east-1") | ||
|
||
NR_OF_TILES_IN_SAMPLE_FILE = 1517 | ||
NR_OF_TILES_IN_SAMPLE_FILE = 2728 | ||
|
||
PC_KEY = os.environ["PC_SDK_SUBSCRIPTION_KEY"] | ||
|
||
PC_KEY = "***" | ||
|
||
for i in range(NR_OF_TILES_IN_SAMPLE_FILE): | ||
job = { | ||
"jobName": f"fetch-and-run-{i}", | ||
"jobQueue": "fetch-and-run", | ||
"jobDefinition": "fetch-and-run", | ||
"containerOverrides": { | ||
"command": ["datacube.py", "--index", f"{i}", "--bucket", "clay-tiles-02"], | ||
"environment": [ | ||
{"name": "BATCH_FILE_TYPE", "value": "zip"}, | ||
{ | ||
"name": "BATCH_FILE_S3_URL", | ||
"value": "s3://clay-fetch-and-run-packages/batch-fetch-and-run.zip", | ||
}, | ||
{"name": "PC_SDK_SUBSCRIPTION_KEY", "value": f"{PC_KEY}"}, | ||
], | ||
"resourceRequirements": [ | ||
{"type": "MEMORY", "value": "8000"}, | ||
{"type": "VCPU", "value": "4"}, | ||
], | ||
}, | ||
} | ||
job = { | ||
"jobName": "fetch-and-run-datacube", | ||
"jobQueue": "fetch-and-run", | ||
"jobDefinition": "fetch-and-run", | ||
"containerOverrides": { | ||
"command": [ | ||
"datacube.py", | ||
"--bucket", | ||
"clay-tiles-04-sample-v02", | ||
"--sample", | ||
"https://clay-mgrs-samples.s3.amazonaws.com/mgrs_sample_v02.fgb", | ||
], | ||
"environment": [ | ||
{"name": "BATCH_FILE_TYPE", "value": "zip"}, | ||
{ | ||
"name": "BATCH_FILE_S3_URL", | ||
"value": "s3://clay-fetch-and-run-packages/batch-fetch-and-run.zip", | ||
}, | ||
{"name": "PC_SDK_SUBSCRIPTION_KEY", "value": f"{PC_KEY}"}, | ||
], | ||
"resourceRequirements": [ | ||
{"type": "MEMORY", "value": "15500"}, | ||
{"type": "VCPU", "value": "4"}, | ||
], | ||
}, | ||
"arrayProperties": {"size": NR_OF_TILES_IN_SAMPLE_FILE}, | ||
} | ||
|
||
print(batch.submit_job(**job)) | ||
print(batch.submit_job(**job)) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we are not creating the job array anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are, but instead of running 2500 individual tasks, we are running one array job which bundles these subjobs into one parent. That is way easier to handle.