Ray implementation of maple inference pipeline #19

kaylahardie · 2024-04-12T22:29:40Z

To run the ray implementation of the maple inference pipeline:

Use the "environment_maple.yml" file to create a conda environment with ray:
run conda env create -f environment_maple.yml and then conda activate maple_py310_ray
Create the directory structures by running mpl_workflow_create_dir_struct.py
Add a sample input image to the data/input_img_local directory, I used the sample image here https://drive.google.com/file/d/1YwQiPc7Ow-oSyEHuCBD97RxRbnzJ4_dW/view?usp=drive_link
run python3 maple_workflow.py --gpus_per_core=0 for running the pipeline on a cpu, I haven't tried running it on more than one cpu. I've only tried running it on one image. When running it on two images locally it ran out of memory
The results from the pipeline should be in the data/ray_shapefiles dir, you can use compare_shapefile_features.py to compare the features in two shapefiles or you can use 'ogrinfo -so -al ' on the command line to examine a shapefile.

…es to how the dataclasses represented the dict

… be a List

…shapefiles

…gle process so instead of having the virtual file in the ray dataset, adapted the code to store the image bytes in the dataset and create the gdal virtual file locally when needed

…maple_workflow.py

…ardcoded, this is needed for service account impersonation if running the code on your local computer and want it to be able to access gcs buckets

kaylahardie · 2024-05-17T00:12:16Z

The most recent commit makes it possible to run the ray version of the maple pipeline using pdg's google cloud storage bucket instead of your local filesystem.

Here are the instructions for locally running (ie. running on your local computer) the ray version of the maple pipeline with access to gcs storage buckets:
To give the ray program that's running on your local computer access to the gcs storage bucket, it's best practice to use service account impersonation:

make sure your email that has access to the pdg-project also has the roles/iam.serviceAccountTokenCreator.
run gcloud auth application-default login --impersonate-service-account=pdg-sa-01@pdg-project-406720.iam.gserviceaccount.com --scopes="https://www.googleapis.com/auth/cloud-platform" It will ask you to sign in and then you will get access to an application default credentials directory. The terminal should print something like Credentials saved to file: [/usr/local/google/home/kaylahardie/.config/gcloud/application_default_credentials.json]
Run python3 ray_maple_workflow.py --gpus_per_core=0 --root_dir="gs://pdg-storage-default/workflows_optimization/maple_ray_pipeline" --adc_dir="/usr/local/google/home/kaylahardie/.config/gcloud/application_default_credentials.json" but replace the directory that you got from step 2 into the --adc_dir flag.
when the program finishes the output shapefiles will be stored in the directory: gs://pdg-storage-default/workflows_optimization/maple_ray_pipeline/data/ray_output_shapefiles/

Right now it just overwrites the old outputted files, we can add some versioning to the output file names to avoid overwriting the old files if desired. It takes as input, the data in the gs://pdg-storage-default/workflows_optimization/maple_ray_pipeline/data/input_img/ directory

kaylahardie · 2024-05-17T19:41:50Z

Most recent commit makes it so each run doesn't overwrite the data from previous runs. Each run now creates a date directory in the format "%Y-%m-%d_%H-%M-%S" in the gs://pdg-storage-default/workflows_optimization/maple_ray_pipeline/data/ray_output_shapefiles/ directory and in the date directory the output shapefiles for that run are stored

…ow.py). Changed the input img directory back to be the input_img_local (what it was originally)

gugibugy

First pass, will take a look at the rest on Monday.

compare_shapefile_features.py

gdal_virtual_file_system.py

mpl_config.py

ray_image_preprocessing.py

gugibugy · 2024-06-24T18:58:44Z

ray_infer_tiles.py

+        # Used to identify a specific predictor when mulitple predictors are
+        # created to run inference in parallel. The counter is also used to
+        # know which GPU to use when multiple are available.
+        self.process_counter = 1  # TODO need to fix this process_counter


This should be replaced by a call to ray.get_gpu_ids which will return a list of available GPUs. For now we could simply choose a value at random (if this become more complex we can re-adress this).

wouldn't we want to keep track of which ids have been used? if we just pick a random id we could randomly pick the same gpu right?

ray_infer_tiles.py

ray_tile_and_stitch_util.py

ray_write_shapefiles.py

ray_maple_workflow.py

gugibugy's review comments

ray_maple_workflow.py

ray_write_shapefiles.py

tcnichol · 2024-07-25T16:12:50Z

This one looks good to me, I've been able to run everything.

tcnichol

I'm going to mark this approved. It looks like all comments were addressed, runs fine, environment works.

gugibugy · 2024-07-26T23:45:55Z

ray_maple_workflow.py

+            create_directory_if_not_exists(config.RAY_OUTPUT_SHAPEFILES_DIR)
+    shapefiles_dataset = data_per_image.map(
+        fn=ray_write_shapefiles.WriteShapefiles, fn_constructor_kwargs={"config": config}, concurrency=concurrency)
+    print("MAPLE Ray pipeline finished, done writing shapefiles", shapefiles_dataset.schema())


You'll actually want to call materialize here instead of schema. Materialize actually runs the pipeline for all the rows in the dataset, while my understanding is that schema only materializes a single row in order to get the schema of the data set.

How'd you find out the schema only materializes a single row? I updated the code, I'm just curious

When I was prototyping with running on more than 1 image, if you did .schema() it would only produce results for a single row

kaylahardie · 2024-07-29T01:43:34Z

I updated the README as well a little bit. The Building Scalable ML Pipelines Using Ray is also updated. Going to merge now

kaylahardie added 24 commits March 18, 2024 23:19

Create ray conda environment

7d38e10

Format file

c89494f

Load geotiffs into ray dataset

156fdfb

Calculate Water Mask

de4a019

Apply water mask and tile image

9eb36fd

Starting to rayify inference but there's a bug

6793e73

Fix inference ray bug

d270836

Fixed creating image tiles -> needed to use deepcopy, also made chang…

fe30d57

…es to how the dataclasses represented the dict

Fix inference -> Ray throws type errors when we set a column value to…

c86e41d

… be a List

Remove old inference code

f07e7d7

Stitching image tiles back together using Ray

81c2b23

Close gdal files after opening them fix and also added code to write …

a71cce7

…shapefiles

Remove unnecessary code from stitch shapefile

0e3f5fe

fix gdal virtual file system issue - the gdal vfs only works in a sin…

833fcf2

…gle process so instead of having the virtual file in the ray dataset, adapted the code to store the image bytes in the dataset and create the gdal virtual file locally when needed

Break logic up into different files and fix imports

e9ce986

Rename files, add compare_unordered_shapefiles for testing purposes

89d5656

Add back original maple_workflow file

c03addd

fix file name indexing for shapefile

acfd23b

Add documentation for compare_unordered_shapefiles

68c1c95

Adding more documentation

bdbe6fb

More documentation

18dea96

More comments

99f7b41

Making it possible to read and write to gcs buckets when running ray_…

f43a4b6

…maple_workflow.py

add application default credentials directory as a flag so it isn't h…

24523c6

…ardcoded, this is needed for service account impersonation if running the code on your local computer and want it to be able to access gcs buckets

Each run is now outputted under it's own datetime directory

5cdf5c0

kaylahardie marked this pull request as ready for review June 18, 2024 16:51

kaylahardie requested review from gugibugy and lmarini June 18, 2024 16:51

kaylahardie requested review from tcnichol and amalshehan June 18, 2024 16:51

small fixes (trying to limit the changes on the original maple_workfl…

4fa1a8f

…ow.py). Changed the input img directory back to be the input_img_local (what it was originally)

gugibugy reviewed Jun 21, 2024

View reviewed changes

gugibugy reviewed Jun 24, 2024

View reviewed changes

addressing

fe5caf5

gugibugy's review comments

gugibugy reviewed Jul 10, 2024

View reviewed changes

ray_maple_workflow.py Outdated Show resolved Hide resolved

ray_write_shapefiles.py Outdated Show resolved Hide resolved

addressing more comments, fixing delete ray shapefile dir

156d209

tcnichol approved these changes Jul 26, 2024

View reviewed changes

gugibugy reviewed Jul 26, 2024

View reviewed changes

gugibugy approved these changes Jul 26, 2024

View reviewed changes

Updated readme

f170b34

kaylahardie added 3 commits July 29, 2024 01:44

fix merge conflict with weight file

9902b38

updating readme again

3c6a08a

Merge branch 'main' into try-ray-inference

785ecad

kaylahardie merged commit 2e6d1af into main Jul 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ray implementation of maple inference pipeline #19

Ray implementation of maple inference pipeline #19

kaylahardie commented Apr 12, 2024

kaylahardie commented May 17, 2024

kaylahardie commented May 17, 2024

gugibugy left a comment

gugibugy Jun 24, 2024

kaylahardie Jul 4, 2024

tcnichol commented Jul 25, 2024

tcnichol left a comment

gugibugy Jul 26, 2024

kaylahardie Jul 29, 2024

gugibugy Jul 29, 2024

kaylahardie commented Jul 29, 2024

Ray implementation of maple inference pipeline #19

Ray implementation of maple inference pipeline #19

Conversation

kaylahardie commented Apr 12, 2024

kaylahardie commented May 17, 2024

kaylahardie commented May 17, 2024

gugibugy left a comment

Choose a reason for hiding this comment

gugibugy Jun 24, 2024

Choose a reason for hiding this comment

kaylahardie Jul 4, 2024

Choose a reason for hiding this comment

tcnichol commented Jul 25, 2024

tcnichol left a comment

Choose a reason for hiding this comment

gugibugy Jul 26, 2024

Choose a reason for hiding this comment

kaylahardie Jul 29, 2024

Choose a reason for hiding this comment

gugibugy Jul 29, 2024

Choose a reason for hiding this comment

kaylahardie commented Jul 29, 2024