Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[data] Add microbenchmark for reading and transforming images from preprocessed image files #37610

Merged

Conversation

stephanie-wang
Copy link
Contributor

Why are these changes needed?

MosaicML streaming and tf.data have their own specialized file formats to improve image load time. This adds a microbenchmark similar to read_images_comparison_microbenchmark_single_node, except that it first preprocesses the images into the right formats.

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Stephanie Wang <[email protected]>
Signed-off-by: Stephanie Wang <[email protected]>
Signed-off-by: Stephanie Wang <[email protected]>
@stephanie-wang
Copy link
Contributor Author

Signed-off-by: Stephanie Wang <[email protected]>
Signed-off-by: Stephanie Wang <[email protected]>
Signed-off-by: Stephanie Wang <[email protected]>
@stephanie-wang stephanie-wang added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Jul 27, 2023
Signed-off-by: Stephanie Wang <[email protected]>
Signed-off-by: Stephanie Wang <[email protected]>
Signed-off-by: Stephanie Wang <[email protected]>
Signed-off-by: Stephanie Wang <[email protected]>
Signed-off-by: Stephanie Wang <[email protected]>
Signed-off-by: Stephanie Wang <[email protected]>
Signed-off-by: Stephanie Wang <[email protected]>
@stephanie-wang stephanie-wang force-pushed the read-preprocssed-micro branch from c5a6e11 to 710e8fa Compare July 27, 2023 22:18
x
Signed-off-by: Stephanie Wang <[email protected]>
@stephanie-wang stephanie-wang merged commit e8db5da into ray-project:master Jul 27, 2023
stephanie-wang added a commit that referenced this pull request Jul 28, 2023
NripeshN pushed a commit to NripeshN/ray that referenced this pull request Aug 15, 2023
…eprocessed image files (ray-project#37610)

MosaicML streaming and tf.data have their own specialized file formats to improve image load time. This adds a microbenchmark similar to read_images_comparison_microbenchmark_single_node, except that it first preprocesses the images into the right formats.

---------

Signed-off-by: Stephanie Wang <[email protected]>
Signed-off-by: NripeshN <[email protected]>
NripeshN pushed a commit to NripeshN/ray that referenced this pull request Aug 15, 2023
Adds a dep required by ray-project#37610.

Signed-off-by: Stephanie Wang <[email protected]>
Signed-off-by: NripeshN <[email protected]>
arvind-chandra pushed a commit to lmco/ray that referenced this pull request Aug 31, 2023
…eprocessed image files (ray-project#37610)

MosaicML streaming and tf.data have their own specialized file formats to improve image load time. This adds a microbenchmark similar to read_images_comparison_microbenchmark_single_node, except that it first preprocesses the images into the right formats.

---------

Signed-off-by: Stephanie Wang <[email protected]>
Signed-off-by: e428265 <[email protected]>
arvind-chandra pushed a commit to lmco/ray that referenced this pull request Aug 31, 2023
Adds a dep required by ray-project#37610.

Signed-off-by: Stephanie Wang <[email protected]>
Signed-off-by: e428265 <[email protected]>
vymao pushed a commit to vymao/ray that referenced this pull request Oct 11, 2023
…eprocessed image files (ray-project#37610)

MosaicML streaming and tf.data have their own specialized file formats to improve image load time. This adds a microbenchmark similar to read_images_comparison_microbenchmark_single_node, except that it first preprocesses the images into the right formats.

---------

Signed-off-by: Stephanie Wang <[email protected]>
Signed-off-by: Victor <[email protected]>
vymao pushed a commit to vymao/ray that referenced this pull request Oct 11, 2023
Adds a dep required by ray-project#37610.

Signed-off-by: Stephanie Wang <[email protected]>
Signed-off-by: Victor <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants