Implement InProcessReadingService #1139

ejguan · 2023-04-21T14:41:55Z

Fixes #1107
Fixes #720
Fixes #616

Changes

Implement InProcessReadingService (Willing to take any suggestion on naming)
- Control shuffle and sharding (noop)
- Add support to pause/resume/limit
~~Make InProcessReadingService as the default reading_service to DataLoader2.~~
- ~~Then, reading_service always has a value, and remove the logic of reading_service is None.~~
Modify MultiProcessingReadingService
- When num_workers=0, raise a warning

facebook-github-bot · 2023-04-21T14:51:11Z

@ejguan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

torchdata/dataloader2/reading_service.py

torchdata/dataloader2/dataloader2.py

NivekT · 2023-04-21T21:51:58Z

torchdata/dataloader2/reading_service.py

+        worker_init_fn: Optional[Callable[[DataPipe, WorkerInfo], DataPipe]] = None,
+        worker_reset_fn: Optional[Callable[[DataPipe, WorkerInfo, SeedGenerator], DataPipe]] = None,
+    ):
+        if num_workers == 0:


I'm curious if num_workers == 1, what is the benefit of using this instead of SingleProcessRS? Will it actually be faster (probably slower)?

Using num_workers=1 would create a single worker process that is dedicated to loading data. And, the main process would work on the model training. The use case might be preventing the data part drain the CPU resource.

torchdata/dataloader2/reading_service.py

ejguan · 2023-04-24T16:24:46Z

torchdata/dataloader2/reading_service.py

        multiprocessing_context: Optional[str] = None,
        worker_prefetch_cnt: int = 10,
        main_prefetch_cnt: int = 10,
        worker_init_fn: Optional[Callable[[DataPipe, WorkerInfo], DataPipe]] = None,
        worker_reset_fn: Optional[Callable[[DataPipe, WorkerInfo, SeedGenerator], DataPipe]] = None,
    ) -> None:
+        assert num_workers > 0, "Please use `InProcessReadingService` for num_workers=0"


I am making a BC-breaking change here. MPRS won't accept num_workers=0 anymore.

In previous commits, I have been trying to add a __new__ to return InProcessReadingService when num_workers=0. However, it doesn't work well with pickling due to clone and mp.

facebook-github-bot · 2023-04-24T16:27:41Z

@ejguan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

ejguan · 2023-04-24T17:01:03Z

test/dataloader2/test_random.py

@@ -26,15 +26,15 @@ def _random_fn(data):
    Used to validate the randomness of subprocess-local RNGs are set deterministically.
    """
    py_random_num = random.randint(0, 2 ** 32)
-    np_random_num = np.random.randint(0, 2 ** 32)
+    np_random_num = np.random.randint(0, 2 ** 32 - 1)


Somehow this problem wasn't revealed until this PR

facebook-github-bot · 2023-04-24T17:01:29Z

@ejguan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

NivekT

Thanks! LGTM! Let's also add it to this documentation table here:

https://pytorch.org/data/beta/dataloader2.html#readingservice

facebook-github-bot · 2023-04-24T19:50:01Z

@ejguan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2023-04-24T23:07:16Z

@ejguan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

ejguan · 2023-04-25T16:31:08Z

torchdata/dataloader2/reading_service.py

+            if hasattr(dp, "resume") and callable(dp.resume):
+                dp.resume()


I need to skip QueueWrapper here as well.

facebook-github-bot · 2023-05-30T14:49:31Z

@ejguan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2023-05-30T17:44:38Z

@ejguan has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2023-05-31T00:33:08Z

@ejguan merged this pull request in 8f9d123.

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 21, 2023

ejguan requested a review from NivekT April 21, 2023 17:08

NivekT reviewed Apr 21, 2023

View reviewed changes

ejguan force-pushed the single_rs branch from 46de5fa to 303a947 Compare April 24, 2023 16:22

ejguan commented Apr 24, 2023

View reviewed changes

ejguan changed the title ~~Implement SingleProcessingReadingService~~ Implement InProcessReadingService Apr 24, 2023

ejguan requested a review from NivekT April 24, 2023 16:28

ejguan commented Apr 24, 2023

View reviewed changes

NivekT reviewed Apr 24, 2023

View reviewed changes

NivekT approved these changes Apr 24, 2023

View reviewed changes

ejguan commented Apr 25, 2023

View reviewed changes

ejguan added 9 commits May 30, 2023 14:16

Implement SingleProcessingReadingService

5c5801e

Fix mypy

e432338

Address comment

7a5f7c9

Remove __new__

1f760a6

Raise ValueError

dcd1b8e

Fix upper bound for numpy random

2862bf7

Add web doc

d79ceb1

Fix windows numpy random

26d018b

Revert default InProcessRS

98089f4

ejguan force-pushed the single_rs branch from 1813234 to 98089f4 Compare May 30, 2023 14:41

Fix lint

4717ce7

Give a default value to MPRS num_workers

981504e

facebook-github-bot closed this in 8f9d123 May 31, 2023

facebook-github-bot added the Merged label May 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement InProcessReadingService #1139

Implement InProcessReadingService #1139

ejguan commented Apr 21, 2023 •

edited

Loading

facebook-github-bot commented Apr 21, 2023

NivekT Apr 21, 2023

ejguan Apr 21, 2023

ejguan Apr 24, 2023

facebook-github-bot commented Apr 24, 2023

ejguan Apr 24, 2023

facebook-github-bot commented Apr 24, 2023

NivekT left a comment

facebook-github-bot commented Apr 24, 2023

facebook-github-bot commented Apr 24, 2023

ejguan Apr 25, 2023

facebook-github-bot commented May 30, 2023

facebook-github-bot commented May 30, 2023

facebook-github-bot commented May 31, 2023

		if hasattr(dp, "resume") and callable(dp.resume):
		dp.resume()

Implement InProcessReadingService #1139

Implement InProcessReadingService #1139

Conversation

ejguan commented Apr 21, 2023 • edited Loading

Changes

facebook-github-bot commented Apr 21, 2023

NivekT Apr 21, 2023

Choose a reason for hiding this comment

ejguan Apr 21, 2023

Choose a reason for hiding this comment

ejguan Apr 24, 2023

Choose a reason for hiding this comment

facebook-github-bot commented Apr 24, 2023

ejguan Apr 24, 2023

Choose a reason for hiding this comment

facebook-github-bot commented Apr 24, 2023

NivekT left a comment

Choose a reason for hiding this comment

facebook-github-bot commented Apr 24, 2023

facebook-github-bot commented Apr 24, 2023

ejguan Apr 25, 2023

Choose a reason for hiding this comment

facebook-github-bot commented May 30, 2023

facebook-github-bot commented May 30, 2023

facebook-github-bot commented May 31, 2023

ejguan commented Apr 21, 2023 •

edited

Loading