Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding mux_longest DataPipe #372

Closed
wants to merge 1 commit into from

Conversation

ninginthecloud
Copy link
Contributor

Summary:
OSS issue discussion: #346
This diff updates mux_longest data pipe.

mux_longest: Yields one element at a time from each of the input Iterable DataPipes (functional name: mux_longest). As in, one element from the 1st input DataPipe, then one element from the 2nd DataPipe in the next iteration, and so on. It skips over DataPipes that are exhausted, and ends when all input DataPipes are exhausted. This is same as current MultiplexerIterDataPipe in pytorch (https://github.com/pytorch/pytorch/blob/4fb7fa081e4fb5df3bf7bc85dcb9a3a9a3ac7133/torch/utils/data/datapipes/iter/combining.py#L375-L390)

mux_longest example:

>>> from torchdata.datapipes.iter import IterableWrapper
>>> dp1, dp2, dp3 = IterableWrapper(range(5)), IterableWrapper(range(10, 15)), IterableWrapper(range(20, 25))
>>> list(dp1.mux_longest(dp2, dp3))
[0, 10, 20, 1, 11, 21, 2, 12, 22, 3, 13, 23, 4, 14, 24]

Reviewed By: NivekT, ejguan

Differential Revision: D35805772

@facebook-github-bot facebook-github-bot added CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported labels Apr 26, 2022
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D35805772

@NivekT NivekT changed the title Update mux_longest data pipe Adding mux_longest DataPipe Apr 26, 2022
ninginthecloud added a commit to ninginthecloud/data that referenced this pull request Apr 27, 2022
Summary:
Pull Request resolved: pytorch#372

OSS issue discussion: pytorch#346
This diff updates `mux_longest` data pipe.

`mux_longest`: Yields one element at a time from each of the input Iterable DataPipes (functional name: ``mux_longest``). As in, one element from the 1st input DataPipe, then one element from the 2nd DataPipe in the next iteration, and so on. It skips over DataPipes that are exhausted, and ends when all input DataPipes are exhausted. This is  same as current `MultiplexerIterDataPipe` in pytorch (https://github.com/pytorch/pytorch/blob/4fb7fa081e4fb5df3bf7bc85dcb9a3a9a3ac7133/torch/utils/data/datapipes/iter/combining.py#L375-L390)

`mux_longest` example:

```
>>> from torchdata.datapipes.iter import IterableWrapper
>>> dp1, dp2, dp3 = IterableWrapper(range(5)), IterableWrapper(range(10, 15)), IterableWrapper(range(20, 25))
>>> list(dp1.mux_longest(dp2, dp3))
[0, 10, 20, 1, 11, 21, 2, 12, 22, 3, 13, 23, 4, 14, 24]
```

Reviewed By: NivekT, ejguan

Differential Revision: D35805772

fbshipit-source-id: 095409427cf3714fb5f94bd99a090a6603526225
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D35805772

ninginthecloud added a commit to ninginthecloud/data that referenced this pull request Apr 27, 2022
Summary:
Pull Request resolved: pytorch#372

OSS issue discussion: pytorch#346
This diff updates `mux_longest` data pipe.

`mux_longest`: Yields one element at a time from each of the input Iterable DataPipes (functional name: ``mux_longest``). As in, one element from the 1st input DataPipe, then one element from the 2nd DataPipe in the next iteration, and so on. It skips over DataPipes that are exhausted, and ends when all input DataPipes are exhausted. This is  same as current `MultiplexerIterDataPipe` in pytorch (https://github.com/pytorch/pytorch/blob/4fb7fa081e4fb5df3bf7bc85dcb9a3a9a3ac7133/torch/utils/data/datapipes/iter/combining.py#L375-L390)

`mux_longest` example:

```
>>> from torchdata.datapipes.iter import IterableWrapper
>>> dp1, dp2, dp3 = IterableWrapper(range(5)), IterableWrapper(range(10, 15)), IterableWrapper(range(20, 25))
>>> list(dp1.mux_longest(dp2, dp3))
[0, 10, 20, 1, 11, 21, 2, 12, 22, 3, 13, 23, 4, 14, 24]
```

Reviewed By: NivekT, ejguan

Differential Revision: D35805772

fbshipit-source-id: db629550c51a5cd9ac90ee77e9942686f995e079
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D35805772

1 similar comment
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D35805772

ninginthecloud added a commit to ninginthecloud/data that referenced this pull request Apr 28, 2022
Summary:
Pull Request resolved: pytorch#372

OSS issue discussion: pytorch#346
This diff updates `mux_longest` data pipe.

`mux_longest`: Yields one element at a time from each of the input Iterable DataPipes (functional name: ``mux_longest``). As in, one element from the 1st input DataPipe, then one element from the 2nd DataPipe in the next iteration, and so on. It skips over DataPipes that are exhausted, and ends when all input DataPipes are exhausted. This is  same as current `MultiplexerIterDataPipe` in pytorch (https://github.com/pytorch/pytorch/blob/4fb7fa081e4fb5df3bf7bc85dcb9a3a9a3ac7133/torch/utils/data/datapipes/iter/combining.py#L375-L390)

`mux_longest` example:

```
>>> from torchdata.datapipes.iter import IterableWrapper
>>> dp1, dp2, dp3 = IterableWrapper(range(5)), IterableWrapper(range(10, 15)), IterableWrapper(range(20, 25))
>>> list(dp1.mux_longest(dp2, dp3))
[0, 10, 20, 1, 11, 21, 2, 12, 22, 3, 13, 23, 4, 14, 24]
```

Reviewed By: NivekT, ejguan

Differential Revision: D35805772

fbshipit-source-id: 91d4c09fb8b956492f3322463d9b19ac40b8ad78
Summary:
Pull Request resolved: pytorch#372

OSS issue discussion: pytorch#346
This diff updates `mux_longest` data pipe.

`mux_longest`: Yields one element at a time from each of the input Iterable DataPipes (functional name: ``mux_longest``). As in, one element from the 1st input DataPipe, then one element from the 2nd DataPipe in the next iteration, and so on. It skips over DataPipes that are exhausted, and ends when all input DataPipes are exhausted. This is  same as current `MultiplexerIterDataPipe` in pytorch (https://github.com/pytorch/pytorch/blob/4fb7fa081e4fb5df3bf7bc85dcb9a3a9a3ac7133/torch/utils/data/datapipes/iter/combining.py#L375-L390)

`mux_longest` example:

```
>>> from torchdata.datapipes.iter import IterableWrapper
>>> dp1, dp2, dp3 = IterableWrapper(range(5)), IterableWrapper(range(10, 15)), IterableWrapper(range(20, 25))
>>> list(dp1.mux_longest(dp2, dp3))
[0, 10, 20, 1, 11, 21, 2, 12, 22, 3, 13, 23, 4, 14, 24]
```

Reviewed By: NivekT, ejguan

Differential Revision: D35805772

fbshipit-source-id: 1d467c8fa8b6eac0d2b47a21779b73346ec07ebd
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D35805772

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants