-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Discussion] Is the implementation of cycler
efficient?
#742
Comments
I would support this proposal because I believe those two have different use cases. And, python does support such functionality via https://more-itertools.readthedocs.io/en/stable/api.html#more_itertools.repeat_each BTW, which are you using for profiling? It looks great. |
Adding an operation sounds good. I will probably modify the docstrings as well to mention that the other option exists and may be more suitable for some use cases. I am using scalene for profiling. |
Closing this as #748 has landed. |
TL;DR: It seems in most cases users might be better off using
.flatmap(lambda x: [x for _ in n_repeat])
rather than.cycle(n_repeat)
.Here is the implementation, basically
Cycler
reads from the source DataPipe forn
number of times.Things to consider:
n
number of times, unless you usein_memory_cache
.shuffle
is used afterwards, I believe.flatmap(lambda x: [x for _ in n_repeat])
is strictly better than.cycle(n_repeat)
.input = [0, 1, 2]
, the major difference is that.cycle
returns[0, 1, 2, 0, 1, 2]
compared to.flatmap(...)
returning[0, 0, 1, 1, 2, 2]
.Questions:
.repeat()
which basically does.flatmap(lambda x: [x for _ in n_repeat])
?.flatmap(...)
instead unless they specifically want the ordering of[0, 1, 2, 0, 1, 2]
?@VitalyFedyunin @ejguan Let me know what you think.
The text was updated successfully, but these errors were encountered: