-
Notifications
You must be signed in to change notification settings - Fork 153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DataLoader2] Saving and restoring initial seed generator #998
Commits on Feb 8, 2023
-
[DataLoader2] Saving and restoring initial seed generator
[ghstack-poisoned]
Configuration menu - View commit details
-
Copy full SHA for 05c0bf1 - Browse repository at this point
Copy the full SHA 05c0bf1View commit details -
Update on "[DataLoader2] Saving and restoring initial seed generator"
[ghstack-poisoned]
Configuration menu - View commit details
-
Copy full SHA for 45b6998 - Browse repository at this point
Copy the full SHA 45b6998View commit details
Commits on Feb 9, 2023
-
Update on "[DataLoader2] Saving and restoring initial seed generator"
Changes to `DataLoader2`: - Modifying `state_dict` to store the `initial_seed_generator` that is saved at the beginning of an epoch. - Modifying `from_state` and `load_state_dict` to restore `initial_seed_generator` if the user sets the parameter to `True` - Within `__iter__, skips over the re-seeding process if no manual seed has been specified AND the `seed_generator` was explicitly restored. --- ### Consideration I decided to make modification to the existing APIs. Alternatively, we can create a new method. The basic idea is that we want to allow users to restore `dl2._seed_generator` to the previously saved version, at the same time, we need to skip over the logic that re-do seeding in `__iter__` (hence the new variable `_skip_iteration_seeding` is needed. I see 2 main scenarios: 1. Users want to restore DataPipe and ReadingService but not the initial state of RNG - I think lots of current users (including some internals) are in this category. - This should work by default because `restore_initial_seed_generator=False` unless user explicitly change it 2. Users actively want to restore DP, RS, and initial state of RNG - Users will need to set an extra variable to `True` and we will make sure `_skip_iteration_seeding=True` so no re-seeding will happen in the first subsequent call of `__iter__` Finally, if users change their mind at any point (after restoring) and want to manual set `seed`. That `seed` will override any other behavior and the `seed` will be used. [ghstack-poisoned]
Configuration menu - View commit details
-
Copy full SHA for 9d6f38d - Browse repository at this point
Copy the full SHA 9d6f38dView commit details -
Update on "[DataLoader2] Saving and restoring initial seed generator"
Changes to `DataLoader2`: - Modifying `state_dict` to store the `initial_seed_generator` that is saved at the beginning of an epoch. - Modifying `from_state` and `load_state_dict` to restore `initial_seed_generator` if the user sets the parameter to `True` - Within `__iter__, skips over the re-seeding process if no manual seed has been specified AND the `seed_generator` was explicitly restored. --- ### Consideration I decided to make modification to the existing APIs. Alternatively, we can create a new method. The basic idea is that we want to allow users to restore `dl2._seed_generator` to the previously saved version, at the same time, we need to skip over the logic that re-do seeding in `__iter__` (hence the new variable `_skip_iteration_seeding` is needed. I see 2 main scenarios: 1. Users want to restore DataPipe and ReadingService but not the initial state of RNG - I think lots of current users (including some internals) are in this category. - This should work by default because `restore_initial_seed_generator=False` unless user explicitly change it 2. Users actively want to restore DP, RS, and initial state of RNG - Users will need to set an extra variable to `True` and we will make sure `_skip_iteration_seeding=True` so no re-seeding will happen in the first subsequent call of `__iter__` Finally, if users change their mind at any point (after restoring) and want to manual set `seed`. That `seed` will override any other behavior and the `seed` will be used. [ghstack-poisoned]
Configuration menu - View commit details
-
Copy full SHA for 389e567 - Browse repository at this point
Copy the full SHA 389e567View commit details
Commits on Feb 10, 2023
-
Update on "[DataLoader2] Saving and restoring initial seed generator"
Changes to `DataLoader2`: - Modifying `state_dict` to store the `initial_seed_generator` that is saved at the beginning of an epoch. - Modifying `from_state` and `load_state_dict` to restore `initial_seed_generator` if the user sets the parameter to `True` - Within `__iter__, skips over the re-seeding process if no manual seed has been specified AND the `seed_generator` was explicitly restored. --- ### Consideration I decided to make modification to the existing APIs. Alternatively, we can create a new method. The basic idea is that we want to allow users to restore `dl2._seed_generator` to the previously saved version, at the same time, we need to skip over the logic that re-do seeding in `__iter__` (hence the new variable `_skip_iteration_seeding` is needed. I see 2 main scenarios: 1. Users want to restore DataPipe and ReadingService but not the initial state of RNG - I think lots of current users (including some internals) are in this category. - This should work by default because `restore_initial_seed_generator=False` unless user explicitly change it 2. Users actively want to restore DP, RS, and initial state of RNG - Users will need to set an extra variable to `True` and we will make sure `_skip_iteration_seeding=True` so no re-seeding will happen in the first subsequent call of `__iter__` Finally, if users change their mind at any point (after restoring) and want to manual set `seed`. That `seed` will override any other behavior and the `seed` will be used. [ghstack-poisoned]
Configuration menu - View commit details
-
Copy full SHA for 955e412 - Browse repository at this point
Copy the full SHA 955e412View commit details -
Update on "[DataLoader2] Saving and restoring initial seed generator"
Changes to `DataLoader2`: - Modifying `state_dict` to store the `initial_seed_generator` that is saved at the beginning of an epoch. - Modifying `from_state` and `load_state_dict` to restore `initial_seed_generator` if the user sets the parameter to `True` - Within `__iter__, skips over the re-seeding process if no manual seed has been specified AND the `seed_generator` was explicitly restored. --- ### Consideration I decided to make modification to the existing APIs. Alternatively, we can create a new method. The basic idea is that we want to allow users to restore `dl2._seed_generator` to the previously saved version, at the same time, we need to skip over the logic that re-do seeding in `__iter__` (hence the new variable `_skip_iteration_seeding` is needed. I see 2 main scenarios: 1. Users want to restore DataPipe and ReadingService but not the initial state of RNG - I think lots of current users (including some internals) are in this category. - This should work by default because `restore_initial_seed_generator=False` unless user explicitly change it 2. Users actively want to restore DP, RS, and initial state of RNG - Users will need to set an extra variable to `True` and we will make sure `_skip_iteration_seeding=True` so no re-seeding will happen in the first subsequent call of `__iter__` Finally, if users change their mind at any point (after restoring) and want to manual set `seed`. That `seed` will override any other behavior and the `seed` will be used. [ghstack-poisoned]
Configuration menu - View commit details
-
Copy full SHA for 90278bf - Browse repository at this point
Copy the full SHA 90278bfView commit details
Commits on Feb 13, 2023
-
Update on "[DataLoader2] Saving and restoring initial seed generator"
Changes to `DataLoader2`: - Modifying `state_dict` to store the `initial_seed_generator` that is saved at the beginning of an epoch. - Modifying `from_state` and `load_state_dict` to restore `initial_seed_generator` if the user sets the parameter to `True` - Within `__iter__, skips over the re-seeding process if no manual seed has been specified AND the `seed_generator` was explicitly restored. --- ### Consideration I decided to make modification to the existing APIs. Alternatively, we can create a new method. The basic idea is that we want to allow users to restore `dl2._seed_generator` to the previously saved version, at the same time, we need to skip over the logic that re-do seeding in `__iter__` (hence the new variable `_skip_iteration_seeding` is needed. I see 2 main scenarios: 1. Users want to restore DataPipe and ReadingService but not the initial state of RNG - I think lots of current users (including some internals) are in this category. - This should work by default because `restore_initial_seed_generator=False` unless user explicitly change it 2. Users actively want to restore DP, RS, and initial state of RNG - Users will need to set an extra variable to `True` and we will make sure `_skip_iteration_seeding=True` so no re-seeding will happen in the first subsequent call of `__iter__` Finally, if users change their mind at any point (after restoring) and want to manual set `seed`. That `seed` will override any other behavior and the `seed` will be used. [ghstack-poisoned]
Configuration menu - View commit details
-
Copy full SHA for 1a8ebdd - Browse repository at this point
Copy the full SHA 1a8ebddView commit details
Commits on Feb 15, 2023
-
Update on "[DataLoader2] Saving and restoring initial seed generator"
Changes to `DataLoader2`: - Modifying `state_dict` to store the `initial_seed_generator` that is saved at the beginning of an epoch. - Modifying `from_state` and `load_state_dict` to restore `initial_seed_generator` if the user sets the parameter to `True` - Within `__iter__, skips over the re-seeding process if no manual seed has been specified AND the `seed_generator` was explicitly restored. --- ### Consideration I decided to make modification to the existing APIs. Alternatively, we can create a new method. The basic idea is that we want to allow users to restore `dl2._seed_generator` to the previously saved version, at the same time, we need to skip over the logic that re-do seeding in `__iter__` (hence the new variable `_skip_iteration_seeding` is needed. I see 2 main scenarios: 1. Users want to restore DataPipe and ReadingService but not the initial state of RNG - I think lots of current users (including some internals) are in this category. - This should work by default because `restore_initial_seed_generator=False` unless user explicitly change it 2. Users actively want to restore DP, RS, and initial state of RNG - Users will need to set an extra variable to `True` and we will make sure `_skip_iteration_seeding=True` so no re-seeding will happen in the first subsequent call of `__iter__` Finally, if users change their mind at any point (after restoring) and want to manual set `seed`. That `seed` will override any other behavior and the `seed` will be used. [ghstack-poisoned]
Configuration menu - View commit details
-
Copy full SHA for fa1f93f - Browse repository at this point
Copy the full SHA fa1f93fView commit details -
Update on "[DataLoader2] Saving and restoring initial seed generator"
Changes to `DataLoader2`: - Modifying `state_dict` to store the `initial_seed_generator` that is saved at the beginning of an epoch. - Modifying `from_state` and `load_state_dict` to restore `initial_seed_generator` if the user sets the parameter to `True` - Within `__iter__, skips over the re-seeding process if no manual seed has been specified AND the `seed_generator` was explicitly restored. --- ### Consideration I decided to make modification to the existing APIs. Alternatively, we can create a new method. The basic idea is that we want to allow users to restore `dl2._seed_generator` to the previously saved version, at the same time, we need to skip over the logic that re-do seeding in `__iter__` (hence the new variable `_skip_iteration_seeding` is needed. I see 2 main scenarios: 1. Users want to restore DataPipe and ReadingService but not the initial state of RNG - I think lots of current users (including some internals) are in this category. - This should work by default because `restore_initial_seed_generator=False` unless user explicitly change it 2. Users actively want to restore DP, RS, and initial state of RNG - Users will need to set an extra variable to `True` and we will make sure `_skip_iteration_seeding=True` so no re-seeding will happen in the first subsequent call of `__iter__` Finally, if users change their mind at any point (after restoring) and want to manual set `seed`. That `seed` will override any other behavior and the `seed` will be used. [ghstack-poisoned]
Configuration menu - View commit details
-
Copy full SHA for 4653af3 - Browse repository at this point
Copy the full SHA 4653af3View commit details -
Update on "[DataLoader2] Saving and restoring initial seed generator"
Changes to `DataLoader2`: - Modifying `state_dict` to store the `initial_seed_generator` that is saved at the beginning of an epoch. - Modifying `from_state` and `load_state_dict` to restore `initial_seed_generator` if the user sets the parameter to `True` - Within `__iter__, skips over the re-seeding process if no manual seed has been specified AND the `seed_generator` was explicitly restored. --- ### Consideration I decided to make modification to the existing APIs. Alternatively, we can create a new method. The basic idea is that we want to allow users to restore `dl2._seed_generator` to the previously saved version, at the same time, we need to skip over the logic that re-do seeding in `__iter__` (hence the new variable `_skip_iteration_seeding` is needed. I see 2 main scenarios: 1. Users want to restore DataPipe and ReadingService but not the initial state of RNG - I think lots of current users (including some internals) are in this category. - This should work by default because `restore_initial_seed_generator=False` unless user explicitly change it 2. Users actively want to restore DP, RS, and initial state of RNG - Users will need to set an extra variable to `True` and we will make sure `_skip_iteration_seeding=True` so no re-seeding will happen in the first subsequent call of `__iter__` Finally, if users change their mind at any point (after restoring) and want to manual set `seed`. That `seed` will override any other behavior and the `seed` will be used. [ghstack-poisoned]
Configuration menu - View commit details
-
Copy full SHA for 15f774e - Browse repository at this point
Copy the full SHA 15f774eView commit details
Commits on Feb 16, 2023
-
Update on "[DataLoader2] Saving and restoring initial seed generator"
Changes to `DataLoader2`: - Modifying `state_dict` to store the `initial_seed_generator` that is saved at the beginning of an epoch. - Modifying `from_state` and `load_state_dict` to restore `initial_seed_generator` if the user sets the parameter to `True` - Within `__iter__, skips over the re-seeding process if no manual seed has been specified AND the `seed_generator` was explicitly restored. --- ### Consideration I decided to make modification to the existing APIs. Alternatively, we can create a new method. The basic idea is that we want to allow users to restore `dl2._seed_generator` to the previously saved version, at the same time, we need to skip over the logic that re-do seeding in `__iter__` (hence the new variable `_skip_iteration_seeding` is needed. I see 2 main scenarios: 1. Users want to restore DataPipe and ReadingService but not the initial state of RNG - I think lots of current users (including some internals) are in this category. - This should work by default because `restore_initial_seed_generator=False` unless user explicitly change it 2. Users actively want to restore DP, RS, and initial state of RNG - Users will need to set an extra variable to `True` and we will make sure `_skip_iteration_seeding=True` so no re-seeding will happen in the first subsequent call of `__iter__` Finally, if users change their mind at any point (after restoring) and want to manual set `seed`. That `seed` will override any other behavior and the `seed` will be used. [ghstack-poisoned]
Configuration menu - View commit details
-
Copy full SHA for 5de743f - Browse repository at this point
Copy the full SHA 5de743fView commit details
Commits on Feb 28, 2023
-
Update on "[DataLoader2] Saving and restoring initial seed generator"
Changes to `DataLoader2`: - Modifying `state_dict` to store the `initial_seed_generator` that is saved at the beginning of an epoch. - Modifying `from_state` and `load_state_dict` to restore `initial_seed_generator` if the user sets the parameter to `True` - Within `__iter__, skips over the re-seeding process if no manual seed has been specified AND the `seed_generator` was explicitly restored. --- ### Consideration I decided to make modification to the existing APIs. Alternatively, we can create a new method. The basic idea is that we want to allow users to restore `dl2._seed_generator` to the previously saved version, at the same time, we need to skip over the logic that re-do seeding in `__iter__` (hence the new variable `_skip_iteration_seeding` is needed. I see 2 main scenarios: 1. Users want to restore DataPipe and ReadingService but not the initial state of RNG - I think lots of current users (including some internals) are in this category. - This should work by default because `restore_initial_seed_generator=False` unless user explicitly change it 2. Users actively want to restore DP, RS, and initial state of RNG - Users will need to set an extra variable to `True` and we will make sure `_skip_iteration_seeding=True` so no re-seeding will happen in the first subsequent call of `__iter__` Finally, if users change their mind at any point (after restoring) and want to manual set `seed`. That `seed` will override any other behavior and the `seed` will be used. [ghstack-poisoned]
Configuration menu - View commit details
-
Copy full SHA for 18a5c26 - Browse repository at this point
Copy the full SHA 18a5c26View commit details -
Update on "[DataLoader2] Saving and restoring initial seed generator"
Changes to `DataLoader2`: - Modifying `state_dict` to store the `initial_seed_generator` that is saved at the beginning of an epoch. - Modifying `from_state` and `load_state_dict` to restore `initial_seed_generator` if the user sets the parameter to `True` - Within `__iter__, skips over the re-seeding process if no manual seed has been specified AND the `seed_generator` was explicitly restored. --- ### Consideration I decided to make modification to the existing APIs. Alternatively, we can create a new method. The basic idea is that we want to allow users to restore `dl2._seed_generator` to the previously saved version, at the same time, we need to skip over the logic that re-do seeding in `__iter__` (hence the new variable `_skip_iteration_seeding` is needed. I see 2 main scenarios: 1. Users want to restore DataPipe and ReadingService but not the initial state of RNG - I think lots of current users (including some internals) are in this category. - This should work by default because `restore_initial_seed_generator=False` unless user explicitly change it 2. Users actively want to restore DP, RS, and initial state of RNG - Users will need to set an extra variable to `True` and we will make sure `_skip_iteration_seeding=True` so no re-seeding will happen in the first subsequent call of `__iter__` Finally, if users change their mind at any point (after restoring) and want to manual set `seed`. That `seed` will override any other behavior and the `seed` will be used. [ghstack-poisoned]
Configuration menu - View commit details
-
Copy full SHA for 9f87c01 - Browse repository at this point
Copy the full SHA 9f87c01View commit details
Commits on Mar 17, 2023
-
Update on "[DataLoader2] Saving and restoring initial seed generator"
Changes to `DataLoader2`: - Modifying `state_dict` to store `randomness_state`, which includes: - `_seed: int` - `_reset_seed: bool` - flag indicating whether `_seed` needs to be set - `_seed_generator` - the latest version at the time when `state_dict` is called - `_initial_seed_generator` - the versopm that is saved at the beginning of very epoch - Modifying `from_state` and `load_state_dict` to restore `randomness_state` - Adding a method `_restore_checkpoint_beginning_of_epoch` - This sets `self._seed_generator = self._initial_seed_generator`, allowing users to re-create an epoch from the beginning. --- ### Considerations Storing the randomness states provide more flexibility for users to restore as they see fit. The decision to do that should not be controversial. I decided to make add a new method for checkpointing at the beginning of the epoch, ensure that users are not confused about what randomness is restored by default. The basic idea is that we want to allow users to restore `dl2._seed_generator` to the previously saved version. From that point on, they can create a new `__iter__` and continue from the beginning of the epoch. - Note that since `_seed` and `_reset_seed` are also saved, if the users were planning to use a different seed or if there was a need to re-seed, those remain valid after restoring the checkpoint. - Finally, if users change their mind at any point (after restoring) and want to manual set `seed`. That `seed` will override any other behavior and the `seed` will be used. [ghstack-poisoned]
Configuration menu - View commit details
-
Copy full SHA for 3cdd2ea - Browse repository at this point
Copy the full SHA 3cdd2eaView commit details -
Update on "[DataLoader2] Saving and restoring initial seed generator"
Changes to `DataLoader2`: - Modifying `state_dict` to store `randomness_state`, which includes: - `_seed: int` - `_reset_seed: bool` - flag indicating whether `_seed` needs to be set - `_seed_generator` - the latest version at the time when `state_dict` is called - `_initial_seed_generator` - the versopm that is saved at the beginning of very epoch - Modifying `from_state` and `load_state_dict` to restore `randomness_state` - Adding a method `_restore_checkpoint_beginning_of_epoch` - This sets `self._seed_generator = self._initial_seed_generator`, allowing users to re-create an epoch from the beginning. --- ### Considerations Storing the randomness states provide more flexibility for users to restore as they see fit. The decision to do that should not be controversial. I decided to make add a new method for checkpointing at the beginning of the epoch, ensure that users are not confused about what randomness is restored by default. The basic idea is that we want to allow users to restore `dl2._seed_generator` to the previously saved version. From that point on, they can create a new `__iter__` and continue from the beginning of the epoch. - Note that since `_seed` and `_reset_seed` are also saved, if the users were planning to use a different seed or if there was a need to re-seed, those remain valid after restoring the checkpoint. - Finally, if users change their mind at any point (after restoring) and want to manual set `seed`. That `seed` will override any other behavior and the `seed` will be used. [ghstack-poisoned]
Configuration menu - View commit details
-
Copy full SHA for ef850ed - Browse repository at this point
Copy the full SHA ef850edView commit details -
Update on "[DataLoader2] Saving and restoring initial seed generator"
Changes to `DataLoader2`: - Modifying `state_dict` to store `randomness_state`, which includes: - `_seed: int` - `_reset_seed: bool` - flag indicating whether `_seed` needs to be set - `_seed_generator` - the latest version at the time when `state_dict` is called - `_initial_seed_generator` - the versopm that is saved at the beginning of very epoch - Modifying `from_state` and `load_state_dict` to restore `randomness_state` - Adding a method `_restore_checkpoint_beginning_of_epoch` - This sets `self._seed_generator = self._initial_seed_generator`, allowing users to re-create an epoch from the beginning. --- ### Considerations Storing the randomness states provide more flexibility for users to restore as they see fit. The decision to do that should not be controversial. I decided to make add a new method for checkpointing at the beginning of the epoch, ensure that users are not confused about what randomness is restored by default. The basic idea is that we want to allow users to restore `dl2._seed_generator` to the previously saved version. From that point on, they can create a new `__iter__` and continue from the beginning of the epoch. - Note that since `_seed` and `_reset_seed` are also saved, if the users were planning to use a different seed or if there was a need to re-seed, those remain valid after restoring the checkpoint. - Finally, if users change their mind at any point (after restoring) and want to manual set `seed`. That `seed` will override any other behavior and the `seed` will be used. [ghstack-poisoned]
Configuration menu - View commit details
-
Copy full SHA for b715066 - Browse repository at this point
Copy the full SHA b715066View commit details
Commits on Mar 24, 2023
-
Update on "[DataLoader2] Saving and restoring initial seed generator"
Changes to `DataLoader2`: - Modifying `state_dict` to store `randomness_state`, which includes: - `_seed: int` - `_reset_seed: bool` - flag indicating whether `_seed` needs to be set - `_seed_generator` - the latest version at the time when `state_dict` is called - `_initial_seed_generator` - the versopm that is saved at the beginning of very epoch - Modifying `from_state` and `load_state_dict` to restore `randomness_state` - Adding a method `_restore_checkpoint_beginning_of_epoch` - This sets `self._seed_generator = self._initial_seed_generator`, allowing users to re-create an epoch from the beginning. --- ### Considerations Storing the randomness states provide more flexibility for users to restore as they see fit. The decision to do that should not be controversial. I decided to make add a new method for checkpointing at the beginning of the epoch, ensure that users are not confused about what randomness is restored by default. The basic idea is that we want to allow users to restore `dl2._seed_generator` to the previously saved version. From that point on, they can create a new `__iter__` and continue from the beginning of the epoch. - Note that since `_seed` and `_reset_seed` are also saved, if the users were planning to use a different seed or if there was a need to re-seed, those remain valid after restoring the checkpoint. - Finally, if users change their mind at any point (after restoring) and want to manual set `seed`. That `seed` will override any other behavior and the `seed` will be used. [ghstack-poisoned]
Configuration menu - View commit details
-
Copy full SHA for 5e56222 - Browse repository at this point
Copy the full SHA 5e56222View commit details -
Update on "[DataLoader2] Saving and restoring initial seed generator"
Changes to `DataLoader2`: - Modifying `state_dict` to store `randomness_state`, which includes: - `_seed: int` - `_reset_seed: bool` - flag indicating whether `_seed` needs to be set - `_seed_generator` - the latest version at the time when `state_dict` is called - `_initial_seed_generator` - the versopm that is saved at the beginning of very epoch - Modifying `from_state` and `load_state_dict` to restore `randomness_state` - Adding a method `_restore_checkpoint_beginning_of_epoch` - This sets `self._seed_generator = self._initial_seed_generator`, allowing users to re-create an epoch from the beginning. --- ### Considerations Storing the randomness states provide more flexibility for users to restore as they see fit. The decision to do that should not be controversial. I decided to make add a new method for checkpointing at the beginning of the epoch, ensure that users are not confused about what randomness is restored by default. The basic idea is that we want to allow users to restore `dl2._seed_generator` to the previously saved version. From that point on, they can create a new `__iter__` and continue from the beginning of the epoch. - Note that since `_seed` and `_reset_seed` are also saved, if the users were planning to use a different seed or if there was a need to re-seed, those remain valid after restoring the checkpoint. - Finally, if users change their mind at any point (after restoring) and want to manual set `seed`. That `seed` will override any other behavior and the `seed` will be used. Differential Revision: [D44390519](https://our.internmc.facebook.com/intern/diff/D44390519) [ghstack-poisoned]
Configuration menu - View commit details
-
Copy full SHA for 0433509 - Browse repository at this point
Copy the full SHA 0433509View commit details
Commits on Mar 27, 2023
-
Update on "[DataLoader2] Saving and restoring initial seed generator"
Changes to `DataLoader2`: - Modifying `state_dict` to store `randomness_state`, which includes: - `_seed: int` - `_reset_seed: bool` - flag indicating whether `_seed` needs to be set - `_seed_generator` - the latest version at the time when `state_dict` is called - `_initial_seed_generator` - the versopm that is saved at the beginning of very epoch - Modifying `from_state` and `load_state_dict` to restore `randomness_state` - Adding a method `_restore_checkpoint_beginning_of_epoch` - This sets `self._seed_generator = self._initial_seed_generator`, allowing users to re-create an epoch from the beginning. --- ### Considerations Storing the randomness states provide more flexibility for users to restore as they see fit. The decision to do that should not be controversial. I decided to make add a new method for checkpointing at the beginning of the epoch, ensure that users are not confused about what randomness is restored by default. The basic idea is that we want to allow users to restore `dl2._seed_generator` to the previously saved version. From that point on, they can create a new `__iter__` and continue from the beginning of the epoch. - Note that since `_seed` and `_reset_seed` are also saved, if the users were planning to use a different seed or if there was a need to re-seed, those remain valid after restoring the checkpoint. - Finally, if users change their mind at any point (after restoring) and want to manual set `seed`. That `seed` will override any other behavior and the `seed` will be used. Differential Revision: [D44390519](https://our.internmc.facebook.com/intern/diff/D44390519) [ghstack-poisoned]
Configuration menu - View commit details
-
Copy full SHA for 0d4854c - Browse repository at this point
Copy the full SHA 0d4854cView commit details