Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Pass replay buffers to SyncDataCollector #2384

Merged
merged 14 commits into from
Aug 13, 2024

Conversation

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Aug 9, 2024
ghstack-source-id: d4949410af9604e64c4d179608ebec7377710758
Pull Request resolved: #2384
Copy link

pytorch-bot bot commented Aug 9, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2384

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 21 Unrelated Failures

As of commit e917d9b with merge base 2b975da (image):

NEW FAILURE - The following job has failed:

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 9, 2024
@vmoens
Copy link
Contributor Author

vmoens commented Aug 9, 2024

Test:

import time
import tempfile
from torchrl.collectors import SyncDataCollector, RandomPolicy
from torchrl.envs import GymEnv, ParallelEnv
from torchrl.data import ReplayBuffer, LazyMemmapStorage

if __name__ == "__main__":
    with tempfile.TemporaryDirectory() as tmpdir:
        env = ParallelEnv(8, lambda: GymEnv("CartPole-v1"), mp_start_method="fork")
        env.reset()
        rb = ReplayBuffer(storage=LazyMemmapStorage(1000, scratch_dir=tmpdir, ndim=2), batch_size=100)
        collector = SyncDataCollector(env, RandomPolicy(env.action_spec), replay_buffer=rb, total_frames=20_000, frames_per_batch=100)

        for i, c in enumerate(collector):
            if i == 1:
                t0 = time.time()
            assert c is None
            rb.sample()
        print(f"{(collector.total_frames-collector.frames_per_batch) / (time.time() - t0):4.4f}fps")
        collector.shutdown()
        if not env.is_closed:
            env.close()
        del collector, env

    with tempfile.TemporaryDirectory() as tmpdir:
        env = ParallelEnv(8, lambda: GymEnv("CartPole-v1"), mp_start_method="fork")
        env.reset()
        rb = ReplayBuffer(storage=LazyMemmapStorage(1000, scratch_dir=tmpdir, ndim=2), batch_size=100)
        collector = SyncDataCollector(env, RandomPolicy(env.action_spec), total_frames=20_000, frames_per_batch=100)

        for i, c in enumerate(collector):
            if i == 1:
                t0 = time.time()
            rb.extend(c)
            rb.sample()
        print(f"{(collector.total_frames-collector.frames_per_batch) / (time.time() - t0):4.4f}fps")
        collector.shutdown()
        if not env.is_closed:
            env.close()
        del collector, env

I get 4700fps for collector + rb and 4800fps for collector and rb separated. RAM usage of the first option will be lower though

Copy link

github-actions bot commented Aug 9, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 91. Improved: $\large\color{#35bf28}3$. Worsened: $\large\color{#d91a1a}4$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 62.5186ms 60.9024ms 16.4197 Ops/s 16.9765 Ops/s $\color{#d91a1a}-3.28\%$
test_sync 41.6958ms 34.8739ms 28.6748 Ops/s 30.2512 Ops/s $\textbf{\color{#d91a1a}-5.21\%}$
test_async 56.4582ms 31.6238ms 31.6217 Ops/s 31.6605 Ops/s $\color{#d91a1a}-0.12\%$
test_simple 0.5079s 0.4280s 2.3363 Ops/s 2.3291 Ops/s $\color{#35bf28}+0.31\%$
test_transformed 0.6482s 0.5833s 1.7145 Ops/s 1.7203 Ops/s $\color{#d91a1a}-0.34\%$
test_serial 1.3482s 1.2664s 0.7896 Ops/s 0.7906 Ops/s $\color{#d91a1a}-0.13\%$
test_parallel 1.2132s 1.1373s 0.8793 Ops/s 0.8883 Ops/s $\color{#d91a1a}-1.02\%$
test_step_mdp_speed[True-True-True-True-True] 0.1533ms 24.8888μs 40.1788 KOps/s 39.7002 KOps/s $\color{#35bf28}+1.21\%$
test_step_mdp_speed[True-True-True-True-False] 65.4220μs 14.3886μs 69.4996 KOps/s 69.5977 KOps/s $\color{#d91a1a}-0.14\%$
test_step_mdp_speed[True-True-True-False-True] 58.1380μs 14.3609μs 69.6337 KOps/s 68.5476 KOps/s $\color{#35bf28}+1.58\%$
test_step_mdp_speed[True-True-True-False-False] 73.9580μs 8.2323μs 121.4722 KOps/s 121.4476 KOps/s $\color{#35bf28}+0.02\%$
test_step_mdp_speed[True-True-False-True-True] 59.4410μs 26.4662μs 37.7841 KOps/s 37.5699 KOps/s $\color{#35bf28}+0.57\%$
test_step_mdp_speed[True-True-False-True-False] 48.9310μs 15.9177μs 62.8232 KOps/s 63.0319 KOps/s $\color{#d91a1a}-0.33\%$
test_step_mdp_speed[True-True-False-False-True] 59.6010μs 15.8985μs 62.8989 KOps/s 62.9777 KOps/s $\color{#d91a1a}-0.13\%$
test_step_mdp_speed[True-True-False-False-False] 39.1030μs 9.8158μs 101.8765 KOps/s 102.5355 KOps/s $\color{#d91a1a}-0.64\%$
test_step_mdp_speed[True-False-True-True-True] 85.8440μs 28.4376μs 35.1647 KOps/s 34.9470 KOps/s $\color{#35bf28}+0.62\%$
test_step_mdp_speed[True-False-True-True-False] 76.4220μs 17.5615μs 56.9428 KOps/s 57.2959 KOps/s $\color{#d91a1a}-0.62\%$
test_step_mdp_speed[True-False-True-False-True] 51.7470μs 16.0235μs 62.4083 KOps/s 62.6481 KOps/s $\color{#d91a1a}-0.38\%$
test_step_mdp_speed[True-False-True-False-False] 33.5830μs 9.9267μs 100.7386 KOps/s 100.7535 KOps/s $\color{#d91a1a}-0.01\%$
test_step_mdp_speed[True-False-False-True-True] 93.7460μs 29.8644μs 33.4847 KOps/s 33.4685 KOps/s $\color{#35bf28}+0.05\%$
test_step_mdp_speed[True-False-False-True-False] 74.6690μs 19.0064μs 52.6139 KOps/s 52.6725 KOps/s $\color{#d91a1a}-0.11\%$
test_step_mdp_speed[True-False-False-False-True] 72.1040μs 17.3879μs 57.5113 KOps/s 57.1287 KOps/s $\color{#35bf28}+0.67\%$
test_step_mdp_speed[True-False-False-False-False] 64.4200μs 11.3559μs 88.0601 KOps/s 88.1349 KOps/s $\color{#d91a1a}-0.08\%$
test_step_mdp_speed[False-True-True-True-True] 69.2790μs 28.1947μs 35.4677 KOps/s 35.1358 KOps/s $\color{#35bf28}+0.94\%$
test_step_mdp_speed[False-True-True-True-False] 79.3080μs 17.4698μs 57.2416 KOps/s 56.7906 KOps/s $\color{#35bf28}+0.79\%$
test_step_mdp_speed[False-True-True-False-True] 52.0970μs 18.6332μs 53.6676 KOps/s 53.1117 KOps/s $\color{#35bf28}+1.05\%$
test_step_mdp_speed[False-True-True-False-False] 51.0650μs 11.0832μs 90.2266 KOps/s 90.0927 KOps/s $\color{#35bf28}+0.15\%$
test_step_mdp_speed[False-True-False-True-True] 69.8710μs 29.9212μs 33.4212 KOps/s 33.5655 KOps/s $\color{#d91a1a}-0.43\%$
test_step_mdp_speed[False-True-False-True-False] 67.6160μs 18.9954μs 52.6444 KOps/s 52.9163 KOps/s $\color{#d91a1a}-0.51\%$
test_step_mdp_speed[False-True-False-False-True] 77.7950μs 19.5313μs 51.1997 KOps/s 49.6615 KOps/s $\color{#35bf28}+3.10\%$
test_step_mdp_speed[False-True-False-False-False] 68.0870μs 12.3682μs 80.8528 KOps/s 80.0259 KOps/s $\color{#35bf28}+1.03\%$
test_step_mdp_speed[False-False-True-True-True] 3.4287ms 31.6484μs 31.5972 KOps/s 30.5036 KOps/s $\color{#35bf28}+3.58\%$
test_step_mdp_speed[False-False-True-True-False] 74.1390μs 20.4671μs 48.8589 KOps/s 48.2773 KOps/s $\color{#35bf28}+1.20\%$
test_step_mdp_speed[False-False-True-False-True] 70.4610μs 19.7928μs 50.5233 KOps/s 47.6435 KOps/s $\textbf{\color{#35bf28}+6.04\%}$
test_step_mdp_speed[False-False-True-False-False] 43.6420μs 12.4049μs 80.6130 KOps/s 79.8317 KOps/s $\color{#35bf28}+0.98\%$
test_step_mdp_speed[False-False-False-True-True] 99.0240μs 32.3465μs 30.9153 KOps/s 30.4769 KOps/s $\color{#35bf28}+1.44\%$
test_step_mdp_speed[False-False-False-True-False] 71.2420μs 21.9294μs 45.6008 KOps/s 45.3094 KOps/s $\color{#35bf28}+0.64\%$
test_step_mdp_speed[False-False-False-False-True] 75.8720μs 20.8738μs 47.9070 KOps/s 46.2766 KOps/s $\color{#35bf28}+3.52\%$
test_step_mdp_speed[False-False-False-False-False] 77.2540μs 13.8462μs 72.2218 KOps/s 71.7349 KOps/s $\color{#35bf28}+0.68\%$
test_values[generalized_advantage_estimate-True-True] 10.7101ms 9.7443ms 102.6236 Ops/s 103.2798 Ops/s $\color{#d91a1a}-0.64\%$
test_values[vec_generalized_advantage_estimate-True-True] 39.2164ms 36.2044ms 27.6209 Ops/s 27.9889 Ops/s $\color{#d91a1a}-1.31\%$
test_values[td0_return_estimate-False-False] 0.2138ms 0.1811ms 5.5213 KOps/s 5.9110 KOps/s $\textbf{\color{#d91a1a}-6.59\%}$
test_values[td1_return_estimate-False-False] 29.1629ms 24.4086ms 40.9692 Ops/s 41.3088 Ops/s $\color{#d91a1a}-0.82\%$
test_values[vec_td1_return_estimate-False-False] 38.7179ms 36.1425ms 27.6683 Ops/s 27.9579 Ops/s $\color{#d91a1a}-1.04\%$
test_values[td_lambda_return_estimate-True-False] 39.7602ms 35.2662ms 28.3557 Ops/s 28.9975 Ops/s $\color{#d91a1a}-2.21\%$
test_values[vec_td_lambda_return_estimate-True-False] 38.7522ms 36.1649ms 27.6511 Ops/s 27.9281 Ops/s $\color{#d91a1a}-0.99\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 9.1246ms 8.3451ms 119.8309 Ops/s 120.8908 Ops/s $\color{#d91a1a}-0.88\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.6667ms 2.0179ms 495.5568 Ops/s 508.4449 Ops/s $\color{#d91a1a}-2.53\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.5187ms 0.3614ms 2.7670 KOps/s 2.7775 KOps/s $\color{#d91a1a}-0.38\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 46.3480ms 45.0103ms 22.2171 Ops/s 22.5972 Ops/s $\color{#d91a1a}-1.68\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 3.5400ms 3.1020ms 322.3704 Ops/s 323.2781 Ops/s $\color{#d91a1a}-0.28\%$
test_dqn_speed 7.0050ms 1.3006ms 768.8471 Ops/s 774.9026 Ops/s $\color{#d91a1a}-0.78\%$
test_ddpg_speed 3.5905ms 2.7224ms 367.3261 Ops/s 369.3114 Ops/s $\color{#d91a1a}-0.54\%$
test_sac_speed 8.7999ms 8.1440ms 122.7892 Ops/s 124.2335 Ops/s $\color{#d91a1a}-1.16\%$
test_redq_speed 15.6912ms 12.9670ms 77.1186 Ops/s 76.9316 Ops/s $\color{#35bf28}+0.24\%$
test_redq_deprec_speed 14.7597ms 13.4132ms 74.5532 Ops/s 73.9116 Ops/s $\color{#35bf28}+0.87\%$
test_td3_speed 10.8072ms 8.2791ms 120.7867 Ops/s 121.4797 Ops/s $\color{#d91a1a}-0.57\%$
test_cql_speed 38.6821ms 36.4797ms 27.4125 Ops/s 25.1806 Ops/s $\textbf{\color{#35bf28}+8.86\%}$
test_a2c_speed 9.6194ms 7.7071ms 129.7497 Ops/s 127.5641 Ops/s $\color{#35bf28}+1.71\%$
test_ppo_speed 8.7600ms 8.0730ms 123.8702 Ops/s 123.5889 Ops/s $\color{#35bf28}+0.23\%$
test_reinforce_speed 8.0524ms 6.7561ms 148.0140 Ops/s 151.1997 Ops/s $\color{#d91a1a}-2.11\%$
test_iql_speed 35.0586ms 32.7877ms 30.4993 Ops/s 30.2712 Ops/s $\color{#35bf28}+0.75\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.6589ms 5.0909ms 196.4285 Ops/s 192.3303 Ops/s $\color{#35bf28}+2.13\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8371ms 0.4912ms 2.0360 KOps/s 2.0135 KOps/s $\color{#35bf28}+1.12\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6747ms 0.4687ms 2.1337 KOps/s 2.0907 KOps/s $\color{#35bf28}+2.06\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.3755ms 4.9491ms 202.0565 Ops/s 196.9878 Ops/s $\color{#35bf28}+2.57\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.1216s 0.5676ms 1.7618 KOps/s 2.0440 KOps/s $\textbf{\color{#d91a1a}-13.81\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6381ms 0.4608ms 2.1700 KOps/s 2.1578 KOps/s $\color{#35bf28}+0.57\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.3471ms 1.7005ms 588.0644 Ops/s 595.2886 Ops/s $\color{#d91a1a}-1.21\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.9361ms 1.6054ms 622.8949 Ops/s 627.4118 Ops/s $\color{#d91a1a}-0.72\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 8.7123ms 5.3916ms 185.4729 Ops/s 190.0682 Ops/s $\color{#d91a1a}-2.42\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.6277ms 0.6411ms 1.5598 KOps/s 1.6025 KOps/s $\color{#d91a1a}-2.66\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7838ms 0.6117ms 1.6348 KOps/s 1.6692 KOps/s $\color{#d91a1a}-2.07\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.3386ms 5.0591ms 197.6653 Ops/s 192.4336 Ops/s $\color{#35bf28}+2.72\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.4146ms 0.4993ms 2.0026 KOps/s 2.0031 KOps/s $\color{#d91a1a}-0.02\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6459ms 0.4662ms 2.1449 KOps/s 2.1302 KOps/s $\color{#35bf28}+0.69\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.7353ms 5.2208ms 191.5433 Ops/s 199.9401 Ops/s $\color{#d91a1a}-4.20\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.7153ms 0.4970ms 2.0122 KOps/s 1.9348 KOps/s $\color{#35bf28}+4.00\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6432ms 0.4668ms 2.1424 KOps/s 2.0963 KOps/s $\color{#35bf28}+2.20\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 8.0165ms 5.2374ms 190.9350 Ops/s 190.1828 Ops/s $\color{#35bf28}+0.40\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.5003ms 0.6429ms 1.5555 KOps/s 1.5901 KOps/s $\color{#d91a1a}-2.17\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7394ms 0.6092ms 1.6414 KOps/s 1.6767 KOps/s $\color{#d91a1a}-2.10\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.1393s 6.5725ms 152.1485 Ops/s 159.7288 Ops/s $\color{#d91a1a}-4.75\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 17.6582ms 13.1226ms 76.2041 Ops/s 76.6153 Ops/s $\color{#d91a1a}-0.54\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.9840ms 1.1721ms 853.1826 Ops/s 850.6005 Ops/s $\color{#35bf28}+0.30\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.1263s 6.2940ms 158.8805 Ops/s 117.3738 Ops/s $\textbf{\color{#35bf28}+35.36\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 0.1326s 15.7308ms 63.5694 Ops/s 75.6031 Ops/s $\textbf{\color{#d91a1a}-15.92\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 1.6972ms 1.1217ms 891.5206 Ops/s 881.1268 Ops/s $\color{#35bf28}+1.18\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.1255s 6.4573ms 154.8635 Ops/s 155.6585 Ops/s $\color{#d91a1a}-0.51\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 17.3844ms 13.2166ms 75.6625 Ops/s 73.9693 Ops/s $\color{#35bf28}+2.29\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 1.8453ms 1.2817ms 780.2150 Ops/s 767.7556 Ops/s $\color{#35bf28}+1.62\%$

Copy link

github-actions bot commented Aug 9, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 94. Improved: $\large\color{#35bf28}0$. Worsened: $\large\color{#d91a1a}4$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 0.1077s 0.1070s 9.3461 Ops/s 9.3625 Ops/s $\color{#d91a1a}-0.18\%$
test_sync 98.9522ms 95.1135ms 10.5138 Ops/s 10.8594 Ops/s $\color{#d91a1a}-3.18\%$
test_async 0.1759s 86.5381ms 11.5556 Ops/s 11.2126 Ops/s $\color{#35bf28}+3.06\%$
test_single_pixels 0.1176s 0.1171s 8.5403 Ops/s 8.5037 Ops/s $\color{#35bf28}+0.43\%$
test_sync_pixels 76.2439ms 75.1642ms 13.3042 Ops/s 12.8778 Ops/s $\color{#35bf28}+3.31\%$
test_async_pixels 0.1478s 70.8122ms 14.1219 Ops/s 14.1064 Ops/s $\color{#35bf28}+0.11\%$
test_simple 0.7605s 0.7592s 1.3171 Ops/s 1.2815 Ops/s $\color{#35bf28}+2.78\%$
test_transformed 1.0911s 1.0202s 0.9802 Ops/s 1.0060 Ops/s $\color{#d91a1a}-2.57\%$
test_serial 2.2563s 2.1779s 0.4592 Ops/s 0.4653 Ops/s $\color{#d91a1a}-1.31\%$
test_parallel 1.9512s 1.8862s 0.5302 Ops/s 0.5252 Ops/s $\color{#35bf28}+0.95\%$
test_step_mdp_speed[True-True-True-True-True] 0.1065ms 38.7312μs 25.8190 KOps/s 26.4738 KOps/s $\color{#d91a1a}-2.47\%$
test_step_mdp_speed[True-True-True-True-False] 51.3010μs 21.7543μs 45.9679 KOps/s 46.8031 KOps/s $\color{#d91a1a}-1.78\%$
test_step_mdp_speed[True-True-True-False-True] 39.2510μs 21.5599μs 46.3823 KOps/s 46.6728 KOps/s $\color{#d91a1a}-0.62\%$
test_step_mdp_speed[True-True-True-False-False] 36.5500μs 12.1013μs 82.6359 KOps/s 83.4489 KOps/s $\color{#d91a1a}-0.97\%$
test_step_mdp_speed[True-True-False-True-True] 61.1310μs 41.0172μs 24.3800 KOps/s 25.0480 KOps/s $\color{#d91a1a}-2.67\%$
test_step_mdp_speed[True-True-False-True-False] 46.8310μs 23.7766μs 42.0582 KOps/s 42.8363 KOps/s $\color{#d91a1a}-1.82\%$
test_step_mdp_speed[True-True-False-False-True] 49.4510μs 23.7667μs 42.0757 KOps/s 42.8546 KOps/s $\color{#d91a1a}-1.82\%$
test_step_mdp_speed[True-True-False-False-False] 39.6010μs 14.4020μs 69.4346 KOps/s 69.6378 KOps/s $\color{#d91a1a}-0.29\%$
test_step_mdp_speed[True-False-True-True-True] 63.1410μs 43.4102μs 23.0361 KOps/s 23.9222 KOps/s $\color{#d91a1a}-3.70\%$
test_step_mdp_speed[True-False-True-True-False] 44.3110μs 26.2143μs 38.1471 KOps/s 38.7954 KOps/s $\color{#d91a1a}-1.67\%$
test_step_mdp_speed[True-False-True-False-True] 73.0610μs 23.8158μs 41.9890 KOps/s 42.2464 KOps/s $\color{#d91a1a}-0.61\%$
test_step_mdp_speed[True-False-True-False-False] 37.3700μs 14.5429μs 68.7620 KOps/s 69.8257 KOps/s $\color{#d91a1a}-1.52\%$
test_step_mdp_speed[True-False-False-True-True] 67.4110μs 45.2871μs 22.0813 KOps/s 22.2331 KOps/s $\color{#d91a1a}-0.68\%$
test_step_mdp_speed[True-False-False-True-False] 48.3400μs 28.4989μs 35.0891 KOps/s 35.7934 KOps/s $\color{#d91a1a}-1.97\%$
test_step_mdp_speed[True-False-False-False-True] 42.5510μs 26.0760μs 38.3494 KOps/s 38.4925 KOps/s $\color{#d91a1a}-0.37\%$
test_step_mdp_speed[True-False-False-False-False] 38.9410μs 16.7283μs 59.7788 KOps/s 60.6036 KOps/s $\color{#d91a1a}-1.36\%$
test_step_mdp_speed[False-True-True-True-True] 80.0610μs 43.3214μs 23.0833 KOps/s 23.6125 KOps/s $\color{#d91a1a}-2.24\%$
test_step_mdp_speed[False-True-True-True-False] 46.0010μs 26.0822μs 38.3404 KOps/s 38.6217 KOps/s $\color{#d91a1a}-0.73\%$
test_step_mdp_speed[False-True-True-False-True] 51.9310μs 28.4978μs 35.0904 KOps/s 35.5543 KOps/s $\color{#d91a1a}-1.30\%$
test_step_mdp_speed[False-True-True-False-False] 34.4300μs 16.8090μs 59.4920 KOps/s 61.9044 KOps/s $\color{#d91a1a}-3.90\%$
test_step_mdp_speed[False-True-False-True-True] 80.0010μs 45.7090μs 21.8775 KOps/s 22.7984 KOps/s $\color{#d91a1a}-4.04\%$
test_step_mdp_speed[False-True-False-True-False] 63.2110μs 28.5919μs 34.9749 KOps/s 35.8560 KOps/s $\color{#d91a1a}-2.46\%$
test_step_mdp_speed[False-True-False-False-True] 57.1800μs 30.5958μs 32.6842 KOps/s 33.6266 KOps/s $\color{#d91a1a}-2.80\%$
test_step_mdp_speed[False-True-False-False-False] 41.7100μs 18.9845μs 52.6746 KOps/s 53.8788 KOps/s $\color{#d91a1a}-2.24\%$
test_step_mdp_speed[False-False-True-True-True] 3.9473ms 48.4431μs 20.6428 KOps/s 21.2787 KOps/s $\color{#d91a1a}-2.99\%$
test_step_mdp_speed[False-False-True-True-False] 63.0300μs 30.8519μs 32.4129 KOps/s 32.7231 KOps/s $\color{#d91a1a}-0.95\%$
test_step_mdp_speed[False-False-True-False-True] 56.3110μs 30.6154μs 32.6633 KOps/s 33.4772 KOps/s $\color{#d91a1a}-2.43\%$
test_step_mdp_speed[False-False-True-False-False] 44.0800μs 19.2414μs 51.9712 KOps/s 53.1553 KOps/s $\color{#d91a1a}-2.23\%$
test_step_mdp_speed[False-False-False-True-True] 75.3110μs 49.4812μs 20.2097 KOps/s 20.5773 KOps/s $\color{#d91a1a}-1.79\%$
test_step_mdp_speed[False-False-False-True-False] 55.0910μs 33.3213μs 30.0108 KOps/s 30.6319 KOps/s $\color{#d91a1a}-2.03\%$
test_step_mdp_speed[False-False-False-False-True] 51.2020μs 32.9885μs 30.3136 KOps/s 31.6413 KOps/s $\color{#d91a1a}-4.20\%$
test_step_mdp_speed[False-False-False-False-False] 37.7200μs 21.0361μs 47.5374 KOps/s 47.9975 KOps/s $\color{#d91a1a}-0.96\%$
test_values[generalized_advantage_estimate-True-True] 25.3269ms 24.8158ms 40.2968 Ops/s 40.4718 Ops/s $\color{#d91a1a}-0.43\%$
test_values[vec_generalized_advantage_estimate-True-True] 95.4059ms 2.8095ms 355.9384 Ops/s 372.1798 Ops/s $\color{#d91a1a}-4.36\%$
test_values[td0_return_estimate-False-False] 90.6620μs 66.4143μs 15.0570 KOps/s 15.0288 KOps/s $\color{#35bf28}+0.19\%$
test_values[td1_return_estimate-False-False] 55.8183ms 55.0422ms 18.1679 Ops/s 17.8553 Ops/s $\color{#35bf28}+1.75\%$
test_values[vec_td1_return_estimate-False-False] 1.3100ms 1.0886ms 918.6111 Ops/s 911.5455 Ops/s $\color{#35bf28}+0.78\%$
test_values[td_lambda_return_estimate-True-False] 88.3793ms 87.9032ms 11.3762 Ops/s 11.2769 Ops/s $\color{#35bf28}+0.88\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.3756ms 1.0912ms 916.4495 Ops/s 914.8911 Ops/s $\color{#35bf28}+0.17\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 24.9450ms 24.6135ms 40.6282 Ops/s 40.3879 Ops/s $\color{#35bf28}+0.59\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 0.9655ms 0.7264ms 1.3767 KOps/s 1.3525 KOps/s $\color{#35bf28}+1.79\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7366ms 0.6744ms 1.4828 KOps/s 1.4705 KOps/s $\color{#35bf28}+0.84\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5284ms 1.4764ms 677.3317 Ops/s 675.6364 Ops/s $\color{#35bf28}+0.25\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7498ms 0.6890ms 1.4513 KOps/s 1.4180 KOps/s $\color{#35bf28}+2.35\%$
test_dqn_speed 7.2020ms 1.3958ms 716.4278 Ops/s 730.2958 Ops/s $\color{#d91a1a}-1.90\%$
test_ddpg_speed 3.0417ms 2.7874ms 358.7584 Ops/s 363.1098 Ops/s $\color{#d91a1a}-1.20\%$
test_sac_speed 0.1010s 8.7937ms 113.7183 Ops/s 125.6362 Ops/s $\textbf{\color{#d91a1a}-9.49\%}$
test_redq_speed 11.5049ms 10.2209ms 97.8386 Ops/s 99.3357 Ops/s $\color{#d91a1a}-1.51\%$
test_redq_deprec_speed 11.3640ms 11.0668ms 90.3605 Ops/s 91.0146 Ops/s $\color{#d91a1a}-0.72\%$
test_td3_speed 8.3707ms 7.9901ms 125.1546 Ops/s 127.5516 Ops/s $\color{#d91a1a}-1.88\%$
test_cql_speed 26.5539ms 25.3258ms 39.4854 Ops/s 39.8755 Ops/s $\color{#d91a1a}-0.98\%$
test_a2c_speed 5.6563ms 5.4416ms 183.7698 Ops/s 180.5604 Ops/s $\color{#35bf28}+1.78\%$
test_ppo_speed 6.5577ms 5.7975ms 172.4871 Ops/s 170.7528 Ops/s $\color{#35bf28}+1.02\%$
test_reinforce_speed 4.7671ms 4.4903ms 222.7014 Ops/s 224.0064 Ops/s $\color{#d91a1a}-0.58\%$
test_iql_speed 20.2792ms 19.4934ms 51.2993 Ops/s 51.9445 Ops/s $\color{#d91a1a}-1.24\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.9100ms 6.7151ms 148.9191 Ops/s 149.9944 Ops/s $\color{#d91a1a}-0.72\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.1095ms 0.5210ms 1.9193 KOps/s 1.9346 KOps/s $\color{#d91a1a}-0.79\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6699ms 0.4981ms 2.0077 KOps/s 2.0176 KOps/s $\color{#d91a1a}-0.49\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.9282ms 6.5847ms 151.8666 Ops/s 151.5296 Ops/s $\color{#35bf28}+0.22\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.8500ms 0.5118ms 1.9537 KOps/s 1.9612 KOps/s $\color{#d91a1a}-0.38\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6773ms 0.4923ms 2.0315 KOps/s 2.0457 KOps/s $\color{#d91a1a}-0.70\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.2301ms 2.0059ms 498.5335 Ops/s 504.7533 Ops/s $\color{#d91a1a}-1.23\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.0424ms 1.8766ms 532.8846 Ops/s 517.4919 Ops/s $\color{#35bf28}+2.97\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.9309ms 6.7742ms 147.6192 Ops/s 146.8710 Ops/s $\color{#35bf28}+0.51\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.1441s 0.8047ms 1.2427 KOps/s 1.4972 KOps/s $\textbf{\color{#d91a1a}-17.00\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8224ms 0.6485ms 1.5420 KOps/s 1.5517 KOps/s $\color{#d91a1a}-0.63\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.8633ms 6.6973ms 149.3134 Ops/s 149.8483 Ops/s $\color{#d91a1a}-0.36\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.2152ms 0.5211ms 1.9189 KOps/s 1.9215 KOps/s $\color{#d91a1a}-0.14\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6957ms 0.4951ms 2.0196 KOps/s 2.0134 KOps/s $\color{#35bf28}+0.31\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.9107ms 6.5727ms 152.1439 Ops/s 151.2116 Ops/s $\color{#35bf28}+0.62\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.1258s 0.6881ms 1.4532 KOps/s 1.9443 KOps/s $\textbf{\color{#d91a1a}-25.26\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6515ms 0.4908ms 2.0374 KOps/s 2.0085 KOps/s $\color{#35bf28}+1.44\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.0387ms 6.8489ms 146.0080 Ops/s 146.2359 Ops/s $\color{#d91a1a}-0.16\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.7656ms 0.6718ms 1.4886 KOps/s 1.5045 KOps/s $\color{#d91a1a}-1.05\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 4.4065ms 0.6552ms 1.5263 KOps/s 1.5575 KOps/s $\color{#d91a1a}-2.00\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.1315s 7.7320ms 129.3321 Ops/s 127.6578 Ops/s $\color{#35bf28}+1.31\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 19.0373ms 16.4593ms 60.7560 Ops/s 62.2541 Ops/s $\color{#d91a1a}-2.41\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 2.4510ms 1.3027ms 767.6133 Ops/s 775.0076 Ops/s $\color{#d91a1a}-0.95\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.1266s 10.0863ms 99.1444 Ops/s 99.2870 Ops/s $\color{#d91a1a}-0.14\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 19.2160ms 16.2911ms 61.3832 Ops/s 61.5192 Ops/s $\color{#d91a1a}-0.22\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 2.4095ms 1.2917ms 774.1564 Ops/s 801.6241 Ops/s $\color{#d91a1a}-3.43\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.1263s 7.8394ms 127.5611 Ops/s 128.5814 Ops/s $\color{#d91a1a}-0.79\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 19.6366ms 16.8408ms 59.3797 Ops/s 61.4922 Ops/s $\color{#d91a1a}-3.44\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 7.6283ms 1.5697ms 637.0658 Ops/s 697.6879 Ops/s $\textbf{\color{#d91a1a}-8.69\%}$

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Aug 9, 2024
ghstack-source-id: 07885ddb54263059daa4e65f3e83fd3668500b9e
Pull Request resolved: #2384
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Aug 10, 2024
ghstack-source-id: 1ced2f68e4026e36a395b43c560fd45391f4587a
Pull Request resolved: #2384
@vmoens vmoens added the enhancement New feature or request label Aug 10, 2024
vmoens added 3 commits August 10, 2024 14:26
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
@vmoens vmoens mentioned this pull request Aug 10, 2024
vmoens added 8 commits August 11, 2024 11:44
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
@vmoens vmoens merged commit e917d9b into gh/vmoens/13/base Aug 13, 2024
9 of 14 checks passed
vmoens added a commit that referenced this pull request Aug 13, 2024
ghstack-source-id: 452d429b153284ebc06e89225eed0f6a7b6ad37b
Pull Request resolved: #2384
@vmoens vmoens deleted the gh/vmoens/13/head branch August 13, 2024 19:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants