[Feature] Pass replay buffers to SyncDataCollector #2384

vmoens · 2024-08-09T15:08:24Z

Stack from ghstack (oldest at bottom):

[ghstack-poisoned]

ghstack-source-id: d4949410af9604e64c4d179608ebec7377710758 Pull Request resolved: #2384

pytorch-bot · 2024-08-09T15:08:30Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2384

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 21 Unrelated Failures

As of commit e917d9b with merge base 2b975da ():

NEW FAILURE - The following job has failed:

Unit-tests on Windows / unittests-cpu / windows-job (gh)
The process 'C:\Program Files\Git\cmd\git.exe' failed with exit code 128

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

Build Windows Wheels / pytorch/rl (pytorch/rl, python packaging/wheel/relocate.py, test/smoke_test.py, torchrl) / upload / wheel-py3_9-cpu (gh) (detected as infra flaky with no log or failing log classifier)
Build Windows Wheels / pytorch/rl (pytorch/rl, python packaging/wheel/relocate.py, test/smoke_test.py, torchrl) / upload / wheel-py3_9-cuda12_4 (gh) (similar failure)
Unable to find any artifacts for the associated workflow
Examples Tests on Linux / tests (3.9, 12.1) / linux-job (gh) (matched linux rule in flaky-rules.json)
The process '/usr/bin/git' failed with exit code 128
Habitat Tests on Linux / tests (3.9, 12.1) / linux-job (gh) (matched linux rule in flaky-rules.json)
The process '/usr/bin/git' failed with exit code 128
Libs Tests on Linux / unittests-gym (3.9, 12.1) / linux-job (gh) (matched linux rule in flaky-rules.json)
The process '/usr/bin/git' failed with exit code 128
Libs Tests on Linux / unittests-sklearn (3.9, 12.1) / linux-job (gh) (matched linux rule in flaky-rules.json)
The process '/usr/bin/git' failed with exit code 128
Lint / c-source / linux-job (gh) (matched linux rule in flaky-rules.json)
The process '/usr/bin/git' failed with exit code 128
Lint / python-source-and-configs / linux-job (gh) (similar failure)
RLHF Tests on Linux / unittests (3.9, 12.1) / linux-job (gh) (matched linux rule in flaky-rules.json)
The process '/usr/bin/git' failed with exit code 128
Unit-tests on Linux / tests-cpu (3.10) / linux-job (gh) (matched linux rule in flaky-rules.json)
The process '/usr/bin/git' failed with exit code 128
Unit-tests on Linux / tests-cpu (3.11) / linux-job (gh) (matched linux rule in flaky-rules.json)
The process '/usr/bin/git' failed with exit code 128
Unit-tests on Linux / tests-cpu (3.12) / linux-job (gh) (matched linux rule in flaky-rules.json)
The process '/usr/bin/git' failed with exit code 128
Unit-tests on Linux / tests-cpu (3.8) / linux-job (gh) (matched linux rule in flaky-rules.json)
The process '/usr/bin/git' failed with exit code 128
Unit-tests on Linux / tests-cpu (3.9) / linux-job (gh) (matched linux rule in flaky-rules.json)
The process '/usr/bin/git' failed with exit code 128
Unit-tests on Linux / tests-cpu-oldget (3.12) / linux-job (gh) (matched linux rule in flaky-rules.json)
The process '/usr/bin/git' failed with exit code 128
Unit-tests on Linux / tests-gpu (3.11, 12.1) / linux-job (gh) (matched linux rule in flaky-rules.json)
The process '/usr/bin/git' failed with exit code 128
Unit-tests on Linux / tests-olddeps (3.8, 11.6) / linux-job (gh) (matched linux rule in flaky-rules.json)
The process '/usr/bin/git' failed with exit code 128
Unit-tests on Linux / tests-optdeps (3.10, 12.1) / linux-job (gh) (matched linux rule in flaky-rules.json)
The process '/usr/bin/git' failed with exit code 128
Unit-tests on Linux / tests-stable-gpu (3.10, 11.8) / linux-job (gh) (matched linux rule in flaky-rules.json)
The process '/usr/bin/git' failed with exit code 128

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

Build Windows Wheels / pytorch/rl (pytorch/rl, python packaging/wheel/relocate.py, test/smoke_test.py, torchrl) / upload / wheel-py3_9-cuda11_8 (gh) (trunk failure)
Unable to find any artifacts for the associated workflow
Build Windows Wheels / pytorch/rl (pytorch/rl, python packaging/wheel/relocate.py, test/smoke_test.py, torchrl) / upload / wheel-py3_9-cuda12_1 (gh) (trunk failure)
Unable to find any artifacts for the associated workflow

This comment was automatically generated by Dr. CI and updates every 15 minutes.

vmoens · 2024-08-09T15:08:44Z

Test:

import time
import tempfile
from torchrl.collectors import SyncDataCollector, RandomPolicy
from torchrl.envs import GymEnv, ParallelEnv
from torchrl.data import ReplayBuffer, LazyMemmapStorage

if __name__ == "__main__":
    with tempfile.TemporaryDirectory() as tmpdir:
        env = ParallelEnv(8, lambda: GymEnv("CartPole-v1"), mp_start_method="fork")
        env.reset()
        rb = ReplayBuffer(storage=LazyMemmapStorage(1000, scratch_dir=tmpdir, ndim=2), batch_size=100)
        collector = SyncDataCollector(env, RandomPolicy(env.action_spec), replay_buffer=rb, total_frames=20_000, frames_per_batch=100)

        for i, c in enumerate(collector):
            if i == 1:
                t0 = time.time()
            assert c is None
            rb.sample()
        print(f"{(collector.total_frames-collector.frames_per_batch) / (time.time() - t0):4.4f}fps")
        collector.shutdown()
        if not env.is_closed:
            env.close()
        del collector, env

    with tempfile.TemporaryDirectory() as tmpdir:
        env = ParallelEnv(8, lambda: GymEnv("CartPole-v1"), mp_start_method="fork")
        env.reset()
        rb = ReplayBuffer(storage=LazyMemmapStorage(1000, scratch_dir=tmpdir, ndim=2), batch_size=100)
        collector = SyncDataCollector(env, RandomPolicy(env.action_spec), total_frames=20_000, frames_per_batch=100)

        for i, c in enumerate(collector):
            if i == 1:
                t0 = time.time()
            rb.extend(c)
            rb.sample()
        print(f"{(collector.total_frames-collector.frames_per_batch) / (time.time() - t0):4.4f}fps")
        collector.shutdown()
        if not env.is_closed:
            env.close()
        del collector, env

I get 4700fps for collector + rb and 4800fps for collector and rb separated. RAM usage of the first option will be lower though

github-actions · 2024-08-09T15:16:09Z

$\color{#D29922}\textsf{\Large&#x26A0;\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 91. Improved: $\large\color{#35bf28}3$. Worsened: $\large\color{#d91a1a}4$.

Expand to view detailed results

Name	Max	Mean	Ops	Ops on Repo `HEAD`	Change
test_single	62.5186ms	60.9024ms	16.4197 Ops/s	16.9765 Ops/s	$\color{#d91a1a}-3.28\%$
test_sync	41.6958ms	34.8739ms	28.6748 Ops/s	30.2512 Ops/s	$\textbf{\color{#d91a1a}-5.21\%}$
test_async	56.4582ms	31.6238ms	31.6217 Ops/s	31.6605 Ops/s	$\color{#d91a1a}-0.12\%$
test_simple	0.5079s	0.4280s	2.3363 Ops/s	2.3291 Ops/s	$\color{#35bf28}+0.31\%$
test_transformed	0.6482s	0.5833s	1.7145 Ops/s	1.7203 Ops/s	$\color{#d91a1a}-0.34\%$
test_serial	1.3482s	1.2664s	0.7896 Ops/s	0.7906 Ops/s	$\color{#d91a1a}-0.13\%$
test_parallel	1.2132s	1.1373s	0.8793 Ops/s	0.8883 Ops/s	$\color{#d91a1a}-1.02\%$
test_step_mdp_speed[True-True-True-True-True]	0.1533ms	24.8888μs	40.1788 KOps/s	39.7002 KOps/s	$\color{#35bf28}+1.21\%$
test_step_mdp_speed[True-True-True-True-False]	65.4220μs	14.3886μs	69.4996 KOps/s	69.5977 KOps/s	$\color{#d91a1a}-0.14\%$
test_step_mdp_speed[True-True-True-False-True]	58.1380μs	14.3609μs	69.6337 KOps/s	68.5476 KOps/s	$\color{#35bf28}+1.58\%$
test_step_mdp_speed[True-True-True-False-False]	73.9580μs	8.2323μs	121.4722 KOps/s	121.4476 KOps/s	$\color{#35bf28}+0.02\%$
test_step_mdp_speed[True-True-False-True-True]	59.4410μs	26.4662μs	37.7841 KOps/s	37.5699 KOps/s	$\color{#35bf28}+0.57\%$
test_step_mdp_speed[True-True-False-True-False]	48.9310μs	15.9177μs	62.8232 KOps/s	63.0319 KOps/s	$\color{#d91a1a}-0.33\%$
test_step_mdp_speed[True-True-False-False-True]	59.6010μs	15.8985μs	62.8989 KOps/s	62.9777 KOps/s	$\color{#d91a1a}-0.13\%$
test_step_mdp_speed[True-True-False-False-False]	39.1030μs	9.8158μs	101.8765 KOps/s	102.5355 KOps/s	$\color{#d91a1a}-0.64\%$
test_step_mdp_speed[True-False-True-True-True]	85.8440μs	28.4376μs	35.1647 KOps/s	34.9470 KOps/s	$\color{#35bf28}+0.62\%$
test_step_mdp_speed[True-False-True-True-False]	76.4220μs	17.5615μs	56.9428 KOps/s	57.2959 KOps/s	$\color{#d91a1a}-0.62\%$
test_step_mdp_speed[True-False-True-False-True]	51.7470μs	16.0235μs	62.4083 KOps/s	62.6481 KOps/s	$\color{#d91a1a}-0.38\%$
test_step_mdp_speed[True-False-True-False-False]	33.5830μs	9.9267μs	100.7386 KOps/s	100.7535 KOps/s	$\color{#d91a1a}-0.01\%$
test_step_mdp_speed[True-False-False-True-True]	93.7460μs	29.8644μs	33.4847 KOps/s	33.4685 KOps/s	$\color{#35bf28}+0.05\%$
test_step_mdp_speed[True-False-False-True-False]	74.6690μs	19.0064μs	52.6139 KOps/s	52.6725 KOps/s	$\color{#d91a1a}-0.11\%$
test_step_mdp_speed[True-False-False-False-True]	72.1040μs	17.3879μs	57.5113 KOps/s	57.1287 KOps/s	$\color{#35bf28}+0.67\%$
test_step_mdp_speed[True-False-False-False-False]	64.4200μs	11.3559μs	88.0601 KOps/s	88.1349 KOps/s	$\color{#d91a1a}-0.08\%$
test_step_mdp_speed[False-True-True-True-True]	69.2790μs	28.1947μs	35.4677 KOps/s	35.1358 KOps/s	$\color{#35bf28}+0.94\%$
test_step_mdp_speed[False-True-True-True-False]	79.3080μs	17.4698μs	57.2416 KOps/s	56.7906 KOps/s	$\color{#35bf28}+0.79\%$
test_step_mdp_speed[False-True-True-False-True]	52.0970μs	18.6332μs	53.6676 KOps/s	53.1117 KOps/s	$\color{#35bf28}+1.05\%$
test_step_mdp_speed[False-True-True-False-False]	51.0650μs	11.0832μs	90.2266 KOps/s	90.0927 KOps/s	$\color{#35bf28}+0.15\%$
test_step_mdp_speed[False-True-False-True-True]	69.8710μs	29.9212μs	33.4212 KOps/s	33.5655 KOps/s	$\color{#d91a1a}-0.43\%$
test_step_mdp_speed[False-True-False-True-False]	67.6160μs	18.9954μs	52.6444 KOps/s	52.9163 KOps/s	$\color{#d91a1a}-0.51\%$
test_step_mdp_speed[False-True-False-False-True]	77.7950μs	19.5313μs	51.1997 KOps/s	49.6615 KOps/s	$\color{#35bf28}+3.10\%$
test_step_mdp_speed[False-True-False-False-False]	68.0870μs	12.3682μs	80.8528 KOps/s	80.0259 KOps/s	$\color{#35bf28}+1.03\%$
test_step_mdp_speed[False-False-True-True-True]	3.4287ms	31.6484μs	31.5972 KOps/s	30.5036 KOps/s	$\color{#35bf28}+3.58\%$
test_step_mdp_speed[False-False-True-True-False]	74.1390μs	20.4671μs	48.8589 KOps/s	48.2773 KOps/s	$\color{#35bf28}+1.20\%$
test_step_mdp_speed[False-False-True-False-True]	70.4610μs	19.7928μs	50.5233 KOps/s	47.6435 KOps/s	$\textbf{\color{#35bf28}+6.04\%}$
test_step_mdp_speed[False-False-True-False-False]	43.6420μs	12.4049μs	80.6130 KOps/s	79.8317 KOps/s	$\color{#35bf28}+0.98\%$
test_step_mdp_speed[False-False-False-True-True]	99.0240μs	32.3465μs	30.9153 KOps/s	30.4769 KOps/s	$\color{#35bf28}+1.44\%$
test_step_mdp_speed[False-False-False-True-False]	71.2420μs	21.9294μs	45.6008 KOps/s	45.3094 KOps/s	$\color{#35bf28}+0.64\%$
test_step_mdp_speed[False-False-False-False-True]	75.8720μs	20.8738μs	47.9070 KOps/s	46.2766 KOps/s	$\color{#35bf28}+3.52\%$
test_step_mdp_speed[False-False-False-False-False]	77.2540μs	13.8462μs	72.2218 KOps/s	71.7349 KOps/s	$\color{#35bf28}+0.68\%$
test_values[generalized_advantage_estimate-True-True]	10.7101ms	9.7443ms	102.6236 Ops/s	103.2798 Ops/s	$\color{#d91a1a}-0.64\%$
test_values[vec_generalized_advantage_estimate-True-True]	39.2164ms	36.2044ms	27.6209 Ops/s	27.9889 Ops/s	$\color{#d91a1a}-1.31\%$
test_values[td0_return_estimate-False-False]	0.2138ms	0.1811ms	5.5213 KOps/s	5.9110 KOps/s	$\textbf{\color{#d91a1a}-6.59\%}$
test_values[td1_return_estimate-False-False]	29.1629ms	24.4086ms	40.9692 Ops/s	41.3088 Ops/s	$\color{#d91a1a}-0.82\%$
test_values[vec_td1_return_estimate-False-False]	38.7179ms	36.1425ms	27.6683 Ops/s	27.9579 Ops/s	$\color{#d91a1a}-1.04\%$
test_values[td_lambda_return_estimate-True-False]	39.7602ms	35.2662ms	28.3557 Ops/s	28.9975 Ops/s	$\color{#d91a1a}-2.21\%$
test_values[vec_td_lambda_return_estimate-True-False]	38.7522ms	36.1649ms	27.6511 Ops/s	27.9281 Ops/s	$\color{#d91a1a}-0.99\%$
test_gae_speed[generalized_advantage_estimate-False-1-512]	9.1246ms	8.3451ms	119.8309 Ops/s	120.8908 Ops/s	$\color{#d91a1a}-0.88\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512]	2.6667ms	2.0179ms	495.5568 Ops/s	508.4449 Ops/s	$\color{#d91a1a}-2.53\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512]	0.5187ms	0.3614ms	2.7670 KOps/s	2.7775 KOps/s	$\color{#d91a1a}-0.38\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512]	46.3480ms	45.0103ms	22.2171 Ops/s	22.5972 Ops/s	$\color{#d91a1a}-1.68\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512]	3.5400ms	3.1020ms	322.3704 Ops/s	323.2781 Ops/s	$\color{#d91a1a}-0.28\%$
test_dqn_speed	7.0050ms	1.3006ms	768.8471 Ops/s	774.9026 Ops/s	$\color{#d91a1a}-0.78\%$
test_ddpg_speed	3.5905ms	2.7224ms	367.3261 Ops/s	369.3114 Ops/s	$\color{#d91a1a}-0.54\%$
test_sac_speed	8.7999ms	8.1440ms	122.7892 Ops/s	124.2335 Ops/s	$\color{#d91a1a}-1.16\%$
test_redq_speed	15.6912ms	12.9670ms	77.1186 Ops/s	76.9316 Ops/s	$\color{#35bf28}+0.24\%$
test_redq_deprec_speed	14.7597ms	13.4132ms	74.5532 Ops/s	73.9116 Ops/s	$\color{#35bf28}+0.87\%$
test_td3_speed	10.8072ms	8.2791ms	120.7867 Ops/s	121.4797 Ops/s	$\color{#d91a1a}-0.57\%$
test_cql_speed	38.6821ms	36.4797ms	27.4125 Ops/s	25.1806 Ops/s	$\textbf{\color{#35bf28}+8.86\%}$
test_a2c_speed	9.6194ms	7.7071ms	129.7497 Ops/s	127.5641 Ops/s	$\color{#35bf28}+1.71\%$
test_ppo_speed	8.7600ms	8.0730ms	123.8702 Ops/s	123.5889 Ops/s	$\color{#35bf28}+0.23\%$
test_reinforce_speed	8.0524ms	6.7561ms	148.0140 Ops/s	151.1997 Ops/s	$\color{#d91a1a}-2.11\%$
test_iql_speed	35.0586ms	32.7877ms	30.4993 Ops/s	30.2712 Ops/s	$\color{#35bf28}+0.75\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000]	5.6589ms	5.0909ms	196.4285 Ops/s	192.3303 Ops/s	$\color{#35bf28}+2.13\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000]	0.8371ms	0.4912ms	2.0360 KOps/s	2.0135 KOps/s	$\color{#35bf28}+1.12\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000]	0.6747ms	0.4687ms	2.1337 KOps/s	2.0907 KOps/s	$\color{#35bf28}+2.06\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000]	5.3755ms	4.9491ms	202.0565 Ops/s	196.9878 Ops/s	$\color{#35bf28}+2.57\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000]	0.1216s	0.5676ms	1.7618 KOps/s	2.0440 KOps/s	$\textbf{\color{#d91a1a}-13.81\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000]	0.6381ms	0.4608ms	2.1700 KOps/s	2.1578 KOps/s	$\color{#35bf28}+0.57\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000]	2.3471ms	1.7005ms	588.0644 Ops/s	595.2886 Ops/s	$\color{#d91a1a}-1.21\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000]	1.9361ms	1.6054ms	622.8949 Ops/s	627.4118 Ops/s	$\color{#d91a1a}-0.72\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000]	8.7123ms	5.3916ms	185.4729 Ops/s	190.0682 Ops/s	$\color{#d91a1a}-2.42\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000]	1.6277ms	0.6411ms	1.5598 KOps/s	1.6025 KOps/s	$\color{#d91a1a}-2.66\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000]	0.7838ms	0.6117ms	1.6348 KOps/s	1.6692 KOps/s	$\color{#d91a1a}-2.07\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000]	5.3386ms	5.0591ms	197.6653 Ops/s	192.4336 Ops/s	$\color{#35bf28}+2.72\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000]	1.4146ms	0.4993ms	2.0026 KOps/s	2.0031 KOps/s	$\color{#d91a1a}-0.02\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000]	0.6459ms	0.4662ms	2.1449 KOps/s	2.1302 KOps/s	$\color{#35bf28}+0.69\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000]	5.7353ms	5.2208ms	191.5433 Ops/s	199.9401 Ops/s	$\color{#d91a1a}-4.20\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000]	0.7153ms	0.4970ms	2.0122 KOps/s	1.9348 KOps/s	$\color{#35bf28}+4.00\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000]	0.6432ms	0.4668ms	2.1424 KOps/s	2.0963 KOps/s	$\color{#35bf28}+2.20\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000]	8.0165ms	5.2374ms	190.9350 Ops/s	190.1828 Ops/s	$\color{#35bf28}+0.40\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000]	1.5003ms	0.6429ms	1.5555 KOps/s	1.5901 KOps/s	$\color{#d91a1a}-2.17\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000]	0.7394ms	0.6092ms	1.6414 KOps/s	1.6767 KOps/s	$\color{#d91a1a}-2.10\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400]	0.1393s	6.5725ms	152.1485 Ops/s	159.7288 Ops/s	$\color{#d91a1a}-4.75\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400]	17.6582ms	13.1226ms	76.2041 Ops/s	76.6153 Ops/s	$\color{#d91a1a}-0.54\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400]	1.9840ms	1.1721ms	853.1826 Ops/s	850.6005 Ops/s	$\color{#35bf28}+0.30\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400]	0.1263s	6.2940ms	158.8805 Ops/s	117.3738 Ops/s	$\textbf{\color{#35bf28}+35.36\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400]	0.1326s	15.7308ms	63.5694 Ops/s	75.6031 Ops/s	$\textbf{\color{#d91a1a}-15.92\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400]	1.6972ms	1.1217ms	891.5206 Ops/s	881.1268 Ops/s	$\color{#35bf28}+1.18\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400]	0.1255s	6.4573ms	154.8635 Ops/s	155.6585 Ops/s	$\color{#d91a1a}-0.51\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400]	17.3844ms	13.2166ms	75.6625 Ops/s	73.9693 Ops/s	$\color{#35bf28}+2.29\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400]	1.8453ms	1.2817ms	780.2150 Ops/s	767.7556 Ops/s	$\color{#35bf28}+1.62\%$

github-actions · 2024-08-09T15:24:19Z

$\color{#D29922}\textsf{\Large&#x26A0;\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 94. Improved: $\large\color{#35bf28}0$. Worsened: $\large\color{#d91a1a}4$.

Expand to view detailed results

Name	Max	Mean	Ops	Ops on Repo `HEAD`	Change
test_single	0.1077s	0.1070s	9.3461 Ops/s	9.3625 Ops/s	$\color{#d91a1a}-0.18\%$
test_sync	98.9522ms	95.1135ms	10.5138 Ops/s	10.8594 Ops/s	$\color{#d91a1a}-3.18\%$
test_async	0.1759s	86.5381ms	11.5556 Ops/s	11.2126 Ops/s	$\color{#35bf28}+3.06\%$
test_single_pixels	0.1176s	0.1171s	8.5403 Ops/s	8.5037 Ops/s	$\color{#35bf28}+0.43\%$
test_sync_pixels	76.2439ms	75.1642ms	13.3042 Ops/s	12.8778 Ops/s	$\color{#35bf28}+3.31\%$
test_async_pixels	0.1478s	70.8122ms	14.1219 Ops/s	14.1064 Ops/s	$\color{#35bf28}+0.11\%$
test_simple	0.7605s	0.7592s	1.3171 Ops/s	1.2815 Ops/s	$\color{#35bf28}+2.78\%$
test_transformed	1.0911s	1.0202s	0.9802 Ops/s	1.0060 Ops/s	$\color{#d91a1a}-2.57\%$
test_serial	2.2563s	2.1779s	0.4592 Ops/s	0.4653 Ops/s	$\color{#d91a1a}-1.31\%$
test_parallel	1.9512s	1.8862s	0.5302 Ops/s	0.5252 Ops/s	$\color{#35bf28}+0.95\%$
test_step_mdp_speed[True-True-True-True-True]	0.1065ms	38.7312μs	25.8190 KOps/s	26.4738 KOps/s	$\color{#d91a1a}-2.47\%$
test_step_mdp_speed[True-True-True-True-False]	51.3010μs	21.7543μs	45.9679 KOps/s	46.8031 KOps/s	$\color{#d91a1a}-1.78\%$
test_step_mdp_speed[True-True-True-False-True]	39.2510μs	21.5599μs	46.3823 KOps/s	46.6728 KOps/s	$\color{#d91a1a}-0.62\%$
test_step_mdp_speed[True-True-True-False-False]	36.5500μs	12.1013μs	82.6359 KOps/s	83.4489 KOps/s	$\color{#d91a1a}-0.97\%$
test_step_mdp_speed[True-True-False-True-True]	61.1310μs	41.0172μs	24.3800 KOps/s	25.0480 KOps/s	$\color{#d91a1a}-2.67\%$
test_step_mdp_speed[True-True-False-True-False]	46.8310μs	23.7766μs	42.0582 KOps/s	42.8363 KOps/s	$\color{#d91a1a}-1.82\%$
test_step_mdp_speed[True-True-False-False-True]	49.4510μs	23.7667μs	42.0757 KOps/s	42.8546 KOps/s	$\color{#d91a1a}-1.82\%$
test_step_mdp_speed[True-True-False-False-False]	39.6010μs	14.4020μs	69.4346 KOps/s	69.6378 KOps/s	$\color{#d91a1a}-0.29\%$
test_step_mdp_speed[True-False-True-True-True]	63.1410μs	43.4102μs	23.0361 KOps/s	23.9222 KOps/s	$\color{#d91a1a}-3.70\%$
test_step_mdp_speed[True-False-True-True-False]	44.3110μs	26.2143μs	38.1471 KOps/s	38.7954 KOps/s	$\color{#d91a1a}-1.67\%$
test_step_mdp_speed[True-False-True-False-True]	73.0610μs	23.8158μs	41.9890 KOps/s	42.2464 KOps/s	$\color{#d91a1a}-0.61\%$
test_step_mdp_speed[True-False-True-False-False]	37.3700μs	14.5429μs	68.7620 KOps/s	69.8257 KOps/s	$\color{#d91a1a}-1.52\%$
test_step_mdp_speed[True-False-False-True-True]	67.4110μs	45.2871μs	22.0813 KOps/s	22.2331 KOps/s	$\color{#d91a1a}-0.68\%$
test_step_mdp_speed[True-False-False-True-False]	48.3400μs	28.4989μs	35.0891 KOps/s	35.7934 KOps/s	$\color{#d91a1a}-1.97\%$
test_step_mdp_speed[True-False-False-False-True]	42.5510μs	26.0760μs	38.3494 KOps/s	38.4925 KOps/s	$\color{#d91a1a}-0.37\%$
test_step_mdp_speed[True-False-False-False-False]	38.9410μs	16.7283μs	59.7788 KOps/s	60.6036 KOps/s	$\color{#d91a1a}-1.36\%$
test_step_mdp_speed[False-True-True-True-True]	80.0610μs	43.3214μs	23.0833 KOps/s	23.6125 KOps/s	$\color{#d91a1a}-2.24\%$
test_step_mdp_speed[False-True-True-True-False]	46.0010μs	26.0822μs	38.3404 KOps/s	38.6217 KOps/s	$\color{#d91a1a}-0.73\%$
test_step_mdp_speed[False-True-True-False-True]	51.9310μs	28.4978μs	35.0904 KOps/s	35.5543 KOps/s	$\color{#d91a1a}-1.30\%$
test_step_mdp_speed[False-True-True-False-False]	34.4300μs	16.8090μs	59.4920 KOps/s	61.9044 KOps/s	$\color{#d91a1a}-3.90\%$
test_step_mdp_speed[False-True-False-True-True]	80.0010μs	45.7090μs	21.8775 KOps/s	22.7984 KOps/s	$\color{#d91a1a}-4.04\%$
test_step_mdp_speed[False-True-False-True-False]	63.2110μs	28.5919μs	34.9749 KOps/s	35.8560 KOps/s	$\color{#d91a1a}-2.46\%$
test_step_mdp_speed[False-True-False-False-True]	57.1800μs	30.5958μs	32.6842 KOps/s	33.6266 KOps/s	$\color{#d91a1a}-2.80\%$
test_step_mdp_speed[False-True-False-False-False]	41.7100μs	18.9845μs	52.6746 KOps/s	53.8788 KOps/s	$\color{#d91a1a}-2.24\%$
test_step_mdp_speed[False-False-True-True-True]	3.9473ms	48.4431μs	20.6428 KOps/s	21.2787 KOps/s	$\color{#d91a1a}-2.99\%$
test_step_mdp_speed[False-False-True-True-False]	63.0300μs	30.8519μs	32.4129 KOps/s	32.7231 KOps/s	$\color{#d91a1a}-0.95\%$
test_step_mdp_speed[False-False-True-False-True]	56.3110μs	30.6154μs	32.6633 KOps/s	33.4772 KOps/s	$\color{#d91a1a}-2.43\%$
test_step_mdp_speed[False-False-True-False-False]	44.0800μs	19.2414μs	51.9712 KOps/s	53.1553 KOps/s	$\color{#d91a1a}-2.23\%$
test_step_mdp_speed[False-False-False-True-True]	75.3110μs	49.4812μs	20.2097 KOps/s	20.5773 KOps/s	$\color{#d91a1a}-1.79\%$
test_step_mdp_speed[False-False-False-True-False]	55.0910μs	33.3213μs	30.0108 KOps/s	30.6319 KOps/s	$\color{#d91a1a}-2.03\%$
test_step_mdp_speed[False-False-False-False-True]	51.2020μs	32.9885μs	30.3136 KOps/s	31.6413 KOps/s	$\color{#d91a1a}-4.20\%$
test_step_mdp_speed[False-False-False-False-False]	37.7200μs	21.0361μs	47.5374 KOps/s	47.9975 KOps/s	$\color{#d91a1a}-0.96\%$
test_values[generalized_advantage_estimate-True-True]	25.3269ms	24.8158ms	40.2968 Ops/s	40.4718 Ops/s	$\color{#d91a1a}-0.43\%$
test_values[vec_generalized_advantage_estimate-True-True]	95.4059ms	2.8095ms	355.9384 Ops/s	372.1798 Ops/s	$\color{#d91a1a}-4.36\%$
test_values[td0_return_estimate-False-False]	90.6620μs	66.4143μs	15.0570 KOps/s	15.0288 KOps/s	$\color{#35bf28}+0.19\%$
test_values[td1_return_estimate-False-False]	55.8183ms	55.0422ms	18.1679 Ops/s	17.8553 Ops/s	$\color{#35bf28}+1.75\%$
test_values[vec_td1_return_estimate-False-False]	1.3100ms	1.0886ms	918.6111 Ops/s	911.5455 Ops/s	$\color{#35bf28}+0.78\%$
test_values[td_lambda_return_estimate-True-False]	88.3793ms	87.9032ms	11.3762 Ops/s	11.2769 Ops/s	$\color{#35bf28}+0.88\%$
test_values[vec_td_lambda_return_estimate-True-False]	1.3756ms	1.0912ms	916.4495 Ops/s	914.8911 Ops/s	$\color{#35bf28}+0.17\%$
test_gae_speed[generalized_advantage_estimate-False-1-512]	24.9450ms	24.6135ms	40.6282 Ops/s	40.3879 Ops/s	$\color{#35bf28}+0.59\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512]	0.9655ms	0.7264ms	1.3767 KOps/s	1.3525 KOps/s	$\color{#35bf28}+1.79\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512]	0.7366ms	0.6744ms	1.4828 KOps/s	1.4705 KOps/s	$\color{#35bf28}+0.84\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512]	1.5284ms	1.4764ms	677.3317 Ops/s	675.6364 Ops/s	$\color{#35bf28}+0.25\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512]	0.7498ms	0.6890ms	1.4513 KOps/s	1.4180 KOps/s	$\color{#35bf28}+2.35\%$
test_dqn_speed	7.2020ms	1.3958ms	716.4278 Ops/s	730.2958 Ops/s	$\color{#d91a1a}-1.90\%$
test_ddpg_speed	3.0417ms	2.7874ms	358.7584 Ops/s	363.1098 Ops/s	$\color{#d91a1a}-1.20\%$
test_sac_speed	0.1010s	8.7937ms	113.7183 Ops/s	125.6362 Ops/s	$\textbf{\color{#d91a1a}-9.49\%}$
test_redq_speed	11.5049ms	10.2209ms	97.8386 Ops/s	99.3357 Ops/s	$\color{#d91a1a}-1.51\%$
test_redq_deprec_speed	11.3640ms	11.0668ms	90.3605 Ops/s	91.0146 Ops/s	$\color{#d91a1a}-0.72\%$
test_td3_speed	8.3707ms	7.9901ms	125.1546 Ops/s	127.5516 Ops/s	$\color{#d91a1a}-1.88\%$
test_cql_speed	26.5539ms	25.3258ms	39.4854 Ops/s	39.8755 Ops/s	$\color{#d91a1a}-0.98\%$
test_a2c_speed	5.6563ms	5.4416ms	183.7698 Ops/s	180.5604 Ops/s	$\color{#35bf28}+1.78\%$
test_ppo_speed	6.5577ms	5.7975ms	172.4871 Ops/s	170.7528 Ops/s	$\color{#35bf28}+1.02\%$
test_reinforce_speed	4.7671ms	4.4903ms	222.7014 Ops/s	224.0064 Ops/s	$\color{#d91a1a}-0.58\%$
test_iql_speed	20.2792ms	19.4934ms	51.2993 Ops/s	51.9445 Ops/s	$\color{#d91a1a}-1.24\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000]	6.9100ms	6.7151ms	148.9191 Ops/s	149.9944 Ops/s	$\color{#d91a1a}-0.72\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000]	1.1095ms	0.5210ms	1.9193 KOps/s	1.9346 KOps/s	$\color{#d91a1a}-0.79\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000]	0.6699ms	0.4981ms	2.0077 KOps/s	2.0176 KOps/s	$\color{#d91a1a}-0.49\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000]	6.9282ms	6.5847ms	151.8666 Ops/s	151.5296 Ops/s	$\color{#35bf28}+0.22\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000]	1.8500ms	0.5118ms	1.9537 KOps/s	1.9612 KOps/s	$\color{#d91a1a}-0.38\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000]	0.6773ms	0.4923ms	2.0315 KOps/s	2.0457 KOps/s	$\color{#d91a1a}-0.70\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000]	2.2301ms	2.0059ms	498.5335 Ops/s	504.7533 Ops/s	$\color{#d91a1a}-1.23\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000]	2.0424ms	1.8766ms	532.8846 Ops/s	517.4919 Ops/s	$\color{#35bf28}+2.97\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000]	6.9309ms	6.7742ms	147.6192 Ops/s	146.8710 Ops/s	$\color{#35bf28}+0.51\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000]	0.1441s	0.8047ms	1.2427 KOps/s	1.4972 KOps/s	$\textbf{\color{#d91a1a}-17.00\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000]	0.8224ms	0.6485ms	1.5420 KOps/s	1.5517 KOps/s	$\color{#d91a1a}-0.63\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000]	6.8633ms	6.6973ms	149.3134 Ops/s	149.8483 Ops/s	$\color{#d91a1a}-0.36\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000]	1.2152ms	0.5211ms	1.9189 KOps/s	1.9215 KOps/s	$\color{#d91a1a}-0.14\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000]	0.6957ms	0.4951ms	2.0196 KOps/s	2.0134 KOps/s	$\color{#35bf28}+0.31\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000]	6.9107ms	6.5727ms	152.1439 Ops/s	151.2116 Ops/s	$\color{#35bf28}+0.62\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000]	0.1258s	0.6881ms	1.4532 KOps/s	1.9443 KOps/s	$\textbf{\color{#d91a1a}-25.26\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000]	0.6515ms	0.4908ms	2.0374 KOps/s	2.0085 KOps/s	$\color{#35bf28}+1.44\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000]	7.0387ms	6.8489ms	146.0080 Ops/s	146.2359 Ops/s	$\color{#d91a1a}-0.16\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000]	0.7656ms	0.6718ms	1.4886 KOps/s	1.5045 KOps/s	$\color{#d91a1a}-1.05\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000]	4.4065ms	0.6552ms	1.5263 KOps/s	1.5575 KOps/s	$\color{#d91a1a}-2.00\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400]	0.1315s	7.7320ms	129.3321 Ops/s	127.6578 Ops/s	$\color{#35bf28}+1.31\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400]	19.0373ms	16.4593ms	60.7560 Ops/s	62.2541 Ops/s	$\color{#d91a1a}-2.41\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400]	2.4510ms	1.3027ms	767.6133 Ops/s	775.0076 Ops/s	$\color{#d91a1a}-0.95\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400]	0.1266s	10.0863ms	99.1444 Ops/s	99.2870 Ops/s	$\color{#d91a1a}-0.14\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400]	19.2160ms	16.2911ms	61.3832 Ops/s	61.5192 Ops/s	$\color{#d91a1a}-0.22\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400]	2.4095ms	1.2917ms	774.1564 Ops/s	801.6241 Ops/s	$\color{#d91a1a}-3.43\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400]	0.1263s	7.8394ms	127.5611 Ops/s	128.5814 Ops/s	$\color{#d91a1a}-0.79\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400]	19.6366ms	16.8408ms	59.3797 Ops/s	61.4922 Ops/s	$\color{#d91a1a}-3.44\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400]	7.6283ms	1.5697ms	637.0658 Ops/s	697.6879 Ops/s	$\textbf{\color{#d91a1a}-8.69\%}$

[ghstack-poisoned]

ghstack-source-id: 07885ddb54263059daa4e65f3e83fd3668500b9e Pull Request resolved: #2384

[ghstack-poisoned]

ghstack-source-id: 1ced2f68e4026e36a395b43c560fd45391f4587a Pull Request resolved: #2384

[ghstack-poisoned]

ghstack-source-id: 452d429b153284ebc06e89225eed0f6a7b6ad37b Pull Request resolved: #2384

Update

7ccffe3

[ghstack-poisoned]

vmoens added a commit that referenced this pull request Aug 9, 2024

[Feature] Pass replay buffers to SyncDataCollector

6d3421b

ghstack-source-id: d4949410af9604e64c4d179608ebec7377710758 Pull Request resolved: #2384

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 9, 2024

Update

869da16

[ghstack-poisoned]

vmoens added a commit that referenced this pull request Aug 9, 2024

[Feature] Pass replay buffers to SyncDataCollector

5e87d7b

ghstack-source-id: 07885ddb54263059daa4e65f3e83fd3668500b9e Pull Request resolved: #2384

Update

ce0e314

[ghstack-poisoned]

vmoens added a commit that referenced this pull request Aug 10, 2024

[Feature] Pass replay buffers to SyncDataCollector

8e1cbc9

ghstack-source-id: 1ced2f68e4026e36a395b43c560fd45391f4587a Pull Request resolved: #2384

vmoens mentioned this pull request Aug 10, 2024

[Feature] Pass replay buffers to MultiSyncDataCollector #2386

Closed

vmoens added the enhancement New feature or request label Aug 10, 2024

This was referenced Aug 10, 2024

[Feature] Pass replay buffers to MultiaSyncDataCollector #2387

Merged

[Feature] replay_buffer_chunk #2388

Merged

vmoens added 3 commits August 10, 2024 14:26

Update

7fe1bc5

[ghstack-poisoned]

Update

5e90f20

[ghstack-poisoned]

Update

0373067

[ghstack-poisoned]

vmoens mentioned this pull request Aug 10, 2024

[Algorithm] TD3 fast #2389

Open

vmoens added 8 commits August 11, 2024 11:44

Update

5858f24

[ghstack-poisoned]

Update

c9ce795

[ghstack-poisoned]

Update

acedf32

[ghstack-poisoned]

Update

e4a6936

[ghstack-poisoned]

Update

bf116e1

[ghstack-poisoned]

Update

b8a8d85

[ghstack-poisoned]

Update

529ccbd

[ghstack-poisoned]

Update

e917d9b

[ghstack-poisoned]

vmoens merged commit e917d9b into gh/vmoens/13/base Aug 13, 2024
9 of 14 checks passed

vmoens added a commit that referenced this pull request Aug 13, 2024

[Feature] Pass replay buffers to SyncDataCollector

9627e8a

ghstack-source-id: 452d429b153284ebc06e89225eed0f6a7b6ad37b Pull Request resolved: #2384

vmoens deleted the gh/vmoens/13/head branch August 13, 2024 19:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Pass replay buffers to SyncDataCollector #2384

[Feature] Pass replay buffers to SyncDataCollector #2384

vmoens commented Aug 9, 2024 •

edited

Loading

pytorch-bot bot commented Aug 9, 2024 •

edited

Loading

vmoens commented Aug 9, 2024 •

edited

Loading

github-actions bot commented Aug 9, 2024 •

edited

Loading

github-actions bot commented Aug 9, 2024 •

edited

Loading

[Feature] Pass replay buffers to SyncDataCollector #2384

[Feature] Pass replay buffers to SyncDataCollector #2384

Conversation

vmoens commented Aug 9, 2024 • edited Loading

pytorch-bot bot commented Aug 9, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2384

❌ 1 New Failure, 21 Unrelated Failures

vmoens commented Aug 9, 2024 • edited Loading

github-actions bot commented Aug 9, 2024 • edited Loading

$\color{#D29922}\textsf{\Large&amp;#x26A0;\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 91. Improved: $\large\color{#35bf28}3$. Worsened: $\large\color{#d91a1a}4$.

github-actions bot commented Aug 9, 2024 • edited Loading

$\color{#D29922}\textsf{\Large&amp;#x26A0;\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 94. Improved: $\large\color{#35bf28}0$. Worsened: $\large\color{#d91a1a}4$.

vmoens commented Aug 9, 2024 •

edited

Loading

pytorch-bot bot commented Aug 9, 2024 •

edited

Loading

vmoens commented Aug 9, 2024 •

edited

Loading

github-actions bot commented Aug 9, 2024 •

edited

Loading

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

github-actions bot commented Aug 9, 2024 •

edited

Loading

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests