Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Refactor] Use empty_like in storage construction #2455

Merged
merged 1 commit into from
Sep 26, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Sep 26, 2024

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Sep 26, 2024
ghstack-source-id: 28cd569bd4abf472991b82b3eba9fe333b5cd68f
Pull Request resolved: #2455
Copy link

pytorch-bot bot commented Sep 26, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2455

Note: Links to docs will display an error until the docs builds have been completed.

❌ 6 New Failures, 1 Unrelated Failure

As of commit cc96914 with merge base ca3a595 (image):

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 26, 2024
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 146. Improved: $\large\color{#35bf28}48$. Worsened: $\large\color{#d91a1a}5$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 61.4402ms 60.5926ms 16.5037 Ops/s 16.6435 Ops/s $\color{#d91a1a}-0.84\%$
test_sync 41.6898ms 36.9132ms 27.0906 Ops/s 30.7543 Ops/s $\textbf{\color{#d91a1a}-11.91\%}$
test_async 64.0095ms 32.5018ms 30.7675 Ops/s 31.9604 Ops/s $\color{#d91a1a}-3.73\%$
test_simple 0.5525s 0.4504s 2.2200 Ops/s 2.4815 Ops/s $\textbf{\color{#d91a1a}-10.54\%}$
test_transformed 0.6915s 0.6063s 1.6494 Ops/s 1.7178 Ops/s $\color{#d91a1a}-3.98\%$
test_serial 1.4013s 1.3190s 0.7581 Ops/s 0.7728 Ops/s $\color{#d91a1a}-1.90\%$
test_parallel 1.2567s 1.1740s 0.8518 Ops/s 0.8879 Ops/s $\color{#d91a1a}-4.07\%$
test_step_mdp_speed[True-True-True-True-True] 0.2499ms 26.7386μs 37.3991 KOps/s 36.4002 KOps/s $\color{#35bf28}+2.74\%$
test_step_mdp_speed[True-True-True-True-False] 47.7590μs 15.7806μs 63.3691 KOps/s 62.2083 KOps/s $\color{#35bf28}+1.87\%$
test_step_mdp_speed[True-True-True-False-True] 44.9940μs 15.3404μs 65.1875 KOps/s 61.1981 KOps/s $\textbf{\color{#35bf28}+6.52\%}$
test_step_mdp_speed[True-True-True-False-False] 40.9460μs 9.0523μs 110.4688 KOps/s 108.5999 KOps/s $\color{#35bf28}+1.72\%$
test_step_mdp_speed[True-True-False-True-True] 94.2160μs 28.6594μs 34.8926 KOps/s 34.5434 KOps/s $\color{#35bf28}+1.01\%$
test_step_mdp_speed[True-True-False-True-False] 56.9760μs 17.2247μs 58.0561 KOps/s 56.0857 KOps/s $\color{#35bf28}+3.51\%$
test_step_mdp_speed[True-True-False-False-True] 47.5390μs 17.0161μs 58.7677 KOps/s 57.7940 KOps/s $\color{#35bf28}+1.68\%$
test_step_mdp_speed[True-True-False-False-False] 40.5760μs 10.6719μs 93.7036 KOps/s 91.7040 KOps/s $\color{#35bf28}+2.18\%$
test_step_mdp_speed[True-False-True-True-True] 82.0930μs 30.0668μs 33.2593 KOps/s 32.3117 KOps/s $\color{#35bf28}+2.93\%$
test_step_mdp_speed[True-False-True-True-False] 50.1630μs 19.0187μs 52.5797 KOps/s 50.7783 KOps/s $\color{#35bf28}+3.55\%$
test_step_mdp_speed[True-False-True-False-True] 55.9040μs 17.1852μs 58.1897 KOps/s 57.6241 KOps/s $\color{#35bf28}+0.98\%$
test_step_mdp_speed[True-False-True-False-False] 0.1374ms 10.8237μs 92.3896 KOps/s 91.4527 KOps/s $\color{#35bf28}+1.02\%$
test_step_mdp_speed[True-False-False-True-True] 63.3490μs 31.7261μs 31.5198 KOps/s 31.0761 KOps/s $\color{#35bf28}+1.43\%$
test_step_mdp_speed[True-False-False-True-False] 50.4840μs 20.5062μs 48.7658 KOps/s 47.2346 KOps/s $\color{#35bf28}+3.24\%$
test_step_mdp_speed[True-False-False-False-True] 46.0960μs 18.4769μs 54.1218 KOps/s 52.8951 KOps/s $\color{#35bf28}+2.32\%$
test_step_mdp_speed[True-False-False-False-False] 41.3570μs 12.1751μs 82.1347 KOps/s 79.4026 KOps/s $\color{#35bf28}+3.44\%$
test_step_mdp_speed[False-True-True-True-True] 65.2320μs 30.2938μs 33.0101 KOps/s 31.9888 KOps/s $\color{#35bf28}+3.19\%$
test_step_mdp_speed[False-True-True-True-False] 59.6710μs 19.1282μs 52.2789 KOps/s 51.2645 KOps/s $\color{#35bf28}+1.98\%$
test_step_mdp_speed[False-True-True-False-True] 0.1541ms 19.3314μs 51.7294 KOps/s 51.0227 KOps/s $\color{#35bf28}+1.39\%$
test_step_mdp_speed[False-True-True-False-False] 36.1980μs 11.9393μs 83.7573 KOps/s 82.6367 KOps/s $\color{#35bf28}+1.36\%$
test_step_mdp_speed[False-True-False-True-True] 60.8740μs 31.8948μs 31.3531 KOps/s 30.7328 KOps/s $\color{#35bf28}+2.02\%$
test_step_mdp_speed[False-True-False-True-False] 55.8540μs 20.5662μs 48.6234 KOps/s 47.2772 KOps/s $\color{#35bf28}+2.85\%$
test_step_mdp_speed[False-True-False-False-True] 3.3586ms 20.8249μs 48.0195 KOps/s 46.6254 KOps/s $\color{#35bf28}+2.99\%$
test_step_mdp_speed[False-True-False-False-False] 40.3250μs 13.3423μs 74.9496 KOps/s 73.3605 KOps/s $\color{#35bf28}+2.17\%$
test_step_mdp_speed[False-False-True-True-True] 0.1658ms 33.8064μs 29.5802 KOps/s 29.6048 KOps/s $\color{#d91a1a}-0.08\%$
test_step_mdp_speed[False-False-True-True-False] 0.1549ms 22.9113μs 43.6466 KOps/s 43.9631 KOps/s $\color{#d91a1a}-0.72\%$
test_step_mdp_speed[False-False-True-False-True] 50.4840μs 20.9656μs 47.6972 KOps/s 47.6010 KOps/s $\color{#35bf28}+0.20\%$
test_step_mdp_speed[False-False-True-False-False] 53.1490μs 13.3441μs 74.9397 KOps/s 71.8281 KOps/s $\color{#35bf28}+4.33\%$
test_step_mdp_speed[False-False-False-True-True] 78.6360μs 34.8016μs 28.7343 KOps/s 28.3517 KOps/s $\color{#35bf28}+1.35\%$
test_step_mdp_speed[False-False-False-True-False] 54.9730μs 23.6335μs 42.3129 KOps/s 41.4247 KOps/s $\color{#35bf28}+2.14\%$
test_step_mdp_speed[False-False-False-False-True] 65.6030μs 22.2360μs 44.9721 KOps/s 44.2595 KOps/s $\color{#35bf28}+1.61\%$
test_step_mdp_speed[False-False-False-False-False] 38.2720μs 14.8446μs 67.3645 KOps/s 66.8260 KOps/s $\color{#35bf28}+0.81\%$
test_values[generalized_advantage_estimate-True-True] 11.8093ms 9.6803ms 103.3028 Ops/s 105.0364 Ops/s $\color{#d91a1a}-1.65\%$
test_values[vec_generalized_advantage_estimate-True-True] 37.5854ms 35.7765ms 27.9513 Ops/s 27.5959 Ops/s $\color{#35bf28}+1.29\%$
test_values[td0_return_estimate-False-False] 0.2308ms 0.1677ms 5.9633 KOps/s 5.6438 KOps/s $\textbf{\color{#35bf28}+5.66\%}$
test_values[td1_return_estimate-False-False] 26.3958ms 24.0627ms 41.5581 Ops/s 41.2907 Ops/s $\color{#35bf28}+0.65\%$
test_values[vec_td1_return_estimate-False-False] 38.9863ms 35.9652ms 27.8046 Ops/s 27.7785 Ops/s $\color{#35bf28}+0.09\%$
test_values[td_lambda_return_estimate-True-False] 38.3742ms 34.8030ms 28.7331 Ops/s 28.6873 Ops/s $\color{#35bf28}+0.16\%$
test_values[vec_td_lambda_return_estimate-True-False] 37.0054ms 36.0077ms 27.7718 Ops/s 27.8265 Ops/s $\color{#d91a1a}-0.20\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 9.1437ms 8.3654ms 119.5401 Ops/s 122.9721 Ops/s $\color{#d91a1a}-2.79\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.6592ms 1.9128ms 522.8055 Ops/s 499.2928 Ops/s $\color{#35bf28}+4.71\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.5073ms 0.3558ms 2.8105 KOps/s 2.7424 KOps/s $\color{#35bf28}+2.48\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 50.7593ms 48.4704ms 20.6312 Ops/s 22.7359 Ops/s $\textbf{\color{#d91a1a}-9.26\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 3.9466ms 3.0369ms 329.2780 Ops/s 326.4831 Ops/s $\color{#35bf28}+0.86\%$
test_dqn_speed[False-None] 5.9260ms 1.3549ms 738.0725 Ops/s 723.7033 Ops/s $\color{#35bf28}+1.99\%$
test_dqn_speed[False-backward] 1.9378ms 1.8484ms 541.0109 Ops/s 541.6467 Ops/s $\color{#d91a1a}-0.12\%$
test_dqn_speed[True-None] 0.7317ms 0.4612ms 2.1682 KOps/s 2.1622 KOps/s $\color{#35bf28}+0.28\%$
test_dqn_speed[True-backward] 1.0470ms 0.8753ms 1.1425 KOps/s 1.1161 KOps/s $\color{#35bf28}+2.36\%$
test_dqn_speed[reduce-overhead-None] 0.7706ms 0.4631ms 2.1596 KOps/s 2.1229 KOps/s $\color{#35bf28}+1.73\%$
test_dqn_speed[reduce-overhead-backward] 0.9443ms 0.8789ms 1.1378 KOps/s 1.1261 KOps/s $\color{#35bf28}+1.04\%$
test_ddpg_speed[False-None] 4.0798ms 2.7796ms 359.7679 Ops/s 351.9648 Ops/s $\color{#35bf28}+2.22\%$
test_ddpg_speed[False-backward] 4.0414ms 3.9056ms 256.0426 Ops/s 252.7225 Ops/s $\color{#35bf28}+1.31\%$
test_ddpg_speed[True-None] 1.1872ms 0.9982ms 1.0018 KOps/s 968.6870 Ops/s $\color{#35bf28}+3.41\%$
test_ddpg_speed[True-backward] 1.9795ms 1.8829ms 531.1005 Ops/s 453.5710 Ops/s $\textbf{\color{#35bf28}+17.09\%}$
test_ddpg_speed[reduce-overhead-None] 1.4312ms 1.0003ms 999.6799 Ops/s 970.8195 Ops/s $\color{#35bf28}+2.97\%$
test_ddpg_speed[reduce-overhead-backward] 1.9654ms 1.8882ms 529.6100 Ops/s 518.9996 Ops/s $\color{#35bf28}+2.04\%$
test_sac_speed[False-None] 9.6458ms 7.8588ms 127.2460 Ops/s 97.7024 Ops/s $\textbf{\color{#35bf28}+30.24\%}$
test_sac_speed[False-backward] 11.0232ms 10.5071ms 95.1734 Ops/s 89.9078 Ops/s $\textbf{\color{#35bf28}+5.86\%}$
test_sac_speed[True-None] 2.1380ms 1.8363ms 544.5870 Ops/s 529.7666 Ops/s $\color{#35bf28}+2.80\%$
test_sac_speed[True-backward] 5.4934ms 3.5776ms 279.5181 Ops/s 266.2574 Ops/s $\color{#35bf28}+4.98\%$
test_sac_speed[reduce-overhead-None] 2.0780ms 1.8344ms 545.1497 Ops/s 525.4598 Ops/s $\color{#35bf28}+3.75\%$
test_sac_speed[reduce-overhead-backward] 3.7086ms 3.5185ms 284.2139 Ops/s 280.5586 Ops/s $\color{#35bf28}+1.30\%$
test_redq_speed[False-None] 14.6212ms 13.2219ms 75.6319 Ops/s 77.0314 Ops/s $\color{#d91a1a}-1.82\%$
test_redq_speed[False-backward] 25.3261ms 22.7486ms 43.9588 Ops/s 43.9407 Ops/s $\color{#35bf28}+0.04\%$
test_redq_speed[True-None] 6.1158ms 4.9888ms 200.4501 Ops/s 202.7981 Ops/s $\color{#d91a1a}-1.16\%$
test_redq_speed[True-backward] 13.2549ms 11.8817ms 84.1628 Ops/s 75.0694 Ops/s $\textbf{\color{#35bf28}+12.11\%}$
test_redq_speed[reduce-overhead-None] 5.7882ms 4.7138ms 212.1443 Ops/s 183.3342 Ops/s $\textbf{\color{#35bf28}+15.71\%}$
test_redq_speed[reduce-overhead-backward] 13.6244ms 12.3661ms 80.8661 Ops/s 75.3982 Ops/s $\textbf{\color{#35bf28}+7.25\%}$
test_redq_deprec_speed[False-None] 14.2644ms 12.7718ms 78.2973 Ops/s 71.3157 Ops/s $\textbf{\color{#35bf28}+9.79\%}$
test_redq_deprec_speed[False-backward] 27.7587ms 19.6371ms 50.9239 Ops/s 49.7286 Ops/s $\color{#35bf28}+2.40\%$
test_redq_deprec_speed[True-None] 4.2815ms 3.5331ms 283.0340 Ops/s 252.1527 Ops/s $\textbf{\color{#35bf28}+12.25\%}$
test_redq_deprec_speed[True-backward] 13.6980ms 8.3792ms 119.3425 Ops/s 115.5143 Ops/s $\color{#35bf28}+3.31\%$
test_redq_deprec_speed[reduce-overhead-None] 7.0486ms 3.6233ms 275.9904 Ops/s 245.0731 Ops/s $\textbf{\color{#35bf28}+12.62\%}$
test_redq_deprec_speed[reduce-overhead-backward] 9.1900ms 7.9906ms 125.1466 Ops/s 107.6858 Ops/s $\textbf{\color{#35bf28}+16.21\%}$
test_td3_speed[False-None] 8.1579ms 7.8009ms 128.1905 Ops/s 117.7546 Ops/s $\textbf{\color{#35bf28}+8.86\%}$
test_td3_speed[False-backward] 10.6788ms 10.1398ms 98.6209 Ops/s 88.3751 Ops/s $\textbf{\color{#35bf28}+11.59\%}$
test_td3_speed[True-None] 2.2352ms 1.9316ms 517.7081 Ops/s 490.3295 Ops/s $\textbf{\color{#35bf28}+5.58\%}$
test_td3_speed[True-backward] 3.5702ms 3.5037ms 285.4087 Ops/s 240.0236 Ops/s $\textbf{\color{#35bf28}+18.91\%}$
test_td3_speed[reduce-overhead-None] 2.1384ms 1.9170ms 521.6375 Ops/s 466.2988 Ops/s $\textbf{\color{#35bf28}+11.87\%}$
test_td3_speed[reduce-overhead-backward] 5.1335ms 3.5382ms 282.6283 Ops/s 256.3990 Ops/s $\textbf{\color{#35bf28}+10.23\%}$
test_cql_speed[False-None] 39.9963ms 36.0571ms 27.7338 Ops/s 26.2111 Ops/s $\textbf{\color{#35bf28}+5.81\%}$
test_cql_speed[False-backward] 53.0787ms 45.8405ms 21.8147 Ops/s 20.1261 Ops/s $\textbf{\color{#35bf28}+8.39\%}$
test_cql_speed[True-None] 16.9734ms 15.5423ms 64.3404 Ops/s 59.9825 Ops/s $\textbf{\color{#35bf28}+7.27\%}$
test_cql_speed[True-backward] 24.5775ms 22.4825ms 44.4791 Ops/s 42.5524 Ops/s $\color{#35bf28}+4.53\%$
test_cql_speed[reduce-overhead-None] 17.0681ms 15.7385ms 63.5385 Ops/s 60.9204 Ops/s $\color{#35bf28}+4.30\%$
test_cql_speed[reduce-overhead-backward] 23.3542ms 22.1349ms 45.1774 Ops/s 41.9294 Ops/s $\textbf{\color{#35bf28}+7.75\%}$
test_a2c_speed[False-None] 8.0773ms 7.1161ms 140.5259 Ops/s 128.7552 Ops/s $\textbf{\color{#35bf28}+9.14\%}$
test_a2c_speed[False-backward] 16.5273ms 14.0272ms 71.2901 Ops/s 65.4575 Ops/s $\textbf{\color{#35bf28}+8.91\%}$
test_a2c_speed[True-None] 3.9317ms 3.2998ms 303.0503 Ops/s 277.9939 Ops/s $\textbf{\color{#35bf28}+9.01\%}$
test_a2c_speed[True-backward] 10.9062ms 9.7237ms 102.8415 Ops/s 94.1891 Ops/s $\textbf{\color{#35bf28}+9.19\%}$
test_a2c_speed[reduce-overhead-None] 3.8258ms 3.2996ms 303.0659 Ops/s 291.9143 Ops/s $\color{#35bf28}+3.82\%$
test_a2c_speed[reduce-overhead-backward] 10.4456ms 9.6752ms 103.3570 Ops/s 94.2334 Ops/s $\textbf{\color{#35bf28}+9.68\%}$
test_ppo_speed[False-None] 8.9861ms 7.3852ms 135.4053 Ops/s 122.1412 Ops/s $\textbf{\color{#35bf28}+10.86\%}$
test_ppo_speed[False-backward] 16.2891ms 14.4637ms 69.1384 Ops/s 63.0782 Ops/s $\textbf{\color{#35bf28}+9.61\%}$
test_ppo_speed[True-None] 4.3739ms 3.6824ms 271.5611 Ops/s 250.7244 Ops/s $\textbf{\color{#35bf28}+8.31\%}$
test_ppo_speed[True-backward] 10.3895ms 9.5982ms 104.1860 Ops/s 94.7177 Ops/s $\textbf{\color{#35bf28}+10.00\%}$
test_ppo_speed[reduce-overhead-None] 3.9846ms 3.6715ms 272.3682 Ops/s 256.8403 Ops/s $\textbf{\color{#35bf28}+6.05\%}$
test_ppo_speed[reduce-overhead-backward] 9.9234ms 9.5023ms 105.2373 Ops/s 100.4962 Ops/s $\color{#35bf28}+4.72\%$
test_reinforce_speed[False-None] 7.9963ms 6.5191ms 153.3950 Ops/s 149.8890 Ops/s $\color{#35bf28}+2.34\%$
test_reinforce_speed[False-backward] 10.5070ms 9.8552ms 101.4692 Ops/s 96.9863 Ops/s $\color{#35bf28}+4.62\%$
test_reinforce_speed[True-None] 3.2683ms 2.6761ms 373.6766 Ops/s 358.4678 Ops/s $\color{#35bf28}+4.24\%$
test_reinforce_speed[True-backward] 8.9212ms 8.5045ms 117.5847 Ops/s 103.1703 Ops/s $\textbf{\color{#35bf28}+13.97\%}$
test_reinforce_speed[reduce-overhead-None] 3.9417ms 2.6327ms 379.8315 Ops/s 348.2645 Ops/s $\textbf{\color{#35bf28}+9.06\%}$
test_reinforce_speed[reduce-overhead-backward] 8.8631ms 8.4901ms 117.7842 Ops/s 106.9962 Ops/s $\textbf{\color{#35bf28}+10.08\%}$
test_iql_speed[False-None] 33.9397ms 32.2800ms 30.9790 Ops/s 30.3987 Ops/s $\color{#35bf28}+1.91\%$
test_iql_speed[False-backward] 46.8326ms 45.0308ms 22.2070 Ops/s 21.5917 Ops/s $\color{#35bf28}+2.85\%$
test_iql_speed[True-None] 14.3072ms 13.2869ms 75.2621 Ops/s 69.7919 Ops/s $\textbf{\color{#35bf28}+7.84\%}$
test_iql_speed[True-backward] 25.2724ms 24.2115ms 41.3026 Ops/s 38.5257 Ops/s $\textbf{\color{#35bf28}+7.21\%}$
test_iql_speed[reduce-overhead-None] 14.6608ms 13.2093ms 75.7045 Ops/s 70.1365 Ops/s $\textbf{\color{#35bf28}+7.94\%}$
test_iql_speed[reduce-overhead-backward] 25.9705ms 24.4423ms 40.9127 Ops/s 39.4723 Ops/s $\color{#35bf28}+3.65\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.6401ms 5.0574ms 197.7288 Ops/s 188.8697 Ops/s $\color{#35bf28}+4.69\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.9741ms 0.4732ms 2.1135 KOps/s 2.0616 KOps/s $\color{#35bf28}+2.52\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8000ms 0.4481ms 2.2318 KOps/s 2.1394 KOps/s $\color{#35bf28}+4.32\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.4940ms 5.0231ms 199.0796 Ops/s 191.8773 Ops/s $\color{#35bf28}+3.75\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.9767ms 0.4694ms 2.1306 KOps/s 2.0572 KOps/s $\color{#35bf28}+3.57\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6936ms 0.4410ms 2.2674 KOps/s 2.2126 KOps/s $\color{#35bf28}+2.48\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.2304ms 1.5690ms 637.3346 Ops/s 611.1936 Ops/s $\color{#35bf28}+4.28\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.2535ms 1.5190ms 658.3212 Ops/s 650.3187 Ops/s $\color{#35bf28}+1.23\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.5027ms 5.1835ms 192.9190 Ops/s 171.6366 Ops/s $\textbf{\color{#35bf28}+12.40\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 3.3262ms 0.6116ms 1.6349 KOps/s 1.5754 KOps/s $\color{#35bf28}+3.78\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8668ms 0.5771ms 1.7327 KOps/s 1.6451 KOps/s $\textbf{\color{#35bf28}+5.32\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.3798ms 5.1625ms 193.7046 Ops/s 180.5524 Ops/s $\textbf{\color{#35bf28}+7.28\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.9334ms 0.4813ms 2.0776 KOps/s 2.0417 KOps/s $\color{#35bf28}+1.76\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6748ms 0.4489ms 2.2278 KOps/s 2.1548 KOps/s $\color{#35bf28}+3.39\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.9318ms 5.0278ms 198.8935 Ops/s 190.9473 Ops/s $\color{#35bf28}+4.16\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.1212ms 0.4679ms 2.1373 KOps/s 2.0107 KOps/s $\textbf{\color{#35bf28}+6.30\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6553ms 0.4419ms 2.2630 KOps/s 2.1698 KOps/s $\color{#35bf28}+4.29\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 8.1014ms 5.2061ms 192.0834 Ops/s 182.2038 Ops/s $\textbf{\color{#35bf28}+5.42\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 3.1095ms 0.6094ms 1.6410 KOps/s 1.5923 KOps/s $\color{#35bf28}+3.06\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9089ms 0.5880ms 1.7007 KOps/s 1.6145 KOps/s $\textbf{\color{#35bf28}+5.34\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.4193s 12.4670ms 80.2117 Ops/s 225.2641 Ops/s $\textbf{\color{#d91a1a}-64.39\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 3.4877ms 1.9354ms 516.6971 Ops/s 495.0018 Ops/s $\color{#35bf28}+4.38\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 2.4339ms 1.2528ms 798.1872 Ops/s 701.1188 Ops/s $\textbf{\color{#35bf28}+13.84\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 5.6434ms 4.1331ms 241.9478 Ops/s 226.3027 Ops/s $\textbf{\color{#35bf28}+6.91\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 5.8353ms 2.0040ms 499.0099 Ops/s 448.4430 Ops/s $\textbf{\color{#35bf28}+11.28\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 1.6686ms 1.1556ms 865.3522 Ops/s 756.7047 Ops/s $\textbf{\color{#35bf28}+14.36\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 6.5408ms 4.3207ms 231.4437 Ops/s 235.5782 Ops/s $\color{#d91a1a}-1.76\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 3.4266ms 2.1573ms 463.5426 Ops/s 490.9334 Ops/s $\textbf{\color{#d91a1a}-5.58\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 5.4660ms 1.4269ms 700.8239 Ops/s 679.9831 Ops/s $\color{#35bf28}+3.06\%$

vmoens added a commit that referenced this pull request Sep 26, 2024
ghstack-source-id: 28cd569bd4abf472991b82b3eba9fe333b5cd68f
Pull Request resolved: #2455
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}8$. Worsened: $\large\color{#d91a1a}13$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 0.1020s 0.1017s 9.8324 Ops/s 9.9742 Ops/s $\color{#d91a1a}-1.42\%$
test_sync 93.2310ms 90.0824ms 11.1009 Ops/s 11.2470 Ops/s $\color{#d91a1a}-1.30\%$
test_async 0.1671s 84.7588ms 11.7982 Ops/s 11.8302 Ops/s $\color{#d91a1a}-0.27\%$
test_single_pixels 0.1099s 0.1088s 9.1885 Ops/s 9.3141 Ops/s $\color{#d91a1a}-1.35\%$
test_sync_pixels 72.3216ms 71.7618ms 13.9350 Ops/s 14.2502 Ops/s $\color{#d91a1a}-2.21\%$
test_async_pixels 0.1255s 66.5838ms 15.0187 Ops/s 15.0048 Ops/s $\color{#35bf28}+0.09\%$
test_simple 0.7375s 0.7249s 1.3795 Ops/s 1.3594 Ops/s $\color{#35bf28}+1.47\%$
test_transformed 0.9513s 0.9482s 1.0546 Ops/s 1.0502 Ops/s $\color{#35bf28}+0.42\%$
test_serial 2.1412s 2.0638s 0.4845 Ops/s 0.4913 Ops/s $\color{#d91a1a}-1.37\%$
test_parallel 1.9666s 1.8951s 0.5277 Ops/s 0.5420 Ops/s $\color{#d91a1a}-2.64\%$
test_step_mdp_speed[True-True-True-True-True] 0.1502ms 37.3566μs 26.7690 KOps/s 27.5821 KOps/s $\color{#d91a1a}-2.95\%$
test_step_mdp_speed[True-True-True-True-False] 0.1311ms 20.9003μs 47.8462 KOps/s 47.7377 KOps/s $\color{#35bf28}+0.23\%$
test_step_mdp_speed[True-True-True-False-True] 0.1205ms 20.9865μs 47.6496 KOps/s 48.0129 KOps/s $\color{#d91a1a}-0.76\%$
test_step_mdp_speed[True-True-True-False-False] 41.5310μs 11.7568μs 85.0569 KOps/s 85.1601 KOps/s $\color{#d91a1a}-0.12\%$
test_step_mdp_speed[True-True-False-True-True] 73.3320μs 38.9373μs 25.6823 KOps/s 25.9726 KOps/s $\color{#d91a1a}-1.12\%$
test_step_mdp_speed[True-True-False-True-False] 0.2197ms 22.9224μs 43.6254 KOps/s 44.0945 KOps/s $\color{#d91a1a}-1.06\%$
test_step_mdp_speed[True-True-False-False-True] 74.8920μs 23.1213μs 43.2502 KOps/s 44.2795 KOps/s $\color{#d91a1a}-2.32\%$
test_step_mdp_speed[True-True-False-False-False] 43.5910μs 13.8873μs 72.0080 KOps/s 75.0086 KOps/s $\color{#d91a1a}-4.00\%$
test_step_mdp_speed[True-False-True-True-True] 70.1710μs 40.8693μs 24.4682 KOps/s 24.9931 KOps/s $\color{#d91a1a}-2.10\%$
test_step_mdp_speed[True-False-True-True-False] 54.1010μs 25.1157μs 39.8157 KOps/s 40.1991 KOps/s $\color{#d91a1a}-0.95\%$
test_step_mdp_speed[True-False-True-False-True] 53.4210μs 23.1253μs 43.2427 KOps/s 43.8133 KOps/s $\color{#d91a1a}-1.30\%$
test_step_mdp_speed[True-False-True-False-False] 43.7410μs 13.8829μs 72.0308 KOps/s 72.3033 KOps/s $\color{#d91a1a}-0.38\%$
test_step_mdp_speed[True-False-False-True-True] 72.6920μs 43.1037μs 23.1999 KOps/s 23.6277 KOps/s $\color{#d91a1a}-1.81\%$
test_step_mdp_speed[True-False-False-True-False] 65.4010μs 26.9635μs 37.0872 KOps/s 37.6062 KOps/s $\color{#d91a1a}-1.38\%$
test_step_mdp_speed[True-False-False-False-True] 52.4310μs 24.4581μs 40.8863 KOps/s 40.7114 KOps/s $\color{#35bf28}+0.43\%$
test_step_mdp_speed[True-False-False-False-False] 0.1189ms 15.8866μs 62.9462 KOps/s 63.6157 KOps/s $\color{#d91a1a}-1.05\%$
test_step_mdp_speed[False-True-True-True-True] 74.8520μs 41.3111μs 24.2066 KOps/s 24.7636 KOps/s $\color{#d91a1a}-2.25\%$
test_step_mdp_speed[False-True-True-True-False] 0.2042ms 25.1788μs 39.7160 KOps/s 40.4084 KOps/s $\color{#d91a1a}-1.71\%$
test_step_mdp_speed[False-True-True-False-True] 0.2015ms 26.0090μs 38.4482 KOps/s 40.0699 KOps/s $\color{#d91a1a}-4.05\%$
test_step_mdp_speed[False-True-True-False-False] 0.2129ms 15.5552μs 64.2870 KOps/s 65.1937 KOps/s $\color{#d91a1a}-1.39\%$
test_step_mdp_speed[False-True-False-True-True] 0.2281ms 43.0604μs 23.2232 KOps/s 23.7253 KOps/s $\color{#d91a1a}-2.12\%$
test_step_mdp_speed[False-True-False-True-False] 55.8310μs 27.0579μs 36.9577 KOps/s 37.3883 KOps/s $\color{#d91a1a}-1.15\%$
test_step_mdp_speed[False-True-False-False-True] 3.8697ms 28.2971μs 35.3393 KOps/s 36.3690 KOps/s $\color{#d91a1a}-2.83\%$
test_step_mdp_speed[False-True-False-False-False] 0.1067ms 17.5159μs 57.0908 KOps/s 57.3922 KOps/s $\color{#d91a1a}-0.53\%$
test_step_mdp_speed[False-False-True-True-True] 73.8520μs 44.5470μs 22.4482 KOps/s 22.4978 KOps/s $\color{#d91a1a}-0.22\%$
test_step_mdp_speed[False-False-True-True-False] 57.4210μs 29.2852μs 34.1469 KOps/s 34.5274 KOps/s $\color{#d91a1a}-1.10\%$
test_step_mdp_speed[False-False-True-False-True] 62.1920μs 27.7902μs 35.9839 KOps/s 36.4495 KOps/s $\color{#d91a1a}-1.28\%$
test_step_mdp_speed[False-False-True-False-False] 44.7110μs 17.5056μs 57.1247 KOps/s 57.4639 KOps/s $\color{#d91a1a}-0.59\%$
test_step_mdp_speed[False-False-False-True-True] 77.1910μs 46.0712μs 21.7055 KOps/s 22.0762 KOps/s $\color{#d91a1a}-1.68\%$
test_step_mdp_speed[False-False-False-True-False] 63.6410μs 30.8768μs 32.3868 KOps/s 32.6106 KOps/s $\color{#d91a1a}-0.69\%$
test_step_mdp_speed[False-False-False-False-True] 54.1610μs 28.8090μs 34.7113 KOps/s 35.3732 KOps/s $\color{#d91a1a}-1.87\%$
test_step_mdp_speed[False-False-False-False-False] 0.1044ms 19.4754μs 51.3469 KOps/s 52.1278 KOps/s $\color{#d91a1a}-1.50\%$
test_values[generalized_advantage_estimate-True-True] 24.4218ms 23.7356ms 42.1308 Ops/s 42.6581 Ops/s $\color{#d91a1a}-1.24\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1046s 2.9649ms 337.2777 Ops/s 339.4438 Ops/s $\color{#d91a1a}-0.64\%$
test_values[td0_return_estimate-False-False] 90.4120μs 63.5565μs 15.7340 KOps/s 15.8876 KOps/s $\color{#d91a1a}-0.97\%$
test_values[td1_return_estimate-False-False] 54.1015ms 53.6641ms 18.6344 Ops/s 18.9001 Ops/s $\color{#d91a1a}-1.41\%$
test_values[vec_td1_return_estimate-False-False] 1.4029ms 1.0547ms 948.1618 Ops/s 951.1933 Ops/s $\color{#d91a1a}-0.32\%$
test_values[td_lambda_return_estimate-True-False] 88.4094ms 85.2316ms 11.7327 Ops/s 11.9017 Ops/s $\color{#d91a1a}-1.42\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.3763ms 1.0527ms 949.9156 Ops/s 957.1064 Ops/s $\color{#d91a1a}-0.75\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 24.0895ms 23.8379ms 41.9500 Ops/s 42.2757 Ops/s $\color{#d91a1a}-0.77\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 0.9497ms 0.6998ms 1.4291 KOps/s 1.4479 KOps/s $\color{#d91a1a}-1.30\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7731ms 0.6396ms 1.5634 KOps/s 1.5712 KOps/s $\color{#d91a1a}-0.49\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.6348ms 1.4442ms 692.4402 Ops/s 695.1367 Ops/s $\color{#d91a1a}-0.39\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.8450ms 0.6535ms 1.5302 KOps/s 1.5356 KOps/s $\color{#d91a1a}-0.35\%$
test_dqn_speed[False-None] 7.0704ms 1.3348ms 749.1641 Ops/s 773.6957 Ops/s $\color{#d91a1a}-3.17\%$
test_dqn_speed[False-backward] 2.0298ms 1.8838ms 530.8405 Ops/s 557.7128 Ops/s $\color{#d91a1a}-4.82\%$
test_dqn_speed[True-None] 0.7714ms 0.5461ms 1.8310 KOps/s 1.8206 KOps/s $\color{#35bf28}+0.57\%$
test_dqn_speed[True-backward] 1.0299ms 0.9935ms 1.0065 KOps/s 863.8787 Ops/s $\textbf{\color{#35bf28}+16.51\%}$
test_dqn_speed[reduce-overhead-None] 1.6503ms 0.5541ms 1.8047 KOps/s 1.8218 KOps/s $\color{#d91a1a}-0.94\%$
test_dqn_speed[reduce-overhead-backward] 1.2552ms 1.0187ms 981.6466 Ops/s 1.0023 KOps/s $\color{#d91a1a}-2.06\%$
test_ddpg_speed[False-None] 3.1081ms 2.7158ms 368.2186 Ops/s 371.9729 Ops/s $\color{#d91a1a}-1.01\%$
test_ddpg_speed[False-backward] 4.1561ms 3.9609ms 252.4656 Ops/s 255.2583 Ops/s $\color{#d91a1a}-1.09\%$
test_ddpg_speed[True-None] 1.4391ms 1.2354ms 809.4865 Ops/s 788.7470 Ops/s $\color{#35bf28}+2.63\%$
test_ddpg_speed[True-backward] 2.2665ms 2.2211ms 450.2350 Ops/s 445.1337 Ops/s $\color{#35bf28}+1.15\%$
test_ddpg_speed[reduce-overhead-None] 1.6373ms 1.2397ms 806.6677 Ops/s 803.6082 Ops/s $\color{#35bf28}+0.38\%$
test_ddpg_speed[reduce-overhead-backward] 2.3688ms 2.2161ms 451.2482 Ops/s 450.8926 Ops/s $\color{#35bf28}+0.08\%$
test_sac_speed[False-None] 8.8824ms 7.5558ms 132.3493 Ops/s 132.5564 Ops/s $\color{#d91a1a}-0.16\%$
test_sac_speed[False-backward] 11.2247ms 10.7573ms 92.9605 Ops/s 93.2798 Ops/s $\color{#d91a1a}-0.34\%$
test_sac_speed[True-None] 2.4049ms 2.0204ms 494.9477 Ops/s 472.5176 Ops/s $\color{#35bf28}+4.75\%$
test_sac_speed[True-backward] 4.1377ms 3.9656ms 252.1698 Ops/s 215.4807 Ops/s $\textbf{\color{#35bf28}+17.03\%}$
test_sac_speed[reduce-overhead-None] 2.3838ms 2.0239ms 494.0914 Ops/s 477.5050 Ops/s $\color{#35bf28}+3.47\%$
test_sac_speed[reduce-overhead-backward] 4.2036ms 3.9804ms 251.2299 Ops/s 248.2112 Ops/s $\color{#35bf28}+1.22\%$
test_redq_speed[False-None] 0.2636s 12.9970ms 76.9410 Ops/s 96.2167 Ops/s $\textbf{\color{#d91a1a}-20.03\%}$
test_redq_speed[False-backward] 19.1561ms 18.1706ms 55.0338 Ops/s 55.9997 Ops/s $\color{#d91a1a}-1.72\%$
test_redq_speed[True-None] 3.8195ms 3.4107ms 293.1950 Ops/s 277.9729 Ops/s $\textbf{\color{#35bf28}+5.48\%}$
test_redq_speed[True-backward] 8.8910ms 8.4617ms 118.1792 Ops/s 112.0008 Ops/s $\textbf{\color{#35bf28}+5.52\%}$
test_redq_speed[reduce-overhead-None] 3.8729ms 3.4413ms 290.5908 Ops/s 277.1663 Ops/s $\color{#35bf28}+4.84\%$
test_redq_speed[reduce-overhead-backward] 8.8952ms 8.3718ms 119.4489 Ops/s 114.1252 Ops/s $\color{#35bf28}+4.66\%$
test_redq_deprec_speed[False-None] 11.4080ms 10.6770ms 93.6594 Ops/s 94.4316 Ops/s $\color{#d91a1a}-0.82\%$
test_redq_deprec_speed[False-backward] 16.1798ms 15.4793ms 64.6025 Ops/s 63.9520 Ops/s $\color{#35bf28}+1.02\%$
test_redq_deprec_speed[True-None] 3.5236ms 3.2024ms 312.2641 Ops/s 298.1805 Ops/s $\color{#35bf28}+4.72\%$
test_redq_deprec_speed[True-backward] 7.1938ms 6.9672ms 143.5304 Ops/s 137.2569 Ops/s $\color{#35bf28}+4.57\%$
test_redq_deprec_speed[reduce-overhead-None] 3.5974ms 3.1971ms 312.7867 Ops/s 302.0743 Ops/s $\color{#35bf28}+3.55\%$
test_redq_deprec_speed[reduce-overhead-backward] 7.2700ms 6.9555ms 143.7707 Ops/s 138.6337 Ops/s $\color{#35bf28}+3.71\%$
test_td3_speed[False-None] 7.7203ms 7.5611ms 132.2556 Ops/s 135.0848 Ops/s $\color{#d91a1a}-2.09\%$
test_td3_speed[False-backward] 10.8044ms 10.4316ms 95.8623 Ops/s 97.3131 Ops/s $\color{#d91a1a}-1.49\%$
test_td3_speed[True-None] 2.1243ms 2.0566ms 486.2408 Ops/s 472.1214 Ops/s $\color{#35bf28}+2.99\%$
test_td3_speed[True-backward] 4.0423ms 3.8820ms 257.6017 Ops/s 252.0508 Ops/s $\color{#35bf28}+2.20\%$
test_td3_speed[reduce-overhead-None] 2.1337ms 2.0623ms 484.8965 Ops/s 477.1535 Ops/s $\color{#35bf28}+1.62\%$
test_td3_speed[reduce-overhead-backward] 4.0678ms 3.9172ms 255.2854 Ops/s 254.8065 Ops/s $\color{#35bf28}+0.19\%$
test_cql_speed[False-None] 27.5910ms 24.6945ms 40.4949 Ops/s 41.1080 Ops/s $\color{#d91a1a}-1.49\%$
test_cql_speed[False-backward] 34.5717ms 33.5142ms 29.8381 Ops/s 29.8415 Ops/s $\color{#d91a1a}-0.01\%$
test_cql_speed[True-None] 11.7510ms 11.0947ms 90.1327 Ops/s 90.3386 Ops/s $\color{#d91a1a}-0.23\%$
test_cql_speed[True-backward] 17.3627ms 16.9247ms 59.0852 Ops/s 59.6355 Ops/s $\color{#d91a1a}-0.92\%$
test_cql_speed[reduce-overhead-None] 11.4661ms 10.9814ms 91.0632 Ops/s 90.7396 Ops/s $\color{#35bf28}+0.36\%$
test_cql_speed[reduce-overhead-backward] 17.5641ms 16.9285ms 59.0718 Ops/s 58.4824 Ops/s $\color{#35bf28}+1.01\%$
test_a2c_speed[False-None] 5.7002ms 5.3505ms 186.8990 Ops/s 187.2834 Ops/s $\color{#d91a1a}-0.21\%$
test_a2c_speed[False-backward] 12.1722ms 11.7371ms 85.2001 Ops/s 84.1013 Ops/s $\color{#35bf28}+1.31\%$
test_a2c_speed[True-None] 3.4511ms 3.0791ms 324.7704 Ops/s 322.5904 Ops/s $\color{#35bf28}+0.68\%$
test_a2c_speed[True-backward] 8.9007ms 8.6515ms 115.5870 Ops/s 113.0466 Ops/s $\color{#35bf28}+2.25\%$
test_a2c_speed[reduce-overhead-None] 3.2873ms 3.0651ms 326.2487 Ops/s 320.6905 Ops/s $\color{#35bf28}+1.73\%$
test_a2c_speed[reduce-overhead-backward] 9.3726ms 8.5539ms 116.9063 Ops/s 115.7260 Ops/s $\color{#35bf28}+1.02\%$
test_ppo_speed[False-None] 6.0689ms 5.7497ms 173.9213 Ops/s 177.1664 Ops/s $\color{#d91a1a}-1.83\%$
test_ppo_speed[False-backward] 12.5412ms 12.2601ms 81.5655 Ops/s 81.0500 Ops/s $\color{#35bf28}+0.64\%$
test_ppo_speed[True-None] 3.8564ms 3.4548ms 289.4515 Ops/s 286.7493 Ops/s $\color{#35bf28}+0.94\%$
test_ppo_speed[True-backward] 9.1438ms 8.4649ms 118.1349 Ops/s 118.9911 Ops/s $\color{#d91a1a}-0.72\%$
test_ppo_speed[reduce-overhead-None] 3.7286ms 3.4313ms 291.4390 Ops/s 288.3524 Ops/s $\color{#35bf28}+1.07\%$
test_ppo_speed[reduce-overhead-backward] 8.8717ms 8.3603ms 119.6124 Ops/s 118.7052 Ops/s $\color{#35bf28}+0.76\%$
test_reinforce_speed[False-None] 4.8408ms 4.4622ms 224.1044 Ops/s 223.5201 Ops/s $\color{#35bf28}+0.26\%$
test_reinforce_speed[False-backward] 7.6124ms 7.3178ms 136.6539 Ops/s 135.9993 Ops/s $\color{#35bf28}+0.48\%$
test_reinforce_speed[True-None] 2.6804ms 2.2486ms 444.7236 Ops/s 442.2393 Ops/s $\color{#35bf28}+0.56\%$
test_reinforce_speed[True-backward] 8.0607ms 7.1660ms 139.5478 Ops/s 139.6145 Ops/s $\color{#d91a1a}-0.05\%$
test_reinforce_speed[reduce-overhead-None] 2.6931ms 2.2489ms 444.6674 Ops/s 441.7395 Ops/s $\color{#35bf28}+0.66\%$
test_reinforce_speed[reduce-overhead-backward] 7.4163ms 7.1147ms 140.5533 Ops/s 138.7752 Ops/s $\color{#35bf28}+1.28\%$
test_iql_speed[False-None] 21.5925ms 19.8090ms 50.4821 Ops/s 50.6749 Ops/s $\color{#d91a1a}-0.38\%$
test_iql_speed[False-backward] 30.7846ms 29.8258ms 33.5280 Ops/s 33.4722 Ops/s $\color{#35bf28}+0.17\%$
test_iql_speed[True-None] 8.3444ms 7.9044ms 126.5115 Ops/s 124.2283 Ops/s $\color{#35bf28}+1.84\%$
test_iql_speed[True-backward] 17.2787ms 16.8357ms 59.3976 Ops/s 59.1617 Ops/s $\color{#35bf28}+0.40\%$
test_iql_speed[reduce-overhead-None] 8.3419ms 7.9376ms 125.9823 Ops/s 124.7918 Ops/s $\color{#35bf28}+0.95\%$
test_iql_speed[reduce-overhead-backward] 17.0352ms 16.6621ms 60.0164 Ops/s 59.2901 Ops/s $\color{#35bf28}+1.23\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.7739ms 6.6001ms 151.5136 Ops/s 152.9555 Ops/s $\color{#d91a1a}-0.94\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7964ms 0.3624ms 2.7591 KOps/s 3.1955 KOps/s $\textbf{\color{#d91a1a}-13.66\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5532ms 0.3530ms 2.8328 KOps/s 2.9991 KOps/s $\textbf{\color{#d91a1a}-5.54\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.8396ms 6.4754ms 154.4307 Ops/s 154.8559 Ops/s $\color{#d91a1a}-0.27\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.7161ms 0.3565ms 2.8054 KOps/s 2.9580 KOps/s $\textbf{\color{#d91a1a}-5.16\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5922ms 0.3442ms 2.9050 KOps/s 3.1299 KOps/s $\textbf{\color{#d91a1a}-7.19\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6473ms 1.4174ms 705.5178 Ops/s 682.0692 Ops/s $\color{#35bf28}+3.44\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5707ms 1.3431ms 744.5655 Ops/s 733.7415 Ops/s $\color{#35bf28}+1.48\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.8225ms 6.6517ms 150.3379 Ops/s 151.2081 Ops/s $\color{#d91a1a}-0.58\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.6992ms 0.3885ms 2.5739 KOps/s 2.0887 KOps/s $\textbf{\color{#35bf28}+23.23\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7030ms 0.3633ms 2.7522 KOps/s 2.7988 KOps/s $\color{#d91a1a}-1.66\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.7848ms 6.5475ms 152.7310 Ops/s 155.6988 Ops/s $\color{#d91a1a}-1.91\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7841ms 0.3549ms 2.8174 KOps/s 2.9590 KOps/s $\color{#d91a1a}-4.79\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6063ms 0.3380ms 2.9587 KOps/s 3.1542 KOps/s $\textbf{\color{#d91a1a}-6.20\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.8800ms 6.4845ms 154.2148 Ops/s 157.6612 Ops/s $\color{#d91a1a}-2.19\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.3785ms 0.3506ms 2.8520 KOps/s 3.2493 KOps/s $\textbf{\color{#d91a1a}-12.23\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5793ms 0.3324ms 3.0088 KOps/s 3.4157 KOps/s $\textbf{\color{#d91a1a}-11.91\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.9736ms 6.7084ms 149.0671 Ops/s 152.2936 Ops/s $\color{#d91a1a}-2.12\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.1980ms 0.5287ms 1.8915 KOps/s 2.3379 KOps/s $\textbf{\color{#d91a1a}-19.09\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7760ms 0.5066ms 1.9739 KOps/s 2.4144 KOps/s $\textbf{\color{#d91a1a}-18.24\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.4154s 13.5733ms 73.6739 Ops/s 34.0887 Ops/s $\textbf{\color{#35bf28}+116.12\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 3.7977ms 1.4378ms 695.5263 Ops/s 668.2730 Ops/s $\color{#35bf28}+4.08\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 3.7100ms 1.1353ms 880.8625 Ops/s 709.2971 Ops/s $\textbf{\color{#35bf28}+24.19\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 9.5576ms 5.3930ms 185.4271 Ops/s 184.0541 Ops/s $\color{#35bf28}+0.75\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 7.4540ms 1.9324ms 517.4928 Ops/s 567.5373 Ops/s $\textbf{\color{#d91a1a}-8.82\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 2.0472ms 1.0463ms 955.7103 Ops/s 774.1924 Ops/s $\textbf{\color{#35bf28}+23.45\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.3683s 12.9516ms 77.2105 Ops/s 176.8238 Ops/s $\textbf{\color{#d91a1a}-56.33\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 9.4700ms 2.0984ms 476.5525 Ops/s 595.6343 Ops/s $\textbf{\color{#d91a1a}-19.99\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 7.9152ms 1.3981ms 715.2485 Ops/s 706.0683 Ops/s $\color{#35bf28}+1.30\%$

@vmoens vmoens merged commit cc96914 into gh/vmoens/30/base Sep 26, 2024
64 of 67 checks passed
@vmoens vmoens deleted the gh/vmoens/30/head branch September 26, 2024 15:42
@vmoens vmoens added the Refactoring Refactoring of an existing feature label Sep 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Refactoring Refactoring of an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants