Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Tutorial] Beam search with GPT models #2623

Open
wants to merge 7 commits into
base: gh/vmoens/47/base
Choose a base branch
from

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Dec 2, 2024

Stack from ghstack (oldest at bottom):

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Dec 2, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2623

Note: Links to docs will display an error until the docs builds have been completed.

❌ 11 New Failures, 8 Unrelated Failures

As of commit c69bf38 with merge base 2511c04 (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following job failed but was likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

vmoens added a commit that referenced this pull request Dec 2, 2024
ghstack-source-id: b37305f2d8c42a070c1113435cabf46926a4fa12
Pull Request resolved: #2623
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 2, 2024
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 3, 2024
ghstack-source-id: 62f96bf1965a65ca35485de6ee66260abe33f117
Pull Request resolved: #2623
Copy link

github-actions bot commented Dec 3, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}10$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.4345s 0.4314s 2.3182 Ops/s 2.1749 Ops/s $\textbf{\color{#35bf28}+6.59\%}$
test_transformed 0.6148s 0.6085s 1.6434 Ops/s 1.6044 Ops/s $\color{#35bf28}+2.43\%$
test_serial 1.4732s 1.3766s 0.7264 Ops/s 0.7323 Ops/s $\color{#d91a1a}-0.81\%$
test_parallel 1.4166s 1.3265s 0.7538 Ops/s 0.7494 Ops/s $\color{#35bf28}+0.59\%$
test_step_mdp_speed[True-True-True-True-True] 0.3440ms 29.9592μs 33.3787 KOps/s 33.6647 KOps/s $\color{#d91a1a}-0.85\%$
test_step_mdp_speed[True-True-True-True-False] 50.5950μs 17.8045μs 56.1656 KOps/s 57.4156 KOps/s $\color{#d91a1a}-2.18\%$
test_step_mdp_speed[True-True-True-False-True] 48.2010μs 16.9470μs 59.0075 KOps/s 59.3348 KOps/s $\color{#d91a1a}-0.55\%$
test_step_mdp_speed[True-True-True-False-False] 52.2110μs 9.8582μs 101.4380 KOps/s 101.1296 KOps/s $\color{#35bf28}+0.30\%$
test_step_mdp_speed[True-True-False-True-True] 81.2120μs 32.0174μs 31.2330 KOps/s 31.5374 KOps/s $\color{#d91a1a}-0.97\%$
test_step_mdp_speed[True-True-False-True-False] 52.7690μs 19.5390μs 51.1798 KOps/s 51.6877 KOps/s $\color{#d91a1a}-0.98\%$
test_step_mdp_speed[True-True-False-False-True] 84.1980μs 18.8224μs 53.1281 KOps/s 53.8697 KOps/s $\color{#d91a1a}-1.38\%$
test_step_mdp_speed[True-True-False-False-False] 39.1430μs 11.7869μs 84.8397 KOps/s 86.0111 KOps/s $\color{#d91a1a}-1.36\%$
test_step_mdp_speed[True-False-True-True-True] 78.6780μs 33.8627μs 29.5310 KOps/s 30.0666 KOps/s $\color{#d91a1a}-1.78\%$
test_step_mdp_speed[True-False-True-True-False] 50.7960μs 21.6405μs 46.2097 KOps/s 48.0096 KOps/s $\color{#d91a1a}-3.75\%$
test_step_mdp_speed[True-False-True-False-True] 70.8340μs 18.7448μs 53.3481 KOps/s 54.2520 KOps/s $\color{#d91a1a}-1.67\%$
test_step_mdp_speed[True-False-True-False-False] 42.4900μs 11.8817μs 84.1629 KOps/s 86.7464 KOps/s $\color{#d91a1a}-2.98\%$
test_step_mdp_speed[True-False-False-True-True] 0.1091ms 35.4203μs 28.2324 KOps/s 28.8986 KOps/s $\color{#d91a1a}-2.31\%$
test_step_mdp_speed[True-False-False-True-False] 81.1020μs 23.1262μs 43.2410 KOps/s 43.6050 KOps/s $\color{#d91a1a}-0.83\%$
test_step_mdp_speed[True-False-False-False-True] 51.4270μs 20.5783μs 48.5949 KOps/s 49.7799 KOps/s $\color{#d91a1a}-2.38\%$
test_step_mdp_speed[True-False-False-False-False] 54.4120μs 13.4697μs 74.2407 KOps/s 75.9674 KOps/s $\color{#d91a1a}-2.27\%$
test_step_mdp_speed[False-True-True-True-True] 79.5800μs 33.8427μs 29.5485 KOps/s 29.8331 KOps/s $\color{#d91a1a}-0.95\%$
test_step_mdp_speed[False-True-True-True-False] 70.0120μs 21.2490μs 47.0611 KOps/s 47.5303 KOps/s $\color{#d91a1a}-0.99\%$
test_step_mdp_speed[False-True-True-False-True] 59.3920μs 21.2762μs 47.0009 KOps/s 46.6459 KOps/s $\color{#35bf28}+0.76\%$
test_step_mdp_speed[False-True-True-False-False] 59.2910μs 12.9745μs 77.0745 KOps/s 77.6984 KOps/s $\color{#d91a1a}-0.80\%$
test_step_mdp_speed[False-True-False-True-True] 91.3220μs 35.4229μs 28.2304 KOps/s 28.5612 KOps/s $\color{#d91a1a}-1.16\%$
test_step_mdp_speed[False-True-False-True-False] 63.7500μs 23.1375μs 43.2199 KOps/s 44.3836 KOps/s $\color{#d91a1a}-2.62\%$
test_step_mdp_speed[False-True-False-False-True] 2.7536ms 22.7694μs 43.9186 KOps/s 44.4243 KOps/s $\color{#d91a1a}-1.14\%$
test_step_mdp_speed[False-True-False-False-False] 41.0770μs 14.8195μs 67.4785 KOps/s 68.6994 KOps/s $\color{#d91a1a}-1.78\%$
test_step_mdp_speed[False-False-True-True-True] 94.0660μs 37.0720μs 26.9745 KOps/s 27.5934 KOps/s $\color{#d91a1a}-2.24\%$
test_step_mdp_speed[False-False-True-True-False] 73.7480μs 24.6922μs 40.4986 KOps/s 41.0635 KOps/s $\color{#d91a1a}-1.38\%$
test_step_mdp_speed[False-False-True-False-True] 63.9400μs 22.4728μs 44.4981 KOps/s 44.3687 KOps/s $\color{#35bf28}+0.29\%$
test_step_mdp_speed[False-False-True-False-False] 0.1984ms 15.2805μs 65.4429 KOps/s 68.5943 KOps/s $\color{#d91a1a}-4.59\%$
test_step_mdp_speed[False-False-False-True-True] 0.1297ms 38.6353μs 25.8831 KOps/s 25.9457 KOps/s $\color{#d91a1a}-0.24\%$
test_step_mdp_speed[False-False-False-True-False] 78.8280μs 26.3974μs 37.8825 KOps/s 39.0642 KOps/s $\color{#d91a1a}-3.02\%$
test_step_mdp_speed[False-False-False-False-True] 71.6350μs 23.9250μs 41.7973 KOps/s 41.6967 KOps/s $\color{#35bf28}+0.24\%$
test_step_mdp_speed[False-False-False-False-False] 42.4090μs 16.3647μs 61.1070 KOps/s 62.8033 KOps/s $\color{#d91a1a}-2.70\%$
test_values[generalized_advantage_estimate-True-True] 9.8748ms 9.2839ms 107.7132 Ops/s 104.2447 Ops/s $\color{#35bf28}+3.33\%$
test_values[vec_generalized_advantage_estimate-True-True] 38.3374ms 35.7064ms 28.0061 Ops/s 29.5967 Ops/s $\textbf{\color{#d91a1a}-5.37\%}$
test_values[td0_return_estimate-False-False] 0.2394ms 0.1830ms 5.4637 KOps/s 5.4721 KOps/s $\color{#d91a1a}-0.15\%$
test_values[td1_return_estimate-False-False] 26.2394ms 23.5729ms 42.4216 Ops/s 40.8017 Ops/s $\color{#35bf28}+3.97\%$
test_values[vec_td1_return_estimate-False-False] 37.7322ms 35.8776ms 27.8725 Ops/s 29.5504 Ops/s $\textbf{\color{#d91a1a}-5.68\%}$
test_values[td_lambda_return_estimate-True-False] 34.1311ms 33.4753ms 29.8728 Ops/s 28.4392 Ops/s $\textbf{\color{#35bf28}+5.04\%}$
test_values[vec_td_lambda_return_estimate-True-False] 38.2848ms 36.0749ms 27.7201 Ops/s 29.2124 Ops/s $\textbf{\color{#d91a1a}-5.11\%}$
test_gae_speed[generalized_advantage_estimate-False-1-512] 10.1043ms 8.1672ms 122.4407 Ops/s 120.5113 Ops/s $\color{#35bf28}+1.60\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.1815ms 1.9420ms 514.9427 Ops/s 486.2732 Ops/s $\textbf{\color{#35bf28}+5.90\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4486ms 0.3658ms 2.7340 KOps/s 2.7061 KOps/s $\color{#35bf28}+1.03\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 49.7049ms 47.5905ms 21.0126 Ops/s 23.0080 Ops/s $\textbf{\color{#d91a1a}-8.67\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 3.9769ms 3.0527ms 327.5790 Ops/s 323.8297 Ops/s $\color{#35bf28}+1.16\%$
test_dqn_speed[False-None] 5.8918ms 1.3935ms 717.6379 Ops/s 696.5583 Ops/s $\color{#35bf28}+3.03\%$
test_dqn_speed[False-backward] 1.9469ms 1.8670ms 535.6089 Ops/s 527.5847 Ops/s $\color{#35bf28}+1.52\%$
test_dqn_speed[True-None] 0.6604ms 0.4701ms 2.1274 KOps/s 2.1273 KOps/s $+0.01\%$
test_dqn_speed[True-backward] 1.4535ms 1.0194ms 980.9436 Ops/s 763.5371 Ops/s $\textbf{\color{#35bf28}+28.47\%}$
test_dqn_speed[reduce-overhead-None] 0.7186ms 0.4788ms 2.0887 KOps/s 2.1217 KOps/s $\color{#d91a1a}-1.56\%$
test_dqn_speed[reduce-overhead-backward] 0.9971ms 0.9352ms 1.0693 KOps/s 1.0991 KOps/s $\color{#d91a1a}-2.72\%$
test_ddpg_speed[False-None] 3.7517ms 2.8705ms 348.3670 Ops/s 341.4539 Ops/s $\color{#35bf28}+2.02\%$
test_ddpg_speed[False-backward] 4.1671ms 3.9969ms 250.1948 Ops/s 244.0636 Ops/s $\color{#35bf28}+2.51\%$
test_ddpg_speed[True-None] 1.3713ms 1.0089ms 991.1896 Ops/s 986.3833 Ops/s $\color{#35bf28}+0.49\%$
test_ddpg_speed[True-backward] 2.0120ms 1.9521ms 512.2762 Ops/s 512.1454 Ops/s $\color{#35bf28}+0.03\%$
test_ddpg_speed[reduce-overhead-None] 1.4794ms 1.0102ms 989.9024 Ops/s 986.8611 Ops/s $\color{#35bf28}+0.31\%$
test_ddpg_speed[reduce-overhead-backward] 2.0348ms 1.9718ms 507.1509 Ops/s 519.0107 Ops/s $\color{#d91a1a}-2.29\%$
test_sac_speed[False-None] 0.2006s 9.6035ms 104.1284 Ops/s 121.5943 Ops/s $\textbf{\color{#d91a1a}-14.36\%}$
test_sac_speed[False-backward] 11.2400ms 10.8286ms 92.3481 Ops/s 90.4625 Ops/s $\color{#35bf28}+2.08\%$
test_sac_speed[True-None] 2.4127ms 1.8371ms 544.3267 Ops/s 540.1161 Ops/s $\color{#35bf28}+0.78\%$
test_sac_speed[True-backward] 3.7747ms 3.6268ms 275.7269 Ops/s 279.9640 Ops/s $\color{#d91a1a}-1.51\%$
test_sac_speed[reduce-overhead-None] 3.0781ms 1.8655ms 536.0538 Ops/s 533.5094 Ops/s $\color{#35bf28}+0.48\%$
test_sac_speed[reduce-overhead-backward] 3.6178ms 3.5261ms 283.6007 Ops/s 270.2031 Ops/s $\color{#35bf28}+4.96\%$
test_redq_speed[False-None] 14.9207ms 12.9604ms 77.1582 Ops/s 73.3930 Ops/s $\textbf{\color{#35bf28}+5.13\%}$
test_redq_speed[False-backward] 24.2765ms 22.2921ms 44.8590 Ops/s 43.4848 Ops/s $\color{#35bf28}+3.16\%$
test_redq_speed[True-None] 5.1787ms 4.5718ms 218.7325 Ops/s 210.4936 Ops/s $\color{#35bf28}+3.91\%$
test_redq_speed[True-backward] 13.2011ms 12.2322ms 81.7512 Ops/s 81.0050 Ops/s $\color{#35bf28}+0.92\%$
test_redq_speed[reduce-overhead-None] 5.9781ms 4.7773ms 209.3216 Ops/s 210.3179 Ops/s $\color{#d91a1a}-0.47\%$
test_redq_speed[reduce-overhead-backward] 13.1570ms 12.1684ms 82.1799 Ops/s 81.6085 Ops/s $\color{#35bf28}+0.70\%$
test_redq_deprec_speed[False-None] 14.0035ms 12.9276ms 77.3539 Ops/s 76.6692 Ops/s $\color{#35bf28}+0.89\%$
test_redq_deprec_speed[False-backward] 19.7874ms 18.7385ms 53.3660 Ops/s 53.3292 Ops/s $\color{#35bf28}+0.07\%$
test_redq_deprec_speed[True-None] 4.2108ms 3.5694ms 280.1558 Ops/s 278.7820 Ops/s $\color{#35bf28}+0.49\%$
test_redq_deprec_speed[True-backward] 9.5291ms 8.2249ms 121.5815 Ops/s 121.5748 Ops/s $+0.01\%$
test_redq_deprec_speed[reduce-overhead-None] 5.2933ms 3.6366ms 274.9809 Ops/s 277.9088 Ops/s $\color{#d91a1a}-1.05\%$
test_redq_deprec_speed[reduce-overhead-backward] 8.8347ms 8.1340ms 122.9402 Ops/s 123.3043 Ops/s $\color{#d91a1a}-0.30\%$
test_td3_speed[False-None] 8.4323ms 7.9736ms 125.4137 Ops/s 121.0255 Ops/s $\color{#35bf28}+3.63\%$
test_td3_speed[False-backward] 12.3040ms 10.5920ms 94.4111 Ops/s 93.4725 Ops/s $\color{#35bf28}+1.00\%$
test_td3_speed[True-None] 1.8141ms 1.7112ms 584.3740 Ops/s 570.5654 Ops/s $\color{#35bf28}+2.42\%$
test_td3_speed[True-backward] 3.4867ms 3.3873ms 295.2173 Ops/s 293.9239 Ops/s $\color{#35bf28}+0.44\%$
test_td3_speed[reduce-overhead-None] 1.8825ms 1.7028ms 587.2727 Ops/s 571.6280 Ops/s $\color{#35bf28}+2.74\%$
test_td3_speed[reduce-overhead-backward] 3.4664ms 3.3780ms 296.0292 Ops/s 295.3112 Ops/s $\color{#35bf28}+0.24\%$
test_cql_speed[False-None] 39.1488ms 36.0800ms 27.7162 Ops/s 27.0820 Ops/s $\color{#35bf28}+2.34\%$
test_cql_speed[False-backward] 50.1890ms 46.7330ms 21.3981 Ops/s 21.1778 Ops/s $\color{#35bf28}+1.04\%$
test_cql_speed[True-None] 16.6227ms 15.7490ms 63.4963 Ops/s 63.2839 Ops/s $\color{#35bf28}+0.34\%$
test_cql_speed[True-backward] 23.6404ms 22.3416ms 44.7595 Ops/s 44.8839 Ops/s $\color{#d91a1a}-0.28\%$
test_cql_speed[reduce-overhead-None] 17.1572ms 15.6571ms 63.8687 Ops/s 61.7973 Ops/s $\color{#35bf28}+3.35\%$
test_cql_speed[reduce-overhead-backward] 23.7867ms 22.4499ms 44.5436 Ops/s 44.3991 Ops/s $\color{#35bf28}+0.33\%$
test_a2c_speed[False-None] 9.4386ms 7.1809ms 139.2588 Ops/s 137.3457 Ops/s $\color{#35bf28}+1.39\%$
test_a2c_speed[False-backward] 15.1356ms 14.4263ms 69.3179 Ops/s 68.7855 Ops/s $\color{#35bf28}+0.77\%$
test_a2c_speed[True-None] 4.5756ms 4.2035ms 237.8985 Ops/s 236.3763 Ops/s $\color{#35bf28}+0.64\%$
test_a2c_speed[True-backward] 11.6721ms 10.8947ms 91.7874 Ops/s 92.2110 Ops/s $\color{#d91a1a}-0.46\%$
test_a2c_speed[reduce-overhead-None] 4.9201ms 4.1943ms 238.4193 Ops/s 238.0226 Ops/s $\color{#35bf28}+0.17\%$
test_a2c_speed[reduce-overhead-backward] 11.5193ms 10.9188ms 91.5853 Ops/s 92.6933 Ops/s $\color{#d91a1a}-1.20\%$
test_ppo_speed[False-None] 8.3097ms 7.4735ms 133.8061 Ops/s 132.9190 Ops/s $\color{#35bf28}+0.67\%$
test_ppo_speed[False-backward] 15.9405ms 14.9848ms 66.7344 Ops/s 67.6344 Ops/s $\color{#d91a1a}-1.33\%$
test_ppo_speed[True-None] 4.4934ms 3.7134ms 269.2973 Ops/s 267.6125 Ops/s $\color{#35bf28}+0.63\%$
test_ppo_speed[True-backward] 10.4267ms 9.7053ms 103.0364 Ops/s 103.0500 Ops/s $\color{#d91a1a}-0.01\%$
test_ppo_speed[reduce-overhead-None] 4.0253ms 3.6806ms 271.6960 Ops/s 269.6264 Ops/s $\color{#35bf28}+0.77\%$
test_ppo_speed[reduce-overhead-backward] 10.0208ms 9.7326ms 102.7479 Ops/s 103.0153 Ops/s $\color{#d91a1a}-0.26\%$
test_reinforce_speed[False-None] 7.4551ms 6.5177ms 153.4293 Ops/s 151.7970 Ops/s $\color{#35bf28}+1.08\%$
test_reinforce_speed[False-backward] 11.8501ms 9.8507ms 101.5159 Ops/s 100.8680 Ops/s $\color{#35bf28}+0.64\%$
test_reinforce_speed[True-None] 2.9874ms 2.6535ms 376.8555 Ops/s 375.8401 Ops/s $\color{#35bf28}+0.27\%$
test_reinforce_speed[True-backward] 9.2780ms 8.7132ms 114.7687 Ops/s 114.8192 Ops/s $\color{#d91a1a}-0.04\%$
test_reinforce_speed[reduce-overhead-None] 2.9916ms 2.6430ms 378.3621 Ops/s 375.1138 Ops/s $\color{#35bf28}+0.87\%$
test_reinforce_speed[reduce-overhead-backward] 9.1060ms 8.7014ms 114.9243 Ops/s 114.7030 Ops/s $\color{#35bf28}+0.19\%$
test_iql_speed[False-None] 33.8693ms 32.1106ms 31.1423 Ops/s 29.9672 Ops/s $\color{#35bf28}+3.92\%$
test_iql_speed[False-backward] 46.4252ms 44.7800ms 22.3314 Ops/s 21.4235 Ops/s $\color{#35bf28}+4.24\%$
test_iql_speed[True-None] 11.8256ms 10.7656ms 92.8881 Ops/s 92.9611 Ops/s $\color{#d91a1a}-0.08\%$
test_iql_speed[True-backward] 24.4532ms 21.9809ms 45.4940 Ops/s 45.6249 Ops/s $\color{#d91a1a}-0.29\%$
test_iql_speed[reduce-overhead-None] 11.5838ms 10.6759ms 93.6693 Ops/s 92.7667 Ops/s $\color{#35bf28}+0.97\%$
test_iql_speed[reduce-overhead-backward] 23.8376ms 21.8615ms 45.7426 Ops/s 46.1200 Ops/s $\color{#d91a1a}-0.82\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.2630ms 4.9710ms 201.1672 Ops/s 195.7532 Ops/s $\color{#35bf28}+2.77\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.9828ms 0.5104ms 1.9594 KOps/s 1.9131 KOps/s $\color{#35bf28}+2.42\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.9399ms 0.4846ms 2.0636 KOps/s 2.0586 KOps/s $\color{#35bf28}+0.24\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.1547ms 4.7329ms 211.2889 Ops/s 209.5170 Ops/s $\color{#35bf28}+0.85\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3.5006ms 0.5043ms 1.9829 KOps/s 737.8500 Ops/s $\textbf{\color{#35bf28}+168.74\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7221ms 0.4810ms 2.0792 KOps/s 2.0247 KOps/s $\color{#35bf28}+2.69\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.3452ms 1.6298ms 613.5632 Ops/s 602.3632 Ops/s $\color{#35bf28}+1.86\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.1416ms 1.5779ms 633.7621 Ops/s 620.3758 Ops/s $\color{#35bf28}+2.16\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.3057ms 4.8763ms 205.0714 Ops/s 191.4517 Ops/s $\textbf{\color{#35bf28}+7.11\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.1659ms 0.6451ms 1.5502 KOps/s 1.4791 KOps/s $\color{#35bf28}+4.80\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0371ms 0.6158ms 1.6238 KOps/s 1.5575 KOps/s $\color{#35bf28}+4.25\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.0861ms 4.8555ms 205.9525 Ops/s 198.1026 Ops/s $\color{#35bf28}+3.96\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.9145ms 0.5138ms 1.9462 KOps/s 1.8971 KOps/s $\color{#35bf28}+2.59\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6596ms 0.4835ms 2.0684 KOps/s 2.0040 KOps/s $\color{#35bf28}+3.22\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.6511ms 4.6171ms 216.5867 Ops/s 199.1733 Ops/s $\textbf{\color{#35bf28}+8.74\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.1357ms 0.4954ms 2.0186 KOps/s 1.9369 KOps/s $\color{#35bf28}+4.22\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6842ms 0.4714ms 2.1213 KOps/s 2.0833 KOps/s $\color{#35bf28}+1.83\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.0630ms 4.8426ms 206.5026 Ops/s 193.9739 Ops/s $\textbf{\color{#35bf28}+6.46\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0334ms 0.6435ms 1.5539 KOps/s 1.5103 KOps/s $\color{#35bf28}+2.88\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 8.1374ms 0.6304ms 1.5864 KOps/s 1.5403 KOps/s $\color{#35bf28}+2.99\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 5.5077ms 4.2279ms 236.5252 Ops/s 239.1660 Ops/s $\color{#d91a1a}-1.10\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 7.3924ms 2.3871ms 418.9098 Ops/s 423.2294 Ops/s $\color{#d91a1a}-1.02\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 5.6790ms 1.3204ms 757.3381 Ops/s 731.3154 Ops/s $\color{#35bf28}+3.56\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.4126s 12.4256ms 80.4790 Ops/s 244.2261 Ops/s $\textbf{\color{#d91a1a}-67.05\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 8.5263ms 2.3390ms 427.5392 Ops/s 391.3893 Ops/s $\textbf{\color{#35bf28}+9.24\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 4.0142ms 1.3468ms 742.5128 Ops/s 790.9743 Ops/s $\textbf{\color{#d91a1a}-6.13\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 7.1849ms 4.4486ms 224.7922 Ops/s 233.1388 Ops/s $\color{#d91a1a}-3.58\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 5.2752ms 2.4245ms 412.4533 Ops/s 408.2112 Ops/s $\color{#35bf28}+1.04\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 7.3280ms 1.5047ms 664.5667 Ops/s 639.5613 Ops/s $\color{#35bf28}+3.91\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 12.0182ms 11.1742ms 89.4916 Ops/s 86.9415 Ops/s $\color{#35bf28}+2.93\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 15.0840ms 14.3599ms 69.6384 Ops/s 68.2629 Ops/s $\color{#35bf28}+2.01\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 21.9432ms 20.2148ms 49.4688 Ops/s 49.9213 Ops/s $\color{#d91a1a}-0.91\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 15.3396ms 14.7796ms 67.6611 Ops/s 68.3790 Ops/s $\color{#d91a1a}-1.05\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 20.7260ms 19.9704ms 50.0740 Ops/s 49.4362 Ops/s $\color{#35bf28}+1.29\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 16.9506ms 15.7990ms 63.2953 Ops/s 63.1118 Ops/s $\color{#35bf28}+0.29\%$

Copy link

github-actions bot commented Dec 3, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}11$. Worsened: $\large\color{#d91a1a}12$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.7522s 0.7492s 1.3348 Ops/s 1.3100 Ops/s $\color{#35bf28}+1.90\%$
test_transformed 1.1101s 1.0186s 0.9818 Ops/s 1.0134 Ops/s $\color{#d91a1a}-3.12\%$
test_serial 2.2459s 2.1615s 0.4626 Ops/s 0.4757 Ops/s $\color{#d91a1a}-2.75\%$
test_parallel 2.1364s 1.9933s 0.5017 Ops/s 0.5006 Ops/s $\color{#35bf28}+0.23\%$
test_step_mdp_speed[True-True-True-True-True] 0.2504ms 39.4159μs 25.3704 KOps/s 25.1527 KOps/s $\color{#35bf28}+0.87\%$
test_step_mdp_speed[True-True-True-True-False] 0.1520ms 22.5255μs 44.3942 KOps/s 43.4887 KOps/s $\color{#35bf28}+2.08\%$
test_step_mdp_speed[True-True-True-False-True] 53.7430μs 21.1540μs 47.2723 KOps/s 46.4113 KOps/s $\color{#35bf28}+1.86\%$
test_step_mdp_speed[True-True-True-False-False] 37.5720μs 12.7279μs 78.5677 KOps/s 78.2778 KOps/s $\color{#35bf28}+0.37\%$
test_step_mdp_speed[True-True-False-True-True] 74.7240μs 41.8116μs 23.9168 KOps/s 23.6887 KOps/s $\color{#35bf28}+0.96\%$
test_step_mdp_speed[True-True-False-True-False] 54.4730μs 24.2996μs 41.1529 KOps/s 40.2240 KOps/s $\color{#35bf28}+2.31\%$
test_step_mdp_speed[True-True-False-False-True] 48.7530μs 23.9089μs 41.8254 KOps/s 41.3898 KOps/s $\color{#35bf28}+1.05\%$
test_step_mdp_speed[True-True-False-False-False] 52.3930μs 14.4489μs 69.2093 KOps/s 67.0915 KOps/s $\color{#35bf28}+3.16\%$
test_step_mdp_speed[True-False-True-True-True] 0.1612ms 44.0635μs 22.6945 KOps/s 22.5118 KOps/s $\color{#35bf28}+0.81\%$
test_step_mdp_speed[True-False-True-True-False] 0.1899ms 26.7104μs 37.4385 KOps/s 37.2369 KOps/s $\color{#35bf28}+0.54\%$
test_step_mdp_speed[True-False-True-False-True] 0.2218ms 23.7598μs 42.0879 KOps/s 42.0798 KOps/s $\color{#35bf28}+0.02\%$
test_step_mdp_speed[True-False-True-False-False] 40.6120μs 14.7787μs 67.6651 KOps/s 67.6370 KOps/s $\color{#35bf28}+0.04\%$
test_step_mdp_speed[True-False-False-True-True] 96.2950μs 45.8213μs 21.8239 KOps/s 21.7169 KOps/s $\color{#35bf28}+0.49\%$
test_step_mdp_speed[True-False-False-True-False] 62.0230μs 28.6975μs 34.8462 KOps/s 34.7863 KOps/s $\color{#35bf28}+0.17\%$
test_step_mdp_speed[True-False-False-False-True] 53.6530μs 25.8438μs 38.6940 KOps/s 38.6316 KOps/s $\color{#35bf28}+0.16\%$
test_step_mdp_speed[True-False-False-False-False] 0.1450ms 16.6026μs 60.2316 KOps/s 59.1598 KOps/s $\color{#35bf28}+1.81\%$
test_step_mdp_speed[False-True-True-True-True] 78.5440μs 43.5062μs 22.9852 KOps/s 22.8447 KOps/s $\color{#35bf28}+0.62\%$
test_step_mdp_speed[False-True-True-True-False] 51.8230μs 27.1533μs 36.8279 KOps/s 36.8877 KOps/s $\color{#d91a1a}-0.16\%$
test_step_mdp_speed[False-True-True-False-True] 62.7930μs 27.1064μs 36.8917 KOps/s 36.4736 KOps/s $\color{#35bf28}+1.15\%$
test_step_mdp_speed[False-True-True-False-False] 33.6420μs 16.6503μs 60.0592 KOps/s 60.9822 KOps/s $\color{#d91a1a}-1.51\%$
test_step_mdp_speed[False-True-False-True-True] 73.7140μs 46.5543μs 21.4803 KOps/s 21.8825 KOps/s $\color{#d91a1a}-1.84\%$
test_step_mdp_speed[False-True-False-True-False] 64.1640μs 29.0955μs 34.3696 KOps/s 34.7695 KOps/s $\color{#d91a1a}-1.15\%$
test_step_mdp_speed[False-True-False-False-True] 3.5347ms 29.8701μs 33.4783 KOps/s 34.3187 KOps/s $\color{#d91a1a}-2.45\%$
test_step_mdp_speed[False-True-False-False-False] 46.0220μs 18.8319μs 53.1013 KOps/s 54.4724 KOps/s $\color{#d91a1a}-2.52\%$
test_step_mdp_speed[False-False-True-True-True] 85.0040μs 49.0724μs 20.3780 KOps/s 21.0535 KOps/s $\color{#d91a1a}-3.21\%$
test_step_mdp_speed[False-False-True-True-False] 59.8940μs 31.5580μs 31.6877 KOps/s 32.6864 KOps/s $\color{#d91a1a}-3.06\%$
test_step_mdp_speed[False-False-True-False-True] 89.5150μs 29.8043μs 33.5522 KOps/s 34.5889 KOps/s $\color{#d91a1a}-3.00\%$
test_step_mdp_speed[False-False-True-False-False] 43.6820μs 18.8416μs 53.0739 KOps/s 55.3394 KOps/s $\color{#d91a1a}-4.09\%$
test_step_mdp_speed[False-False-False-True-True] 93.6950μs 50.7758μs 19.6944 KOps/s 20.2372 KOps/s $\color{#d91a1a}-2.68\%$
test_step_mdp_speed[False-False-False-True-False] 0.1613ms 33.5388μs 29.8163 KOps/s 30.2509 KOps/s $\color{#d91a1a}-1.44\%$
test_step_mdp_speed[False-False-False-False-True] 56.7730μs 31.7918μs 31.4547 KOps/s 32.2748 KOps/s $\color{#d91a1a}-2.54\%$
test_step_mdp_speed[False-False-False-False-False] 81.4340μs 20.7284μs 48.2430 KOps/s 49.4123 KOps/s $\color{#d91a1a}-2.37\%$
test_values[generalized_advantage_estimate-True-True] 25.1801ms 24.3704ms 41.0333 Ops/s 41.3580 Ops/s $\color{#d91a1a}-0.79\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1036s 2.9674ms 337.0007 Ops/s 332.4482 Ops/s $\color{#35bf28}+1.37\%$
test_values[td0_return_estimate-False-False] 0.1066ms 79.3208μs 12.6070 KOps/s 12.6525 KOps/s $\color{#d91a1a}-0.36\%$
test_values[td1_return_estimate-False-False] 54.7995ms 54.1944ms 18.4521 Ops/s 18.8118 Ops/s $\color{#d91a1a}-1.91\%$
test_values[vec_td1_return_estimate-False-False] 1.3926ms 1.0784ms 927.2907 Ops/s 932.7444 Ops/s $\color{#d91a1a}-0.58\%$
test_values[td_lambda_return_estimate-True-False] 86.5557ms 85.7133ms 11.6668 Ops/s 11.8234 Ops/s $\color{#d91a1a}-1.32\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.4197ms 1.0767ms 928.7570 Ops/s 932.6088 Ops/s $\color{#d91a1a}-0.41\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 24.8486ms 24.2308ms 41.2698 Ops/s 42.5017 Ops/s $\color{#d91a1a}-2.90\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0822ms 0.7526ms 1.3288 KOps/s 1.3375 KOps/s $\color{#d91a1a}-0.65\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7990ms 0.6599ms 1.5155 KOps/s 1.5183 KOps/s $\color{#d91a1a}-0.18\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.6546ms 1.4766ms 677.2279 Ops/s 678.5764 Ops/s $\color{#d91a1a}-0.20\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.8352ms 0.6784ms 1.4741 KOps/s 1.4930 KOps/s $\color{#d91a1a}-1.27\%$
test_dqn_speed[False-None] 7.4650ms 1.4586ms 685.6051 Ops/s 689.1732 Ops/s $\color{#d91a1a}-0.52\%$
test_dqn_speed[False-backward] 2.2351ms 2.0448ms 489.0540 Ops/s 488.7467 Ops/s $\color{#35bf28}+0.06\%$
test_dqn_speed[True-None] 0.7067ms 0.5267ms 1.8987 KOps/s 1.8821 KOps/s $\color{#35bf28}+0.88\%$
test_dqn_speed[True-backward] 1.2390ms 1.1847ms 844.0892 Ops/s 831.1939 Ops/s $\color{#35bf28}+1.55\%$
test_dqn_speed[reduce-overhead-None] 0.7301ms 0.5434ms 1.8402 KOps/s 1.8450 KOps/s $\color{#d91a1a}-0.26\%$
test_dqn_speed[reduce-overhead-backward] 1.2358ms 1.0834ms 922.9982 Ops/s 953.0856 Ops/s $\color{#d91a1a}-3.16\%$
test_ddpg_speed[False-None] 3.3347ms 2.8436ms 351.6613 Ops/s 363.1027 Ops/s $\color{#d91a1a}-3.15\%$
test_ddpg_speed[False-backward] 4.6691ms 4.1226ms 242.5637 Ops/s 246.3766 Ops/s $\color{#d91a1a}-1.55\%$
test_ddpg_speed[True-None] 1.3262ms 1.0872ms 919.8121 Ops/s 929.1851 Ops/s $\color{#d91a1a}-1.01\%$
test_ddpg_speed[True-backward] 2.4347ms 2.2622ms 442.0469 Ops/s 436.0366 Ops/s $\color{#35bf28}+1.38\%$
test_ddpg_speed[reduce-overhead-None] 1.2439ms 1.0717ms 933.0704 Ops/s 927.7057 Ops/s $\color{#35bf28}+0.58\%$
test_ddpg_speed[reduce-overhead-backward] 1.9473ms 1.7560ms 569.4809 Ops/s 573.7950 Ops/s $\color{#d91a1a}-0.75\%$
test_sac_speed[False-None] 8.2930ms 7.8289ms 127.7311 Ops/s 128.2630 Ops/s $\color{#d91a1a}-0.41\%$
test_sac_speed[False-backward] 11.3684ms 10.9065ms 91.6882 Ops/s 91.9576 Ops/s $\color{#d91a1a}-0.29\%$
test_sac_speed[True-None] 1.7135ms 1.5137ms 660.6222 Ops/s 653.0876 Ops/s $\color{#35bf28}+1.15\%$
test_sac_speed[True-backward] 3.6667ms 3.3587ms 297.7320 Ops/s 313.1767 Ops/s $\color{#d91a1a}-4.93\%$
test_sac_speed[reduce-overhead-None] 22.8586ms 12.5784ms 79.5012 Ops/s 79.4327 Ops/s $\color{#35bf28}+0.09\%$
test_sac_speed[reduce-overhead-backward] 1.4920ms 1.3224ms 756.1915 Ops/s 667.5145 Ops/s $\textbf{\color{#35bf28}+13.28\%}$
test_redq_speed[False-None] 8.3592ms 7.3832ms 135.4420 Ops/s 135.4652 Ops/s $\color{#d91a1a}-0.02\%$
test_redq_speed[False-backward] 11.7296ms 10.9994ms 90.9139 Ops/s 88.1992 Ops/s $\color{#35bf28}+3.08\%$
test_redq_speed[True-None] 2.1282ms 1.9589ms 510.4925 Ops/s 508.4049 Ops/s $\color{#35bf28}+0.41\%$
test_redq_speed[True-backward] 3.8743ms 3.6004ms 277.7436 Ops/s 277.0605 Ops/s $\color{#35bf28}+0.25\%$
test_redq_speed[reduce-overhead-None] 2.2020ms 1.9657ms 508.7370 Ops/s 509.1268 Ops/s $\color{#d91a1a}-0.08\%$
test_redq_speed[reduce-overhead-backward] 3.9812ms 3.7855ms 264.1683 Ops/s 261.1035 Ops/s $\color{#35bf28}+1.17\%$
test_redq_deprec_speed[False-None] 9.2715ms 8.8017ms 113.6150 Ops/s 113.3933 Ops/s $\color{#35bf28}+0.20\%$
test_redq_deprec_speed[False-backward] 12.6579ms 12.0240ms 83.1667 Ops/s 83.4662 Ops/s $\color{#d91a1a}-0.36\%$
test_redq_deprec_speed[True-None] 2.5467ms 2.2773ms 439.1241 Ops/s 437.0573 Ops/s $\color{#35bf28}+0.47\%$
test_redq_deprec_speed[True-backward] 4.3449ms 4.0902ms 244.4872 Ops/s 244.7317 Ops/s $\color{#d91a1a}-0.10\%$
test_redq_deprec_speed[reduce-overhead-None] 2.6457ms 2.4158ms 413.9373 Ops/s 439.7798 Ops/s $\textbf{\color{#d91a1a}-5.88\%}$
test_redq_deprec_speed[reduce-overhead-backward] 4.1432ms 3.9472ms 253.3414 Ops/s 243.8892 Ops/s $\color{#35bf28}+3.88\%$
test_td3_speed[False-None] 7.7695ms 7.7059ms 129.7710 Ops/s 131.4333 Ops/s $\color{#d91a1a}-1.26\%$
test_td3_speed[False-backward] 0.2919s 15.6198ms 64.0212 Ops/s 99.9219 Ops/s $\textbf{\color{#d91a1a}-35.93\%}$
test_td3_speed[True-None] 1.6876ms 1.5546ms 643.2598 Ops/s 642.4739 Ops/s $\color{#35bf28}+0.12\%$
test_td3_speed[True-backward] 3.2723ms 3.0767ms 325.0193 Ops/s 309.1049 Ops/s $\textbf{\color{#35bf28}+5.15\%}$
test_td3_speed[reduce-overhead-None] 51.2740ms 26.0725ms 38.3546 Ops/s 37.5100 Ops/s $\color{#35bf28}+2.25\%$
test_td3_speed[reduce-overhead-backward] 1.5998ms 1.4369ms 695.9200 Ops/s 697.9605 Ops/s $\color{#d91a1a}-0.29\%$
test_cql_speed[False-None] 16.2616ms 15.7932ms 63.3185 Ops/s 63.3821 Ops/s $\color{#d91a1a}-0.10\%$
test_cql_speed[False-backward] 22.2045ms 21.0235ms 47.5658 Ops/s 47.2474 Ops/s $\color{#35bf28}+0.67\%$
test_cql_speed[True-None] 3.1290ms 2.8934ms 345.6167 Ops/s 333.6171 Ops/s $\color{#35bf28}+3.60\%$
test_cql_speed[True-backward] 5.6168ms 5.1551ms 193.9832 Ops/s 197.0301 Ops/s $\color{#d91a1a}-1.55\%$
test_cql_speed[reduce-overhead-None] 21.9225ms 13.1810ms 75.8670 Ops/s 76.8270 Ops/s $\color{#d91a1a}-1.25\%$
test_cql_speed[reduce-overhead-backward] 1.8266ms 1.6601ms 602.3656 Ops/s 601.7932 Ops/s $\color{#35bf28}+0.10\%$
test_a2c_speed[False-None] 3.3587ms 3.1089ms 321.6610 Ops/s 320.7211 Ops/s $\color{#35bf28}+0.29\%$
test_a2c_speed[False-backward] 6.6961ms 6.1295ms 163.1458 Ops/s 161.9086 Ops/s $\color{#35bf28}+0.76\%$
test_a2c_speed[True-None] 1.1530ms 0.9725ms 1.0282 KOps/s 998.3136 Ops/s $\color{#35bf28}+3.00\%$
test_a2c_speed[True-backward] 3.2051ms 2.7757ms 360.2691 Ops/s 367.7239 Ops/s $\color{#d91a1a}-2.03\%$
test_a2c_speed[reduce-overhead-None] 0.4458s 12.5764ms 79.5143 Ops/s 86.0203 Ops/s $\textbf{\color{#d91a1a}-7.56\%}$
test_a2c_speed[reduce-overhead-backward] 1.2069ms 1.0843ms 922.2845 Ops/s 891.6673 Ops/s $\color{#35bf28}+3.43\%$
test_ppo_speed[False-None] 3.9001ms 3.5736ms 279.8319 Ops/s 277.7179 Ops/s $\color{#35bf28}+0.76\%$
test_ppo_speed[False-backward] 6.9942ms 6.7392ms 148.3847 Ops/s 147.1463 Ops/s $\color{#35bf28}+0.84\%$
test_ppo_speed[True-None] 1.0904ms 0.9254ms 1.0807 KOps/s 1.0720 KOps/s $\color{#35bf28}+0.81\%$
test_ppo_speed[True-backward] 2.8567ms 2.6742ms 373.9409 Ops/s 379.7995 Ops/s $\color{#d91a1a}-1.54\%$
test_ppo_speed[reduce-overhead-None] 0.6484ms 0.4859ms 2.0579 KOps/s 1.9328 KOps/s $\textbf{\color{#35bf28}+6.47\%}$
test_ppo_speed[reduce-overhead-backward] 1.0291ms 0.9690ms 1.0320 KOps/s 1.0243 KOps/s $\color{#35bf28}+0.75\%$
test_reinforce_speed[False-None] 2.3458ms 2.1804ms 458.6317 Ops/s 455.8666 Ops/s $\color{#35bf28}+0.61\%$
test_reinforce_speed[False-backward] 3.5880ms 3.1844ms 314.0282 Ops/s 319.1884 Ops/s $\color{#d91a1a}-1.62\%$
test_reinforce_speed[True-None] 1.0389ms 0.8375ms 1.1941 KOps/s 1.2345 KOps/s $\color{#d91a1a}-3.27\%$
test_reinforce_speed[True-backward] 2.6088ms 2.4101ms 414.9205 Ops/s 415.1577 Ops/s $\color{#d91a1a}-0.06\%$
test_reinforce_speed[reduce-overhead-None] 23.6464ms 12.0448ms 83.0230 Ops/s 87.2972 Ops/s $\color{#d91a1a}-4.90\%$
test_reinforce_speed[reduce-overhead-backward] 1.1703ms 1.0574ms 945.7543 Ops/s 950.3705 Ops/s $\color{#d91a1a}-0.49\%$
test_iql_speed[False-None] 9.6143ms 9.0200ms 110.8643 Ops/s 111.5158 Ops/s $\color{#d91a1a}-0.58\%$
test_iql_speed[False-backward] 13.1728ms 12.6795ms 78.8675 Ops/s 79.9914 Ops/s $\color{#d91a1a}-1.41\%$
test_iql_speed[True-None] 2.1022ms 1.7681ms 565.5724 Ops/s 588.1024 Ops/s $\color{#d91a1a}-3.83\%$
test_iql_speed[True-backward] 4.7593ms 4.3643ms 229.1325 Ops/s 230.4951 Ops/s $\color{#d91a1a}-0.59\%$
test_iql_speed[reduce-overhead-None] 21.9393ms 11.8658ms 84.2757 Ops/s 86.8738 Ops/s $\color{#d91a1a}-2.99\%$
test_iql_speed[reduce-overhead-backward] 1.6349ms 1.5901ms 628.8931 Ops/s 641.2082 Ops/s $\color{#d91a1a}-1.92\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 8.0420ms 6.4841ms 154.2242 Ops/s 151.2094 Ops/s $\color{#35bf28}+1.99\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.6290ms 0.3709ms 2.6958 KOps/s 3.4958 KOps/s $\textbf{\color{#d91a1a}-22.88\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5403ms 0.3154ms 3.1708 KOps/s 3.4054 KOps/s $\textbf{\color{#d91a1a}-6.89\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.7457ms 6.2493ms 160.0173 Ops/s 158.8627 Ops/s $\color{#35bf28}+0.73\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.3089ms 0.3354ms 2.9812 KOps/s 3.1008 KOps/s $\color{#d91a1a}-3.86\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5781ms 0.3187ms 3.1382 KOps/s 3.7030 KOps/s $\textbf{\color{#d91a1a}-15.25\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.7491ms 1.3837ms 722.7134 Ops/s 784.2591 Ops/s $\textbf{\color{#d91a1a}-7.85\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5821ms 1.3182ms 758.6350 Ops/s 735.8418 Ops/s $\color{#35bf28}+3.10\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.7123ms 6.4306ms 155.5073 Ops/s 154.0163 Ops/s $\color{#35bf28}+0.97\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.2706ms 0.4811ms 2.0787 KOps/s 2.3937 KOps/s $\textbf{\color{#d91a1a}-13.16\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7038ms 0.4488ms 2.2283 KOps/s 2.2160 KOps/s $\color{#35bf28}+0.56\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.4593ms 6.2515ms 159.9623 Ops/s 158.9758 Ops/s $\color{#35bf28}+0.62\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.1695ms 0.3430ms 2.9158 KOps/s 2.5111 KOps/s $\textbf{\color{#35bf28}+16.12\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.4668ms 0.2581ms 3.8751 KOps/s 3.0200 KOps/s $\textbf{\color{#35bf28}+28.31\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.6796ms 6.2592ms 159.7657 Ops/s 158.4366 Ops/s $\color{#35bf28}+0.84\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.8142ms 0.3061ms 3.2668 KOps/s 3.0969 KOps/s $\textbf{\color{#35bf28}+5.49\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4884ms 0.2583ms 3.8709 KOps/s 2.8647 KOps/s $\textbf{\color{#35bf28}+35.12\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.8551ms 6.4979ms 153.8963 Ops/s 154.5403 Ops/s $\color{#d91a1a}-0.42\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.8346ms 0.4677ms 2.1383 KOps/s 2.4175 KOps/s $\textbf{\color{#d91a1a}-11.55\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6418ms 0.4051ms 2.4686 KOps/s 2.4865 KOps/s $\color{#d91a1a}-0.72\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 7.1850ms 5.3754ms 186.0311 Ops/s 188.4023 Ops/s $\color{#d91a1a}-1.26\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 9.6065ms 2.0963ms 477.0313 Ops/s 443.6279 Ops/s $\textbf{\color{#35bf28}+7.53\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.4356ms 1.2317ms 811.8779 Ops/s 798.9778 Ops/s $\color{#35bf28}+1.61\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.5297s 15.8409ms 63.1278 Ops/s 190.6253 Ops/s $\textbf{\color{#d91a1a}-66.88\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 10.5661ms 2.0310ms 492.3704 Ops/s 438.9996 Ops/s $\textbf{\color{#35bf28}+12.16\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 6.7542ms 1.2402ms 806.2977 Ops/s 871.8545 Ops/s $\textbf{\color{#d91a1a}-7.52\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 7.3875ms 5.6127ms 178.1682 Ops/s 30.9831 Ops/s $\textbf{\color{#35bf28}+475.05\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 11.5000ms 2.2896ms 436.7567 Ops/s 474.7798 Ops/s $\textbf{\color{#d91a1a}-8.01\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 3.7523ms 1.4114ms 708.5327 Ops/s 730.3010 Ops/s $\color{#d91a1a}-2.98\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.6959ms 13.3767ms 74.7569 Ops/s 74.8477 Ops/s $\color{#d91a1a}-0.12\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 17.6643ms 16.3745ms 61.0705 Ops/s 57.8380 Ops/s $\textbf{\color{#35bf28}+5.59\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.5481ms 18.1136ms 55.2072 Ops/s 55.0729 Ops/s $\color{#35bf28}+0.24\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 16.9687ms 16.6419ms 60.0892 Ops/s 58.8124 Ops/s $\color{#35bf28}+2.17\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 18.2631ms 17.9373ms 55.7498 Ops/s 54.5691 Ops/s $\color{#35bf28}+2.16\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 18.0677ms 17.6920ms 56.5228 Ops/s 54.6035 Ops/s $\color{#35bf28}+3.51\%$

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 3, 2024
ghstack-source-id: aff610c34d130f62b2a7a4cc859b36d1c6e6bed9
Pull Request resolved: #2623
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 4, 2024
ghstack-source-id: 45e29fe6418e57ceb5997f9547d9e52a356302c0
Pull Request resolved: #2623
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 4, 2024
ghstack-source-id: f7257a61ce2443b6edbbcc064da8b3efc0483a95
Pull Request resolved: #2623
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 4, 2024
ghstack-source-id: 578b2b0d8e5278dae37a56b8cb04ec6549822ae6
Pull Request resolved: #2623
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 4, 2024
ghstack-source-id: 396baef4490d010cf55171280d6382257a25577f
Pull Request resolved: #2623
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. tutorials
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants