Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] Fix windows build legacy #2450

Merged
merged 2 commits into from
Sep 23, 2024
Merged

[CI] Fix windows build legacy #2450

merged 2 commits into from
Sep 23, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Sep 23, 2024

Stack from ghstack (oldest at bottom):

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Sep 23, 2024
ghstack-source-id: 093eb6706d95cccde1cd26be6317b6234f08788c
Pull Request resolved: #2450
Copy link

pytorch-bot bot commented Sep 23, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2450

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures, 19 Unrelated Failures

As of commit 8ae94e0 with merge base e294c68 (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 23, 2024
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Sep 23, 2024
ghstack-source-id: 5931159572a19a8c25ce774d050d2054cdbabae5
Pull Request resolved: #2450
@vmoens vmoens merged commit 8ae94e0 into gh/vmoens/29/base Sep 23, 2024
9 of 14 checks passed
vmoens added a commit that referenced this pull request Sep 23, 2024
ghstack-source-id: 5931159572a19a8c25ce774d050d2054cdbabae5
Pull Request resolved: #2450
@vmoens vmoens deleted the gh/vmoens/29/head branch September 23, 2024 15:25
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 146. Improved: $\large\color{#35bf28}11$. Worsened: $\large\color{#d91a1a}4$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 60.0849ms 59.6355ms 16.7685 Ops/s 17.2531 Ops/s $\color{#d91a1a}-2.81\%$
test_sync 38.7007ms 33.0702ms 30.2387 Ops/s 29.0259 Ops/s $\color{#35bf28}+4.18\%$
test_async 73.0718ms 31.4069ms 31.8401 Ops/s 31.9930 Ops/s $\color{#d91a1a}-0.48\%$
test_simple 0.4999s 0.4226s 2.3662 Ops/s 2.5420 Ops/s $\textbf{\color{#d91a1a}-6.92\%}$
test_transformed 0.5683s 0.5668s 1.7643 Ops/s 1.7891 Ops/s $\color{#d91a1a}-1.39\%$
test_serial 1.2656s 1.2614s 0.7928 Ops/s 0.7918 Ops/s $\color{#35bf28}+0.13\%$
test_parallel 1.2084s 1.1319s 0.8835 Ops/s 0.8882 Ops/s $\color{#d91a1a}-0.53\%$
test_step_mdp_speed[True-True-True-True-True] 0.2154ms 27.1185μs 36.8752 KOps/s 36.9825 KOps/s $\color{#d91a1a}-0.29\%$
test_step_mdp_speed[True-True-True-True-False] 53.6900μs 15.8454μs 63.1097 KOps/s 63.7028 KOps/s $\color{#d91a1a}-0.93\%$
test_step_mdp_speed[True-True-True-False-True] 53.6390μs 15.6693μs 63.8192 KOps/s 64.4978 KOps/s $\color{#d91a1a}-1.05\%$
test_step_mdp_speed[True-True-True-False-False] 44.3920μs 9.1266μs 109.5695 KOps/s 110.5476 KOps/s $\color{#d91a1a}-0.88\%$
test_step_mdp_speed[True-True-False-True-True] 77.2020μs 29.1366μs 34.3211 KOps/s 34.6846 KOps/s $\color{#d91a1a}-1.05\%$
test_step_mdp_speed[True-True-False-True-False] 56.4440μs 17.5977μs 56.8256 KOps/s 57.3279 KOps/s $\color{#d91a1a}-0.88\%$
test_step_mdp_speed[True-True-False-False-True] 52.4470μs 17.2633μs 57.9265 KOps/s 58.3478 KOps/s $\color{#d91a1a}-0.72\%$
test_step_mdp_speed[True-True-False-False-False] 37.0890μs 10.8757μs 91.9481 KOps/s 93.8906 KOps/s $\color{#d91a1a}-2.07\%$
test_step_mdp_speed[True-False-True-True-True] 76.5820μs 31.0429μs 32.2135 KOps/s 32.9623 KOps/s $\color{#d91a1a}-2.27\%$
test_step_mdp_speed[True-False-True-True-False] 47.3490μs 19.4270μs 51.4748 KOps/s 52.7898 KOps/s $\color{#d91a1a}-2.49\%$
test_step_mdp_speed[True-False-True-False-True] 49.3910μs 17.4577μs 57.2812 KOps/s 58.7044 KOps/s $\color{#d91a1a}-2.42\%$
test_step_mdp_speed[True-False-True-False-False] 34.6550μs 10.8855μs 91.8653 KOps/s 92.7565 KOps/s $\color{#d91a1a}-0.96\%$
test_step_mdp_speed[True-False-False-True-True] 0.2386ms 33.3120μs 30.0192 KOps/s 31.2038 KOps/s $\color{#d91a1a}-3.80\%$
test_step_mdp_speed[True-False-False-True-False] 71.2010μs 20.8679μs 47.9206 KOps/s 48.1138 KOps/s $\color{#d91a1a}-0.40\%$
test_step_mdp_speed[True-False-False-False-True] 49.6320μs 18.9609μs 52.7401 KOps/s 53.4353 KOps/s $\color{#d91a1a}-1.30\%$
test_step_mdp_speed[True-False-False-False-False] 58.9800μs 12.3760μs 80.8017 KOps/s 81.5735 KOps/s $\color{#d91a1a}-0.95\%$
test_step_mdp_speed[False-True-True-True-True] 66.2830μs 31.0695μs 32.1859 KOps/s 32.7044 KOps/s $\color{#d91a1a}-1.59\%$
test_step_mdp_speed[False-True-True-True-False] 44.4430μs 19.5286μs 51.2068 KOps/s 51.8751 KOps/s $\color{#d91a1a}-1.29\%$
test_step_mdp_speed[False-True-True-False-True] 0.2049ms 20.1509μs 49.6255 KOps/s 51.9460 KOps/s $\color{#d91a1a}-4.47\%$
test_step_mdp_speed[False-True-True-False-False] 76.3430μs 12.2837μs 81.4089 KOps/s 83.8194 KOps/s $\color{#d91a1a}-2.88\%$
test_step_mdp_speed[False-True-False-True-True] 74.7380μs 32.1671μs 31.0877 KOps/s 30.7992 KOps/s $\color{#35bf28}+0.94\%$
test_step_mdp_speed[False-True-False-True-False] 56.3860μs 21.1102μs 47.3705 KOps/s 48.4961 KOps/s $\color{#d91a1a}-2.32\%$
test_step_mdp_speed[False-True-False-False-True] 3.1115ms 21.2588μs 47.0393 KOps/s 46.6306 KOps/s $\color{#35bf28}+0.88\%$
test_step_mdp_speed[False-True-False-False-False] 42.0980μs 13.6643μs 73.1835 KOps/s 73.9971 KOps/s $\color{#d91a1a}-1.10\%$
test_step_mdp_speed[False-False-True-True-True] 97.7930μs 34.1836μs 29.2538 KOps/s 29.4393 KOps/s $\color{#d91a1a}-0.63\%$
test_step_mdp_speed[False-False-True-True-False] 72.6250μs 22.6334μs 44.1825 KOps/s 44.9037 KOps/s $\color{#d91a1a}-1.61\%$
test_step_mdp_speed[False-False-True-False-True] 50.3630μs 21.3914μs 46.7478 KOps/s 47.7102 KOps/s $\color{#d91a1a}-2.02\%$
test_step_mdp_speed[False-False-True-False-False] 36.3580μs 13.7927μs 72.5021 KOps/s 74.7412 KOps/s $\color{#d91a1a}-3.00\%$
test_step_mdp_speed[False-False-False-True-True] 75.8820μs 35.4503μs 28.2085 KOps/s 28.8302 KOps/s $\color{#d91a1a}-2.16\%$
test_step_mdp_speed[False-False-False-True-False] 57.7480μs 24.1129μs 41.4716 KOps/s 42.1710 KOps/s $\color{#d91a1a}-1.66\%$
test_step_mdp_speed[False-False-False-False-True] 50.9550μs 22.5190μs 44.4069 KOps/s 44.9780 KOps/s $\color{#d91a1a}-1.27\%$
test_step_mdp_speed[False-False-False-False-False] 44.9140μs 15.0305μs 66.5312 KOps/s 67.7909 KOps/s $\color{#d91a1a}-1.86\%$
test_values[generalized_advantage_estimate-True-True] 10.7191ms 9.4749ms 105.5422 Ops/s 105.6514 Ops/s $\color{#d91a1a}-0.10\%$
test_values[vec_generalized_advantage_estimate-True-True] 37.3788ms 33.6075ms 29.7553 Ops/s 29.6646 Ops/s $\color{#35bf28}+0.31\%$
test_values[td0_return_estimate-False-False] 0.2471ms 0.1833ms 5.4551 KOps/s 5.5002 KOps/s $\color{#d91a1a}-0.82\%$
test_values[td1_return_estimate-False-False] 27.0656ms 23.7843ms 42.0445 Ops/s 42.3871 Ops/s $\color{#d91a1a}-0.81\%$
test_values[vec_td1_return_estimate-False-False] 35.5831ms 33.5933ms 29.7678 Ops/s 29.6757 Ops/s $\color{#35bf28}+0.31\%$
test_values[td_lambda_return_estimate-True-False] 38.3665ms 34.5567ms 28.9380 Ops/s 29.3630 Ops/s $\color{#d91a1a}-1.45\%$
test_values[vec_td_lambda_return_estimate-True-False] 36.0369ms 33.7055ms 29.6688 Ops/s 29.6671 Ops/s $+0.01\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 10.6872ms 8.3231ms 120.1477 Ops/s 122.1251 Ops/s $\color{#d91a1a}-1.62\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.3955ms 1.9998ms 500.0578 Ops/s 528.6198 Ops/s $\textbf{\color{#d91a1a}-5.40\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4694ms 0.3637ms 2.7499 KOps/s 2.8480 KOps/s $\color{#d91a1a}-3.44\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 58.4560ms 42.3661ms 23.6038 Ops/s 21.8059 Ops/s $\textbf{\color{#35bf28}+8.24\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 3.9582ms 3.0309ms 329.9387 Ops/s 331.1327 Ops/s $\color{#d91a1a}-0.36\%$
test_dqn_speed[False-None] 6.2048ms 1.3487ms 741.4433 Ops/s 760.5638 Ops/s $\color{#d91a1a}-2.51\%$
test_dqn_speed[False-backward] 1.8988ms 1.8168ms 550.4115 Ops/s 557.9044 Ops/s $\color{#d91a1a}-1.34\%$
test_dqn_speed[True-None] 1.1770ms 0.4611ms 2.1685 KOps/s 2.1471 KOps/s $\color{#35bf28}+1.00\%$
test_dqn_speed[True-backward] 0.9578ms 0.8940ms 1.1186 KOps/s 1.1228 KOps/s $\color{#d91a1a}-0.38\%$
test_dqn_speed[reduce-overhead-None] 0.7383ms 0.4664ms 2.1440 KOps/s 2.1795 KOps/s $\color{#d91a1a}-1.63\%$
test_dqn_speed[reduce-overhead-backward] 1.0429ms 0.8898ms 1.1238 KOps/s 1.1590 KOps/s $\color{#d91a1a}-3.04\%$
test_ddpg_speed[False-None] 3.4983ms 2.8260ms 353.8526 Ops/s 364.8428 Ops/s $\color{#d91a1a}-3.01\%$
test_ddpg_speed[False-backward] 4.9485ms 3.9529ms 252.9775 Ops/s 259.4566 Ops/s $\color{#d91a1a}-2.50\%$
test_ddpg_speed[True-None] 1.7211ms 1.0198ms 980.5980 Ops/s 980.3869 Ops/s $\color{#35bf28}+0.02\%$
test_ddpg_speed[True-backward] 1.9359ms 1.8881ms 529.6449 Ops/s 528.0387 Ops/s $\color{#35bf28}+0.30\%$
test_ddpg_speed[reduce-overhead-None] 1.4119ms 1.0122ms 987.9536 Ops/s 984.4529 Ops/s $\color{#35bf28}+0.36\%$
test_ddpg_speed[reduce-overhead-backward] 1.9582ms 1.8899ms 529.1407 Ops/s 517.5037 Ops/s $\color{#35bf28}+2.25\%$
test_sac_speed[False-None] 8.9768ms 7.9570ms 125.6759 Ops/s 98.1630 Ops/s $\textbf{\color{#35bf28}+28.03\%}$
test_sac_speed[False-backward] 11.9783ms 10.6894ms 93.5502 Ops/s 95.4331 Ops/s $\color{#d91a1a}-1.97\%$
test_sac_speed[True-None] 2.3667ms 1.8549ms 539.0992 Ops/s 527.9436 Ops/s $\color{#35bf28}+2.11\%$
test_sac_speed[True-backward] 3.6010ms 3.5190ms 284.1715 Ops/s 282.0358 Ops/s $\color{#35bf28}+0.76\%$
test_sac_speed[reduce-overhead-None] 2.1936ms 1.8530ms 539.6745 Ops/s 524.3812 Ops/s $\color{#35bf28}+2.92\%$
test_sac_speed[reduce-overhead-backward] 3.5835ms 3.5173ms 284.3063 Ops/s 282.4123 Ops/s $\color{#35bf28}+0.67\%$
test_redq_speed[False-None] 15.3610ms 13.0138ms 76.8417 Ops/s 76.7927 Ops/s $\color{#35bf28}+0.06\%$
test_redq_speed[False-backward] 24.7574ms 22.4801ms 44.4838 Ops/s 44.4780 Ops/s $\color{#35bf28}+0.01\%$
test_redq_speed[True-None] 5.2210ms 4.5204ms 221.2192 Ops/s 160.6096 Ops/s $\textbf{\color{#35bf28}+37.74\%}$
test_redq_speed[True-backward] 13.1558ms 12.3478ms 80.9858 Ops/s 78.0807 Ops/s $\color{#35bf28}+3.72\%$
test_redq_speed[reduce-overhead-None] 5.5156ms 4.4904ms 222.6969 Ops/s 197.9590 Ops/s $\textbf{\color{#35bf28}+12.50\%}$
test_redq_speed[reduce-overhead-backward] 12.6828ms 11.7761ms 84.9174 Ops/s 77.8337 Ops/s $\textbf{\color{#35bf28}+9.10\%}$
test_redq_deprec_speed[False-None] 15.1724ms 12.4306ms 80.4467 Ops/s 76.8273 Ops/s $\color{#35bf28}+4.71\%$
test_redq_deprec_speed[False-backward] 19.8040ms 18.1054ms 55.2322 Ops/s 51.5913 Ops/s $\textbf{\color{#35bf28}+7.06\%}$
test_redq_deprec_speed[True-None] 4.6566ms 3.5787ms 279.4346 Ops/s 251.6181 Ops/s $\textbf{\color{#35bf28}+11.06\%}$
test_redq_deprec_speed[True-backward] 8.1773ms 7.9216ms 126.2378 Ops/s 92.5441 Ops/s $\textbf{\color{#35bf28}+36.41\%}$
test_redq_deprec_speed[reduce-overhead-None] 3.9223ms 3.5400ms 282.4852 Ops/s 212.0907 Ops/s $\textbf{\color{#35bf28}+33.19\%}$
test_redq_deprec_speed[reduce-overhead-backward] 7.9570ms 7.8518ms 127.3586 Ops/s 116.3153 Ops/s $\textbf{\color{#35bf28}+9.49\%}$
test_td3_speed[False-None] 9.5306ms 7.8547ms 127.3119 Ops/s 124.5672 Ops/s $\color{#35bf28}+2.20\%$
test_td3_speed[False-backward] 12.4283ms 10.4496ms 95.6973 Ops/s 93.5565 Ops/s $\color{#35bf28}+2.29\%$
test_td3_speed[True-None] 2.1970ms 1.9567ms 511.0773 Ops/s 495.2565 Ops/s $\color{#35bf28}+3.19\%$
test_td3_speed[True-backward] 3.6602ms 3.5694ms 280.1564 Ops/s 279.1106 Ops/s $\color{#35bf28}+0.37\%$
test_td3_speed[reduce-overhead-None] 2.1037ms 1.9397ms 515.5321 Ops/s 497.1013 Ops/s $\color{#35bf28}+3.71\%$
test_td3_speed[reduce-overhead-backward] 4.0042ms 3.5236ms 283.8019 Ops/s 279.5558 Ops/s $\color{#35bf28}+1.52\%$
test_cql_speed[False-None] 36.9531ms 34.9892ms 28.5803 Ops/s 27.9087 Ops/s $\color{#35bf28}+2.41\%$
test_cql_speed[False-backward] 48.0173ms 44.9140ms 22.2648 Ops/s 21.7930 Ops/s $\color{#35bf28}+2.16\%$
test_cql_speed[True-None] 17.1037ms 15.5286ms 64.3974 Ops/s 64.1124 Ops/s $\color{#35bf28}+0.44\%$
test_cql_speed[True-backward] 24.5474ms 21.7204ms 46.0398 Ops/s 45.1304 Ops/s $\color{#35bf28}+2.01\%$
test_cql_speed[reduce-overhead-None] 17.3591ms 15.7469ms 63.5047 Ops/s 63.0499 Ops/s $\color{#35bf28}+0.72\%$
test_cql_speed[reduce-overhead-backward] 23.8896ms 21.8620ms 45.7415 Ops/s 44.7195 Ops/s $\color{#35bf28}+2.29\%$
test_a2c_speed[False-None] 9.0306ms 7.0518ms 141.8076 Ops/s 141.6563 Ops/s $\color{#35bf28}+0.11\%$
test_a2c_speed[False-backward] 15.1505ms 13.8451ms 72.2277 Ops/s 71.7605 Ops/s $\color{#35bf28}+0.65\%$
test_a2c_speed[True-None] 3.6231ms 3.2996ms 303.0697 Ops/s 298.7492 Ops/s $\color{#35bf28}+1.45\%$
test_a2c_speed[True-backward] 9.9505ms 9.6192ms 103.9582 Ops/s 102.5692 Ops/s $\color{#35bf28}+1.35\%$
test_a2c_speed[reduce-overhead-None] 3.7120ms 3.3030ms 302.7552 Ops/s 297.4239 Ops/s $\color{#35bf28}+1.79\%$
test_a2c_speed[reduce-overhead-backward] 10.0430ms 9.6460ms 103.6695 Ops/s 100.2578 Ops/s $\color{#35bf28}+3.40\%$
test_ppo_speed[False-None] 8.8737ms 7.2773ms 137.4131 Ops/s 134.8276 Ops/s $\color{#35bf28}+1.92\%$
test_ppo_speed[False-backward] 16.6380ms 14.2170ms 70.3384 Ops/s 68.6535 Ops/s $\color{#35bf28}+2.45\%$
test_ppo_speed[True-None] 4.0079ms 3.6886ms 271.1046 Ops/s 266.0233 Ops/s $\color{#35bf28}+1.91\%$
test_ppo_speed[True-backward] 10.4913ms 9.5000ms 105.2627 Ops/s 103.5950 Ops/s $\color{#35bf28}+1.61\%$
test_ppo_speed[reduce-overhead-None] 4.3018ms 3.6781ms 271.8790 Ops/s 261.8086 Ops/s $\color{#35bf28}+3.85\%$
test_ppo_speed[reduce-overhead-backward] 9.7989ms 9.4672ms 105.6283 Ops/s 104.2673 Ops/s $\color{#35bf28}+1.31\%$
test_reinforce_speed[False-None] 7.2947ms 6.3952ms 156.3678 Ops/s 155.3476 Ops/s $\color{#35bf28}+0.66\%$
test_reinforce_speed[False-backward] 9.9655ms 9.5604ms 104.5983 Ops/s 103.4048 Ops/s $\color{#35bf28}+1.15\%$
test_reinforce_speed[True-None] 7.0708ms 2.7190ms 367.7831 Ops/s 372.6236 Ops/s $\color{#d91a1a}-1.30\%$
test_reinforce_speed[True-backward] 10.5149ms 8.5792ms 116.5606 Ops/s 114.5248 Ops/s $\color{#35bf28}+1.78\%$
test_reinforce_speed[reduce-overhead-None] 3.2212ms 2.6243ms 381.0558 Ops/s 374.2026 Ops/s $\color{#35bf28}+1.83\%$
test_reinforce_speed[reduce-overhead-backward] 9.6791ms 8.5145ms 117.4471 Ops/s 115.9739 Ops/s $\color{#35bf28}+1.27\%$
test_iql_speed[False-None] 32.7480ms 31.1483ms 32.1045 Ops/s 30.9626 Ops/s $\color{#35bf28}+3.69\%$
test_iql_speed[False-backward] 45.5826ms 43.5845ms 22.9440 Ops/s 22.1363 Ops/s $\color{#35bf28}+3.65\%$
test_iql_speed[True-None] 14.2378ms 13.1325ms 76.1467 Ops/s 73.0330 Ops/s $\color{#35bf28}+4.26\%$
test_iql_speed[True-backward] 25.4154ms 23.7769ms 42.0576 Ops/s 41.1771 Ops/s $\color{#35bf28}+2.14\%$
test_iql_speed[reduce-overhead-None] 14.3092ms 13.0891ms 76.3995 Ops/s 74.3982 Ops/s $\color{#35bf28}+2.69\%$
test_iql_speed[reduce-overhead-backward] 24.9662ms 23.5955ms 42.3811 Ops/s 40.8796 Ops/s $\color{#35bf28}+3.67\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.2151ms 4.9678ms 201.2945 Ops/s 193.6836 Ops/s $\color{#35bf28}+3.93\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.7243ms 0.4748ms 2.1064 KOps/s 2.0987 KOps/s $\color{#35bf28}+0.37\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6578ms 0.4492ms 2.2262 KOps/s 2.2006 KOps/s $\color{#35bf28}+1.16\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.9112ms 4.9381ms 202.5065 Ops/s 194.3274 Ops/s $\color{#35bf28}+4.21\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.3993ms 0.4715ms 2.1210 KOps/s 2.1063 KOps/s $\color{#35bf28}+0.70\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 1.0835ms 0.4464ms 2.2401 KOps/s 2.2471 KOps/s $\color{#d91a1a}-0.31\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 3.8643ms 1.6863ms 593.0004 Ops/s 615.5643 Ops/s $\color{#d91a1a}-3.67\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.6910ms 1.5147ms 660.1802 Ops/s 658.2822 Ops/s $\color{#35bf28}+0.29\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.4949ms 5.3135ms 188.2000 Ops/s 189.5755 Ops/s $\color{#d91a1a}-0.73\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.9388ms 0.6079ms 1.6451 KOps/s 1.5824 KOps/s $\color{#35bf28}+3.96\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7770ms 0.5792ms 1.7264 KOps/s 1.7066 KOps/s $\color{#35bf28}+1.16\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.5802ms 5.0586ms 197.6838 Ops/s 196.8972 Ops/s $\color{#35bf28}+0.40\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7877ms 0.4766ms 2.0984 KOps/s 2.1006 KOps/s $\color{#d91a1a}-0.10\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 8.2680ms 0.4606ms 2.1710 KOps/s 2.1803 KOps/s $\color{#d91a1a}-0.43\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.6098ms 5.0337ms 198.6600 Ops/s 195.9161 Ops/s $\color{#35bf28}+1.40\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.7423ms 0.4710ms 2.1231 KOps/s 2.0835 KOps/s $\color{#35bf28}+1.90\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7435ms 0.4507ms 2.2189 KOps/s 2.2388 KOps/s $\color{#d91a1a}-0.89\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.2966ms 5.1749ms 193.2391 Ops/s 188.0450 Ops/s $\color{#35bf28}+2.76\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 3.4016ms 0.6099ms 1.6395 KOps/s 1.6364 KOps/s $\color{#35bf28}+0.19\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8796ms 0.5873ms 1.7026 KOps/s 1.6853 KOps/s $\color{#35bf28}+1.03\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.3921s 11.9738ms 83.5153 Ops/s 237.4303 Ops/s $\textbf{\color{#d91a1a}-64.83\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 19.7599ms 13.2263ms 75.6072 Ops/s 76.1390 Ops/s $\color{#d91a1a}-0.70\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 4.3594ms 1.3119ms 762.2379 Ops/s 763.6276 Ops/s $\color{#d91a1a}-0.18\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 5.5364ms 4.2082ms 237.6293 Ops/s 34.4692 Ops/s $\textbf{\color{#35bf28}+589.40\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 17.8192ms 13.0410ms 76.6815 Ops/s 74.3752 Ops/s $\color{#35bf28}+3.10\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 4.0541ms 1.3178ms 758.8381 Ops/s 766.8384 Ops/s $\color{#d91a1a}-1.04\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.3450s 11.5988ms 86.2158 Ops/s 222.7883 Ops/s $\textbf{\color{#d91a1a}-61.30\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 17.5852ms 13.2360ms 75.5514 Ops/s 74.9399 Ops/s $\color{#35bf28}+0.82\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.1189ms 1.4404ms 694.2355 Ops/s 682.6887 Ops/s $\color{#35bf28}+1.69\%$

Copy link

Result of GPU Benchmark Tests

Expand to view detailed results
Name Max Mean Ops
test_single 0.1041s 0.1038s 9.6356 Ops/s
test_sync 93.1390ms 89.7259ms 11.1451 Ops/s
test_async 0.2613s 87.9774ms 11.3666 Ops/s
test_single_pixels 0.1104s 0.1102s 9.0739 Ops/s
test_sync_pixels 72.3340ms 71.2646ms 14.0322 Ops/s
test_async_pixels 0.1354s 68.1880ms 14.6653 Ops/s
test_simple 0.7418s 0.7387s 1.3538 Ops/s
test_transformed 0.9732s 0.9712s 1.0297 Ops/s
test_serial 2.0885s 2.0863s 0.4793 Ops/s
test_parallel 2.0854s 1.9538s 0.5118 Ops/s
test_step_mdp_speed[True-True-True-True-True] 0.2433ms 37.7218μs 26.5099 KOps/s
test_step_mdp_speed[True-True-True-True-False] 51.5310μs 21.3511μs 46.8360 KOps/s
test_step_mdp_speed[True-True-True-False-True] 46.2700μs 21.2582μs 47.0407 KOps/s
test_step_mdp_speed[True-True-True-False-False] 0.2126ms 12.2322μs 81.7517 KOps/s
test_step_mdp_speed[True-True-False-True-True] 77.8710μs 40.0362μs 24.9774 KOps/s
test_step_mdp_speed[True-True-False-True-False] 76.1010μs 23.5147μs 42.5266 KOps/s
test_step_mdp_speed[True-True-False-False-True] 0.1247ms 23.3876μs 42.7577 KOps/s
test_step_mdp_speed[True-True-False-False-False] 37.5810μs 14.1802μs 70.5208 KOps/s
test_step_mdp_speed[True-False-True-True-True] 83.6810μs 41.7970μs 23.9251 KOps/s
test_step_mdp_speed[True-False-True-True-False] 59.5200μs 25.6166μs 39.0371 KOps/s
test_step_mdp_speed[True-False-True-False-True] 80.7610μs 23.4802μs 42.5892 KOps/s
test_step_mdp_speed[True-False-True-False-False] 0.3989ms 14.2638μs 70.1078 KOps/s
test_step_mdp_speed[True-False-False-True-True] 0.4245ms 44.0462μs 22.7034 KOps/s
test_step_mdp_speed[True-False-False-True-False] 51.4900μs 27.6761μs 36.1323 KOps/s
test_step_mdp_speed[True-False-False-False-True] 48.4110μs 25.2509μs 39.6026 KOps/s
test_step_mdp_speed[True-False-False-False-False] 0.3989ms 16.1455μs 61.9367 KOps/s
test_step_mdp_speed[False-True-True-True-True] 0.4469ms 41.3487μs 24.1846 KOps/s
test_step_mdp_speed[False-True-True-True-False] 54.8810μs 25.6851μs 38.9330 KOps/s
test_step_mdp_speed[False-True-True-False-True] 0.4151ms 26.4947μs 37.7434 KOps/s
test_step_mdp_speed[False-True-True-False-False] 0.3976ms 15.8417μs 63.1245 KOps/s
test_step_mdp_speed[False-True-False-True-True] 0.4277ms 43.5432μs 22.9657 KOps/s
test_step_mdp_speed[False-True-False-True-False] 59.0210μs 27.5652μs 36.2777 KOps/s
test_step_mdp_speed[False-True-False-False-True] 3.4640ms 28.7368μs 34.7985 KOps/s
test_step_mdp_speed[False-True-False-False-False] 0.4169ms 17.8628μs 55.9822 KOps/s
test_step_mdp_speed[False-False-True-True-True] 0.4323ms 46.0379μs 21.7212 KOps/s
test_step_mdp_speed[False-False-True-True-False] 55.5510μs 29.8492μs 33.5017 KOps/s
test_step_mdp_speed[False-False-True-False-True] 0.4159ms 28.3939μs 35.2188 KOps/s
test_step_mdp_speed[False-False-True-False-False] 0.4047ms 17.6584μs 56.6302 KOps/s
test_step_mdp_speed[False-False-False-True-True] 0.1211ms 46.6999μs 21.4133 KOps/s
test_step_mdp_speed[False-False-False-True-False] 56.9410μs 31.7129μs 31.5329 KOps/s
test_step_mdp_speed[False-False-False-False-True] 58.6010μs 29.6644μs 33.7104 KOps/s
test_step_mdp_speed[False-False-False-False-False] 43.8010μs 19.7292μs 50.6862 KOps/s
test_values[generalized_advantage_estimate-True-True] 24.7001ms 24.1694ms 41.3747 Ops/s
test_values[vec_generalized_advantage_estimate-True-True] 0.1080s 3.0436ms 328.5554 Ops/s
test_values[td0_return_estimate-False-False] 0.1050ms 66.6336μs 15.0074 KOps/s
test_values[td1_return_estimate-False-False] 54.8253ms 54.3609ms 18.3956 Ops/s
test_values[vec_td1_return_estimate-False-False] 1.4446ms 1.0711ms 933.5769 Ops/s
test_values[td_lambda_return_estimate-True-False] 87.5967ms 86.8244ms 11.5175 Ops/s
test_values[vec_td_lambda_return_estimate-True-False] 1.4123ms 1.0669ms 937.2627 Ops/s
test_gae_speed[generalized_advantage_estimate-False-1-512] 24.4344ms 24.2484ms 41.2398 Ops/s
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 0.9394ms 0.7073ms 1.4139 KOps/s
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.8056ms 0.6543ms 1.5284 KOps/s
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.6400ms 1.4697ms 680.4161 Ops/s
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.8278ms 0.6698ms 1.4931 KOps/s
test_dqn_speed[False-None] 7.0170ms 1.3723ms 728.7276 Ops/s
test_dqn_speed[False-backward] 1.9726ms 1.8869ms 529.9585 Ops/s
test_dqn_speed[True-None] 1.0655ms 0.5769ms 1.7335 KOps/s
test_dqn_speed[True-backward] 1.0496ms 1.0188ms 981.5497 Ops/s
test_dqn_speed[reduce-overhead-None] 0.9679ms 0.5679ms 1.7607 KOps/s
test_dqn_speed[reduce-overhead-backward] 1.0340ms 1.0025ms 997.4927 Ops/s
test_ddpg_speed[False-None] 3.2207ms 2.7627ms 361.9609 Ops/s
test_ddpg_speed[False-backward] 4.2403ms 4.0449ms 247.2241 Ops/s
test_ddpg_speed[True-None] 1.6965ms 1.2737ms 785.1024 Ops/s
test_ddpg_speed[True-backward] 2.3129ms 2.2481ms 444.8245 Ops/s
test_ddpg_speed[reduce-overhead-None] 1.6840ms 1.2834ms 779.1650 Ops/s
test_ddpg_speed[reduce-overhead-backward] 2.4313ms 2.2491ms 444.6168 Ops/s
test_sac_speed[False-None] 8.1854ms 7.6841ms 130.1389 Ops/s
test_sac_speed[False-backward] 11.4413ms 11.0120ms 90.8102 Ops/s
test_sac_speed[True-None] 2.2975ms 2.0636ms 484.6005 Ops/s
test_sac_speed[True-backward] 4.1823ms 4.0128ms 249.2047 Ops/s
test_sac_speed[reduce-overhead-None] 2.4375ms 2.0687ms 483.3874 Ops/s
test_sac_speed[reduce-overhead-backward] 4.1695ms 3.9920ms 250.5022 Ops/s
test_redq_speed[False-None] 11.1717ms 10.2631ms 97.4367 Ops/s
test_redq_speed[False-backward] 18.7624ms 17.7239ms 56.4209 Ops/s
test_redq_speed[True-None] 4.0415ms 3.4976ms 285.9111 Ops/s
test_redq_speed[True-backward] 8.7849ms 8.4789ms 117.9392 Ops/s
test_redq_speed[reduce-overhead-None] 3.7194ms 3.4636ms 288.7207 Ops/s
test_redq_speed[reduce-overhead-backward] 8.6018ms 8.3689ms 119.4897 Ops/s
test_redq_deprec_speed[False-None] 11.1695ms 10.4999ms 95.2387 Ops/s
test_redq_deprec_speed[False-backward] 15.7080ms 15.1754ms 65.8960 Ops/s
test_redq_deprec_speed[True-None] 3.5659ms 3.1999ms 312.5125 Ops/s
test_redq_deprec_speed[True-backward] 7.0603ms 6.8389ms 146.2215 Ops/s
test_redq_deprec_speed[reduce-overhead-None] 3.5643ms 3.1868ms 313.7984 Ops/s
test_redq_deprec_speed[reduce-overhead-backward] 7.2325ms 6.8023ms 147.0097 Ops/s
test_td3_speed[False-None] 7.8379ms 7.6506ms 130.7088 Ops/s
test_td3_speed[False-backward] 10.9178ms 10.4927ms 95.3043 Ops/s
test_td3_speed[True-None] 2.1809ms 2.1303ms 469.4086 Ops/s
test_td3_speed[True-backward] 4.1050ms 3.9597ms 252.5430 Ops/s
test_td3_speed[reduce-overhead-None] 2.1433ms 2.0949ms 477.3558 Ops/s
test_td3_speed[reduce-overhead-backward] 4.3255ms 3.9508ms 253.1160 Ops/s
test_cql_speed[False-None] 29.2012ms 25.5198ms 39.1852 Ops/s
test_cql_speed[False-backward] 37.4733ms 34.0771ms 29.3452 Ops/s
test_cql_speed[True-None] 11.4886ms 10.9996ms 90.9127 Ops/s
test_cql_speed[True-backward] 17.0984ms 16.5783ms 60.3197 Ops/s
test_cql_speed[reduce-overhead-None] 11.3571ms 10.9874ms 91.0135 Ops/s
test_cql_speed[reduce-overhead-backward] 17.2828ms 16.7704ms 59.6288 Ops/s
test_a2c_speed[False-None] 5.8009ms 5.3944ms 185.3758 Ops/s
test_a2c_speed[False-backward] 12.1580ms 11.8279ms 84.5462 Ops/s
test_a2c_speed[True-None] 3.5272ms 3.1373ms 318.7415 Ops/s
test_a2c_speed[True-backward] 8.9667ms 8.5424ms 117.0635 Ops/s
test_a2c_speed[reduce-overhead-None] 3.2707ms 3.0959ms 323.0069 Ops/s
test_a2c_speed[reduce-overhead-backward] 8.9011ms 8.5737ms 116.6358 Ops/s
test_ppo_speed[False-None] 6.1858ms 5.7690ms 173.3395 Ops/s
test_ppo_speed[False-backward] 12.8116ms 12.3658ms 80.8679 Ops/s
test_ppo_speed[True-None] 3.8586ms 3.4580ms 289.1842 Ops/s
test_ppo_speed[True-backward] 8.6601ms 8.2721ms 120.8878 Ops/s
test_ppo_speed[reduce-overhead-None] 3.6324ms 3.4765ms 287.6476 Ops/s
test_ppo_speed[reduce-overhead-backward] 8.5584ms 8.3330ms 120.0045 Ops/s
test_reinforce_speed[False-None] 6.5053ms 4.5067ms 221.8901 Ops/s
test_reinforce_speed[False-backward] 7.5622ms 7.2506ms 137.9196 Ops/s
test_reinforce_speed[True-None] 2.5699ms 2.2151ms 451.4538 Ops/s
test_reinforce_speed[True-backward] 7.5269ms 7.1352ms 140.1503 Ops/s
test_reinforce_speed[reduce-overhead-None] 2.6399ms 2.2210ms 450.2570 Ops/s
test_reinforce_speed[reduce-overhead-backward] 7.4455ms 7.0743ms 141.3558 Ops/s
test_iql_speed[False-None] 25.1363ms 19.9261ms 50.1855 Ops/s
test_iql_speed[False-backward] 33.8951ms 29.8764ms 33.4713 Ops/s
test_iql_speed[True-None] 8.1948ms 7.8928ms 126.6973 Ops/s
test_iql_speed[True-backward] 16.9678ms 16.5788ms 60.3182 Ops/s
test_iql_speed[reduce-overhead-None] 10.5827ms 8.3385ms 119.9250 Ops/s
test_iql_speed[reduce-overhead-backward] 17.0730ms 16.5743ms 60.3343 Ops/s
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.0780ms 6.7276ms 148.6417 Ops/s
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.0214ms 0.3257ms 3.0707 KOps/s
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6667ms 0.3068ms 3.2595 KOps/s
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.3074ms 6.6649ms 150.0399 Ops/s
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.0694ms 0.3216ms 3.1098 KOps/s
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6733ms 0.3017ms 3.3143 KOps/s
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.5513ms 1.3523ms 739.4562 Ops/s
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5010ms 1.2765ms 783.3931 Ops/s
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.1260ms 6.8765ms 145.4228 Ops/s
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.9379ms 0.4614ms 2.1673 KOps/s
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8082ms 0.4343ms 2.3023 KOps/s
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.1368ms 6.7701ms 147.7076 Ops/s
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7621ms 0.3241ms 3.0853 KOps/s
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5809ms 0.2979ms 3.3570 KOps/s
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.1710ms 6.6748ms 149.8182 Ops/s
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.3524ms 0.3394ms 2.9463 KOps/s
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7285ms 0.3200ms 3.1251 KOps/s
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.3710ms 6.9420ms 144.0513 Ops/s
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.2341ms 0.4874ms 2.0516 KOps/s
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6789ms 0.4723ms 2.1171 KOps/s
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.9916ms 5.3743ms 186.0693 Ops/s
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 20.8403ms 16.1519ms 61.9121 Ops/s
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.3914ms 1.2669ms 789.3491 Ops/s
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.3934s 13.4044ms 74.6023 Ops/s
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 21.0819ms 15.9760ms 62.5940 Ops/s
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 7.0447ms 1.2485ms 800.9410 Ops/s
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.3792s 13.0366ms 76.7070 Ops/s
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 21.6059ms 16.2010ms 61.7246 Ops/s
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 6.9248ms 1.4011ms 713.7186 Ops/s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants