Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Fix export aoti_compile_and_package API change #2629

Merged
merged 1 commit into from
Dec 3, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Dec 3, 2024

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Dec 3, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2629

Note: Links to docs will display an error until the docs builds have been completed.

❌ 18 New Failures

As of commit 41bdf97 with merge base aed03fd (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

vmoens added a commit that referenced this pull request Dec 3, 2024
ghstack-source-id: 07a0f063f8955815157c2a3eac02c6460a82f672
Pull Request resolved: #2629
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 3, 2024
@vmoens vmoens merged commit 41bdf97 into gh/vmoens/49/base Dec 3, 2024
28 of 46 checks passed
vmoens added a commit that referenced this pull request Dec 3, 2024
ghstack-source-id: 07a0f063f8955815157c2a3eac02c6460a82f672
Pull Request resolved: #2629
@vmoens vmoens deleted the gh/vmoens/49/head branch December 3, 2024 15:06
Copy link

github-actions bot commented Dec 3, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}23$. Worsened: $\large\color{#d91a1a}9$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.4252s 0.4233s 2.3625 Ops/s 2.2172 Ops/s $\textbf{\color{#35bf28}+6.55\%}$
test_transformed 0.6029s 0.6004s 1.6655 Ops/s 1.5967 Ops/s $\color{#35bf28}+4.31\%$
test_serial 1.3461s 1.3449s 0.7435 Ops/s 0.7304 Ops/s $\color{#35bf28}+1.79\%$
test_parallel 1.3733s 1.2920s 0.7740 Ops/s 0.7473 Ops/s $\color{#35bf28}+3.56\%$
test_step_mdp_speed[True-True-True-True-True] 0.1975ms 30.2385μs 33.0705 KOps/s 33.4731 KOps/s $\color{#d91a1a}-1.20\%$
test_step_mdp_speed[True-True-True-True-False] 50.2930μs 17.6228μs 56.7446 KOps/s 55.7052 KOps/s $\color{#35bf28}+1.87\%$
test_step_mdp_speed[True-True-True-False-True] 53.7700μs 17.1074μs 58.4543 KOps/s 59.0876 KOps/s $\color{#d91a1a}-1.07\%$
test_step_mdp_speed[True-True-True-False-False] 46.5370μs 9.9926μs 100.0738 KOps/s 99.0703 KOps/s $\color{#35bf28}+1.01\%$
test_step_mdp_speed[True-True-False-True-True] 74.5190μs 32.2110μs 31.0453 KOps/s 31.0269 KOps/s $\color{#35bf28}+0.06\%$
test_step_mdp_speed[True-True-False-True-False] 61.9960μs 19.8199μs 50.4544 KOps/s 50.1910 KOps/s $\color{#35bf28}+0.52\%$
test_step_mdp_speed[True-True-False-False-True] 62.8250μs 19.2346μs 51.9897 KOps/s 52.7408 KOps/s $\color{#d91a1a}-1.42\%$
test_step_mdp_speed[True-True-False-False-False] 47.3080μs 11.8506μs 84.3838 KOps/s 84.4573 KOps/s $\color{#d91a1a}-0.09\%$
test_step_mdp_speed[True-False-True-True-True] 93.8350μs 34.2787μs 29.1726 KOps/s 29.3761 KOps/s $\color{#d91a1a}-0.69\%$
test_step_mdp_speed[True-False-True-True-False] 52.1980μs 21.5977μs 46.3013 KOps/s 46.1259 KOps/s $\color{#35bf28}+0.38\%$
test_step_mdp_speed[True-False-True-False-True] 65.5830μs 19.2617μs 51.9164 KOps/s 53.2532 KOps/s $\color{#d91a1a}-2.51\%$
test_step_mdp_speed[True-False-True-False-False] 49.6130μs 11.9614μs 83.6021 KOps/s 83.9607 KOps/s $\color{#d91a1a}-0.43\%$
test_step_mdp_speed[True-False-False-True-True] 88.9460μs 36.1447μs 27.6666 KOps/s 27.9653 KOps/s $\color{#d91a1a}-1.07\%$
test_step_mdp_speed[True-False-False-True-False] 81.8950μs 23.1327μs 43.2289 KOps/s 43.3823 KOps/s $\color{#d91a1a}-0.35\%$
test_step_mdp_speed[True-False-False-False-True] 80.3680μs 20.6230μs 48.4896 KOps/s 49.2743 KOps/s $\color{#d91a1a}-1.59\%$
test_step_mdp_speed[True-False-False-False-False] 59.3910μs 13.7245μs 72.8623 KOps/s 73.7316 KOps/s $\color{#d91a1a}-1.18\%$
test_step_mdp_speed[False-True-True-True-True] 79.2480μs 34.3912μs 29.0772 KOps/s 29.7712 KOps/s $\color{#d91a1a}-2.33\%$
test_step_mdp_speed[False-True-True-True-False] 0.1764ms 22.4314μs 44.5804 KOps/s 46.3921 KOps/s $\color{#d91a1a}-3.91\%$
test_step_mdp_speed[False-True-True-False-True] 67.4560μs 21.6505μs 46.1884 KOps/s 47.4468 KOps/s $\color{#d91a1a}-2.65\%$
test_step_mdp_speed[False-True-True-False-False] 42.6990μs 13.2705μs 75.3554 KOps/s 75.4086 KOps/s $\color{#d91a1a}-0.07\%$
test_step_mdp_speed[False-True-False-True-True] 79.8190μs 35.9030μs 27.8528 KOps/s 28.0312 KOps/s $\color{#d91a1a}-0.64\%$
test_step_mdp_speed[False-True-False-True-False] 75.7920μs 23.4764μs 42.5960 KOps/s 43.1932 KOps/s $\color{#d91a1a}-1.38\%$
test_step_mdp_speed[False-True-False-False-True] 2.8161ms 23.3551μs 42.8173 KOps/s 42.7465 KOps/s $\color{#35bf28}+0.17\%$
test_step_mdp_speed[False-True-False-False-False] 70.0410μs 14.9762μs 66.7728 KOps/s 65.6519 KOps/s $\color{#35bf28}+1.71\%$
test_step_mdp_speed[False-False-True-True-True] 83.9370μs 37.6714μs 26.5454 KOps/s 26.0989 KOps/s $\color{#35bf28}+1.71\%$
test_step_mdp_speed[False-False-True-True-False] 71.5640μs 24.9643μs 40.0572 KOps/s 38.7307 KOps/s $\color{#35bf28}+3.42\%$
test_step_mdp_speed[False-False-True-False-True] 78.3160μs 23.2167μs 43.0724 KOps/s 44.3117 KOps/s $\color{#d91a1a}-2.80\%$
test_step_mdp_speed[False-False-True-False-False] 0.1538ms 15.1941μs 65.8152 KOps/s 66.1989 KOps/s $\color{#d91a1a}-0.58\%$
test_step_mdp_speed[False-False-False-True-True] 93.0140μs 39.0122μs 25.6330 KOps/s 25.4133 KOps/s $\color{#35bf28}+0.86\%$
test_step_mdp_speed[False-False-False-True-False] 78.3760μs 26.7325μs 37.4076 KOps/s 37.3734 KOps/s $\color{#35bf28}+0.09\%$
test_step_mdp_speed[False-False-False-False-True] 74.4790μs 24.4270μs 40.9382 KOps/s 40.1636 KOps/s $\color{#35bf28}+1.93\%$
test_step_mdp_speed[False-False-False-False-False] 72.7560μs 16.5515μs 60.4175 KOps/s 59.4351 KOps/s $\color{#35bf28}+1.65\%$
test_values[generalized_advantage_estimate-True-True] 12.8366ms 9.7188ms 102.8931 Ops/s 104.1003 Ops/s $\color{#d91a1a}-1.16\%$
test_values[vec_generalized_advantage_estimate-True-True] 40.1057ms 36.3807ms 27.4871 Ops/s 29.6871 Ops/s $\textbf{\color{#d91a1a}-7.41\%}$
test_values[td0_return_estimate-False-False] 0.2420ms 0.1732ms 5.7753 KOps/s 5.5519 KOps/s $\color{#35bf28}+4.02\%$
test_values[td1_return_estimate-False-False] 27.7321ms 23.9035ms 41.8349 Ops/s 41.2515 Ops/s $\color{#35bf28}+1.41\%$
test_values[vec_td1_return_estimate-False-False] 39.5660ms 36.6573ms 27.2797 Ops/s 29.3591 Ops/s $\textbf{\color{#d91a1a}-7.08\%}$
test_values[td_lambda_return_estimate-True-False] 37.1436ms 34.2458ms 29.2006 Ops/s 28.6999 Ops/s $\color{#35bf28}+1.74\%$
test_values[vec_td_lambda_return_estimate-True-False] 41.5536ms 37.0012ms 27.0261 Ops/s 29.5206 Ops/s $\textbf{\color{#d91a1a}-8.45\%}$
test_gae_speed[generalized_advantage_estimate-False-1-512] 10.7279ms 8.3333ms 119.9998 Ops/s 118.8980 Ops/s $\color{#35bf28}+0.93\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.4304ms 1.9878ms 503.0767 Ops/s 512.5497 Ops/s $\color{#d91a1a}-1.85\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.5134ms 0.3561ms 2.8079 KOps/s 2.7139 KOps/s $\color{#35bf28}+3.46\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 49.4385ms 47.0138ms 21.2704 Ops/s 24.1431 Ops/s $\textbf{\color{#d91a1a}-11.90\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 3.9422ms 3.1303ms 319.4564 Ops/s 310.1004 Ops/s $\color{#35bf28}+3.02\%$
test_dqn_speed[False-None] 1.9891ms 1.3661ms 732.0310 Ops/s 710.7407 Ops/s $\color{#35bf28}+3.00\%$
test_dqn_speed[False-backward] 1.9306ms 1.8421ms 542.8466 Ops/s 532.1631 Ops/s $\color{#35bf28}+2.01\%$
test_dqn_speed[True-None] 0.7935ms 0.4752ms 2.1046 KOps/s 2.1104 KOps/s $\color{#d91a1a}-0.28\%$
test_dqn_speed[True-backward] 0.9477ms 0.8835ms 1.1319 KOps/s 1.0898 KOps/s $\color{#35bf28}+3.86\%$
test_dqn_speed[reduce-overhead-None] 1.8120ms 0.4665ms 2.1435 KOps/s 2.1329 KOps/s $\color{#35bf28}+0.50\%$
test_dqn_speed[reduce-overhead-backward] 0.9139ms 0.8724ms 1.1462 KOps/s 1.0852 KOps/s $\textbf{\color{#35bf28}+5.62\%}$
test_ddpg_speed[False-None] 3.5922ms 2.8393ms 352.1935 Ops/s 344.8666 Ops/s $\color{#35bf28}+2.12\%$
test_ddpg_speed[False-backward] 4.0910ms 3.9523ms 253.0160 Ops/s 244.5180 Ops/s $\color{#35bf28}+3.48\%$
test_ddpg_speed[True-None] 1.4281ms 1.0011ms 998.9435 Ops/s 974.7407 Ops/s $\color{#35bf28}+2.48\%$
test_ddpg_speed[True-backward] 2.0201ms 1.9290ms 518.3933 Ops/s 507.2098 Ops/s $\color{#35bf28}+2.20\%$
test_ddpg_speed[reduce-overhead-None] 1.4250ms 0.9967ms 1.0033 KOps/s 987.0283 Ops/s $\color{#35bf28}+1.65\%$
test_ddpg_speed[reduce-overhead-backward] 1.9635ms 1.8915ms 528.6721 Ops/s 520.7635 Ops/s $\color{#35bf28}+1.52\%$
test_sac_speed[False-None] 9.7088ms 7.9353ms 126.0196 Ops/s 123.7336 Ops/s $\color{#35bf28}+1.85\%$
test_sac_speed[False-backward] 11.5239ms 10.6204ms 94.1588 Ops/s 91.5484 Ops/s $\color{#35bf28}+2.85\%$
test_sac_speed[True-None] 2.2380ms 1.8166ms 550.4738 Ops/s 539.1936 Ops/s $\color{#35bf28}+2.09\%$
test_sac_speed[True-backward] 3.6675ms 3.5043ms 285.3632 Ops/s 284.6160 Ops/s $\color{#35bf28}+0.26\%$
test_sac_speed[reduce-overhead-None] 1.9314ms 1.8144ms 551.1549 Ops/s 541.7930 Ops/s $\color{#35bf28}+1.73\%$
test_sac_speed[reduce-overhead-backward] 3.7619ms 3.5141ms 284.5668 Ops/s 283.8475 Ops/s $\color{#35bf28}+0.25\%$
test_redq_speed[False-None] 20.6443ms 13.8780ms 72.0566 Ops/s 77.6291 Ops/s $\textbf{\color{#d91a1a}-7.18\%}$
test_redq_speed[False-backward] 24.0063ms 22.3922ms 44.6584 Ops/s 44.3618 Ops/s $\color{#35bf28}+0.67\%$
test_redq_speed[True-None] 5.3023ms 4.6050ms 217.1552 Ops/s 207.2594 Ops/s $\color{#35bf28}+4.77\%$
test_redq_speed[True-backward] 13.2173ms 12.0268ms 83.1476 Ops/s 80.9195 Ops/s $\color{#35bf28}+2.75\%$
test_redq_speed[reduce-overhead-None] 6.3624ms 5.1150ms 195.5030 Ops/s 218.7864 Ops/s $\textbf{\color{#d91a1a}-10.64\%}$
test_redq_speed[reduce-overhead-backward] 15.6583ms 13.0494ms 76.6321 Ops/s 83.5653 Ops/s $\textbf{\color{#d91a1a}-8.30\%}$
test_redq_deprec_speed[False-None] 15.3290ms 13.2333ms 75.5669 Ops/s 77.7763 Ops/s $\color{#d91a1a}-2.84\%$
test_redq_deprec_speed[False-backward] 20.0955ms 18.7844ms 53.2358 Ops/s 53.6669 Ops/s $\color{#d91a1a}-0.80\%$
test_redq_deprec_speed[True-None] 4.2649ms 3.7962ms 263.4192 Ops/s 276.7413 Ops/s $\color{#d91a1a}-4.81\%$
test_redq_deprec_speed[True-backward] 8.4198ms 8.1375ms 122.8875 Ops/s 115.8099 Ops/s $\textbf{\color{#35bf28}+6.11\%}$
test_redq_deprec_speed[reduce-overhead-None] 4.5329ms 3.6526ms 273.7765 Ops/s 280.7095 Ops/s $\color{#d91a1a}-2.47\%$
test_redq_deprec_speed[reduce-overhead-backward] 8.6519ms 8.1328ms 122.9593 Ops/s 126.1689 Ops/s $\color{#d91a1a}-2.54\%$
test_td3_speed[False-None] 8.4518ms 7.9551ms 125.7053 Ops/s 121.7080 Ops/s $\color{#35bf28}+3.28\%$
test_td3_speed[False-backward] 11.4424ms 10.4202ms 95.9678 Ops/s 93.0040 Ops/s $\color{#35bf28}+3.19\%$
test_td3_speed[True-None] 2.0397ms 1.7671ms 565.9143 Ops/s 582.6486 Ops/s $\color{#d91a1a}-2.87\%$
test_td3_speed[True-backward] 3.6169ms 3.4095ms 293.3019 Ops/s 302.5296 Ops/s $\color{#d91a1a}-3.05\%$
test_td3_speed[reduce-overhead-None] 2.0802ms 1.7604ms 568.0408 Ops/s 582.6335 Ops/s $\color{#d91a1a}-2.50\%$
test_td3_speed[reduce-overhead-backward] 3.7256ms 3.5225ms 283.8909 Ops/s 301.0399 Ops/s $\textbf{\color{#d91a1a}-5.70\%}$
test_cql_speed[False-None] 40.4111ms 36.9141ms 27.0899 Ops/s 27.2459 Ops/s $\color{#d91a1a}-0.57\%$
test_cql_speed[False-backward] 52.2204ms 48.4745ms 20.6294 Ops/s 21.4262 Ops/s $\color{#d91a1a}-3.72\%$
test_cql_speed[True-None] 16.7573ms 15.8231ms 63.1986 Ops/s 62.0651 Ops/s $\color{#35bf28}+1.83\%$
test_cql_speed[True-backward] 24.1275ms 22.7250ms 44.0045 Ops/s 43.0357 Ops/s $\color{#35bf28}+2.25\%$
test_cql_speed[reduce-overhead-None] 17.1796ms 15.7565ms 63.4657 Ops/s 62.8341 Ops/s $\color{#35bf28}+1.01\%$
test_cql_speed[reduce-overhead-backward] 24.3000ms 22.5437ms 44.3584 Ops/s 43.0647 Ops/s $\color{#35bf28}+3.00\%$
test_a2c_speed[False-None] 9.8461ms 7.4122ms 134.9123 Ops/s 130.0139 Ops/s $\color{#35bf28}+3.77\%$
test_a2c_speed[False-backward] 16.2030ms 14.7777ms 67.6697 Ops/s 66.4585 Ops/s $\color{#35bf28}+1.82\%$
test_a2c_speed[True-None] 4.9900ms 4.2666ms 234.3793 Ops/s 235.5868 Ops/s $\color{#d91a1a}-0.51\%$
test_a2c_speed[True-backward] 12.3302ms 11.5653ms 86.4654 Ops/s 85.9751 Ops/s $\color{#35bf28}+0.57\%$
test_a2c_speed[reduce-overhead-None] 5.1261ms 4.2604ms 234.7217 Ops/s 214.1638 Ops/s $\textbf{\color{#35bf28}+9.60\%}$
test_a2c_speed[reduce-overhead-backward] 11.2933ms 10.6838ms 93.6000 Ops/s 83.6331 Ops/s $\textbf{\color{#35bf28}+11.92\%}$
test_ppo_speed[False-None] 8.3975ms 7.5678ms 132.1383 Ops/s 119.2553 Ops/s $\textbf{\color{#35bf28}+10.80\%}$
test_ppo_speed[False-backward] 22.2671ms 15.2771ms 65.4575 Ops/s 60.9349 Ops/s $\textbf{\color{#35bf28}+7.42\%}$
test_ppo_speed[True-None] 4.4969ms 3.7215ms 268.7115 Ops/s 238.3463 Ops/s $\textbf{\color{#35bf28}+12.74\%}$
test_ppo_speed[True-backward] 11.0740ms 9.9532ms 100.4700 Ops/s 93.1024 Ops/s $\textbf{\color{#35bf28}+7.91\%}$
test_ppo_speed[reduce-overhead-None] 5.2315ms 3.7682ms 265.3803 Ops/s 246.1318 Ops/s $\textbf{\color{#35bf28}+7.82\%}$
test_ppo_speed[reduce-overhead-backward] 10.5854ms 9.9218ms 100.7882 Ops/s 92.3521 Ops/s $\textbf{\color{#35bf28}+9.13\%}$
test_reinforce_speed[False-None] 8.4310ms 6.6057ms 151.3834 Ops/s 137.7123 Ops/s $\textbf{\color{#35bf28}+9.93\%}$
test_reinforce_speed[False-backward] 10.3333ms 9.8341ms 101.6872 Ops/s 86.9947 Ops/s $\textbf{\color{#35bf28}+16.89\%}$
test_reinforce_speed[True-None] 3.1889ms 2.6784ms 373.3536 Ops/s 347.7087 Ops/s $\textbf{\color{#35bf28}+7.38\%}$
test_reinforce_speed[True-backward] 9.2855ms 8.5807ms 116.5402 Ops/s 110.9582 Ops/s $\textbf{\color{#35bf28}+5.03\%}$
test_reinforce_speed[reduce-overhead-None] 3.1261ms 2.6954ms 370.9966 Ops/s 360.5361 Ops/s $\color{#35bf28}+2.90\%$
test_reinforce_speed[reduce-overhead-backward] 9.3232ms 8.5750ms 116.6179 Ops/s 111.3711 Ops/s $\color{#35bf28}+4.71\%$
test_iql_speed[False-None] 34.6788ms 32.3784ms 30.8848 Ops/s 30.1394 Ops/s $\color{#35bf28}+2.47\%$
test_iql_speed[False-backward] 48.3226ms 45.7362ms 21.8645 Ops/s 21.3564 Ops/s $\color{#35bf28}+2.38\%$
test_iql_speed[True-None] 11.8277ms 10.9438ms 91.3758 Ops/s 87.8458 Ops/s $\color{#35bf28}+4.02\%$
test_iql_speed[True-backward] 23.4960ms 21.8970ms 45.6683 Ops/s 45.0818 Ops/s $\color{#35bf28}+1.30\%$
test_iql_speed[reduce-overhead-None] 12.7688ms 10.8989ms 91.7522 Ops/s 91.3999 Ops/s $\color{#35bf28}+0.39\%$
test_iql_speed[reduce-overhead-backward] 23.0896ms 21.6625ms 46.1628 Ops/s 44.8977 Ops/s $\color{#35bf28}+2.82\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.3070ms 4.9482ms 202.0951 Ops/s 190.0241 Ops/s $\textbf{\color{#35bf28}+6.35\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.1703ms 0.5103ms 1.9598 KOps/s 1.9102 KOps/s $\color{#35bf28}+2.60\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6931ms 0.4855ms 2.0598 KOps/s 1.9919 KOps/s $\color{#35bf28}+3.41\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.2374ms 4.7596ms 210.1033 Ops/s 195.3015 Ops/s $\textbf{\color{#35bf28}+7.58\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.8653ms 0.4952ms 2.0192 KOps/s 1.9821 KOps/s $\color{#35bf28}+1.87\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7091ms 0.4734ms 2.1125 KOps/s 2.0819 KOps/s $\color{#35bf28}+1.47\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.8799ms 1.6357ms 611.3476 Ops/s 603.4839 Ops/s $\color{#35bf28}+1.30\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.3141ms 1.5943ms 627.2261 Ops/s 620.7076 Ops/s $\color{#35bf28}+1.05\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.0099ms 4.9592ms 201.6462 Ops/s 197.0502 Ops/s $\color{#35bf28}+2.33\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.4322ms 0.6481ms 1.5430 KOps/s 1.5307 KOps/s $\color{#35bf28}+0.80\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8805ms 0.6204ms 1.6119 KOps/s 1.6013 KOps/s $\color{#35bf28}+0.66\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.6326ms 4.8408ms 206.5771 Ops/s 205.1241 Ops/s $\color{#35bf28}+0.71\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.3112ms 0.5110ms 1.9568 KOps/s 1.8966 KOps/s $\color{#35bf28}+3.17\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7761ms 0.4907ms 2.0378 KOps/s 2.0163 KOps/s $\color{#35bf28}+1.07\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.1279ms 4.7449ms 210.7529 Ops/s 201.7279 Ops/s $\color{#35bf28}+4.47\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.1952ms 0.5050ms 1.9801 KOps/s 1.9754 KOps/s $\color{#35bf28}+0.24\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6772ms 0.4677ms 2.1382 KOps/s 2.0436 KOps/s $\color{#35bf28}+4.63\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.6535ms 4.8740ms 205.1685 Ops/s 198.1271 Ops/s $\color{#35bf28}+3.55\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.9324ms 0.6463ms 1.5473 KOps/s 1.5022 KOps/s $\color{#35bf28}+3.00\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 7.4347ms 0.6222ms 1.6071 KOps/s 1.5314 KOps/s $\color{#35bf28}+4.94\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.4683s 13.5729ms 73.6762 Ops/s 36.8112 Ops/s $\textbf{\color{#35bf28}+100.15\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 8.6146ms 2.3227ms 430.5330 Ops/s 421.4443 Ops/s $\color{#35bf28}+2.16\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 5.1809ms 1.3633ms 733.4992 Ops/s 763.1096 Ops/s $\color{#d91a1a}-3.88\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 5.5152ms 4.2300ms 236.4075 Ops/s 221.4859 Ops/s $\textbf{\color{#35bf28}+6.74\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 7.6048ms 2.2966ms 435.4171 Ops/s 409.6061 Ops/s $\textbf{\color{#35bf28}+6.30\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 2.5227ms 1.2587ms 794.4983 Ops/s 709.1914 Ops/s $\textbf{\color{#35bf28}+12.03\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.4090s 12.5981ms 79.3767 Ops/s 230.0878 Ops/s $\textbf{\color{#d91a1a}-65.50\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 3.3634ms 2.2221ms 450.0177 Ops/s 404.3748 Ops/s $\textbf{\color{#35bf28}+11.29\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.0746ms 1.3586ms 736.0253 Ops/s 641.0993 Ops/s $\textbf{\color{#35bf28}+14.81\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.2438ms 11.7783ms 84.9021 Ops/s 82.8629 Ops/s $\color{#35bf28}+2.46\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 16.9435ms 14.4615ms 69.1492 Ops/s 67.8772 Ops/s $\color{#35bf28}+1.87\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 20.5103ms 19.9452ms 50.1374 Ops/s 49.7307 Ops/s $\color{#35bf28}+0.82\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 16.0493ms 14.5650ms 68.6580 Ops/s 68.4548 Ops/s $\color{#35bf28}+0.30\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 23.0586ms 20.0399ms 49.9004 Ops/s 50.3626 Ops/s $\color{#d91a1a}-0.92\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 17.1124ms 15.7361ms 63.5483 Ops/s 63.5131 Ops/s $\color{#35bf28}+0.06\%$

Copy link

github-actions bot commented Dec 3, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}23$. Worsened: $\large\color{#d91a1a}11$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.7364s 0.7357s 1.3593 Ops/s 1.3073 Ops/s $\color{#35bf28}+3.98\%$
test_transformed 1.0890s 1.0093s 0.9907 Ops/s 0.9849 Ops/s $\color{#35bf28}+0.59\%$
test_serial 2.2138s 2.1282s 0.4699 Ops/s 0.4641 Ops/s $\color{#35bf28}+1.26\%$
test_parallel 2.0648s 1.9566s 0.5111 Ops/s 0.5095 Ops/s $\color{#35bf28}+0.31\%$
test_step_mdp_speed[True-True-True-True-True] 0.2205ms 39.9396μs 25.0378 KOps/s 24.7278 KOps/s $\color{#35bf28}+1.25\%$
test_step_mdp_speed[True-True-True-True-False] 52.4910μs 23.0794μs 43.3287 KOps/s 44.0015 KOps/s $\color{#d91a1a}-1.53\%$
test_step_mdp_speed[True-True-True-False-True] 0.1826ms 21.6715μs 46.1435 KOps/s 45.4228 KOps/s $\color{#35bf28}+1.59\%$
test_step_mdp_speed[True-True-True-False-False] 0.2013ms 12.8219μs 77.9919 KOps/s 77.5691 KOps/s $\color{#35bf28}+0.54\%$
test_step_mdp_speed[True-True-False-True-True] 0.2308ms 42.2500μs 23.6686 KOps/s 23.8870 KOps/s $\color{#d91a1a}-0.91\%$
test_step_mdp_speed[True-True-False-True-False] 64.3610μs 24.5620μs 40.7133 KOps/s 40.6104 KOps/s $\color{#35bf28}+0.25\%$
test_step_mdp_speed[True-True-False-False-True] 0.2278ms 24.0412μs 41.5953 KOps/s 42.6214 KOps/s $\color{#d91a1a}-2.41\%$
test_step_mdp_speed[True-True-False-False-False] 0.1740ms 14.7878μs 67.6232 KOps/s 67.3667 KOps/s $\color{#35bf28}+0.38\%$
test_step_mdp_speed[True-False-True-True-True] 80.1810μs 43.4479μs 23.0161 KOps/s 22.9345 KOps/s $\color{#35bf28}+0.36\%$
test_step_mdp_speed[True-False-True-True-False] 55.1510μs 26.9736μs 37.0733 KOps/s 37.6883 KOps/s $\color{#d91a1a}-1.63\%$
test_step_mdp_speed[True-False-True-False-True] 84.2320μs 23.2089μs 43.0869 KOps/s 42.6880 KOps/s $\color{#35bf28}+0.93\%$
test_step_mdp_speed[True-False-True-False-False] 91.2010μs 14.8886μs 67.1655 KOps/s 67.1936 KOps/s $\color{#d91a1a}-0.04\%$
test_step_mdp_speed[True-False-False-True-True] 90.4010μs 45.3374μs 22.0569 KOps/s 22.1178 KOps/s $\color{#d91a1a}-0.28\%$
test_step_mdp_speed[True-False-False-True-False] 94.4910μs 28.0592μs 35.6389 KOps/s 34.7176 KOps/s $\color{#35bf28}+2.65\%$
test_step_mdp_speed[True-False-False-False-True] 0.2327ms 25.9656μs 38.5125 KOps/s 39.7263 KOps/s $\color{#d91a1a}-3.06\%$
test_step_mdp_speed[True-False-False-False-False] 41.2310μs 17.0389μs 58.6894 KOps/s 59.8546 KOps/s $\color{#d91a1a}-1.95\%$
test_step_mdp_speed[False-True-True-True-True] 0.1093ms 43.2155μs 23.1398 KOps/s 23.2460 KOps/s $\color{#d91a1a}-0.46\%$
test_step_mdp_speed[False-True-True-True-False] 2.7368ms 26.6551μs 37.5163 KOps/s 37.5981 KOps/s $\color{#d91a1a}-0.22\%$
test_step_mdp_speed[False-True-True-False-True] 80.5110μs 27.8695μs 35.8815 KOps/s 36.3833 KOps/s $\color{#d91a1a}-1.38\%$
test_step_mdp_speed[False-True-True-False-False] 40.9710μs 16.5989μs 60.2448 KOps/s 60.3448 KOps/s $\color{#d91a1a}-0.17\%$
test_step_mdp_speed[False-True-False-True-True] 80.7410μs 46.5834μs 21.4669 KOps/s 22.0261 KOps/s $\color{#d91a1a}-2.54\%$
test_step_mdp_speed[False-True-False-True-False] 67.1410μs 29.0642μs 34.4065 KOps/s 34.5374 KOps/s $\color{#d91a1a}-0.38\%$
test_step_mdp_speed[False-True-False-False-True] 3.2570ms 30.2295μs 33.0803 KOps/s 32.9110 KOps/s $\color{#35bf28}+0.51\%$
test_step_mdp_speed[False-True-False-False-False] 61.0100μs 18.6455μs 53.6321 KOps/s 54.4635 KOps/s $\color{#d91a1a}-1.53\%$
test_step_mdp_speed[False-False-True-True-True] 90.1820μs 49.1086μs 20.3630 KOps/s 21.7157 KOps/s $\textbf{\color{#d91a1a}-6.23\%}$
test_step_mdp_speed[False-False-True-True-False] 72.7010μs 32.1775μs 31.0776 KOps/s 32.3812 KOps/s $\color{#d91a1a}-4.03\%$
test_step_mdp_speed[False-False-True-False-True] 64.4110μs 29.4512μs 33.9545 KOps/s 33.5644 KOps/s $\color{#35bf28}+1.16\%$
test_step_mdp_speed[False-False-True-False-False] 49.5710μs 18.6745μs 53.5489 KOps/s 54.6413 KOps/s $\color{#d91a1a}-2.00\%$
test_step_mdp_speed[False-False-False-True-True] 80.7510μs 50.0196μs 19.9921 KOps/s 20.0377 KOps/s $\color{#d91a1a}-0.23\%$
test_step_mdp_speed[False-False-False-True-False] 79.2210μs 33.4993μs 29.8514 KOps/s 30.2654 KOps/s $\color{#d91a1a}-1.37\%$
test_step_mdp_speed[False-False-False-False-True] 60.6310μs 31.8294μs 31.4175 KOps/s 32.0550 KOps/s $\color{#d91a1a}-1.99\%$
test_step_mdp_speed[False-False-False-False-False] 50.7910μs 20.5433μs 48.6777 KOps/s 48.9343 KOps/s $\color{#d91a1a}-0.52\%$
test_values[generalized_advantage_estimate-True-True] 24.8540ms 24.0269ms 41.6200 Ops/s 40.4440 Ops/s $\color{#35bf28}+2.91\%$
test_values[vec_generalized_advantage_estimate-True-True] 94.9558ms 2.7927ms 358.0818 Ops/s 331.3482 Ops/s $\textbf{\color{#35bf28}+8.07\%}$
test_values[td0_return_estimate-False-False] 0.1031ms 78.3789μs 12.7585 KOps/s 12.5481 KOps/s $\color{#35bf28}+1.68\%$
test_values[td1_return_estimate-False-False] 56.1978ms 55.0385ms 18.1691 Ops/s 18.3105 Ops/s $\color{#d91a1a}-0.77\%$
test_values[vec_td1_return_estimate-False-False] 1.2910ms 1.0713ms 933.4612 Ops/s 924.8099 Ops/s $\color{#35bf28}+0.94\%$
test_values[td_lambda_return_estimate-True-False] 88.7115ms 85.6681ms 11.6730 Ops/s 11.5822 Ops/s $\color{#35bf28}+0.78\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.3711ms 1.0705ms 934.1169 Ops/s 933.4065 Ops/s $\color{#35bf28}+0.08\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 24.3924ms 23.8609ms 41.9095 Ops/s 41.2212 Ops/s $\color{#35bf28}+1.67\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0296ms 0.7493ms 1.3345 KOps/s 1.3212 KOps/s $\color{#35bf28}+1.01\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7869ms 0.6594ms 1.5165 KOps/s 1.5093 KOps/s $\color{#35bf28}+0.48\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.6252ms 1.4702ms 680.1838 Ops/s 676.2392 Ops/s $\color{#35bf28}+0.58\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.8762ms 0.6723ms 1.4874 KOps/s 1.4738 KOps/s $\color{#35bf28}+0.92\%$
test_dqn_speed[False-None] 7.1342ms 1.4622ms 683.9193 Ops/s 681.8988 Ops/s $\color{#35bf28}+0.30\%$
test_dqn_speed[False-backward] 2.2050ms 2.0620ms 484.9615 Ops/s 483.5748 Ops/s $\color{#35bf28}+0.29\%$
test_dqn_speed[True-None] 0.6936ms 0.5385ms 1.8570 KOps/s 1.8966 KOps/s $\color{#d91a1a}-2.09\%$
test_dqn_speed[True-backward] 1.2605ms 1.2080ms 827.7857 Ops/s 832.3896 Ops/s $\color{#d91a1a}-0.55\%$
test_dqn_speed[reduce-overhead-None] 0.7504ms 0.5454ms 1.8334 KOps/s 1.8439 KOps/s $\color{#d91a1a}-0.57\%$
test_dqn_speed[reduce-overhead-backward] 1.1205ms 1.0679ms 936.3968 Ops/s 1.0336 KOps/s $\textbf{\color{#d91a1a}-9.40\%}$
test_ddpg_speed[False-None] 3.2071ms 2.8389ms 352.2492 Ops/s 359.5450 Ops/s $\color{#d91a1a}-2.03\%$
test_ddpg_speed[False-backward] 4.6384ms 4.1477ms 241.0993 Ops/s 247.6017 Ops/s $\color{#d91a1a}-2.63\%$
test_ddpg_speed[True-None] 1.2289ms 1.0574ms 945.7194 Ops/s 949.5296 Ops/s $\color{#d91a1a}-0.40\%$
test_ddpg_speed[True-backward] 2.6306ms 2.2766ms 439.2488 Ops/s 468.6903 Ops/s $\textbf{\color{#d91a1a}-6.28\%}$
test_ddpg_speed[reduce-overhead-None] 1.2390ms 1.0721ms 932.7583 Ops/s 927.4361 Ops/s $\color{#35bf28}+0.57\%$
test_ddpg_speed[reduce-overhead-backward] 1.9176ms 1.7773ms 562.6654 Ops/s 610.2856 Ops/s $\textbf{\color{#d91a1a}-7.80\%}$
test_sac_speed[False-None] 8.4427ms 7.9044ms 126.5112 Ops/s 125.7720 Ops/s $\color{#35bf28}+0.59\%$
test_sac_speed[False-backward] 11.8762ms 11.1941ms 89.3326 Ops/s 91.1794 Ops/s $\color{#d91a1a}-2.03\%$
test_sac_speed[True-None] 1.7613ms 1.5801ms 632.8683 Ops/s 657.0066 Ops/s $\color{#d91a1a}-3.67\%$
test_sac_speed[True-backward] 3.5177ms 3.3510ms 298.4195 Ops/s 298.4603 Ops/s $\color{#d91a1a}-0.01\%$
test_sac_speed[reduce-overhead-None] 23.1636ms 12.7562ms 78.3932 Ops/s 81.5148 Ops/s $\color{#d91a1a}-3.83\%$
test_sac_speed[reduce-overhead-backward] 1.4956ms 1.3459ms 742.9878 Ops/s 670.3431 Ops/s $\textbf{\color{#35bf28}+10.84\%}$
test_redq_speed[False-None] 8.2816ms 7.4406ms 134.3985 Ops/s 133.2226 Ops/s $\color{#35bf28}+0.88\%$
test_redq_speed[False-backward] 12.4602ms 11.2483ms 88.9026 Ops/s 85.5374 Ops/s $\color{#35bf28}+3.93\%$
test_redq_speed[True-None] 2.1391ms 1.9572ms 510.9340 Ops/s 505.7424 Ops/s $\color{#35bf28}+1.03\%$
test_redq_speed[True-backward] 3.7399ms 3.5877ms 278.7269 Ops/s 261.2430 Ops/s $\textbf{\color{#35bf28}+6.69\%}$
test_redq_speed[reduce-overhead-None] 2.2058ms 1.9840ms 504.0409 Ops/s 504.9537 Ops/s $\color{#d91a1a}-0.18\%$
test_redq_speed[reduce-overhead-backward] 3.7373ms 3.5998ms 277.7965 Ops/s 265.1968 Ops/s $\color{#35bf28}+4.75\%$
test_redq_deprec_speed[False-None] 9.5141ms 8.9582ms 111.6298 Ops/s 111.1410 Ops/s $\color{#35bf28}+0.44\%$
test_redq_deprec_speed[False-backward] 12.4456ms 11.9051ms 83.9974 Ops/s 81.1741 Ops/s $\color{#35bf28}+3.48\%$
test_redq_deprec_speed[True-None] 2.5698ms 2.3642ms 422.9764 Ops/s 427.8540 Ops/s $\color{#d91a1a}-1.14\%$
test_redq_deprec_speed[True-backward] 4.0734ms 3.8968ms 256.6176 Ops/s 241.1442 Ops/s $\textbf{\color{#35bf28}+6.42\%}$
test_redq_deprec_speed[reduce-overhead-None] 2.5888ms 2.3842ms 419.4329 Ops/s 417.6078 Ops/s $\color{#35bf28}+0.44\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.1153ms 3.9281ms 254.5739 Ops/s 241.5168 Ops/s $\textbf{\color{#35bf28}+5.41\%}$
test_td3_speed[False-None] 8.1366ms 7.8885ms 126.7673 Ops/s 128.0408 Ops/s $\color{#d91a1a}-0.99\%$
test_td3_speed[False-backward] 10.9630ms 10.1989ms 98.0495 Ops/s 97.4239 Ops/s $\color{#35bf28}+0.64\%$
test_td3_speed[True-None] 1.5684ms 1.5411ms 648.8941 Ops/s 653.5241 Ops/s $\color{#d91a1a}-0.71\%$
test_td3_speed[True-backward] 3.2287ms 3.0725ms 325.4638 Ops/s 325.9555 Ops/s $\color{#d91a1a}-0.15\%$
test_td3_speed[reduce-overhead-None] 50.3859ms 25.5875ms 39.0816 Ops/s 37.3316 Ops/s $\color{#35bf28}+4.69\%$
test_td3_speed[reduce-overhead-backward] 1.3811ms 1.2963ms 771.3979 Ops/s 689.8302 Ops/s $\textbf{\color{#35bf28}+11.82\%}$
test_cql_speed[False-None] 16.2129ms 15.8664ms 63.0263 Ops/s 62.1173 Ops/s $\color{#35bf28}+1.46\%$
test_cql_speed[False-backward] 21.4154ms 21.0050ms 47.6076 Ops/s 46.0670 Ops/s $\color{#35bf28}+3.34\%$
test_cql_speed[True-None] 3.1108ms 2.8766ms 347.6332 Ops/s 347.9950 Ops/s $\color{#d91a1a}-0.10\%$
test_cql_speed[True-backward] 5.1167ms 4.9283ms 202.9109 Ops/s 193.4359 Ops/s $\color{#35bf28}+4.90\%$
test_cql_speed[reduce-overhead-None] 21.6873ms 13.1873ms 75.8306 Ops/s 75.6781 Ops/s $\color{#35bf28}+0.20\%$
test_cql_speed[reduce-overhead-backward] 1.6442ms 1.5613ms 640.4869 Ops/s 598.5184 Ops/s $\textbf{\color{#35bf28}+7.01\%}$
test_a2c_speed[False-None] 3.4084ms 3.1331ms 319.1764 Ops/s 314.5375 Ops/s $\color{#35bf28}+1.47\%$
test_a2c_speed[False-backward] 6.4604ms 6.2364ms 160.3478 Ops/s 158.3689 Ops/s $\color{#35bf28}+1.25\%$
test_a2c_speed[True-None] 1.1705ms 0.9832ms 1.0171 KOps/s 1.0135 KOps/s $\color{#35bf28}+0.36\%$
test_a2c_speed[True-backward] 2.8809ms 2.7271ms 366.6885 Ops/s 361.6352 Ops/s $\color{#35bf28}+1.40\%$
test_a2c_speed[reduce-overhead-None] 0.3937s 12.4319ms 80.4383 Ops/s 85.8922 Ops/s $\textbf{\color{#d91a1a}-6.35\%}$
test_a2c_speed[reduce-overhead-backward] 1.1471ms 1.0957ms 912.6381 Ops/s 877.2044 Ops/s $\color{#35bf28}+4.04\%$
test_ppo_speed[False-None] 4.6161ms 3.5949ms 278.1696 Ops/s 274.4141 Ops/s $\color{#35bf28}+1.37\%$
test_ppo_speed[False-backward] 7.1085ms 6.8527ms 145.9270 Ops/s 142.8284 Ops/s $\color{#35bf28}+2.17\%$
test_ppo_speed[True-None] 1.1296ms 0.9435ms 1.0599 KOps/s 1.0613 KOps/s $\color{#d91a1a}-0.13\%$
test_ppo_speed[True-backward] 2.8130ms 2.5845ms 386.9264 Ops/s 373.6000 Ops/s $\color{#35bf28}+3.57\%$
test_ppo_speed[reduce-overhead-None] 0.6806ms 0.4939ms 2.0246 KOps/s 1.8740 KOps/s $\textbf{\color{#35bf28}+8.04\%}$
test_ppo_speed[reduce-overhead-backward] 1.0397ms 0.9762ms 1.0244 KOps/s 873.1678 Ops/s $\textbf{\color{#35bf28}+17.32\%}$
test_reinforce_speed[False-None] 2.3909ms 2.1880ms 457.0380 Ops/s 448.5675 Ops/s $\color{#35bf28}+1.89\%$
test_reinforce_speed[False-backward] 3.4517ms 3.2017ms 312.3377 Ops/s 302.2500 Ops/s $\color{#35bf28}+3.34\%$
test_reinforce_speed[True-None] 0.9787ms 0.8170ms 1.2240 KOps/s 1.2129 KOps/s $\color{#35bf28}+0.91\%$
test_reinforce_speed[True-backward] 2.6169ms 2.4302ms 411.4909 Ops/s 391.4228 Ops/s $\textbf{\color{#35bf28}+5.13\%}$
test_reinforce_speed[reduce-overhead-None] 22.5807ms 11.9917ms 83.3910 Ops/s 86.2135 Ops/s $\color{#d91a1a}-3.27\%$
test_reinforce_speed[reduce-overhead-backward] 1.2177ms 1.0599ms 943.4749 Ops/s 828.1735 Ops/s $\textbf{\color{#35bf28}+13.92\%}$
test_iql_speed[False-None] 9.5091ms 9.0718ms 110.2320 Ops/s 109.8657 Ops/s $\color{#35bf28}+0.33\%$
test_iql_speed[False-backward] 14.1900ms 12.8984ms 77.5288 Ops/s 76.6039 Ops/s $\color{#35bf28}+1.21\%$
test_iql_speed[True-None] 1.9135ms 1.7178ms 582.1340 Ops/s 583.3451 Ops/s $\color{#d91a1a}-0.21\%$
test_iql_speed[True-backward] 4.2973ms 4.1366ms 241.7469 Ops/s 229.8571 Ops/s $\textbf{\color{#35bf28}+5.17\%}$
test_iql_speed[reduce-overhead-None] 20.6960ms 11.6363ms 85.9379 Ops/s 86.8607 Ops/s $\color{#d91a1a}-1.06\%$
test_iql_speed[reduce-overhead-backward] 1.5656ms 1.4367ms 696.0504 Ops/s 630.4913 Ops/s $\textbf{\color{#35bf28}+10.40\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.9887ms 6.4982ms 153.8880 Ops/s 151.9907 Ops/s $\color{#35bf28}+1.25\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.6702ms 0.3257ms 3.0705 KOps/s 2.6800 KOps/s $\textbf{\color{#35bf28}+14.57\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7533ms 0.2986ms 3.3486 KOps/s 3.9238 KOps/s $\textbf{\color{#d91a1a}-14.66\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.6041ms 6.2390ms 160.2831 Ops/s 159.3628 Ops/s $\color{#35bf28}+0.58\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.9723ms 0.2994ms 3.3398 KOps/s 2.9663 KOps/s $\textbf{\color{#35bf28}+12.59\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5660ms 0.3043ms 3.2864 KOps/s 3.1101 KOps/s $\textbf{\color{#35bf28}+5.67\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.5551ms 1.3354ms 748.8190 Ops/s 714.9960 Ops/s $\color{#35bf28}+4.73\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5954ms 1.2598ms 793.7681 Ops/s 703.1842 Ops/s $\textbf{\color{#35bf28}+12.88\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.6007ms 6.3980ms 156.2997 Ops/s 156.8312 Ops/s $\color{#d91a1a}-0.34\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.3035ms 0.4395ms 2.2755 KOps/s 2.3934 KOps/s $\color{#d91a1a}-4.93\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8371ms 0.4358ms 2.2944 KOps/s 2.2767 KOps/s $\color{#35bf28}+0.78\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.6333ms 6.2558ms 159.8507 Ops/s 159.7725 Ops/s $\color{#35bf28}+0.05\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.9431ms 0.3937ms 2.5400 KOps/s 3.1040 KOps/s $\textbf{\color{#d91a1a}-18.17\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8147ms 0.3561ms 2.8084 KOps/s 2.9903 KOps/s $\textbf{\color{#d91a1a}-6.08\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.5943ms 6.2083ms 161.0754 Ops/s 160.9829 Ops/s $\color{#35bf28}+0.06\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.7663ms 0.2914ms 3.4321 KOps/s 2.5824 KOps/s $\textbf{\color{#35bf28}+32.90\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5597ms 0.2928ms 3.4159 KOps/s 2.9887 KOps/s $\textbf{\color{#35bf28}+14.29\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.5891ms 6.4113ms 155.9743 Ops/s 156.5780 Ops/s $\color{#d91a1a}-0.39\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.3355ms 0.5147ms 1.9428 KOps/s 1.9203 KOps/s $\color{#35bf28}+1.17\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7708ms 0.4835ms 2.0684 KOps/s 1.9557 KOps/s $\textbf{\color{#35bf28}+5.76\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 7.0482ms 5.2671ms 189.8577 Ops/s 189.4674 Ops/s $\color{#35bf28}+0.21\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 9.9952ms 1.9453ms 514.0662 Ops/s 442.8788 Ops/s $\textbf{\color{#35bf28}+16.07\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.5369ms 1.2462ms 802.4422 Ops/s 864.7390 Ops/s $\textbf{\color{#d91a1a}-7.20\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.4993s 15.1920ms 65.8239 Ops/s 191.8542 Ops/s $\textbf{\color{#d91a1a}-65.69\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 10.4748ms 2.0064ms 498.4175 Ops/s 431.8362 Ops/s $\textbf{\color{#35bf28}+15.42\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 6.8727ms 1.2113ms 825.5917 Ops/s 877.8145 Ops/s $\textbf{\color{#d91a1a}-5.95\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 7.2996ms 5.5590ms 179.8887 Ops/s 32.8260 Ops/s $\textbf{\color{#35bf28}+448.01\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 10.4647ms 2.2282ms 448.7839 Ops/s 467.9848 Ops/s $\color{#d91a1a}-4.10\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 3.5798ms 1.3362ms 748.3926 Ops/s 733.3541 Ops/s $\color{#35bf28}+2.05\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.7798ms 13.0836ms 76.4318 Ops/s 75.0957 Ops/s $\color{#35bf28}+1.78\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 17.8698ms 16.7425ms 59.7284 Ops/s 59.2376 Ops/s $\color{#35bf28}+0.83\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.1539ms 17.6223ms 56.7464 Ops/s 55.2926 Ops/s $\color{#35bf28}+2.63\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 18.1215ms 17.3700ms 57.5704 Ops/s 57.7224 Ops/s $\color{#d91a1a}-0.26\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 17.7714ms 17.3367ms 57.6812 Ops/s 55.1423 Ops/s $\color{#35bf28}+4.60\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 19.9198ms 18.2969ms 54.6540 Ops/s 55.3558 Ops/s $\color{#d91a1a}-1.27\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants