-
Notifications
You must be signed in to change notification settings - Fork 315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Performance] Minor improvements to step_and_maybe_reset in batched envs #1807
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/1807
Note: Links to docs will display an error until the docs builds have been completed. ⏳ 1 Pending, 3 Unrelated FailuresAs of commit 6ba3c85 with merge base 3d7e49c (): BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Jan 16, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_single | 63.9152ms | 62.9908ms | 15.8753 Ops/s | 15.1212 Ops/s | |
test_sync | 56.0164ms | 39.3707ms | 25.3996 Ops/s | 27.4694 Ops/s | |
test_async | 98.1210ms | 34.1165ms | 29.3113 Ops/s | 29.3916 Ops/s | |
test_simple | 0.4970s | 0.4377s | 2.2846 Ops/s | 2.2411 Ops/s | |
test_transformed | 0.6633s | 0.6152s | 1.6254 Ops/s | 1.6179 Ops/s | |
test_serial | 1.4135s | 1.3485s | 0.7416 Ops/s | 0.7063 Ops/s | |
test_parallel | 1.3381s | 1.2724s | 0.7859 Ops/s | 0.7586 Ops/s | |
test_step_mdp_speed[True-True-True-True-True] | 0.1718ms | 22.0895μs | 45.2704 KOps/s | 45.3579 KOps/s | |
test_step_mdp_speed[True-True-True-True-False] | 40.4160μs | 13.4504μs | 74.3470 KOps/s | 75.5643 KOps/s | |
test_step_mdp_speed[True-True-True-False-True] | 33.4530μs | 13.0518μs | 76.6180 KOps/s | 77.0190 KOps/s | |
test_step_mdp_speed[True-True-True-False-False] | 32.6110μs | 8.0565μs | 124.1230 KOps/s | 128.4810 KOps/s | |
test_step_mdp_speed[True-True-False-True-True] | 54.1420μs | 23.5295μs | 42.4998 KOps/s | 42.2180 KOps/s | |
test_step_mdp_speed[True-True-False-True-False] | 48.1910μs | 14.8405μs | 67.3831 KOps/s | 67.3093 KOps/s | |
test_step_mdp_speed[True-True-False-False-True] | 32.4510μs | 14.4961μs | 68.9841 KOps/s | 69.5636 KOps/s | |
test_step_mdp_speed[True-True-False-False-False] | 33.2730μs | 9.3269μs | 107.2162 KOps/s | 109.2014 KOps/s | |
test_step_mdp_speed[True-False-True-True-True] | 47.8300μs | 25.0976μs | 39.8444 KOps/s | 40.0175 KOps/s | |
test_step_mdp_speed[True-False-True-True-False] | 70.8130μs | 16.1198μs | 62.0355 KOps/s | 61.8902 KOps/s | |
test_step_mdp_speed[True-False-True-False-True] | 41.3180μs | 14.3269μs | 69.7986 KOps/s | 69.0176 KOps/s | |
test_step_mdp_speed[True-False-True-False-False] | 23.1040μs | 9.2714μs | 107.8591 KOps/s | 109.3479 KOps/s | |
test_step_mdp_speed[True-False-False-True-True] | 65.4120μs | 26.3230μs | 37.9896 KOps/s | 38.8597 KOps/s | |
test_step_mdp_speed[True-False-False-True-False] | 50.9650μs | 17.3304μs | 57.7020 KOps/s | 58.4371 KOps/s | |
test_step_mdp_speed[True-False-False-False-True] | 47.9900μs | 15.4331μs | 64.7957 KOps/s | 64.3543 KOps/s | |
test_step_mdp_speed[True-False-False-False-False] | 45.3450μs | 10.5179μs | 95.0758 KOps/s | 96.9731 KOps/s | |
test_step_mdp_speed[False-True-True-True-True] | 0.1169ms | 24.9517μs | 40.0774 KOps/s | 37.8611 KOps/s | |
test_step_mdp_speed[False-True-True-True-False] | 66.1530μs | 16.1758μs | 61.8208 KOps/s | 59.3138 KOps/s | |
test_step_mdp_speed[False-True-True-False-True] | 41.2270μs | 16.7887μs | 59.5640 KOps/s | 60.4413 KOps/s | |
test_step_mdp_speed[False-True-True-False-False] | 39.8850μs | 10.5665μs | 94.6391 KOps/s | 96.3671 KOps/s | |
test_step_mdp_speed[False-True-False-True-True] | 69.7100μs | 25.9269μs | 38.5699 KOps/s | 38.0792 KOps/s | |
test_step_mdp_speed[False-True-False-True-False] | 49.0820μs | 17.3653μs | 57.5863 KOps/s | 58.1388 KOps/s | |
test_step_mdp_speed[False-True-False-False-True] | 41.1770μs | 17.9177μs | 55.8106 KOps/s | 56.0061 KOps/s | |
test_step_mdp_speed[False-True-False-False-False] | 33.4720μs | 11.6743μs | 85.6579 KOps/s | 85.7198 KOps/s | |
test_step_mdp_speed[False-False-True-True-True] | 59.8020μs | 27.5163μs | 36.3421 KOps/s | 36.1080 KOps/s | |
test_step_mdp_speed[False-False-True-True-False] | 62.7680μs | 18.7389μs | 53.3650 KOps/s | 53.0926 KOps/s | |
test_step_mdp_speed[False-False-True-False-True] | 56.7760μs | 17.8932μs | 55.8873 KOps/s | 54.9082 KOps/s | |
test_step_mdp_speed[False-False-True-False-False] | 29.2850μs | 11.8050μs | 84.7102 KOps/s | 84.7530 KOps/s | |
test_step_mdp_speed[False-False-False-True-True] | 0.1056ms | 28.5154μs | 35.0688 KOps/s | 35.0370 KOps/s | |
test_step_mdp_speed[False-False-False-True-False] | 54.4210μs | 19.9268μs | 50.1837 KOps/s | 50.6287 KOps/s | |
test_step_mdp_speed[False-False-False-False-True] | 49.8140μs | 18.8144μs | 53.1508 KOps/s | 52.4778 KOps/s | |
test_step_mdp_speed[False-False-False-False-False] | 39.6940μs | 12.7823μs | 78.2332 KOps/s | 77.9398 KOps/s | |
test_values[generalized_advantage_estimate-True-True] | 12.5204ms | 11.8820ms | 84.1607 Ops/s | 82.4419 Ops/s | |
test_values[vec_generalized_advantage_estimate-True-True] | 35.0684ms | 27.0910ms | 36.9126 Ops/s | 36.6057 Ops/s | |
test_values[td0_return_estimate-False-False] | 0.2432ms | 0.1805ms | 5.5391 KOps/s | 5.7048 KOps/s | |
test_values[td1_return_estimate-False-False] | 25.1403ms | 24.9127ms | 40.1402 Ops/s | 38.5856 Ops/s | |
test_values[vec_td1_return_estimate-False-False] | 35.1975ms | 27.4797ms | 36.3906 Ops/s | 32.3297 Ops/s | |
test_values[td_lambda_return_estimate-True-False] | 35.3985ms | 34.9812ms | 28.5868 Ops/s | 27.7086 Ops/s | |
test_values[vec_td_lambda_return_estimate-True-False] | 36.9679ms | 27.4770ms | 36.3940 Ops/s | 36.5363 Ops/s | |
test_gae_speed[generalized_advantage_estimate-False-1-512] | 8.0873ms | 7.8679ms | 127.0990 Ops/s | 121.8558 Ops/s | |
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 2.3119ms | 1.9199ms | 520.8637 Ops/s | 514.6368 Ops/s | |
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 9.5685ms | 0.4283ms | 2.3346 KOps/s | 2.3264 KOps/s | |
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 48.0291ms | 39.1320ms | 25.5546 Ops/s | 26.0190 Ops/s | |
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 11.3014ms | 2.6328ms | 379.8217 Ops/s | 378.6601 Ops/s | |
test_dqn_speed | 18.0578ms | 7.7095ms | 129.7105 Ops/s | 118.4531 Ops/s | |
test_ddpg_speed | 22.5488ms | 14.5848ms | 68.5647 Ops/s | 65.5957 Ops/s | |
test_sac_speed | 35.9385ms | 29.4172ms | 33.9937 Ops/s | 33.2678 Ops/s | |
test_redq_speed | 50.3302ms | 46.9836ms | 21.2840 Ops/s | 21.5700 Ops/s | |
test_redq_deprec_speed | 27.2073ms | 25.6884ms | 38.9280 Ops/s | 38.6662 Ops/s | |
test_td3_speed | 30.4294ms | 20.4464ms | 48.9083 Ops/s | 48.2521 Ops/s | |
test_cql_speed | 91.6567ms | 87.8187ms | 11.3871 Ops/s | 11.2436 Ops/s | |
test_a2c_speed | 29.4090ms | 27.0472ms | 36.9724 Ops/s | 36.5002 Ops/s | |
test_ppo_speed | 36.9635ms | 27.9684ms | 35.7546 Ops/s | 36.3943 Ops/s | |
test_reinforce_speed | 31.4043ms | 26.3275ms | 37.9830 Ops/s | 37.9172 Ops/s | |
test_iql_speed | 69.5717ms | 64.2593ms | 15.5619 Ops/s | 15.6205 Ops/s | |
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 1.8549ms | 1.5114ms | 661.6343 Ops/s | 721.3631 Ops/s | |
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 8.8265ms | 0.5270ms | 1.8976 KOps/s | 1.8778 KOps/s | |
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 8.7894ms | 0.5154ms | 1.9403 KOps/s | 1.9460 KOps/s | |
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 2.0237ms | 1.4651ms | 682.5420 Ops/s | 738.5651 Ops/s | |
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 8.8731ms | 0.5204ms | 1.9218 KOps/s | 1.9237 KOps/s | |
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 2.7266ms | 0.5012ms | 1.9952 KOps/s | 2.0118 KOps/s | |
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 2.4686ms | 1.6809ms | 594.9208 Ops/s | 637.3991 Ops/s | |
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 0.8764ms | 0.6520ms | 1.5336 KOps/s | 1.5398 KOps/s | |
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 8.9043ms | 0.6465ms | 1.5469 KOps/s | 1.5521 KOps/s | |
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 2.1982ms | 1.5029ms | 665.3882 Ops/s | 709.1735 Ops/s | |
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.6252ms | 0.5174ms | 1.9327 KOps/s | 1.8846 KOps/s | |
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.7115ms | 0.5049ms | 1.9805 KOps/s | 1.9651 KOps/s | |
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 1.6334ms | 1.4656ms | 682.3220 Ops/s | 725.1670 Ops/s | |
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.6261ms | 0.5150ms | 1.9419 KOps/s | 1.8873 KOps/s | |
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 8.7501ms | 0.5141ms | 1.9453 KOps/s | 2.0021 KOps/s | |
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 2.6257ms | 1.7371ms | 575.6663 Ops/s | 632.2609 Ops/s | |
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 0.8530ms | 0.6574ms | 1.5211 KOps/s | 1.5116 KOps/s | |
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 11.1851ms | 0.6652ms | 1.5034 KOps/s | 1.5633 KOps/s | |
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 0.1296s | 20.2419ms | 49.4024 Ops/s | 58.7754 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 19.1477ms | 13.7929ms | 72.5009 Ops/s | 73.5584 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 15.9930ms | 3.5258ms | 283.6209 Ops/s | 305.3620 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 0.1165s | 17.7246ms | 56.4188 Ops/s | 58.4469 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 15.9580ms | 13.7385ms | 72.7880 Ops/s | 73.4186 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 4.5733ms | 3.2594ms | 306.8091 Ops/s | 305.3517 Ops/s | |
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 0.1224s | 18.2215ms | 54.8803 Ops/s | 58.1224 Ops/s | |
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 16.1390ms | 14.0083ms | 71.3862 Ops/s | 72.1933 Ops/s | |
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 6.1468ms | 3.5538ms | 281.3891 Ops/s | 271.4022 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_single | 0.1151s | 0.1144s | 8.7391 Ops/s | 8.5162 Ops/s | |
test_sync | 0.1733s | 0.1034s | 9.6675 Ops/s | 9.7200 Ops/s | |
test_async | 0.2445s | 91.4248ms | 10.9380 Ops/s | 10.9213 Ops/s | |
test_single_pixels | 0.1402s | 0.1395s | 7.1682 Ops/s | 7.1229 Ops/s | |
test_sync_pixels | 82.1640ms | 76.6023ms | 13.0544 Ops/s | 12.7278 Ops/s | |
test_async_pixels | 0.1481s | 72.8528ms | 13.7263 Ops/s | 13.3416 Ops/s | |
test_simple | 0.8940s | 0.8313s | 1.2030 Ops/s | 1.2172 Ops/s | |
test_transformed | 1.1321s | 1.0658s | 0.9383 Ops/s | 0.9277 Ops/s | |
test_serial | 2.3827s | 2.3247s | 0.4302 Ops/s | 0.4264 Ops/s | |
test_parallel | 1.9473s | 1.8568s | 0.5385 Ops/s | 0.5199 Ops/s | |
test_step_mdp_speed[True-True-True-True-True] | 93.4720μs | 33.2630μs | 30.0634 KOps/s | 29.7825 KOps/s | |
test_step_mdp_speed[True-True-True-True-False] | 40.2510μs | 19.6253μs | 50.9547 KOps/s | 51.0734 KOps/s | |
test_step_mdp_speed[True-True-True-False-True] | 38.4610μs | 18.7720μs | 53.2709 KOps/s | 52.5865 KOps/s | |
test_step_mdp_speed[True-True-True-False-False] | 36.8710μs | 11.2430μs | 88.9444 KOps/s | 87.9172 KOps/s | |
test_step_mdp_speed[True-True-False-True-True] | 57.1510μs | 34.8608μs | 28.6855 KOps/s | 28.1700 KOps/s | |
test_step_mdp_speed[True-True-False-True-False] | 65.5310μs | 21.4267μs | 46.6707 KOps/s | 46.0765 KOps/s | |
test_step_mdp_speed[True-True-False-False-True] | 47.6710μs | 20.5737μs | 48.6058 KOps/s | 48.5042 KOps/s | |
test_step_mdp_speed[True-True-False-False-False] | 28.6910μs | 13.2058μs | 75.7242 KOps/s | 74.9475 KOps/s | |
test_step_mdp_speed[True-False-True-True-True] | 61.2110μs | 37.0624μs | 26.9815 KOps/s | 26.6039 KOps/s | |
test_step_mdp_speed[True-False-True-True-False] | 49.8610μs | 23.2754μs | 42.9639 KOps/s | 41.9130 KOps/s | |
test_step_mdp_speed[True-False-True-False-True] | 0.1066ms | 21.0004μs | 47.6181 KOps/s | 47.7207 KOps/s | |
test_step_mdp_speed[True-False-True-False-False] | 66.6410μs | 13.1007μs | 76.3317 KOps/s | 75.1159 KOps/s | |
test_step_mdp_speed[True-False-False-True-True] | 66.8910μs | 38.9133μs | 25.6982 KOps/s | 25.6088 KOps/s | |
test_step_mdp_speed[True-False-False-True-False] | 44.2710μs | 25.6414μs | 38.9994 KOps/s | 39.1824 KOps/s | |
test_step_mdp_speed[True-False-False-False-True] | 48.1010μs | 22.6043μs | 44.2394 KOps/s | 43.7317 KOps/s | |
test_step_mdp_speed[True-False-False-False-False] | 30.0600μs | 15.0097μs | 66.6236 KOps/s | 66.0830 KOps/s | |
test_step_mdp_speed[False-True-True-True-True] | 62.3810μs | 37.3265μs | 26.7906 KOps/s | 26.4747 KOps/s | |
test_step_mdp_speed[False-True-True-True-False] | 43.4310μs | 23.7683μs | 42.0729 KOps/s | 42.2681 KOps/s | |
test_step_mdp_speed[False-True-True-False-True] | 43.7610μs | 25.8589μs | 38.6714 KOps/s | 39.2771 KOps/s | |
test_step_mdp_speed[False-True-True-False-False] | 33.5500μs | 14.9223μs | 67.0136 KOps/s | 66.1309 KOps/s | |
test_step_mdp_speed[False-True-False-True-True] | 79.1520μs | 38.7307μs | 25.8193 KOps/s | 25.5240 KOps/s | |
test_step_mdp_speed[False-True-False-True-False] | 42.5610μs | 25.5921μs | 39.0746 KOps/s | 39.1604 KOps/s | |
test_step_mdp_speed[False-True-False-False-True] | 44.7410μs | 27.0875μs | 36.9174 KOps/s | 36.2890 KOps/s | |
test_step_mdp_speed[False-True-False-False-False] | 58.6810μs | 16.9445μs | 59.0163 KOps/s | 59.0464 KOps/s | |
test_step_mdp_speed[False-False-True-True-True] | 67.5420μs | 40.6907μs | 24.5756 KOps/s | 24.3037 KOps/s | |
test_step_mdp_speed[False-False-True-True-False] | 53.7910μs | 27.3204μs | 36.6027 KOps/s | 36.1561 KOps/s | |
test_step_mdp_speed[False-False-True-False-True] | 45.8210μs | 27.2826μs | 36.6533 KOps/s | 36.9297 KOps/s | |
test_step_mdp_speed[False-False-True-False-False] | 33.6290μs | 16.8690μs | 59.2804 KOps/s | 58.9068 KOps/s | |
test_step_mdp_speed[False-False-False-True-True] | 68.5110μs | 42.9859μs | 23.2634 KOps/s | 23.5327 KOps/s | |
test_step_mdp_speed[False-False-False-True-False] | 48.6300μs | 29.4435μs | 33.9633 KOps/s | 34.1988 KOps/s | |
test_step_mdp_speed[False-False-False-False-True] | 57.1220μs | 28.2469μs | 35.4021 KOps/s | 35.3453 KOps/s | |
test_step_mdp_speed[False-False-False-False-False] | 42.3810μs | 18.6438μs | 53.6370 KOps/s | 53.2952 KOps/s | |
test_values[generalized_advantage_estimate-True-True] | 23.8751ms | 23.3359ms | 42.8525 Ops/s | 40.1834 Ops/s | |
test_values[vec_generalized_advantage_estimate-True-True] | 88.4258ms | 3.3142ms | 301.7295 Ops/s | 301.3235 Ops/s | |
test_values[td0_return_estimate-False-False] | 91.6610μs | 60.6476μs | 16.4887 KOps/s | 16.3041 KOps/s | |
test_values[td1_return_estimate-False-False] | 50.6963ms | 50.0407ms | 19.9837 Ops/s | 18.4629 Ops/s | |
test_values[vec_td1_return_estimate-False-False] | 1.9780ms | 1.7374ms | 575.5814 Ops/s | 565.1371 Ops/s | |
test_values[td_lambda_return_estimate-True-False] | 82.8892ms | 80.7322ms | 12.3866 Ops/s | 12.2495 Ops/s | |
test_values[vec_td_lambda_return_estimate-True-False] | 1.9918ms | 1.7277ms | 578.7946 Ops/s | 574.0583 Ops/s | |
test_gae_speed[generalized_advantage_estimate-False-1-512] | 22.3012ms | 21.6864ms | 46.1118 Ops/s | 43.2948 Ops/s | |
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 0.8351ms | 0.6842ms | 1.4616 KOps/s | 1.3613 KOps/s | |
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.7195ms | 0.6326ms | 1.5808 KOps/s | 1.5704 KOps/s | |
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 1.4937ms | 1.4405ms | 694.2207 Ops/s | 688.2112 Ops/s | |
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 0.9057ms | 0.6528ms | 1.5318 KOps/s | 1.4357 KOps/s | |
test_dqn_speed | 14.0265ms | 7.3021ms | 136.9468 Ops/s | 132.7510 Ops/s | |
test_ddpg_speed | 14.9856ms | 14.2109ms | 70.3685 Ops/s | 67.7270 Ops/s | |
test_sac_speed | 29.2192ms | 28.4786ms | 35.1140 Ops/s | 30.9936 Ops/s | |
test_redq_speed | 48.5343ms | 47.6494ms | 20.9866 Ops/s | 20.2519 Ops/s | |
test_redq_deprec_speed | 24.7433ms | 23.5788ms | 42.4111 Ops/s | 41.0181 Ops/s | |
test_td3_speed | 28.6351ms | 19.2991ms | 51.8160 Ops/s | 49.9684 Ops/s | |
test_cql_speed | 82.2356ms | 81.3049ms | 12.2994 Ops/s | 11.7831 Ops/s | |
test_a2c_speed | 27.4745ms | 26.1570ms | 38.2307 Ops/s | 36.9817 Ops/s | |
test_ppo_speed | 27.7873ms | 26.5121ms | 37.7186 Ops/s | 36.5922 Ops/s | |
test_reinforce_speed | 26.2688ms | 25.4240ms | 39.3329 Ops/s | 38.0649 Ops/s | |
test_iql_speed | 57.5249ms | 56.3739ms | 17.7387 Ops/s | 17.2191 Ops/s | |
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 2.1745ms | 1.8105ms | 552.3207 Ops/s | 541.1532 Ops/s | |
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.9473ms | 0.8312ms | 1.2031 KOps/s | 1.1887 KOps/s | |
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 1.0066ms | 0.8206ms | 1.2186 KOps/s | 1.2066 KOps/s | |
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 2.3847ms | 1.7898ms | 558.7282 Ops/s | 553.5186 Ops/s | |
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.0434ms | 0.8218ms | 1.2168 KOps/s | 1.2062 KOps/s | |
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.9443ms | 0.8105ms | 1.2338 KOps/s | 1.2221 KOps/s | |
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 2.9518ms | 2.0693ms | 483.2620 Ops/s | 476.3162 Ops/s | |
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.1457ms | 0.9472ms | 1.0558 KOps/s | 1.0418 KOps/s | |
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 1.1077ms | 0.9398ms | 1.0640 KOps/s | 1.0509 KOps/s | |
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 2.1919ms | 1.8136ms | 551.3834 Ops/s | 542.6766 Ops/s | |
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.9794ms | 0.8324ms | 1.2014 KOps/s | 1.1892 KOps/s | |
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.1143s | 0.9476ms | 1.0553 KOps/s | 1.2028 KOps/s | |
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 2.4233ms | 1.7923ms | 557.9456 Ops/s | 545.2023 Ops/s | |
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.0318ms | 0.8225ms | 1.2158 KOps/s | 1.2044 KOps/s | |
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.9540ms | 0.8133ms | 1.2295 KOps/s | 1.2191 KOps/s | |
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 3.0190ms | 2.0768ms | 481.4999 Ops/s | 478.7630 Ops/s | |
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.1459ms | 0.9508ms | 1.0517 KOps/s | 1.0390 KOps/s | |
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 1.1302ms | 0.9438ms | 1.0596 KOps/s | 1.0492 KOps/s | |
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 0.1217s | 17.9140ms | 55.8223 Ops/s | 53.0909 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 18.7014ms | 13.6323ms | 73.3553 Ops/s | 70.7254 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 5.6979ms | 3.2492ms | 307.7667 Ops/s | 296.1958 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 0.1229s | 17.9769ms | 55.6270 Ops/s | 54.3223 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 16.0059ms | 13.5705ms | 73.6892 Ops/s | 70.8529 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 5.7493ms | 3.2553ms | 307.1939 Ops/s | 298.2148 Ops/s | |
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 0.1244s | 18.1903ms | 54.9742 Ops/s | 53.5936 Ops/s | |
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 15.9455ms | 13.7461ms | 72.7480 Ops/s | 69.7971 Ops/s | |
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 5.6752ms | 3.4332ms | 291.2707 Ops/s | 281.5518 Ops/s |
vmoens
changed the title
[Performance, WIP] faster step_mdp
[Performance, WIP] Minor improvements to step_and_maybe_reset in batched envs
Jan 16, 2024
vmoens
changed the title
[Performance, WIP] Minor improvements to step_and_maybe_reset in batched envs
[Performance] Minor improvements to step_and_maybe_reset in batched envs
Jan 16, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
performance
Performance issue or suggestion for improvement
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.