-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Best intention stack #605
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Jan 3, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 38.0710μs | 15.3500μs | 65.1465 KOps/s | 64.9928 KOps/s | |
test_plain_set_stack_nested | 0.1900ms | 0.1409ms | 7.0992 KOps/s | 7.0443 KOps/s | |
test_plain_set_nested_inplace | 66.5540μs | 17.7125μs | 56.4574 KOps/s | 55.9344 KOps/s | |
test_plain_set_stack_nested_inplace | 0.2342ms | 0.1737ms | 5.7571 KOps/s | 5.6624 KOps/s | |
test_items | 42.4500μs | 2.4554μs | 407.2601 KOps/s | 403.5771 KOps/s | |
test_items_nested | 0.9264ms | 0.2774ms | 3.6043 KOps/s | 3.7259 KOps/s | |
test_items_nested_locked | 0.3360ms | 0.2814ms | 3.5539 KOps/s | 3.7123 KOps/s | |
test_items_nested_leaf | 0.5507ms | 0.1697ms | 5.8918 KOps/s | 6.0207 KOps/s | |
test_items_stack_nested | 1.4115ms | 1.3118ms | 762.2952 Ops/s | 653.8100 Ops/s | |
test_items_stack_nested_leaf | 1.2932ms | 1.1731ms | 852.4091 Ops/s | 718.9009 Ops/s | |
test_items_stack_nested_locked | 0.9303ms | 0.7626ms | 1.3112 KOps/s | 1.3094 KOps/s | |
test_keys | 15.3990μs | 4.1254μs | 242.3988 KOps/s | 260.4388 KOps/s | |
test_keys_nested | 58.7519ms | 0.1548ms | 6.4617 KOps/s | 6.8383 KOps/s | |
test_keys_nested_locked | 0.2108ms | 0.1433ms | 6.9797 KOps/s | 6.8435 KOps/s | |
test_keys_nested_leaf | 0.2073ms | 0.1264ms | 7.9138 KOps/s | 7.7185 KOps/s | |
test_keys_stack_nested | 1.4105ms | 1.2833ms | 779.2535 Ops/s | 667.1620 Ops/s | |
test_keys_stack_nested_leaf | 2.9077ms | 1.2916ms | 774.2608 Ops/s | 676.5470 Ops/s | |
test_keys_stack_nested_locked | 1.0680ms | 0.6853ms | 1.4592 KOps/s | 1.4426 KOps/s | |
test_values | 7.8245μs | 1.1579μs | 863.6371 KOps/s | 841.2603 KOps/s | |
test_values_nested | 0.1139ms | 52.1995μs | 19.1573 KOps/s | 19.2276 KOps/s | |
test_values_nested_locked | 0.1054ms | 52.4423μs | 19.0686 KOps/s | 19.1069 KOps/s | |
test_values_nested_leaf | 0.1067ms | 46.1768μs | 21.6559 KOps/s | 21.5934 KOps/s | |
test_values_stack_nested | 1.7035ms | 1.0647ms | 939.2343 Ops/s | 803.2234 Ops/s | |
test_values_stack_nested_leaf | 1.2275ms | 1.0217ms | 978.7632 Ops/s | 794.1083 Ops/s | |
test_values_stack_nested_locked | 0.9218ms | 0.5086ms | 1.9663 KOps/s | 1.9690 KOps/s | |
test_membership | 17.9930μs | 1.3634μs | 733.4550 KOps/s | 704.0183 KOps/s | |
test_membership_nested | 26.9610μs | 2.8686μs | 348.6003 KOps/s | 351.4403 KOps/s | |
test_membership_nested_leaf | 36.5590μs | 2.9332μs | 340.9296 KOps/s | 322.9385 KOps/s | |
test_membership_stacked_nested | 36.7090μs | 12.0287μs | 83.1345 KOps/s | 86.8252 KOps/s | |
test_membership_stacked_nested_leaf | 50.9360μs | 11.8878μs | 84.1199 KOps/s | 86.3282 KOps/s | |
test_membership_nested_last | 34.4140μs | 6.2125μs | 160.9655 KOps/s | 168.2317 KOps/s | |
test_membership_nested_leaf_last | 30.2170μs | 6.1145μs | 163.5457 KOps/s | 167.4071 KOps/s | |
test_membership_stacked_nested_last | 0.3199ms | 0.1676ms | 5.9676 KOps/s | 6.0002 KOps/s | |
test_membership_stacked_nested_leaf_last | 45.0040μs | 13.6947μs | 73.0208 KOps/s | 74.2207 KOps/s | |
test_nested_getleaf | 42.4700μs | 10.5256μs | 95.0066 KOps/s | 94.9275 KOps/s | |
test_nested_get | 30.5270μs | 9.9618μs | 100.3834 KOps/s | 99.4502 KOps/s | |
test_stacked_getleaf | 1.0050ms | 0.4746ms | 2.1068 KOps/s | 1.4741 KOps/s | |
test_stacked_get | 0.5319ms | 0.4406ms | 2.2697 KOps/s | 1.5468 KOps/s | |
test_nested_getitemleaf | 35.6670μs | 10.6005μs | 94.3356 KOps/s | 94.1845 KOps/s | |
test_nested_getitem | 38.2320μs | 10.0927μs | 99.0812 KOps/s | 99.3653 KOps/s | |
test_stacked_getitemleaf | 0.5644ms | 0.4762ms | 2.0998 KOps/s | 1.4807 KOps/s | |
test_stacked_getitem | 0.6791ms | 0.4418ms | 2.2636 KOps/s | 1.5441 KOps/s | |
test_lock_nested | 2.0547ms | 0.4211ms | 2.3746 KOps/s | 2.4171 KOps/s | |
test_lock_stack_nested | 83.3618ms | 6.9231ms | 144.4446 Ops/s | 148.6991 Ops/s | |
test_unlock_nested | 87.9306ms | 0.5156ms | 1.9394 KOps/s | 2.3686 KOps/s | |
test_unlock_stack_nested | 89.3231ms | 6.6747ms | 149.8191 Ops/s | 158.7116 Ops/s | |
test_flatten_speed | 0.7433ms | 0.3643ms | 2.7452 KOps/s | 2.6951 KOps/s | |
test_unflatten_speed | 0.5608ms | 0.4579ms | 2.1839 KOps/s | 2.2152 KOps/s | |
test_common_ops | 3.7087ms | 0.6749ms | 1.4817 KOps/s | 1.5469 KOps/s | |
test_creation | 19.5970μs | 2.0215μs | 494.6839 KOps/s | 482.4042 KOps/s | |
test_creation_empty | 36.5290μs | 7.9083μs | 126.4501 KOps/s | 129.2805 KOps/s | |
test_creation_nested_1 | 37.0290μs | 10.7329μs | 93.1715 KOps/s | 94.8619 KOps/s | |
test_creation_nested_2 | 61.5960μs | 16.0558μs | 62.2827 KOps/s | 62.7703 KOps/s | |
test_clone | 0.3205ms | 12.3811μs | 80.7682 KOps/s | 79.4243 KOps/s | |
test_getitem[int] | 41.0370μs | 12.2847μs | 81.4021 KOps/s | 84.1615 KOps/s | |
test_getitem[slice_int] | 79.9390μs | 24.1394μs | 41.4261 KOps/s | 40.6568 KOps/s | |
test_getitem[range] | 0.1656ms | 42.6411μs | 23.4516 KOps/s | 23.9761 KOps/s | |
test_getitem[tuple] | 51.4670μs | 19.5400μs | 51.1770 KOps/s | 52.2095 KOps/s | |
test_getitem[list] | 97.3930μs | 37.5515μs | 26.6301 KOps/s | 27.3438 KOps/s | |
test_setitem_dim[int] | 50.9450μs | 27.9881μs | 35.7295 KOps/s | 36.4471 KOps/s | |
test_setitem_dim[slice_int] | 0.1484ms | 53.1931μs | 18.7994 KOps/s | 19.0685 KOps/s | |
test_setitem_dim[range] | 0.1415ms | 71.1973μs | 14.0455 KOps/s | 14.2008 KOps/s | |
test_setitem_dim[tuple] | 71.8650μs | 41.9820μs | 23.8197 KOps/s | 24.4175 KOps/s | |
test_setitem | 0.2401ms | 17.6607μs | 56.6228 KOps/s | 57.9751 KOps/s | |
test_set | 0.2551ms | 17.1282μs | 58.3831 KOps/s | 59.0782 KOps/s | |
test_set_shared | 2.1774ms | 0.1382ms | 7.2336 KOps/s | 7.3638 KOps/s | |
test_update | 0.1823ms | 18.8046μs | 53.1784 KOps/s | 54.1414 KOps/s | |
test_update_nested | 60.3530μs | 25.6207μs | 39.0309 KOps/s | 38.8301 KOps/s | |
test_set_nested | 0.2256ms | 18.7088μs | 53.4509 KOps/s | 53.3464 KOps/s | |
test_set_nested_new | 0.1943ms | 22.7785μs | 43.9010 KOps/s | 44.5875 KOps/s | |
test_select | 0.1054ms | 46.0508μs | 21.7152 KOps/s | 21.5638 KOps/s | |
test_unbind_speed | 0.4902ms | 0.3430ms | 2.9152 KOps/s | 2.9452 KOps/s | |
test_unbind_speed_stack0 | 72.9809ms | 4.5104ms | 221.7121 Ops/s | 250.8482 Ops/s | |
test_unbind_speed_stack1 | 2.7151μs | 0.6338μs | 1.5778 MOps/s | 1.5590 MOps/s | |
test_split | 3.2736ms | 1.5504ms | 644.9797 Ops/s | 581.2285 Ops/s | |
test_chunk | 64.3080ms | 1.6447ms | 608.0242 Ops/s | 590.6739 Ops/s | |
test_creation[device0] | 3.4212ms | 0.2929ms | 3.4144 KOps/s | 3.4191 KOps/s | |
test_creation_from_tensor | 2.9263ms | 0.3294ms | 3.0357 KOps/s | 3.0491 KOps/s | |
test_add_one[memmap_tensor0] | 0.3874ms | 25.4119μs | 39.3517 KOps/s | 38.6460 KOps/s | |
test_contiguous[memmap_tensor0] | 46.3760μs | 5.9241μs | 168.8026 KOps/s | 167.4780 KOps/s | |
test_stack[memmap_tensor0] | 72.5360μs | 19.6211μs | 50.9656 KOps/s | 50.8983 KOps/s | |
test_memmaptd_index | 0.2810ms | 0.1999ms | 5.0028 KOps/s | 4.9361 KOps/s | |
test_memmaptd_index_astensor | 0.4211ms | 0.2583ms | 3.8722 KOps/s | 3.8219 KOps/s | |
test_memmaptd_index_op | 1.1640ms | 0.5085ms | 1.9667 KOps/s | 1.9260 KOps/s | |
test_serialize_model | 0.1655s | 0.1047s | 9.5495 Ops/s | 9.9410 Ops/s | |
test_serialize_model_filesystem | 98.6964ms | 91.5896ms | 10.9183 Ops/s | 10.7360 Ops/s | |
test_serialize_model_pickle | 0.4520s | 0.3816s | 2.6203 Ops/s | 2.5937 Ops/s | |
test_serialize_weights | 0.1039s | 95.9982ms | 10.4169 Ops/s | 9.4540 Ops/s | |
test_serialize_weights_filesystem | 0.1627s | 96.3534ms | 10.3785 Ops/s | 10.8085 Ops/s | |
test_serialize_weights_returnearly | 0.1265s | 0.1203s | 8.3105 Ops/s | 7.5560 Ops/s | |
test_serialize_weights_pickle | 1.1563s | 0.6685s | 1.4959 Ops/s | 1.3274 Ops/s | |
test_reshape_pytree | 75.9520μs | 23.4755μs | 42.5975 KOps/s | 42.6020 KOps/s | |
test_reshape_td | 73.3670μs | 30.4465μs | 32.8445 KOps/s | 32.4895 KOps/s | |
test_view_pytree | 69.0900μs | 23.3036μs | 42.9118 KOps/s | 43.0073 KOps/s | |
test_view_td | 27.1910μs | 4.8877μs | 204.5950 KOps/s | 201.1965 KOps/s | |
test_unbind_pytree | 58.9800μs | 26.6190μs | 37.5672 KOps/s | 37.2769 KOps/s | |
test_unbind_td | 87.4840μs | 55.4662μs | 18.0290 KOps/s | 18.0139 KOps/s | |
test_split_pytree | 54.5820μs | 26.2997μs | 38.0233 KOps/s | 38.0000 KOps/s | |
test_split_td | 0.5898ms | 43.7534μs | 22.8554 KOps/s | 22.9920 KOps/s | |
test_add_pytree | 0.1217ms | 32.2738μs | 30.9849 KOps/s | 31.2496 KOps/s | |
test_add_td | 94.3070μs | 44.7293μs | 22.3567 KOps/s | 22.7119 KOps/s | |
test_distributed | 24.2250μs | 6.0745μs | 164.6221 KOps/s | 163.0276 KOps/s | |
test_tdmodule | 0.3909ms | 21.3623μs | 46.8113 KOps/s | 46.5173 KOps/s | |
test_tdmodule_dispatch | 0.1953ms | 37.8466μs | 26.4225 KOps/s | 26.2346 KOps/s | |
test_tdseq | 42.6200μs | 24.2654μs | 41.2109 KOps/s | 41.4538 KOps/s | |
test_tdseq_dispatch | 0.4367ms | 43.5119μs | 22.9822 KOps/s | 23.6098 KOps/s | |
test_instantiation_functorch | 1.4698ms | 1.3096ms | 763.6160 Ops/s | 760.5597 Ops/s | |
test_instantiation_td | 74.1761ms | 1.0829ms | 923.4041 Ops/s | 982.1427 Ops/s | |
test_exec_functorch | 0.3533ms | 0.1574ms | 6.3537 KOps/s | 6.1597 KOps/s | |
test_exec_functional_call | 0.3191ms | 0.1449ms | 6.9033 KOps/s | 6.6778 KOps/s | |
test_exec_td | 0.3552ms | 0.1425ms | 7.0167 KOps/s | 6.9092 KOps/s | |
test_exec_td_decorator | 1.0449ms | 0.1737ms | 5.7580 KOps/s | 5.6540 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.4223ms | 0.8992ms | 1.1121 KOps/s | 1.0957 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.7336ms | 0.4754ms | 2.1033 KOps/s | 2.1246 KOps/s | |
test_vmap_mlp_speed[False-True] | 1.0163ms | 0.7803ms | 1.2816 KOps/s | 1.2542 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.6017ms | 0.3925ms | 2.5476 KOps/s | 2.5976 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 3.6907ms | 1.7821ms | 561.1316 Ops/s | 550.9478 Ops/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8341ms | 0.5145ms | 1.9438 KOps/s | 1.9581 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 2.4271ms | 1.4927ms | 669.9204 Ops/s | 659.8723 Ops/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7060ms | 0.4002ms | 2.4985 KOps/s | 2.5354 KOps/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
enhancement
New feature or request
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Allows torch.stack to return a TensorDict whenever possible, and a LazyStackedTensorDict otherwise.
This is aimed at simplifying stacking tensordicts together while preserving the features of LazyStackedTDs.
Currently, this behaviour is only enabled when
set_lazy_legacy(False)
is called. In the future,lazy_legacy()
will beFalse
by default making this behaviour the default in tensordict.In the PyTorch PR, this will be the only accepted behaviour of
torch.stack
.cc @shagunsodhani @matteobettini