-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Better tensor allocation in memmap_like #543
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Oct 11, 2023
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.1253ms | 20.3075μs | 49.2428 KOps/s | 45.9230 KOps/s | |
test_plain_set_stack_nested | 1.5014ms | 0.2062ms | 4.8497 KOps/s | 4.3702 KOps/s | |
test_plain_set_nested_inplace | 2.0826ms | 29.1581μs | 34.2958 KOps/s | 33.7351 KOps/s | |
test_plain_set_stack_nested_inplace | 5.1725ms | 0.2842ms | 3.5192 KOps/s | 3.8326 KOps/s | |
test_items | 2.1288ms | 3.9267μs | 254.6648 KOps/s | 256.3579 KOps/s | |
test_items_nested | 2.6436ms | 0.3958ms | 2.5268 KOps/s | 2.5080 KOps/s | |
test_items_nested_locked | 6.5170ms | 0.3962ms | 2.5241 KOps/s | 2.4876 KOps/s | |
test_items_nested_leaf | 3.4541ms | 0.2445ms | 4.0900 KOps/s | 4.0189 KOps/s | |
test_items_stack_nested | 4.8019ms | 2.4640ms | 405.8515 Ops/s | 392.1884 Ops/s | |
test_items_stack_nested_leaf | 7.4461ms | 2.3652ms | 422.7905 Ops/s | 447.0910 Ops/s | |
test_items_stack_nested_locked | 6.9820ms | 1.3643ms | 732.9740 Ops/s | 782.3738 Ops/s | |
test_keys | 1.3312ms | 5.3127μs | 188.2290 KOps/s | 192.7224 KOps/s | |
test_keys_nested | 2.3374ms | 0.1892ms | 5.2854 KOps/s | 4.9317 KOps/s | |
test_keys_nested_locked | 9.1093ms | 0.1867ms | 5.3563 KOps/s | 5.2806 KOps/s | |
test_keys_nested_leaf | 2.8880ms | 0.1853ms | 5.3960 KOps/s | 5.3787 KOps/s | |
test_keys_stack_nested | 7.6089ms | 2.3127ms | 432.3881 Ops/s | 417.1650 Ops/s | |
test_keys_stack_nested_leaf | 5.3414ms | 2.4231ms | 412.6955 Ops/s | 417.2773 Ops/s | |
test_keys_stack_nested_locked | 2.6028ms | 1.1112ms | 899.8917 Ops/s | 883.0447 Ops/s | |
test_values | 0.3191ms | 1.4503μs | 689.5179 KOps/s | 694.7675 KOps/s | |
test_values_nested | 2.4649ms | 69.4485μs | 14.3992 KOps/s | 13.9199 KOps/s | |
test_values_nested_locked | 1.2254ms | 66.6422μs | 15.0055 KOps/s | 14.1805 KOps/s | |
test_values_nested_leaf | 4.9540ms | 60.7980μs | 16.4479 KOps/s | 15.6836 KOps/s | |
test_values_stack_nested | 6.2717ms | 1.9920ms | 502.0035 Ops/s | 508.0448 Ops/s | |
test_values_stack_nested_leaf | 3.9445ms | 1.9755ms | 506.2131 Ops/s | 494.4853 Ops/s | |
test_values_stack_nested_locked | 2.5155ms | 0.9026ms | 1.1079 KOps/s | 1.1423 KOps/s | |
test_membership | 1.4753ms | 2.1855μs | 457.5635 KOps/s | 558.9156 KOps/s | |
test_membership_nested | 1.2034ms | 4.3104μs | 231.9959 KOps/s | 248.5675 KOps/s | |
test_membership_nested_leaf | 0.9206ms | 4.3520μs | 229.7783 KOps/s | 235.5521 KOps/s | |
test_membership_stacked_nested | 1.1331ms | 16.2057μs | 61.7066 KOps/s | 61.8999 KOps/s | |
test_membership_stacked_nested_leaf | 0.4260ms | 15.9529μs | 62.6846 KOps/s | 58.8257 KOps/s | |
test_membership_nested_last | 4.5696ms | 8.7793μs | 113.9049 KOps/s | 118.6792 KOps/s | |
test_membership_nested_leaf_last | 1.2039ms | 8.7112μs | 114.7946 KOps/s | 114.0584 KOps/s | |
test_membership_stacked_nested_last | 2.9628ms | 0.2751ms | 3.6356 KOps/s | 3.6095 KOps/s | |
test_membership_stacked_nested_leaf_last | 1.2531ms | 20.0917μs | 49.7717 KOps/s | 49.3476 KOps/s | |
test_nested_getleaf | 1.4224ms | 17.0221μs | 58.7472 KOps/s | 55.0062 KOps/s | |
test_nested_get | 0.3420ms | 16.0486μs | 62.3105 KOps/s | 59.8196 KOps/s | |
test_stacked_getleaf | 3.2808ms | 1.0902ms | 917.2474 Ops/s | 924.4855 Ops/s | |
test_stacked_get | 6.9611ms | 1.0563ms | 946.6659 Ops/s | 973.6335 Ops/s | |
test_nested_getitemleaf | 1.4626ms | 17.1725μs | 58.2326 KOps/s | 56.3858 KOps/s | |
test_nested_getitem | 1.1629ms | 16.3148μs | 61.2942 KOps/s | 60.5595 KOps/s | |
test_stacked_getitemleaf | 6.8877ms | 1.0980ms | 910.7560 Ops/s | 893.2261 Ops/s | |
test_stacked_getitem | 3.1988ms | 1.0213ms | 979.1509 Ops/s | 1.0193 KOps/s | |
test_lock_nested | 83.5075ms | 1.9299ms | 518.1509 Ops/s | 541.1430 Ops/s | |
test_lock_stack_nested | 0.1116s | 24.4534ms | 40.8941 Ops/s | 39.1593 Ops/s | |
test_unlock_nested | 79.7144ms | 1.8756ms | 533.1647 Ops/s | 511.8164 Ops/s | |
test_unlock_stack_nested | 0.1217s | 25.2719ms | 39.5696 Ops/s | 38.7493 Ops/s | |
test_flatten_speed | 6.2838ms | 1.1203ms | 892.6516 Ops/s | 849.1616 Ops/s | |
test_unflatten_speed | 4.8611ms | 2.0570ms | 486.1485 Ops/s | 477.8297 Ops/s | |
test_common_ops | 3.7408ms | 1.6600ms | 602.4168 Ops/s | 627.6735 Ops/s | |
test_creation | 0.8265ms | 6.5975μs | 151.5727 KOps/s | 141.8989 KOps/s | |
test_creation_empty | 1.2935ms | 15.5652μs | 64.2458 KOps/s | 60.5570 KOps/s | |
test_creation_nested_1 | 1.5741ms | 28.8176μs | 34.7010 KOps/s | 30.7995 KOps/s | |
test_creation_nested_2 | 1.0136ms | 32.2518μs | 31.0060 KOps/s | 28.4411 KOps/s | |
test_clone | 0.9406ms | 33.1811μs | 30.1376 KOps/s | 30.4864 KOps/s | |
test_getitem[int] | 4.9104ms | 38.1325μs | 26.2243 KOps/s | 26.6930 KOps/s | |
test_getitem[slice_int] | 1.6649ms | 79.1019μs | 12.6419 KOps/s | 12.8225 KOps/s | |
test_getitem[range] | 1.2555ms | 0.1380ms | 7.2479 KOps/s | 8.0378 KOps/s | |
test_getitem[tuple] | 2.5602ms | 70.2189μs | 14.2412 KOps/s | 15.8705 KOps/s | |
test_getitem[list] | 1.4524ms | 0.1341ms | 7.4592 KOps/s | 7.7725 KOps/s | |
test_setitem_dim[int] | 0.8447ms | 61.4634μs | 16.2698 KOps/s | 16.5410 KOps/s | |
test_setitem_dim[slice_int] | 2.6030ms | 0.1170ms | 8.5487 KOps/s | 10.4188 KOps/s | |
test_setitem_dim[range] | 1.7557ms | 0.1504ms | 6.6478 KOps/s | 7.1310 KOps/s | |
test_setitem_dim[tuple] | 2.7720ms | 83.3356μs | 11.9997 KOps/s | 12.1379 KOps/s | |
test_setitem | 2.8571ms | 49.2693μs | 20.2966 KOps/s | 21.8481 KOps/s | |
test_set | 1.9870ms | 47.0715μs | 21.2443 KOps/s | 22.4526 KOps/s | |
test_set_shared | 2.9542ms | 0.3275ms | 3.0532 KOps/s | 2.9161 KOps/s | |
test_update | 4.8988ms | 55.7000μs | 17.9533 KOps/s | 19.2858 KOps/s | |
test_update_nested | 3.0663ms | 73.0738μs | 13.6848 KOps/s | 12.8469 KOps/s | |
test_set_nested | 8.3263ms | 53.5936μs | 18.6590 KOps/s | 20.1380 KOps/s | |
test_set_nested_new | 3.2019ms | 76.6758μs | 13.0419 KOps/s | 13.8633 KOps/s | |
test_select | 4.1147ms | 0.1399ms | 7.1479 KOps/s | 7.7206 KOps/s | |
test_unbind_speed | 3.7550ms | 0.7821ms | 1.2786 KOps/s | 1.3348 KOps/s | |
test_unbind_speed_stack0 | 88.3994ms | 10.4818ms | 95.4036 Ops/s | 96.8713 Ops/s | |
test_unbind_speed_stack1 | 4.6152ms | 1.2389μs | 807.1704 KOps/s | 1.0472 MOps/s | |
test_creation[device0] | 6.0431ms | 0.6073ms | 1.6467 KOps/s | 1.6446 KOps/s | |
test_creation_from_tensor | 7.9095ms | 0.6778ms | 1.4755 KOps/s | 1.6545 KOps/s | |
test_add_one[memmap_tensor0] | 3.9124ms | 72.7928μs | 13.7376 KOps/s | 15.8867 KOps/s | |
test_contiguous[memmap_tensor0] | 1.0459ms | 13.5892μs | 73.5880 KOps/s | 80.8446 KOps/s | |
test_stack[memmap_tensor0] | 2.2030ms | 47.8967μs | 20.8783 KOps/s | 21.9940 KOps/s | |
test_memmaptd_index | 5.1818ms | 0.4187ms | 2.3881 KOps/s | 2.5861 KOps/s | |
test_memmaptd_index_astensor | 7.4053ms | 1.9253ms | 519.3883 Ops/s | 488.3396 Ops/s | |
test_memmaptd_index_op | 7.4785ms | 5.1341ms | 194.7777 Ops/s | 190.9614 Ops/s | |
test_reshape_pytree | 2.6013ms | 41.5962μs | 24.0407 KOps/s | 25.0614 KOps/s | |
test_reshape_td | 1.4945ms | 52.9956μs | 18.8695 KOps/s | 18.6939 KOps/s | |
test_view_pytree | 2.6258ms | 38.8592μs | 25.7339 KOps/s | 26.5494 KOps/s | |
test_view_td | 0.8918ms | 10.0651μs | 99.3529 KOps/s | 101.9492 KOps/s | |
test_unbind_pytree | 1.0609ms | 48.0864μs | 20.7959 KOps/s | 21.6465 KOps/s | |
test_unbind_td | 2.3601ms | 0.1310ms | 7.6343 KOps/s | 8.3943 KOps/s | |
test_split_pytree | 2.3407ms | 51.1323μs | 19.5571 KOps/s | 22.8530 KOps/s | |
test_split_td | 3.0771ms | 0.1546ms | 6.4679 KOps/s | 6.8749 KOps/s | |
test_add_pytree | 2.8924ms | 76.7669μs | 13.0264 KOps/s | 14.6333 KOps/s | |
test_add_td | 2.6833ms | 0.1383ms | 7.2291 KOps/s | 7.2776 KOps/s | |
test_distributed | 0.7993ms | 9.4978μs | 105.2871 KOps/s | 111.5351 KOps/s | |
test_tdmodule | 0.5962ms | 40.5425μs | 24.6655 KOps/s | 25.6911 KOps/s | |
test_tdmodule_dispatch | 0.8099ms | 78.0050μs | 12.8197 KOps/s | 13.0100 KOps/s | |
test_tdseq | 0.4077ms | 48.3397μs | 20.6869 KOps/s | 22.2486 KOps/s | |
test_tdseq_dispatch | 0.3341ms | 95.1994μs | 10.5043 KOps/s | 10.0374 KOps/s | |
test_instantiation_functorch | 6.3699ms | 2.2143ms | 451.6078 Ops/s | 495.9035 Ops/s | |
test_instantiation_td | 4.9764ms | 1.6871ms | 592.7490 Ops/s | 612.6192 Ops/s | |
test_exec_functorch | 1.5201ms | 0.3077ms | 3.2501 KOps/s | 3.3403 KOps/s | |
test_exec_td | 2.7371ms | 0.3219ms | 3.1065 KOps/s | 3.2950 KOps/s | |
test_vmap_mlp_speed[True-True] | 16.3034ms | 2.0686ms | 483.4082 Ops/s | 495.9900 Ops/s | |
test_vmap_mlp_speed[True-False] | 5.8715ms | 1.1130ms | 898.5097 Ops/s | 911.0957 Ops/s | |
test_vmap_mlp_speed[False-True] | 13.7816ms | 1.8334ms | 545.4367 Ops/s | 553.5525 Ops/s | |
test_vmap_mlp_speed[False-False] | 12.0079ms | 0.8848ms | 1.1302 KOps/s | 1.1749 KOps/s |
matteobettini
approved these changes
Oct 11, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
bug
Something isn't working
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The memmap_like function does not use
_set_str
, therefore ato(...)
can be called on memmap tensors, leading them to be regular tensors.This PR solves this issue