Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Allow inplace modification of non-tensor data in locked tds #694

Merged
merged 2 commits into from
Mar 1, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Mar 1, 2024

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 1, 2024
@vmoens vmoens added the bug Something isn't working label Mar 1, 2024
@vmoens vmoens marked this pull request as ready for review March 1, 2024 17:28
Copy link

github-actions bot commented Mar 1, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 126. Improved: $\large\color{#35bf28}7$. Worsened: $\large\color{#d91a1a}29$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 57.4670μs 17.4777μs 57.2157 KOps/s 63.2192 KOps/s $\textbf{\color{#d91a1a}-9.50\%}$
test_plain_set_stack_nested 64.9090μs 17.4612μs 57.2700 KOps/s 61.1391 KOps/s $\textbf{\color{#d91a1a}-6.33\%}$
test_plain_set_nested_inplace 44.0420μs 19.6304μs 50.9413 KOps/s 53.9656 KOps/s $\textbf{\color{#d91a1a}-5.60\%}$
test_plain_set_stack_nested_inplace 50.9750μs 19.7257μs 50.6954 KOps/s 53.5100 KOps/s $\textbf{\color{#d91a1a}-5.26\%}$
test_items 11.0010μs 2.4121μs 414.5782 KOps/s 394.8771 KOps/s $\color{#35bf28}+4.99\%$
test_items_nested 0.3268ms 0.2689ms 3.7185 KOps/s 3.7347 KOps/s $\color{#d91a1a}-0.43\%$
test_items_nested_locked 0.9455ms 0.2684ms 3.7251 KOps/s 3.7215 KOps/s $\color{#35bf28}+0.10\%$
test_items_nested_leaf 0.3745ms 0.1665ms 6.0063 KOps/s 5.9837 KOps/s $\color{#35bf28}+0.38\%$
test_items_stack_nested 0.3687ms 0.2682ms 3.7289 KOps/s 3.6771 KOps/s $\color{#35bf28}+1.41\%$
test_items_stack_nested_leaf 0.7189ms 0.1678ms 5.9597 KOps/s 6.0517 KOps/s $\color{#d91a1a}-1.52\%$
test_items_stack_nested_locked 0.3383ms 0.2678ms 3.7347 KOps/s 3.6740 KOps/s $\color{#35bf28}+1.65\%$
test_keys 40.0450μs 3.8084μs 262.5801 KOps/s 259.9959 KOps/s $\color{#35bf28}+0.99\%$
test_keys_nested 2.0494ms 0.1515ms 6.6018 KOps/s 6.6101 KOps/s $\color{#d91a1a}-0.13\%$
test_keys_nested_locked 0.2862ms 0.1570ms 6.3693 KOps/s 6.4972 KOps/s $\color{#d91a1a}-1.97\%$
test_keys_nested_leaf 35.8288ms 0.1380ms 7.2462 KOps/s 7.6445 KOps/s $\textbf{\color{#d91a1a}-5.21\%}$
test_keys_stack_nested 0.2913ms 0.1514ms 6.6046 KOps/s 6.5163 KOps/s $\color{#35bf28}+1.35\%$
test_keys_stack_nested_leaf 0.2331ms 0.1329ms 7.5222 KOps/s 7.4449 KOps/s $\color{#35bf28}+1.04\%$
test_keys_stack_nested_locked 0.2844ms 0.1573ms 6.3554 KOps/s 6.3584 KOps/s $\color{#d91a1a}-0.05\%$
test_values 10.0335μs 1.1629μs 859.9105 KOps/s 855.5989 KOps/s $\color{#35bf28}+0.50\%$
test_values_nested 0.1041ms 52.4439μs 19.0680 KOps/s 19.1779 KOps/s $\color{#d91a1a}-0.57\%$
test_values_nested_locked 99.1450μs 52.6904μs 18.9788 KOps/s 19.2171 KOps/s $\color{#d91a1a}-1.24\%$
test_values_nested_leaf 0.1135ms 46.5151μs 21.4984 KOps/s 21.3987 KOps/s $\color{#35bf28}+0.47\%$
test_values_stack_nested 0.1037ms 52.1910μs 19.1604 KOps/s 18.9161 KOps/s $\color{#35bf28}+1.29\%$
test_values_stack_nested_leaf 95.4870μs 46.0196μs 21.7299 KOps/s 21.4151 KOps/s $\color{#35bf28}+1.47\%$
test_values_stack_nested_locked 0.1009ms 51.8194μs 19.2978 KOps/s 18.9326 KOps/s $\color{#35bf28}+1.93\%$
test_membership 19.5560μs 1.3345μs 749.3291 KOps/s 733.9553 KOps/s $\color{#35bf28}+2.09\%$
test_membership_nested 31.3590μs 3.4654μs 288.5699 KOps/s 285.2588 KOps/s $\color{#35bf28}+1.16\%$
test_membership_nested_leaf 19.7870μs 3.4425μs 290.4882 KOps/s 285.5593 KOps/s $\color{#35bf28}+1.73\%$
test_membership_stacked_nested 21.7600μs 3.4469μs 290.1166 KOps/s 270.9272 KOps/s $\textbf{\color{#35bf28}+7.08\%}$
test_membership_stacked_nested_leaf 20.2080μs 3.4666μs 288.4678 KOps/s 283.3040 KOps/s $\color{#35bf28}+1.82\%$
test_membership_nested_last 23.8640μs 4.3253μs 231.1976 KOps/s 234.0940 KOps/s $\color{#d91a1a}-1.24\%$
test_membership_nested_leaf_last 32.8710μs 4.3087μs 232.0893 KOps/s 231.2519 KOps/s $\color{#35bf28}+0.36\%$
test_membership_stacked_nested_last 31.1250μs 4.2862μs 233.3086 KOps/s 206.3702 KOps/s $\textbf{\color{#35bf28}+13.05\%}$
test_membership_stacked_nested_leaf_last 33.8450μs 4.3016μs 232.4707 KOps/s 204.1274 KOps/s $\textbf{\color{#35bf28}+13.89\%}$
test_nested_getleaf 52.1670μs 10.8921μs 91.8094 KOps/s 95.6205 KOps/s $\color{#d91a1a}-3.99\%$
test_nested_get 26.7900μs 10.2102μs 97.9413 KOps/s 101.0829 KOps/s $\color{#d91a1a}-3.11\%$
test_stacked_getleaf 52.3580μs 10.7574μs 92.9589 KOps/s 95.5917 KOps/s $\color{#d91a1a}-2.75\%$
test_stacked_get 51.0050μs 10.1196μs 98.8181 KOps/s 100.8822 KOps/s $\color{#d91a1a}-2.05\%$
test_nested_getitemleaf 35.0050μs 11.2511μs 88.8805 KOps/s 91.0987 KOps/s $\color{#d91a1a}-2.43\%$
test_nested_getitem 49.9030μs 10.6345μs 94.0333 KOps/s 96.4795 KOps/s $\color{#d91a1a}-2.54\%$
test_stacked_getitemleaf 51.2960μs 11.0484μs 90.5106 KOps/s 91.4144 KOps/s $\color{#d91a1a}-0.99\%$
test_stacked_getitem 36.0270μs 10.6144μs 94.2118 KOps/s 96.2651 KOps/s $\color{#d91a1a}-2.13\%$
test_lock_nested 0.6874ms 0.3386ms 2.9535 KOps/s 3.0078 KOps/s $\color{#d91a1a}-1.81\%$
test_lock_stack_nested 0.4492ms 0.3052ms 3.2761 KOps/s 3.4611 KOps/s $\textbf{\color{#d91a1a}-5.35\%}$
test_unlock_nested 80.8149ms 0.4227ms 2.3660 KOps/s 2.4719 KOps/s $\color{#d91a1a}-4.28\%$
test_unlock_stack_nested 0.4439ms 0.3123ms 3.2018 KOps/s 3.3581 KOps/s $\color{#d91a1a}-4.65\%$
test_flatten_speed 0.5560ms 0.2941ms 3.4007 KOps/s 3.4748 KOps/s $\color{#d91a1a}-2.13\%$
test_unflatten_speed 0.8738ms 0.4083ms 2.4492 KOps/s 2.4265 KOps/s $\color{#35bf28}+0.94\%$
test_common_ops 1.1914ms 0.7065ms 1.4154 KOps/s 1.5449 KOps/s $\textbf{\color{#d91a1a}-8.38\%}$
test_creation 17.0820μs 1.8519μs 539.9743 KOps/s 550.9766 KOps/s $\color{#d91a1a}-2.00\%$
test_creation_empty 33.0720μs 10.7768μs 92.7915 KOps/s 115.4827 KOps/s $\textbf{\color{#d91a1a}-19.65\%}$
test_creation_nested_1 43.1810μs 13.2639μs 75.3927 KOps/s 88.7079 KOps/s $\textbf{\color{#d91a1a}-15.01\%}$
test_creation_nested_2 45.3140μs 16.4847μs 60.6622 KOps/s 67.9661 KOps/s $\textbf{\color{#d91a1a}-10.75\%}$
test_clone 57.6470μs 13.8015μs 72.4558 KOps/s 78.0008 KOps/s $\textbf{\color{#d91a1a}-7.11\%}$
test_getitem[int] 31.1070μs 11.1270μs 89.8716 KOps/s 92.8926 KOps/s $\color{#d91a1a}-3.25\%$
test_getitem[slice_int] 58.0380μs 22.2034μs 45.0381 KOps/s 45.5767 KOps/s $\color{#d91a1a}-1.18\%$
test_getitem[range] 0.1460ms 41.9865μs 23.8172 KOps/s 24.9231 KOps/s $\color{#d91a1a}-4.44\%$
test_getitem[tuple] 43.7420μs 18.3896μs 54.3786 KOps/s 55.8824 KOps/s $\color{#d91a1a}-2.69\%$
test_getitem[list] 0.1562ms 36.5524μs 27.3580 KOps/s 28.0868 KOps/s $\color{#d91a1a}-2.59\%$
test_setitem_dim[int] 84.8280μs 36.5465μs 27.3624 KOps/s 30.3947 KOps/s $\textbf{\color{#d91a1a}-9.98\%}$
test_setitem_dim[slice_int] 0.1182ms 62.2858μs 16.0550 KOps/s 17.2092 KOps/s $\textbf{\color{#d91a1a}-6.71\%}$
test_setitem_dim[range] 0.1094ms 80.7378μs 12.3858 KOps/s 12.9526 KOps/s $\color{#d91a1a}-4.38\%$
test_setitem_dim[tuple] 97.4720μs 50.4738μs 19.8122 KOps/s 21.1772 KOps/s $\textbf{\color{#d91a1a}-6.45\%}$
test_setitem 64.1990μs 21.3242μs 46.8951 KOps/s 55.4291 KOps/s $\textbf{\color{#d91a1a}-15.40\%}$
test_set 0.1350ms 20.3004μs 49.2602 KOps/s 56.3363 KOps/s $\textbf{\color{#d91a1a}-12.56\%}$
test_set_shared 1.7373ms 0.1453ms 6.8814 KOps/s 7.1440 KOps/s $\color{#d91a1a}-3.68\%$
test_update 0.1643ms 24.3609μs 41.0494 KOps/s 49.4654 KOps/s $\textbf{\color{#d91a1a}-17.01\%}$
test_update_nested 0.1027ms 31.7515μs 31.4946 KOps/s 35.9248 KOps/s $\textbf{\color{#d91a1a}-12.33\%}$
test_set_nested 76.4230μs 22.8590μs 43.7465 KOps/s 49.9475 KOps/s $\textbf{\color{#d91a1a}-12.41\%}$
test_set_nested_new 87.7440μs 26.3799μs 37.9076 KOps/s 42.1637 KOps/s $\textbf{\color{#d91a1a}-10.09\%}$
test_select 0.1331ms 39.8902μs 25.0688 KOps/s 27.0747 KOps/s $\textbf{\color{#d91a1a}-7.41\%}$
test_select_nested 0.1266ms 58.1133μs 17.2078 KOps/s 17.2692 KOps/s $\color{#d91a1a}-0.36\%$
test_exclude_nested 0.2290ms 0.1161ms 8.6167 KOps/s 8.4839 KOps/s $\color{#35bf28}+1.56\%$
test_empty[True] 0.7250ms 0.4128ms 2.4225 KOps/s 2.4267 KOps/s $\color{#d91a1a}-0.17\%$
test_empty[False] 10.7600μs 1.1754μs 850.7762 KOps/s 936.0900 KOps/s $\textbf{\color{#d91a1a}-9.11\%}$
test_unbind_speed 0.3140ms 0.2474ms 4.0427 KOps/s 4.1185 KOps/s $\color{#d91a1a}-1.84\%$
test_unbind_speed_stack0 0.4005ms 0.2424ms 4.1260 KOps/s 4.2517 KOps/s $\color{#d91a1a}-2.96\%$
test_unbind_speed_stack1 1.0536ms 0.6064ms 1.6492 KOps/s 1.5239 KOps/s $\textbf{\color{#35bf28}+8.22\%}$
test_split 0.1311s 1.6361ms 611.2145 Ops/s 621.6905 Ops/s $\color{#d91a1a}-1.69\%$
test_chunk 2.3485ms 1.4470ms 691.1032 Ops/s 689.6406 Ops/s $\color{#35bf28}+0.21\%$
test_creation[device0] 0.1871ms 0.1066ms 9.3845 KOps/s 9.5142 KOps/s $\color{#d91a1a}-1.36\%$
test_creation_from_tensor 4.1622ms 84.1172μs 11.8882 KOps/s 12.2334 KOps/s $\color{#d91a1a}-2.82\%$
test_add_one[memmap_tensor0] 76.6930μs 5.7950μs 172.5627 KOps/s 182.3371 KOps/s $\textbf{\color{#d91a1a}-5.36\%}$
test_contiguous[memmap_tensor0] 23.2940μs 0.6508μs 1.5365 MOps/s 1.5602 MOps/s $\color{#d91a1a}-1.52\%$
test_stack[memmap_tensor0] 39.6440μs 3.7253μs 268.4338 KOps/s 289.4264 KOps/s $\textbf{\color{#d91a1a}-7.25\%}$
test_memmaptd_index 1.0280ms 0.2440ms 4.0976 KOps/s 4.3002 KOps/s $\color{#d91a1a}-4.71\%$
test_memmaptd_index_astensor 0.5307ms 0.3056ms 3.2720 KOps/s 3.4075 KOps/s $\color{#d91a1a}-3.97\%$
test_memmaptd_index_op 0.9669ms 0.6173ms 1.6201 KOps/s 1.7912 KOps/s $\textbf{\color{#d91a1a}-9.56\%}$
test_serialize_model 0.1044s 0.1017s 9.8355 Ops/s 8.5795 Ops/s $\textbf{\color{#35bf28}+14.64\%}$
test_serialize_model_pickle 0.4633s 0.3803s 2.6297 Ops/s 2.4774 Ops/s $\textbf{\color{#35bf28}+6.15\%}$
test_serialize_weights 0.1039s 97.7790ms 10.2271 Ops/s 9.6858 Ops/s $\textbf{\color{#35bf28}+5.59\%}$
test_serialize_weights_returnearly 0.1237s 0.1203s 8.3160 Ops/s 8.0454 Ops/s $\color{#35bf28}+3.36\%$
test_serialize_weights_pickle 0.5237s 0.4176s 2.3945 Ops/s 2.4115 Ops/s $\color{#d91a1a}-0.70\%$
test_serialize_weights_filesystem 0.1041s 93.5203ms 10.6929 Ops/s 10.3786 Ops/s $\color{#35bf28}+3.03\%$
test_serialize_model_filesystem 0.1023s 96.6344ms 10.3483 Ops/s 10.4439 Ops/s $\color{#d91a1a}-0.92\%$
test_reshape_pytree 46.4970μs 21.5883μs 46.3214 KOps/s 48.2686 KOps/s $\color{#d91a1a}-4.03\%$
test_reshape_td 69.8000μs 32.0536μs 31.1978 KOps/s 32.0769 KOps/s $\color{#d91a1a}-2.74\%$
test_view_pytree 69.2600μs 21.4154μs 46.6955 KOps/s 48.8656 KOps/s $\color{#d91a1a}-4.44\%$
test_view_td 0.1129s 61.0874μs 16.3700 KOps/s 17.3171 KOps/s $\textbf{\color{#d91a1a}-5.47\%}$
test_unbind_pytree 78.5960μs 24.6459μs 40.5746 KOps/s 41.4407 KOps/s $\color{#d91a1a}-2.09\%$
test_unbind_td 0.1251ms 36.0528μs 27.7371 KOps/s 27.8220 KOps/s $\color{#d91a1a}-0.31\%$
test_split_pytree 69.3590μs 24.7118μs 40.4664 KOps/s 42.0010 KOps/s $\color{#d91a1a}-3.65\%$
test_split_td 0.5525ms 39.7266μs 25.1720 KOps/s 25.6741 KOps/s $\color{#d91a1a}-1.96\%$
test_add_pytree 63.5390μs 30.8085μs 32.4586 KOps/s 33.7351 KOps/s $\color{#d91a1a}-3.78\%$
test_add_td 0.1135ms 57.6811μs 17.3367 KOps/s 19.0721 KOps/s $\textbf{\color{#d91a1a}-9.10\%}$
test_distributed 0.1748ms 99.2575μs 10.0748 KOps/s 9.7807 KOps/s $\color{#35bf28}+3.01\%$
test_tdmodule 33.0420μs 18.0138μs 55.5131 KOps/s 59.3678 KOps/s $\textbf{\color{#d91a1a}-6.49\%}$
test_tdmodule_dispatch 65.7630μs 34.5864μs 28.9131 KOps/s 30.2017 KOps/s $\color{#d91a1a}-4.27\%$
test_tdseq 36.8290μs 21.2285μs 47.1065 KOps/s 49.4426 KOps/s $\color{#d91a1a}-4.72\%$
test_tdseq_dispatch 61.6050μs 39.6902μs 25.1951 KOps/s 25.2807 KOps/s $\color{#d91a1a}-0.34\%$
test_instantiation_functorch 1.5171ms 1.3052ms 766.1768 Ops/s 758.3561 Ops/s $\color{#35bf28}+1.03\%$
test_instantiation_td 2.0793ms 1.0032ms 996.7679 Ops/s 1.0027 KOps/s $\color{#d91a1a}-0.59\%$
test_exec_functorch 0.2490ms 0.1626ms 6.1496 KOps/s 6.3526 KOps/s $\color{#d91a1a}-3.20\%$
test_exec_functional_call 0.2446ms 0.1520ms 6.5778 KOps/s 6.7401 KOps/s $\color{#d91a1a}-2.41\%$
test_exec_td 0.2341ms 0.1521ms 6.5726 KOps/s 6.9383 KOps/s $\textbf{\color{#d91a1a}-5.27\%}$
test_exec_td_decorator 0.8087ms 0.1972ms 5.0712 KOps/s 5.0032 KOps/s $\color{#35bf28}+1.36\%$
test_vmap_mlp_speed[True-True] 0.6145ms 0.4819ms 2.0749 KOps/s 2.1197 KOps/s $\color{#d91a1a}-2.11\%$
test_vmap_mlp_speed[True-False] 0.9662ms 0.4835ms 2.0684 KOps/s 2.1146 KOps/s $\color{#d91a1a}-2.18\%$
test_vmap_mlp_speed[False-True] 0.5964ms 0.3958ms 2.5267 KOps/s 2.5594 KOps/s $\color{#d91a1a}-1.28\%$
test_vmap_mlp_speed[False-False] 0.7711ms 0.3978ms 2.5138 KOps/s 2.5745 KOps/s $\color{#d91a1a}-2.36\%$
test_vmap_mlp_speed_decorator[True-True] 0.9839ms 0.5141ms 1.9450 KOps/s 2.0232 KOps/s $\color{#d91a1a}-3.86\%$
test_vmap_mlp_speed_decorator[True-False] 0.9441ms 0.5074ms 1.9708 KOps/s 2.0212 KOps/s $\color{#d91a1a}-2.49\%$
test_vmap_mlp_speed_decorator[False-True] 0.6183ms 0.4109ms 2.4337 KOps/s 2.4526 KOps/s $\color{#d91a1a}-0.77\%$
test_vmap_mlp_speed_decorator[False-False] 0.6940ms 0.4155ms 2.4069 KOps/s 2.4456 KOps/s $\color{#d91a1a}-1.58\%$
test_to_module_speed[True] 1.4650ms 1.3759ms 726.8085 Ops/s 714.2215 Ops/s $\color{#35bf28}+1.76\%$
test_to_module_speed[False] 2.3948ms 1.3731ms 728.2641 Ops/s 732.5290 Ops/s $\color{#d91a1a}-0.58\%$

@vmoens vmoens merged commit 5f10cee into main Mar 1, 2024
45 of 48 checks passed
vmoens added a commit that referenced this pull request Mar 24, 2024
vmoens added a commit that referenced this pull request Mar 25, 2024
@vmoens vmoens deleted the ignore-lock-set-str branch October 21, 2024 14:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants