Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Refactor] Make unbind call tensor.unbind #628

Merged
merged 6 commits into from
Jan 18, 2024
Merged

[Refactor] Make unbind call tensor.unbind #628

merged 6 commits into from
Jan 18, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Jan 18, 2024

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 18, 2024
@vmoens vmoens added the Refactor Refactoring code - not a new feature label Jan 18, 2024
Copy link

github-actions bot commented Jan 18, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 124. Improved: $\large\color{#35bf28}10$. Worsened: $\large\color{#d91a1a}23$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 35.8780μs 17.6822μs 56.5541 KOps/s 61.6785 KOps/s $\textbf{\color{#d91a1a}-8.31\%}$
test_plain_set_stack_nested 0.2372ms 0.1450ms 6.8948 KOps/s 6.9377 KOps/s $\color{#d91a1a}-0.62\%$
test_plain_set_nested_inplace 47.0880μs 19.6283μs 50.9467 KOps/s 53.4100 KOps/s $\color{#d91a1a}-4.61\%$
test_plain_set_stack_nested_inplace 0.3233ms 0.1793ms 5.5785 KOps/s 5.7112 KOps/s $\color{#d91a1a}-2.32\%$
test_items 20.1880μs 2.4427μs 409.3913 KOps/s 399.7511 KOps/s $\color{#35bf28}+2.41\%$
test_items_nested 0.3532ms 0.2709ms 3.6911 KOps/s 3.6981 KOps/s $\color{#d91a1a}-0.19\%$
test_items_nested_locked 0.9439ms 0.2746ms 3.6417 KOps/s 3.7104 KOps/s $\color{#d91a1a}-1.85\%$
test_items_nested_leaf 0.3029ms 0.1664ms 6.0105 KOps/s 5.9988 KOps/s $\color{#35bf28}+0.19\%$
test_items_stack_nested 2.1908ms 1.3115ms 762.4621 Ops/s 758.3780 Ops/s $\color{#35bf28}+0.54\%$
test_items_stack_nested_leaf 1.3498ms 1.1788ms 848.3409 Ops/s 841.0779 Ops/s $\color{#35bf28}+0.86\%$
test_items_stack_nested_locked 1.5542ms 0.8814ms 1.1345 KOps/s 1.1085 KOps/s $\color{#35bf28}+2.34\%$
test_keys 20.7590μs 3.9542μs 252.8951 KOps/s 248.6039 KOps/s $\color{#35bf28}+1.73\%$
test_keys_nested 49.4265ms 0.1578ms 6.3389 KOps/s 6.7166 KOps/s $\textbf{\color{#d91a1a}-5.62\%}$
test_keys_nested_locked 0.2994ms 0.1517ms 6.5934 KOps/s 6.5110 KOps/s $\color{#35bf28}+1.27\%$
test_keys_nested_leaf 0.2328ms 0.1296ms 7.7169 KOps/s 7.6776 KOps/s $\color{#35bf28}+0.51\%$
test_keys_stack_nested 1.9150ms 1.2617ms 792.5816 Ops/s 787.4074 Ops/s $\color{#35bf28}+0.66\%$
test_keys_stack_nested_leaf 1.4669ms 1.2623ms 792.2133 Ops/s 786.2122 Ops/s $\color{#35bf28}+0.76\%$
test_keys_stack_nested_locked 0.9994ms 0.8082ms 1.2374 KOps/s 1.2231 KOps/s $\color{#35bf28}+1.17\%$
test_values 4.2960μs 1.1335μs 882.2059 KOps/s 874.0814 KOps/s $\color{#35bf28}+0.93\%$
test_values_nested 97.1920μs 51.6618μs 19.3567 KOps/s 18.9839 KOps/s $\color{#35bf28}+1.96\%$
test_values_nested_locked 0.1847ms 51.2543μs 19.5106 KOps/s 19.0974 KOps/s $\color{#35bf28}+2.16\%$
test_values_nested_leaf 91.3110μs 45.5728μs 21.9429 KOps/s 21.3055 KOps/s $\color{#35bf28}+2.99\%$
test_values_stack_nested 1.6904ms 1.0152ms 984.9869 Ops/s 959.7553 Ops/s $\color{#35bf28}+2.63\%$
test_values_stack_nested_leaf 1.2150ms 1.0157ms 984.5647 Ops/s 976.6440 Ops/s $\color{#35bf28}+0.81\%$
test_values_stack_nested_locked 0.7763ms 0.5995ms 1.6681 KOps/s 1.6366 KOps/s $\color{#35bf28}+1.92\%$
test_membership 17.5130μs 1.3521μs 739.6123 KOps/s 736.5882 KOps/s $\color{#35bf28}+0.41\%$
test_membership_nested 21.7300μs 3.4199μs 292.4038 KOps/s 289.2480 KOps/s $\color{#35bf28}+1.09\%$
test_membership_nested_leaf 29.0640μs 3.4512μs 289.7504 KOps/s 290.7296 KOps/s $\color{#d91a1a}-0.34\%$
test_membership_stacked_nested 41.2170μs 11.6471μs 85.8581 KOps/s 78.4803 KOps/s $\textbf{\color{#35bf28}+9.40\%}$
test_membership_stacked_nested_leaf 33.0810μs 11.6814μs 85.6064 KOps/s 83.1352 KOps/s $\color{#35bf28}+2.97\%$
test_membership_nested_last 39.7540μs 6.7197μs 148.8154 KOps/s 149.5617 KOps/s $\color{#d91a1a}-0.50\%$
test_membership_nested_leaf_last 29.5350μs 6.6208μs 151.0401 KOps/s 146.9178 KOps/s $\color{#35bf28}+2.81\%$
test_membership_stacked_nested_last 0.2653ms 0.1777ms 5.6281 KOps/s 5.7006 KOps/s $\color{#d91a1a}-1.27\%$
test_membership_stacked_nested_leaf_last 45.4250μs 13.6648μs 73.1806 KOps/s 70.9543 KOps/s $\color{#35bf28}+3.14\%$
test_nested_getleaf 37.3000μs 11.0038μs 90.8779 KOps/s 91.9463 KOps/s $\color{#d91a1a}-1.16\%$
test_nested_get 30.7180μs 10.3540μs 96.5815 KOps/s 97.0607 KOps/s $\color{#d91a1a}-0.49\%$
test_stacked_getleaf 0.5721ms 0.3951ms 2.5312 KOps/s 2.5644 KOps/s $\color{#d91a1a}-1.29\%$
test_stacked_get 0.5495ms 0.3638ms 2.7485 KOps/s 2.7515 KOps/s $\color{#d91a1a}-0.11\%$
test_nested_getitemleaf 35.4660μs 11.0632μs 90.3895 KOps/s 92.8636 KOps/s $\color{#d91a1a}-2.66\%$
test_nested_getitem 30.9180μs 10.5367μs 94.9064 KOps/s 98.3446 KOps/s $\color{#d91a1a}-3.50\%$
test_stacked_getitemleaf 0.6706ms 0.3951ms 2.5311 KOps/s 2.5537 KOps/s $\color{#d91a1a}-0.88\%$
test_stacked_getitem 0.6608ms 0.3649ms 2.7403 KOps/s 2.7875 KOps/s $\color{#d91a1a}-1.69\%$
test_lock_nested 0.9078ms 0.3326ms 3.0063 KOps/s 2.5340 KOps/s $\textbf{\color{#35bf28}+18.64\%}$
test_lock_stack_nested 77.6143ms 5.5659ms 179.6664 Ops/s 159.4950 Ops/s $\textbf{\color{#35bf28}+12.65\%}$
test_unlock_nested 0.7565ms 0.3351ms 2.9839 KOps/s 2.5022 KOps/s $\textbf{\color{#35bf28}+19.25\%}$
test_unlock_stack_nested 77.2054ms 5.5328ms 180.7414 Ops/s 167.0352 Ops/s $\textbf{\color{#35bf28}+8.21\%}$
test_flatten_speed 0.7392ms 0.3685ms 2.7140 KOps/s 2.7608 KOps/s $\color{#d91a1a}-1.70\%$
test_unflatten_speed 0.6572ms 0.4666ms 2.1430 KOps/s 2.1590 KOps/s $\color{#d91a1a}-0.74\%$
test_common_ops 1.1890ms 0.7002ms 1.4281 KOps/s 1.5428 KOps/s $\textbf{\color{#d91a1a}-7.43\%}$
test_creation 17.5230μs 1.8709μs 534.4917 KOps/s 535.4395 KOps/s $\color{#d91a1a}-0.18\%$
test_creation_empty 52.2170μs 11.0062μs 90.8580 KOps/s 115.7096 KOps/s $\textbf{\color{#d91a1a}-21.48\%}$
test_creation_nested_1 48.8810μs 13.6139μs 73.4545 KOps/s 88.1741 KOps/s $\textbf{\color{#d91a1a}-16.69\%}$
test_creation_nested_2 41.7080μs 16.9065μs 59.1490 KOps/s 68.4916 KOps/s $\textbf{\color{#d91a1a}-13.64\%}$
test_clone 72.4350μs 12.9526μs 77.2049 KOps/s 75.1574 KOps/s $\color{#35bf28}+2.72\%$
test_getitem[int] 26.6900μs 11.2451μs 88.9279 KOps/s 87.9440 KOps/s $\color{#35bf28}+1.12\%$
test_getitem[slice_int] 59.6710μs 22.6901μs 44.0720 KOps/s 43.2444 KOps/s $\color{#35bf28}+1.91\%$
test_getitem[range] 0.1672ms 43.3559μs 23.0649 KOps/s 24.5353 KOps/s $\textbf{\color{#d91a1a}-5.99\%}$
test_getitem[tuple] 1.5238ms 18.0577μs 55.3780 KOps/s 53.8295 KOps/s $\color{#35bf28}+2.88\%$
test_getitem[list] 0.2282ms 37.7892μs 26.4626 KOps/s 27.9410 KOps/s $\textbf{\color{#d91a1a}-5.29\%}$
test_setitem_dim[int] 54.6420μs 28.8575μs 34.6530 KOps/s 35.8949 KOps/s $\color{#d91a1a}-3.46\%$
test_setitem_dim[slice_int] 0.1268ms 56.5533μs 17.6824 KOps/s 18.6760 KOps/s $\textbf{\color{#d91a1a}-5.32\%}$
test_setitem_dim[range] 0.1115ms 75.2160μs 13.2950 KOps/s 14.2526 KOps/s $\textbf{\color{#d91a1a}-6.72\%}$
test_setitem_dim[tuple] 73.0960μs 44.1040μs 22.6737 KOps/s 23.1422 KOps/s $\color{#d91a1a}-2.02\%$
test_setitem 98.4130μs 19.9682μs 50.0797 KOps/s 53.1550 KOps/s $\textbf{\color{#d91a1a}-5.79\%}$
test_set 84.2980μs 19.5621μs 51.1192 KOps/s 54.1922 KOps/s $\textbf{\color{#d91a1a}-5.67\%}$
test_set_shared 1.8691ms 0.1384ms 7.2237 KOps/s 7.3228 KOps/s $\color{#d91a1a}-1.35\%$
test_update 0.1085ms 22.9773μs 43.5212 KOps/s 49.8499 KOps/s $\textbf{\color{#d91a1a}-12.70\%}$
test_update_nested 0.6984ms 30.9135μs 32.3483 KOps/s 36.4331 KOps/s $\textbf{\color{#d91a1a}-11.21\%}$
test_set_nested 89.6680μs 21.5484μs 46.4072 KOps/s 49.5205 KOps/s $\textbf{\color{#d91a1a}-6.29\%}$
test_set_nested_new 0.1280ms 25.0894μs 39.8574 KOps/s 41.7317 KOps/s $\color{#d91a1a}-4.49\%$
test_select 0.1064ms 38.5271μs 25.9557 KOps/s 27.2821 KOps/s $\color{#d91a1a}-4.86\%$
test_select_nested 0.1641ms 58.2436μs 17.1693 KOps/s 17.4740 KOps/s $\color{#d91a1a}-1.74\%$
test_exclude_nested 0.2183ms 0.1089ms 9.1848 KOps/s 9.2576 KOps/s $\color{#d91a1a}-0.79\%$
test_empty[True] 0.5524ms 0.3259ms 3.0687 KOps/s 3.0480 KOps/s $\color{#35bf28}+0.68\%$
test_empty[False] 5.7728μs 1.0221μs 978.4185 KOps/s 963.3005 KOps/s $\color{#35bf28}+1.57\%$
test_unbind_speed 0.4507ms 0.2525ms 3.9607 KOps/s 3.1392 KOps/s $\textbf{\color{#35bf28}+26.17\%}$
test_unbind_speed_stack0 71.2747ms 3.4005ms 294.0733 Ops/s 253.7533 Ops/s $\textbf{\color{#35bf28}+15.89\%}$
test_unbind_speed_stack1 16.7010μs 1.9726μs 506.9337 KOps/s 1.5819 MOps/s $\textbf{\color{#d91a1a}-67.95\%}$
test_split 1.7669ms 1.4750ms 677.9890 Ops/s 621.5848 Ops/s $\textbf{\color{#35bf28}+9.07\%}$
test_chunk 63.6751ms 1.5648ms 639.0411 Ops/s 629.7014 Ops/s $\color{#35bf28}+1.48\%$
test_creation[device0] 0.1726ms 99.4403μs 10.0563 KOps/s 10.2905 KOps/s $\color{#d91a1a}-2.28\%$
test_creation_from_tensor 4.9978ms 81.4214μs 12.2818 KOps/s 12.8983 KOps/s $\color{#d91a1a}-4.78\%$
test_add_one[memmap_tensor0] 0.2259ms 5.1858μs 192.8357 KOps/s 188.0835 KOps/s $\color{#35bf28}+2.53\%$
test_contiguous[memmap_tensor0] 10.6310μs 0.6495μs 1.5396 MOps/s 1.5756 MOps/s $\color{#d91a1a}-2.29\%$
test_stack[memmap_tensor0] 65.2520μs 3.4755μs 287.7305 KOps/s 296.2416 KOps/s $\color{#d91a1a}-2.87\%$
test_memmaptd_index 0.9290ms 0.2169ms 4.6099 KOps/s 4.4633 KOps/s $\color{#35bf28}+3.28\%$
test_memmaptd_index_astensor 1.1095ms 0.2911ms 3.4350 KOps/s 3.5272 KOps/s $\color{#d91a1a}-2.61\%$
test_memmaptd_index_op 0.8744ms 0.5652ms 1.7694 KOps/s 1.8475 KOps/s $\color{#d91a1a}-4.23\%$
test_serialize_model 0.1638s 0.1060s 9.4352 Ops/s 9.1190 Ops/s $\color{#35bf28}+3.47\%$
test_serialize_model_pickle 0.4737s 0.3862s 2.5893 Ops/s 2.6281 Ops/s $\color{#d91a1a}-1.48\%$
test_serialize_weights 0.1619s 0.1033s 9.6844 Ops/s 9.5255 Ops/s $\color{#35bf28}+1.67\%$
test_serialize_weights_returnearly 0.3052s 0.1397s 7.1575 Ops/s 7.8830 Ops/s $\textbf{\color{#d91a1a}-9.20\%}$
test_serialize_weights_pickle 0.6923s 0.4881s 2.0489 Ops/s 2.4318 Ops/s $\textbf{\color{#d91a1a}-15.74\%}$
test_serialize_weights_filesystem 0.1022s 92.0335ms 10.8656 Ops/s 11.2282 Ops/s $\color{#d91a1a}-3.23\%$
test_serialize_model_filesystem 0.1562s 97.5587ms 10.2502 Ops/s 10.4584 Ops/s $\color{#d91a1a}-1.99\%$
test_reshape_pytree 64.2700μs 23.8415μs 41.9437 KOps/s 43.8841 KOps/s $\color{#d91a1a}-4.42\%$
test_reshape_td 59.9220μs 29.4727μs 33.9297 KOps/s 30.8061 KOps/s $\textbf{\color{#35bf28}+10.14\%}$
test_view_pytree 61.1640μs 23.3586μs 42.8108 KOps/s 44.6034 KOps/s $\color{#d91a1a}-4.02\%$
test_view_td 24.1450μs 4.8684μs 205.4081 KOps/s 207.8048 KOps/s $\color{#d91a1a}-1.15\%$
test_unbind_pytree 96.3400μs 26.8028μs 37.3095 KOps/s 38.2693 KOps/s $\color{#d91a1a}-2.51\%$
test_unbind_td 0.4470ms 35.8106μs 27.9247 KOps/s 19.8624 KOps/s $\textbf{\color{#35bf28}+40.59\%}$
test_split_pytree 54.9420μs 26.5147μs 37.7149 KOps/s 38.8427 KOps/s $\color{#d91a1a}-2.90\%$
test_split_td 0.5387ms 40.6243μs 24.6158 KOps/s 24.2912 KOps/s $\color{#35bf28}+1.34\%$
test_add_pytree 74.8500μs 32.4480μs 30.8186 KOps/s 31.2115 KOps/s $\color{#d91a1a}-1.26\%$
test_add_td 0.1404ms 49.8624μs 20.0552 KOps/s 21.4193 KOps/s $\textbf{\color{#d91a1a}-6.37\%}$
test_distributed 0.1973ms 98.9724μs 10.1038 KOps/s 9.9492 KOps/s $\color{#35bf28}+1.55\%$
test_tdmodule 0.2192ms 23.1069μs 43.2771 KOps/s 46.1336 KOps/s $\textbf{\color{#d91a1a}-6.19\%}$
test_tdmodule_dispatch 0.2229ms 42.1454μs 23.7274 KOps/s 26.2377 KOps/s $\textbf{\color{#d91a1a}-9.57\%}$
test_tdseq 52.8980μs 26.4610μs 37.7915 KOps/s 41.0283 KOps/s $\textbf{\color{#d91a1a}-7.89\%}$
test_tdseq_dispatch 0.1424ms 46.4996μs 21.5056 KOps/s 23.6174 KOps/s $\textbf{\color{#d91a1a}-8.94\%}$
test_instantiation_functorch 2.0592ms 1.3077ms 764.6833 Ops/s 767.7962 Ops/s $\color{#d91a1a}-0.41\%$
test_instantiation_td 1.4747ms 1.0074ms 992.6591 Ops/s 996.6348 Ops/s $\color{#d91a1a}-0.40\%$
test_exec_functorch 0.3035ms 0.1552ms 6.4429 KOps/s 6.4243 KOps/s $\color{#35bf28}+0.29\%$
test_exec_functional_call 0.2266ms 0.1452ms 6.8857 KOps/s 6.9684 KOps/s $\color{#d91a1a}-1.19\%$
test_exec_td 0.2618ms 0.1388ms 7.2069 KOps/s 7.1182 KOps/s $\color{#35bf28}+1.25\%$
test_exec_td_decorator 0.8537ms 0.1734ms 5.7683 KOps/s 5.7290 KOps/s $\color{#35bf28}+0.69\%$
test_vmap_mlp_speed[True-True] 1.2599ms 0.8819ms 1.1340 KOps/s 1.1698 KOps/s $\color{#d91a1a}-3.07\%$
test_vmap_mlp_speed[True-False] 0.7547ms 0.4692ms 2.1312 KOps/s 2.1819 KOps/s $\color{#d91a1a}-2.32\%$
test_vmap_mlp_speed[False-True] 1.3156ms 0.7647ms 1.3077 KOps/s 1.3399 KOps/s $\color{#d91a1a}-2.41\%$
test_vmap_mlp_speed[False-False] 0.5537ms 0.3806ms 2.6276 KOps/s 2.6461 KOps/s $\color{#d91a1a}-0.70\%$
test_vmap_mlp_speed_decorator[True-True] 2.7668ms 2.2919ms 436.3266 Ops/s 429.2535 Ops/s $\color{#35bf28}+1.65\%$
test_vmap_mlp_speed_decorator[True-False] 0.9430ms 0.5162ms 1.9374 KOps/s 1.9666 KOps/s $\color{#d91a1a}-1.48\%$
test_vmap_mlp_speed_decorator[False-True] 2.4399ms 1.8694ms 534.9360 Ops/s 530.1374 Ops/s $\color{#35bf28}+0.91\%$
test_vmap_mlp_speed_decorator[False-False] 0.6567ms 0.3947ms 2.5333 KOps/s 2.5563 KOps/s $\color{#d91a1a}-0.90\%$

Copy link

github-actions bot commented Jan 18, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 132. Improved: $\large\color{#35bf28}11$. Worsened: $\large\color{#d91a1a}28$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 60.7685ms 20.1431μs 49.6447 KOps/s 77.7881 KOps/s $\textbf{\color{#d91a1a}-36.18\%}$
test_plain_set_stack_nested 0.1615ms 0.1212ms 8.2501 KOps/s 8.4887 KOps/s $\color{#d91a1a}-2.81\%$
test_plain_set_nested_inplace 57.8510μs 15.8022μs 63.2824 KOps/s 70.2172 KOps/s $\textbf{\color{#d91a1a}-9.88\%}$
test_plain_set_stack_nested_inplace 0.1796ms 0.1511ms 6.6193 KOps/s 6.7884 KOps/s $\color{#d91a1a}-2.49\%$
test_items 17.1800μs 4.7432μs 210.8291 KOps/s 209.8336 KOps/s $\color{#35bf28}+0.47\%$
test_items_nested 0.4124ms 0.3435ms 2.9113 KOps/s 2.9290 KOps/s $\color{#d91a1a}-0.60\%$
test_items_nested_locked 0.3893ms 0.3482ms 2.8722 KOps/s 2.9150 KOps/s $\color{#d91a1a}-1.47\%$
test_items_nested_leaf 0.2557ms 0.2042ms 4.8963 KOps/s 4.9487 KOps/s $\color{#d91a1a}-1.06\%$
test_items_stack_nested 1.3971ms 1.3524ms 739.4094 Ops/s 754.4822 Ops/s $\color{#d91a1a}-2.00\%$
test_items_stack_nested_leaf 1.2502ms 1.1881ms 841.6692 Ops/s 870.0972 Ops/s $\color{#d91a1a}-3.27\%$
test_items_stack_nested_locked 1.9269ms 0.9437ms 1.0597 KOps/s 1.0571 KOps/s $\color{#35bf28}+0.25\%$
test_keys 32.6800μs 4.6127μs 216.7907 KOps/s 209.8348 KOps/s $\color{#35bf28}+3.31\%$
test_keys_nested 0.5375ms 95.2860μs 10.4947 KOps/s 10.5183 KOps/s $\color{#d91a1a}-0.22\%$
test_keys_nested_locked 0.1285ms 98.1417μs 10.1894 KOps/s 10.2504 KOps/s $\color{#d91a1a}-0.60\%$
test_keys_nested_leaf 0.1851ms 78.2932μs 12.7725 KOps/s 12.7893 KOps/s $\color{#d91a1a}-0.13\%$
test_keys_stack_nested 1.3239ms 1.1888ms 841.2010 Ops/s 831.3156 Ops/s $\color{#35bf28}+1.19\%$
test_keys_stack_nested_leaf 1.3963ms 1.1854ms 843.5882 Ops/s 860.7572 Ops/s $\color{#d91a1a}-1.99\%$
test_keys_stack_nested_locked 0.8557ms 0.7597ms 1.3163 KOps/s 1.3025 KOps/s $\color{#35bf28}+1.07\%$
test_values 10.3570μs 1.8886μs 529.5064 KOps/s 527.6273 KOps/s $\color{#35bf28}+0.36\%$
test_values_nested 77.9410μs 45.6343μs 21.9133 KOps/s 22.1314 KOps/s $\color{#d91a1a}-0.99\%$
test_values_nested_locked 70.5210μs 48.0901μs 20.7943 KOps/s 21.0453 KOps/s $\color{#d91a1a}-1.19\%$
test_values_nested_leaf 54.4510μs 39.6754μs 25.2045 KOps/s 25.2557 KOps/s $\color{#d91a1a}-0.20\%$
test_values_stack_nested 1.0566ms 1.0004ms 999.6368 Ops/s 989.4230 Ops/s $\color{#35bf28}+1.03\%$
test_values_stack_nested_leaf 1.0713ms 0.9962ms 1.0038 KOps/s 1.0315 KOps/s $\color{#d91a1a}-2.68\%$
test_values_stack_nested_locked 0.6740ms 0.6126ms 1.6323 KOps/s 1.6316 KOps/s $\color{#35bf28}+0.04\%$
test_membership 5.5020μs 0.9247μs 1.0814 MOps/s 925.5830 KOps/s $\textbf{\color{#35bf28}+16.84\%}$
test_membership_nested 26.3100μs 2.8980μs 345.0668 KOps/s 341.9827 KOps/s $\color{#35bf28}+0.90\%$
test_membership_nested_leaf 71.1810μs 2.9015μs 344.6511 KOps/s 338.6692 KOps/s $\color{#35bf28}+1.77\%$
test_membership_stacked_nested 52.2310μs 11.4747μs 87.1481 KOps/s 89.5690 KOps/s $\color{#d91a1a}-2.70\%$
test_membership_stacked_nested_leaf 28.0310μs 11.5438μs 86.6263 KOps/s 89.4931 KOps/s $\color{#d91a1a}-3.20\%$
test_membership_nested_last 34.4710μs 5.3578μs 186.6435 KOps/s 186.2169 KOps/s $\color{#35bf28}+0.23\%$
test_membership_nested_leaf_last 34.7900μs 5.3432μs 187.1535 KOps/s 188.1541 KOps/s $\color{#d91a1a}-0.53\%$
test_membership_stacked_nested_last 0.1958ms 0.1580ms 6.3282 KOps/s 6.9635 KOps/s $\textbf{\color{#d91a1a}-9.12\%}$
test_membership_stacked_nested_leaf_last 48.1100μs 13.5545μs 73.7763 KOps/s 76.5699 KOps/s $\color{#d91a1a}-3.65\%$
test_nested_getleaf 32.1600μs 8.3642μs 119.5566 KOps/s 118.8189 KOps/s $\color{#35bf28}+0.62\%$
test_nested_get 31.3300μs 7.8909μs 126.7279 KOps/s 125.9295 KOps/s $\color{#35bf28}+0.63\%$
test_stacked_getleaf 0.3893ms 0.3341ms 2.9929 KOps/s 3.1247 KOps/s $\color{#d91a1a}-4.22\%$
test_stacked_get 0.3334ms 0.2985ms 3.3506 KOps/s 3.5124 KOps/s $\color{#d91a1a}-4.61\%$
test_nested_getitemleaf 29.6410μs 8.4151μs 118.8335 KOps/s 118.1277 KOps/s $\color{#35bf28}+0.60\%$
test_nested_getitem 33.9610μs 7.9647μs 125.5548 KOps/s 125.3636 KOps/s $\color{#35bf28}+0.15\%$
test_stacked_getitemleaf 0.3713ms 0.3346ms 2.9885 KOps/s 3.1132 KOps/s $\color{#d91a1a}-4.00\%$
test_stacked_getitem 0.3263ms 0.2991ms 3.3436 KOps/s 3.4693 KOps/s $\color{#d91a1a}-3.62\%$
test_lock_nested 0.7986ms 0.3624ms 2.7597 KOps/s 2.4632 KOps/s $\textbf{\color{#35bf28}+12.04\%}$
test_lock_stack_nested 82.7104ms 6.2264ms 160.6073 Ops/s 156.4421 Ops/s $\color{#35bf28}+2.66\%$
test_unlock_nested 0.7827ms 0.3576ms 2.7961 KOps/s 2.4671 KOps/s $\textbf{\color{#35bf28}+13.33\%}$
test_unlock_stack_nested 82.2634ms 6.3263ms 158.0711 Ops/s 145.1688 Ops/s $\textbf{\color{#35bf28}+8.89\%}$
test_flatten_speed 0.5222ms 0.2607ms 3.8360 KOps/s 3.7893 KOps/s $\color{#35bf28}+1.23\%$
test_unflatten_speed 0.4335ms 0.3607ms 2.7721 KOps/s 2.7461 KOps/s $\color{#35bf28}+0.95\%$
test_common_ops 1.1209ms 0.6579ms 1.5201 KOps/s 1.6489 KOps/s $\textbf{\color{#d91a1a}-7.81\%}$
test_creation 15.3800μs 1.5738μs 635.3987 KOps/s 631.0477 KOps/s $\color{#35bf28}+0.69\%$
test_creation_empty 29.1700μs 9.9704μs 100.2967 KOps/s 144.2341 KOps/s $\textbf{\color{#d91a1a}-30.46\%}$
test_creation_nested_1 45.4910μs 11.7300μs 85.2514 KOps/s 115.5669 KOps/s $\textbf{\color{#d91a1a}-26.23\%}$
test_creation_nested_2 28.1410μs 14.2644μs 70.1044 KOps/s 89.5521 KOps/s $\textbf{\color{#d91a1a}-21.72\%}$
test_clone 0.1467ms 14.5905μs 68.5376 KOps/s 67.7563 KOps/s $\color{#35bf28}+1.15\%$
test_getitem[int] 25.7800μs 10.8220μs 92.4045 KOps/s 91.7665 KOps/s $\color{#35bf28}+0.70\%$
test_getitem[slice_int] 57.8810μs 22.3470μs 44.7488 KOps/s 45.7016 KOps/s $\color{#d91a1a}-2.08\%$
test_getitem[range] 65.0910μs 37.6357μs 26.5705 KOps/s 24.9400 KOps/s $\textbf{\color{#35bf28}+6.54\%}$
test_getitem[tuple] 41.9300μs 19.3261μs 51.7435 KOps/s 52.0760 KOps/s $\color{#d91a1a}-0.64\%$
test_getitem[list] 80.5210μs 35.5967μs 28.0925 KOps/s 28.2012 KOps/s $\color{#d91a1a}-0.39\%$
test_setitem_dim[int] 48.7310μs 31.7968μs 31.4497 KOps/s 37.7061 KOps/s $\textbf{\color{#d91a1a}-16.59\%}$
test_setitem_dim[slice_int] 72.4610μs 53.1028μs 18.8314 KOps/s 20.9467 KOps/s $\textbf{\color{#d91a1a}-10.10\%}$
test_setitem_dim[range] 0.1011ms 67.4266μs 14.8309 KOps/s 16.2088 KOps/s $\textbf{\color{#d91a1a}-8.50\%}$
test_setitem_dim[tuple] 64.3010μs 47.2003μs 21.1863 KOps/s 24.0195 KOps/s $\textbf{\color{#d91a1a}-11.80\%}$
test_setitem 0.1242ms 20.6327μs 48.4667 KOps/s 53.2867 KOps/s $\textbf{\color{#d91a1a}-9.05\%}$
test_set 0.1259ms 21.3271μs 46.8888 KOps/s 54.0806 KOps/s $\textbf{\color{#d91a1a}-13.30\%}$
test_set_shared 2.7030ms 0.1118ms 8.9423 KOps/s 9.3084 KOps/s $\color{#d91a1a}-3.93\%$
test_update 0.1225ms 24.0266μs 41.6205 KOps/s 50.0930 KOps/s $\textbf{\color{#d91a1a}-16.91\%}$
test_update_nested 0.1263ms 29.9173μs 33.4254 KOps/s 38.2477 KOps/s $\textbf{\color{#d91a1a}-12.61\%}$
test_set_nested 0.1225ms 21.2244μs 47.1155 KOps/s 50.7029 KOps/s $\textbf{\color{#d91a1a}-7.08\%}$
test_set_nested_new 0.1267ms 25.5402μs 39.1540 KOps/s 44.6805 KOps/s $\textbf{\color{#d91a1a}-12.37\%}$
test_select 66.6810μs 37.2866μs 26.8193 KOps/s 28.4869 KOps/s $\textbf{\color{#d91a1a}-5.85\%}$
test_select_nested 84.5710μs 53.8503μs 18.5700 KOps/s 18.7163 KOps/s $\color{#d91a1a}-0.78\%$
test_exclude_nested 0.1469ms 0.1063ms 9.4064 KOps/s 9.3106 KOps/s $\color{#35bf28}+1.03\%$
test_empty[True] 0.3573ms 0.3205ms 3.1198 KOps/s 3.1003 KOps/s $\color{#35bf28}+0.63\%$
test_empty[False] 3.2520μs 0.8767μs 1.1406 MOps/s 1.1591 MOps/s $\color{#d91a1a}-1.59\%$
test_to 73.1710μs 54.2230μs 18.4424 KOps/s 18.4689 KOps/s $\color{#d91a1a}-0.14\%$
test_to_nonblocking 64.8110μs 34.8088μs 28.7284 KOps/s 26.6106 KOps/s $\textbf{\color{#35bf28}+7.96\%}$
test_unbind_speed 0.3368ms 0.2755ms 3.6301 KOps/s 3.1201 KOps/s $\textbf{\color{#35bf28}+16.34\%}$
test_unbind_speed_stack0 80.2668ms 3.7775ms 264.7231 Ops/s 287.9540 Ops/s $\textbf{\color{#d91a1a}-8.07\%}$
test_unbind_speed_stack1 46.7900μs 1.8128μs 551.6295 KOps/s 1.8767 MOps/s $\textbf{\color{#d91a1a}-70.61\%}$
test_split 75.6947ms 1.7760ms 563.0669 Ops/s 572.9589 Ops/s $\color{#d91a1a}-1.73\%$
test_chunk 1.6680ms 1.5919ms 628.1675 Ops/s 589.2500 Ops/s $\textbf{\color{#35bf28}+6.60\%}$
test_creation[device0] 0.1422ms 76.1647μs 13.1294 KOps/s 12.6194 KOps/s $\color{#35bf28}+4.04\%$
test_creation_from_tensor 0.1314ms 57.1500μs 17.4978 KOps/s 16.8566 KOps/s $\color{#35bf28}+3.80\%$
test_add_one[memmap_tensor0] 0.1871ms 7.7814μs 128.5112 KOps/s 128.4423 KOps/s $\color{#35bf28}+0.05\%$
test_contiguous[memmap_tensor0] 14.9700μs 0.6508μs 1.5366 MOps/s 1.5751 MOps/s $\color{#d91a1a}-2.45\%$
test_stack[memmap_tensor0] 27.2420μs 4.7744μs 209.4514 KOps/s 206.3791 KOps/s $\color{#35bf28}+1.49\%$
test_memmaptd_index 1.2114ms 0.2707ms 3.6945 KOps/s 3.6852 KOps/s $\color{#35bf28}+0.25\%$
test_memmaptd_index_astensor 0.5402ms 0.3259ms 3.0689 KOps/s 3.0456 KOps/s $\color{#35bf28}+0.76\%$
test_memmaptd_index_op 0.9731ms 0.6724ms 1.4872 KOps/s 1.5993 KOps/s $\textbf{\color{#d91a1a}-7.01\%}$
test_serialize_model 93.6357ms 89.9989ms 11.1112 Ops/s 9.9195 Ops/s $\textbf{\color{#35bf28}+12.01\%}$
test_serialize_model_pickle 1.3493s 1.2370s 0.8084 Ops/s 0.8053 Ops/s $\color{#35bf28}+0.39\%$
test_serialize_weights 0.1681s 97.1389ms 10.2945 Ops/s 10.1956 Ops/s $\color{#35bf28}+0.97\%$
test_serialize_weights_returnearly 0.2642s 79.3706ms 12.5991 Ops/s 14.3150 Ops/s $\textbf{\color{#d91a1a}-11.99\%}$
test_serialize_weights_pickle 1.3486s 1.2483s 0.8011 Ops/s 0.8084 Ops/s $\color{#d91a1a}-0.90\%$
test_reshape_pytree 55.9100μs 25.0256μs 39.9591 KOps/s 39.4111 KOps/s $\color{#35bf28}+1.39\%$
test_reshape_td 0.1753ms 30.2503μs 33.0575 KOps/s 32.7727 KOps/s $\color{#35bf28}+0.87\%$
test_view_pytree 0.1721ms 25.9219μs 38.5775 KOps/s 40.3542 KOps/s $\color{#d91a1a}-4.40\%$
test_view_td 0.1283ms 4.3383μs 230.5043 KOps/s 239.2383 KOps/s $\color{#d91a1a}-3.65\%$
test_unbind_pytree 62.3710μs 31.6864μs 31.5592 KOps/s 32.4097 KOps/s $\color{#d91a1a}-2.62\%$
test_unbind_td 0.5292ms 41.7430μs 23.9561 KOps/s 19.5912 KOps/s $\textbf{\color{#35bf28}+22.28\%}$
test_split_pytree 69.4910μs 29.6784μs 33.6946 KOps/s 34.1895 KOps/s $\color{#d91a1a}-1.45\%$
test_split_td 0.1126ms 39.9147μs 25.0534 KOps/s 24.9196 KOps/s $\color{#35bf28}+0.54\%$
test_add_pytree 62.6910μs 38.6701μs 25.8598 KOps/s 25.5002 KOps/s $\color{#35bf28}+1.41\%$
test_add_td 93.7220μs 54.7550μs 18.2632 KOps/s 20.6202 KOps/s $\textbf{\color{#d91a1a}-11.43\%}$
test_distributed 2.8358ms 74.6153μs 13.4021 KOps/s 14.2403 KOps/s $\textbf{\color{#d91a1a}-5.89\%}$
test_tdmodule 34.8910μs 18.7273μs 53.3979 KOps/s 57.0984 KOps/s $\textbf{\color{#d91a1a}-6.48\%}$
test_tdmodule_dispatch 0.2579ms 36.0606μs 27.7311 KOps/s 31.2802 KOps/s $\textbf{\color{#d91a1a}-11.35\%}$
test_tdseq 40.7700μs 22.1984μs 45.0482 KOps/s 49.9092 KOps/s $\textbf{\color{#d91a1a}-9.74\%}$
test_tdseq_dispatch 59.0610μs 39.1643μs 25.5334 KOps/s 28.5348 KOps/s $\textbf{\color{#d91a1a}-10.52\%}$
test_instantiation_functorch 1.8161ms 1.7185ms 581.8952 Ops/s 582.5732 Ops/s $\color{#d91a1a}-0.12\%$
test_instantiation_td 1.7635ms 1.1887ms 841.2700 Ops/s 845.9915 Ops/s $\color{#d91a1a}-0.56\%$
test_exec_functorch 0.2066ms 0.1651ms 6.0581 KOps/s 6.0801 KOps/s $\color{#d91a1a}-0.36\%$
test_exec_functional_call 0.2157ms 0.1645ms 6.0807 KOps/s 6.0004 KOps/s $\color{#35bf28}+1.34\%$
test_exec_td 0.2238ms 0.1562ms 6.4002 KOps/s 6.3309 KOps/s $\color{#35bf28}+1.09\%$
test_exec_td_decorator 0.9185ms 0.1928ms 5.1857 KOps/s 5.1591 KOps/s $\color{#35bf28}+0.52\%$
test_vmap_mlp_speed[True-True] 1.2082ms 1.1132ms 898.2943 Ops/s 899.0528 Ops/s $\color{#d91a1a}-0.08\%$
test_vmap_mlp_speed[True-False] 0.7810ms 0.6844ms 1.4611 KOps/s 1.4976 KOps/s $\color{#d91a1a}-2.44\%$
test_vmap_mlp_speed[False-True] 1.1099ms 1.0623ms 941.3505 Ops/s 968.8352 Ops/s $\color{#d91a1a}-2.84\%$
test_vmap_mlp_speed[False-False] 0.7084ms 0.6194ms 1.6145 KOps/s 1.6663 KOps/s $\color{#d91a1a}-3.11\%$
test_vmap_mlp_speed_decorator[True-True] 3.1805ms 2.4561ms 407.1501 Ops/s 403.5341 Ops/s $\color{#35bf28}+0.90\%$
test_vmap_mlp_speed_decorator[True-False] 1.1009ms 0.7347ms 1.3610 KOps/s 1.4075 KOps/s $\color{#d91a1a}-3.30\%$
test_vmap_mlp_speed_decorator[False-True] 2.4570ms 2.0725ms 482.5195 Ops/s 483.1108 Ops/s $\color{#d91a1a}-0.12\%$
test_vmap_mlp_speed_decorator[False-False] 0.9804ms 0.6272ms 1.5944 KOps/s 1.6320 KOps/s $\color{#d91a1a}-2.31\%$
test_vmap_transformer_speed[True-True] 13.1145ms 12.5905ms 79.4251 Ops/s 79.7465 Ops/s $\color{#d91a1a}-0.40\%$
test_vmap_transformer_speed[True-False] 8.3995ms 8.2425ms 121.3228 Ops/s 121.2430 Ops/s $\color{#35bf28}+0.07\%$
test_vmap_transformer_speed[False-True] 12.9209ms 12.4066ms 80.6025 Ops/s 80.4808 Ops/s $\color{#35bf28}+0.15\%$
test_vmap_transformer_speed[False-False] 8.4169ms 8.1473ms 122.7398 Ops/s 122.6294 Ops/s $\color{#35bf28}+0.09\%$
test_vmap_transformer_speed_decorator[True-True] 0.1643s 81.1360ms 12.3250 Ops/s 12.3360 Ops/s $\color{#d91a1a}-0.09\%$
test_vmap_transformer_speed_decorator[True-False] 21.6505ms 19.8009ms 50.5026 Ops/s 50.5371 Ops/s $\color{#d91a1a}-0.07\%$
test_vmap_transformer_speed_decorator[False-True] 68.8417ms 67.5782ms 14.7977 Ops/s 14.8337 Ops/s $\color{#d91a1a}-0.24\%$
test_vmap_transformer_speed_decorator[False-False] 21.1321ms 19.2829ms 51.8593 Ops/s 47.1394 Ops/s $\textbf{\color{#35bf28}+10.01\%}$

@vmoens vmoens merged commit 4f2a602 into main Jan 18, 2024
43 of 45 checks passed
@vmoens vmoens deleted the unbind-unbind branch January 18, 2024 10:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Refactor Refactoring code - not a new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants