Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Improve the empty method #622

Merged
merged 3 commits into from
Jan 17, 2024
Merged

[Feature] Improve the empty method #622

merged 3 commits into from
Jan 17, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Jan 17, 2024

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 17, 2024
Copy link

github-actions bot commented Jan 17, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 124. Improved: $\large\color{#35bf28}10$. Worsened: $\large\color{#d91a1a}17$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 37.5200μs 17.1593μs 58.2773 KOps/s 60.8448 KOps/s $\color{#d91a1a}-4.22\%$
test_plain_set_stack_nested 0.1851ms 0.1428ms 7.0008 KOps/s 7.0363 KOps/s $\color{#d91a1a}-0.50\%$
test_plain_set_nested_inplace 50.6550μs 19.1506μs 52.2178 KOps/s 53.5105 KOps/s $\color{#d91a1a}-2.42\%$
test_plain_set_stack_nested_inplace 0.2270ms 0.1751ms 5.7108 KOps/s 5.7081 KOps/s $\color{#35bf28}+0.05\%$
test_items 15.5990μs 2.3943μs 417.6624 KOps/s 422.5757 KOps/s $\color{#d91a1a}-1.16\%$
test_items_nested 0.5284ms 0.2723ms 3.6725 KOps/s 3.7193 KOps/s $\color{#d91a1a}-1.26\%$
test_items_nested_locked 0.8417ms 0.2694ms 3.7113 KOps/s 3.7547 KOps/s $\color{#d91a1a}-1.16\%$
test_items_nested_leaf 0.2091ms 0.1671ms 5.9857 KOps/s 6.0579 KOps/s $\color{#d91a1a}-1.19\%$
test_items_stack_nested 3.9058ms 1.3561ms 737.3939 Ops/s 748.9274 Ops/s $\color{#d91a1a}-1.54\%$
test_items_stack_nested_leaf 2.9854ms 1.2129ms 824.4609 Ops/s 836.4759 Ops/s $\color{#d91a1a}-1.44\%$
test_items_stack_nested_locked 1.5242ms 0.8666ms 1.1539 KOps/s 1.1203 KOps/s $\color{#35bf28}+3.00\%$
test_keys 21.3900μs 3.8987μs 256.4970 KOps/s 255.4939 KOps/s $\color{#35bf28}+0.39\%$
test_keys_nested 48.5944ms 0.1586ms 6.3053 KOps/s 6.7353 KOps/s $\textbf{\color{#d91a1a}-6.38\%}$
test_keys_nested_locked 0.2938ms 0.1547ms 6.4633 KOps/s 6.5397 KOps/s $\color{#d91a1a}-1.17\%$
test_keys_nested_leaf 0.2496ms 0.1311ms 7.6277 KOps/s 7.6862 KOps/s $\color{#d91a1a}-0.76\%$
test_keys_stack_nested 1.8825ms 1.2777ms 782.6772 Ops/s 787.9931 Ops/s $\color{#d91a1a}-0.67\%$
test_keys_stack_nested_leaf 2.4682ms 1.2664ms 789.6128 Ops/s 781.9132 Ops/s $\color{#35bf28}+0.98\%$
test_keys_stack_nested_locked 5.3490ms 0.8089ms 1.2362 KOps/s 1.2376 KOps/s $\color{#d91a1a}-0.11\%$
test_values 6.4320μs 1.1738μs 851.8977 KOps/s 860.5055 KOps/s $\color{#d91a1a}-1.00\%$
test_values_nested 89.3870μs 51.3099μs 19.4894 KOps/s 19.4325 KOps/s $\color{#35bf28}+0.29\%$
test_values_nested_locked 91.8120μs 51.0040μs 19.6063 KOps/s 19.4994 KOps/s $\color{#35bf28}+0.55\%$
test_values_nested_leaf 0.1004ms 46.5594μs 21.4779 KOps/s 21.8963 KOps/s $\color{#d91a1a}-1.91\%$
test_values_stack_nested 1.2911ms 1.0321ms 968.8762 Ops/s 964.4884 Ops/s $\color{#35bf28}+0.45\%$
test_values_stack_nested_leaf 1.6201ms 1.0254ms 975.2008 Ops/s 955.1745 Ops/s $\color{#35bf28}+2.10\%$
test_values_stack_nested_locked 1.0566ms 0.5976ms 1.6734 KOps/s 1.6289 KOps/s $\color{#35bf28}+2.73\%$
test_membership 17.3320μs 1.3279μs 753.0508 KOps/s 743.2282 KOps/s $\color{#35bf28}+1.32\%$
test_membership_nested 19.2660μs 3.3899μs 294.9950 KOps/s 292.5346 KOps/s $\color{#35bf28}+0.84\%$
test_membership_nested_leaf 15.8300μs 3.3952μs 294.5300 KOps/s 291.9106 KOps/s $\color{#35bf28}+0.90\%$
test_membership_stacked_nested 39.3640μs 11.5733μs 86.4061 KOps/s 86.3844 KOps/s $\color{#35bf28}+0.03\%$
test_membership_stacked_nested_leaf 55.9650μs 11.5264μs 86.7576 KOps/s 80.8043 KOps/s $\textbf{\color{#35bf28}+7.37\%}$
test_membership_nested_last 24.4350μs 6.5461μs 152.7634 KOps/s 153.7840 KOps/s $\color{#d91a1a}-0.66\%$
test_membership_nested_leaf_last 35.7370μs 6.5138μs 153.5211 KOps/s 151.6257 KOps/s $\color{#35bf28}+1.25\%$
test_membership_stacked_nested_last 0.2621ms 0.1714ms 5.8328 KOps/s 5.7245 KOps/s $\color{#35bf28}+1.89\%$
test_membership_stacked_nested_leaf_last 48.6810μs 13.5618μs 73.7366 KOps/s 70.9628 KOps/s $\color{#35bf28}+3.91\%$
test_nested_getleaf 37.5600μs 10.5905μs 94.4238 KOps/s 94.0129 KOps/s $\color{#35bf28}+0.44\%$
test_nested_get 35.9770μs 10.0118μs 99.8819 KOps/s 100.4828 KOps/s $\color{#d91a1a}-0.60\%$
test_stacked_getleaf 0.6230ms 0.4066ms 2.4592 KOps/s 2.4789 KOps/s $\color{#d91a1a}-0.79\%$
test_stacked_get 0.6052ms 0.3757ms 2.6615 KOps/s 2.6681 KOps/s $\color{#d91a1a}-0.25\%$
test_nested_getitemleaf 33.7030μs 10.6685μs 93.7343 KOps/s 93.1346 KOps/s $\color{#35bf28}+0.64\%$
test_nested_getitem 31.6990μs 10.1430μs 98.5901 KOps/s 100.1001 KOps/s $\color{#d91a1a}-1.51\%$
test_stacked_getitemleaf 0.5568ms 0.4097ms 2.4408 KOps/s 2.4452 KOps/s $\color{#d91a1a}-0.18\%$
test_stacked_getitem 0.6520ms 0.3783ms 2.6431 KOps/s 2.7137 KOps/s $\color{#d91a1a}-2.60\%$
test_lock_nested 1.2689ms 0.3867ms 2.5863 KOps/s 2.5146 KOps/s $\color{#35bf28}+2.85\%$
test_lock_stack_nested 82.4281ms 6.4271ms 155.5915 Ops/s 149.6562 Ops/s $\color{#35bf28}+3.97\%$
test_unlock_nested 66.0260ms 0.4523ms 2.2111 KOps/s 2.5675 KOps/s $\textbf{\color{#d91a1a}-13.88\%}$
test_unlock_stack_nested 76.2140ms 5.8730ms 170.2702 Ops/s 162.0955 Ops/s $\textbf{\color{#35bf28}+5.04\%}$
test_flatten_speed 0.7422ms 0.3699ms 2.7035 KOps/s 2.7089 KOps/s $\color{#d91a1a}-0.20\%$
test_unflatten_speed 0.5351ms 0.4560ms 2.1929 KOps/s 2.2151 KOps/s $\color{#d91a1a}-1.01\%$
test_common_ops 1.5450ms 0.7014ms 1.4257 KOps/s 1.4875 KOps/s $\color{#d91a1a}-4.16\%$
test_creation 17.4630μs 1.8641μs 536.4592 KOps/s 548.4845 KOps/s $\color{#d91a1a}-2.19\%$
test_creation_empty 33.2320μs 10.9304μs 91.4881 KOps/s 111.3814 KOps/s $\textbf{\color{#d91a1a}-17.86\%}$
test_creation_nested_1 42.3090μs 13.4742μs 74.2158 KOps/s 86.3304 KOps/s $\textbf{\color{#d91a1a}-14.03\%}$
test_creation_nested_2 44.7940μs 16.8022μs 59.5160 KOps/s 67.1123 KOps/s $\textbf{\color{#d91a1a}-11.32\%}$
test_clone 97.1510μs 12.6832μs 78.8443 KOps/s 78.7501 KOps/s $\color{#35bf28}+0.12\%$
test_getitem[int] 35.6660μs 10.9991μs 90.9165 KOps/s 90.0562 KOps/s $\color{#35bf28}+0.96\%$
test_getitem[slice_int] 58.7800μs 22.8999μs 43.6684 KOps/s 44.9777 KOps/s $\color{#d91a1a}-2.91\%$
test_getitem[range] 0.1044ms 38.8096μs 25.7668 KOps/s 23.3542 KOps/s $\textbf{\color{#35bf28}+10.33\%}$
test_getitem[tuple] 71.1330μs 17.9208μs 55.8011 KOps/s 54.5689 KOps/s $\color{#35bf28}+2.26\%$
test_getitem[list] 0.2658ms 34.2084μs 29.2326 KOps/s 26.3057 KOps/s $\textbf{\color{#35bf28}+11.13\%}$
test_setitem_dim[int] 67.3250μs 30.5365μs 32.7477 KOps/s 34.3916 KOps/s $\color{#d91a1a}-4.78\%$
test_setitem_dim[slice_int] 87.2030μs 56.2592μs 17.7749 KOps/s 17.9180 KOps/s $\color{#d91a1a}-0.80\%$
test_setitem_dim[range] 0.1158ms 74.0489μs 13.5046 KOps/s 13.4807 KOps/s $\color{#35bf28}+0.18\%$
test_setitem_dim[tuple] 75.5110μs 44.3596μs 22.5431 KOps/s 22.9673 KOps/s $\color{#d91a1a}-1.85\%$
test_setitem 0.1163ms 19.2535μs 51.9386 KOps/s 54.5241 KOps/s $\color{#d91a1a}-4.74\%$
test_set 0.1077ms 18.6316μs 53.6723 KOps/s 56.6648 KOps/s $\textbf{\color{#d91a1a}-5.28\%}$
test_set_shared 2.8117ms 0.1393ms 7.1794 KOps/s 7.0128 KOps/s $\color{#35bf28}+2.37\%$
test_update 0.1869ms 21.7480μs 45.9813 KOps/s 50.7088 KOps/s $\textbf{\color{#d91a1a}-9.32\%}$
test_update_nested 0.1428ms 29.1852μs 34.2640 KOps/s 37.3283 KOps/s $\textbf{\color{#d91a1a}-8.21\%}$
test_set_nested 0.1229ms 20.5094μs 48.7582 KOps/s 51.3316 KOps/s $\textbf{\color{#d91a1a}-5.01\%}$
test_set_nested_new 0.1114ms 24.4354μs 40.9242 KOps/s 42.4937 KOps/s $\color{#d91a1a}-3.69\%$
test_select 0.2570ms 38.4287μs 26.0222 KOps/s 27.4793 KOps/s $\textbf{\color{#d91a1a}-5.30\%}$
test_select_nested 0.1496ms 61.6054μs 16.2323 KOps/s 16.5466 KOps/s $\color{#d91a1a}-1.90\%$
test_exclude_nested 0.2553ms 0.1126ms 8.8799 KOps/s 9.0695 KOps/s $\color{#d91a1a}-2.09\%$
test_empty[True] 0.3916ms 0.3268ms 3.0599 KOps/s 2.7328 KOps/s $\textbf{\color{#35bf28}+11.97\%}$
test_empty[False] 8.9306μs 1.0441μs 957.7412 KOps/s 983.8182 KOps/s $\color{#d91a1a}-2.65\%$
test_unbind_speed 0.4834ms 0.3111ms 3.2142 KOps/s 3.1793 KOps/s $\color{#35bf28}+1.10\%$
test_unbind_speed_stack0 71.5992ms 4.0295ms 248.1691 Ops/s 257.6566 Ops/s $\color{#d91a1a}-3.68\%$
test_unbind_speed_stack1 2.0829μs 0.6274μs 1.5940 MOps/s 1.5948 MOps/s $\color{#d91a1a}-0.06\%$
test_split 1.7908ms 1.4637ms 683.1985 Ops/s 621.2877 Ops/s $\textbf{\color{#35bf28}+9.96\%}$
test_chunk 63.3131ms 1.5457ms 646.9395 Ops/s 638.4277 Ops/s $\color{#35bf28}+1.33\%$
test_creation[device0] 3.7589ms 0.1034ms 9.6699 KOps/s 9.9398 KOps/s $\color{#d91a1a}-2.71\%$
test_creation_from_tensor 0.1850ms 79.7376μs 12.5411 KOps/s 12.2174 KOps/s $\color{#35bf28}+2.65\%$
test_add_one[memmap_tensor0] 0.2408ms 5.1896μs 192.6948 KOps/s 192.6351 KOps/s $\color{#35bf28}+0.03\%$
test_contiguous[memmap_tensor0] 23.0030μs 0.6363μs 1.5716 MOps/s 1.5598 MOps/s $\color{#35bf28}+0.76\%$
test_stack[memmap_tensor0] 65.2020μs 3.4342μs 291.1895 KOps/s 288.1441 KOps/s $\color{#35bf28}+1.06\%$
test_memmaptd_index 0.9406ms 0.2142ms 4.6677 KOps/s 4.4949 KOps/s $\color{#35bf28}+3.85\%$
test_memmaptd_index_astensor 0.6467ms 0.2746ms 3.6423 KOps/s 3.5720 KOps/s $\color{#35bf28}+1.97\%$
test_memmaptd_index_op 0.8212ms 0.5720ms 1.7483 KOps/s 1.8182 KOps/s $\color{#d91a1a}-3.84\%$
test_serialize_model 0.1066s 97.3846ms 10.2686 Ops/s 8.8196 Ops/s $\textbf{\color{#35bf28}+16.43\%}$
test_serialize_model_pickle 0.4556s 0.3772s 2.6509 Ops/s 2.6032 Ops/s $\color{#35bf28}+1.83\%$
test_serialize_weights 0.1677s 0.1049s 9.5351 Ops/s 9.0569 Ops/s $\textbf{\color{#35bf28}+5.28\%}$
test_serialize_weights_returnearly 0.3127s 0.1476s 6.7749 Ops/s 7.3701 Ops/s $\textbf{\color{#d91a1a}-8.08\%}$
test_serialize_weights_pickle 1.0422s 0.5806s 1.7223 Ops/s 2.4107 Ops/s $\textbf{\color{#d91a1a}-28.56\%}$
test_serialize_weights_filesystem 95.2003ms 90.3034ms 11.0738 Ops/s 10.6431 Ops/s $\color{#35bf28}+4.05\%$
test_serialize_model_filesystem 94.9147ms 91.2955ms 10.9534 Ops/s 11.1680 Ops/s $\color{#d91a1a}-1.92\%$
test_reshape_pytree 65.5520μs 22.9564μs 43.5608 KOps/s 43.2934 KOps/s $\color{#35bf28}+0.62\%$
test_reshape_td 67.3060μs 29.6211μs 33.7598 KOps/s 33.4247 KOps/s $\color{#35bf28}+1.00\%$
test_view_pytree 58.3990μs 23.1008μs 43.2886 KOps/s 43.9327 KOps/s $\color{#d91a1a}-1.47\%$
test_view_td 22.7830μs 4.8583μs 205.8350 KOps/s 200.0038 KOps/s $\color{#35bf28}+2.92\%$
test_unbind_pytree 56.4150μs 26.4786μs 37.7664 KOps/s 37.8245 KOps/s $\color{#d91a1a}-0.15\%$
test_unbind_td 0.1186ms 49.6800μs 20.1288 KOps/s 20.2761 KOps/s $\color{#d91a1a}-0.73\%$
test_split_pytree 57.7680μs 26.1010μs 38.3127 KOps/s 38.7160 KOps/s $\color{#d91a1a}-1.04\%$
test_split_td 0.5252ms 42.8836μs 23.3189 KOps/s 24.7315 KOps/s $\textbf{\color{#d91a1a}-5.71\%}$
test_add_pytree 86.8130μs 32.1762μs 31.0788 KOps/s 31.7302 KOps/s $\color{#d91a1a}-2.05\%$
test_add_td 0.1293ms 51.7214μs 19.3344 KOps/s 20.4640 KOps/s $\textbf{\color{#d91a1a}-5.52\%}$
test_distributed 0.2023ms 97.9364μs 10.2107 KOps/s 9.6126 KOps/s $\textbf{\color{#35bf28}+6.22\%}$
test_tdmodule 0.3009ms 23.2674μs 42.9786 KOps/s 48.0317 KOps/s $\textbf{\color{#d91a1a}-10.52\%}$
test_tdmodule_dispatch 0.1878ms 41.6334μs 24.0192 KOps/s 26.6135 KOps/s $\textbf{\color{#d91a1a}-9.75\%}$
test_tdseq 51.5270μs 25.6097μs 39.0477 KOps/s 40.3233 KOps/s $\color{#d91a1a}-3.16\%$
test_tdseq_dispatch 0.4693ms 45.2680μs 22.0907 KOps/s 23.3545 KOps/s $\textbf{\color{#d91a1a}-5.41\%}$
test_instantiation_functorch 1.9206ms 1.3044ms 766.6604 Ops/s 778.3296 Ops/s $\color{#d91a1a}-1.50\%$
test_instantiation_td 1.5077ms 1.0056ms 994.4310 Ops/s 1.0095 KOps/s $\color{#d91a1a}-1.49\%$
test_exec_functorch 0.3513ms 0.1583ms 6.3175 KOps/s 5.8072 KOps/s $\textbf{\color{#35bf28}+8.79\%}$
test_exec_functional_call 0.2810ms 0.1456ms 6.8676 KOps/s 6.7147 KOps/s $\color{#35bf28}+2.28\%$
test_exec_td 0.2792ms 0.1402ms 7.1324 KOps/s 7.0468 KOps/s $\color{#35bf28}+1.21\%$
test_exec_td_decorator 0.7601ms 0.1755ms 5.6988 KOps/s 5.6015 KOps/s $\color{#35bf28}+1.74\%$
test_vmap_mlp_speed[True-True] 1.2513ms 0.8816ms 1.1343 KOps/s 1.1207 KOps/s $\color{#35bf28}+1.21\%$
test_vmap_mlp_speed[True-False] 0.8764ms 0.4715ms 2.1209 KOps/s 2.1484 KOps/s $\color{#d91a1a}-1.28\%$
test_vmap_mlp_speed[False-True] 0.9532ms 0.7715ms 1.2961 KOps/s 1.2897 KOps/s $\color{#35bf28}+0.50\%$
test_vmap_mlp_speed[False-False] 1.1324ms 0.3841ms 2.6034 KOps/s 2.5609 KOps/s $\color{#35bf28}+1.66\%$
test_vmap_mlp_speed_decorator[True-True] 3.1427ms 2.3897ms 418.4584 Ops/s 419.0680 Ops/s $\color{#d91a1a}-0.15\%$
test_vmap_mlp_speed_decorator[True-False] 1.1056ms 0.5219ms 1.9160 KOps/s 1.9046 KOps/s $\color{#35bf28}+0.60\%$
test_vmap_mlp_speed_decorator[False-True] 2.5893ms 1.9772ms 505.7582 Ops/s 517.3507 Ops/s $\color{#d91a1a}-2.24\%$
test_vmap_mlp_speed_decorator[False-False] 0.7384ms 0.4006ms 2.4961 KOps/s 2.4938 KOps/s $\color{#35bf28}+0.09\%$

Copy link

github-actions bot commented Jan 17, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 132. Improved: $\large\color{#35bf28}12$. Worsened: $\large\color{#d91a1a}10$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.5082ms 13.9537μs 71.6654 KOps/s 73.6715 KOps/s $\color{#d91a1a}-2.72\%$
test_plain_set_stack_nested 0.1602ms 0.1177ms 8.4940 KOps/s 8.4533 KOps/s $\color{#35bf28}+0.48\%$
test_plain_set_nested_inplace 0.1300ms 15.4420μs 64.7586 KOps/s 67.8773 KOps/s $\color{#d91a1a}-4.59\%$
test_plain_set_stack_nested_inplace 0.2971ms 0.1477ms 6.7716 KOps/s 6.8588 KOps/s $\color{#d91a1a}-1.27\%$
test_items 0.1342ms 4.9056μs 203.8481 KOps/s 212.7371 KOps/s $\color{#d91a1a}-4.18\%$
test_items_nested 0.4807ms 0.3433ms 2.9125 KOps/s 2.9143 KOps/s $\color{#d91a1a}-0.06\%$
test_items_nested_locked 0.5010ms 0.3454ms 2.8949 KOps/s 2.8777 KOps/s $\color{#35bf28}+0.60\%$
test_items_nested_leaf 0.2697ms 0.2017ms 4.9573 KOps/s 4.9115 KOps/s $\color{#35bf28}+0.93\%$
test_items_stack_nested 1.4019ms 1.3042ms 766.7718 Ops/s 770.8973 Ops/s $\color{#d91a1a}-0.54\%$
test_items_stack_nested_leaf 1.2237ms 1.1452ms 873.2414 Ops/s 879.8321 Ops/s $\color{#d91a1a}-0.75\%$
test_items_stack_nested_locked 1.8103ms 0.9096ms 1.0994 KOps/s 1.0860 KOps/s $\color{#35bf28}+1.24\%$
test_keys 29.9200μs 4.5911μs 217.8150 KOps/s 219.3117 KOps/s $\color{#d91a1a}-0.68\%$
test_keys_nested 0.4944ms 94.7938μs 10.5492 KOps/s 10.4678 KOps/s $\color{#35bf28}+0.78\%$
test_keys_nested_locked 0.1207ms 97.6491μs 10.2408 KOps/s 10.0725 KOps/s $\color{#35bf28}+1.67\%$
test_keys_nested_leaf 0.1815ms 78.3493μs 12.7634 KOps/s 12.6655 KOps/s $\color{#35bf28}+0.77\%$
test_keys_stack_nested 1.1871ms 1.1359ms 880.3268 Ops/s 889.7478 Ops/s $\color{#d91a1a}-1.06\%$
test_keys_stack_nested_leaf 1.1717ms 1.1257ms 888.3004 Ops/s 891.5068 Ops/s $\color{#d91a1a}-0.36\%$
test_keys_stack_nested_locked 0.8533ms 0.7274ms 1.3747 KOps/s 1.3731 KOps/s $\color{#35bf28}+0.12\%$
test_values 12.8837μs 1.8916μs 528.6499 KOps/s 532.2086 KOps/s $\color{#d91a1a}-0.67\%$
test_values_nested 67.8910μs 45.6275μs 21.9166 KOps/s 21.5624 KOps/s $\color{#35bf28}+1.64\%$
test_values_nested_locked 77.9100μs 47.7977μs 20.9215 KOps/s 20.6145 KOps/s $\color{#35bf28}+1.49\%$
test_values_nested_leaf 58.8210μs 39.7255μs 25.1727 KOps/s 24.8492 KOps/s $\color{#35bf28}+1.30\%$
test_values_stack_nested 1.0817ms 0.9550ms 1.0472 KOps/s 1.0411 KOps/s $\color{#35bf28}+0.58\%$
test_values_stack_nested_leaf 1.1218ms 0.9440ms 1.0593 KOps/s 1.0639 KOps/s $\color{#d91a1a}-0.44\%$
test_values_stack_nested_locked 0.7252ms 0.5838ms 1.7129 KOps/s 1.6785 KOps/s $\color{#35bf28}+2.05\%$
test_membership 3.8600μs 0.9467μs 1.0563 MOps/s 1.0694 MOps/s $\color{#d91a1a}-1.22\%$
test_membership_nested 20.7900μs 2.9104μs 343.5905 KOps/s 344.4666 KOps/s $\color{#d91a1a}-0.25\%$
test_membership_nested_leaf 17.6710μs 2.9228μs 342.1391 KOps/s 344.1185 KOps/s $\color{#d91a1a}-0.58\%$
test_membership_stacked_nested 33.5700μs 11.1501μs 89.6853 KOps/s 89.8412 KOps/s $\color{#d91a1a}-0.17\%$
test_membership_stacked_nested_leaf 0.4463ms 11.2980μs 88.5110 KOps/s 90.3808 KOps/s $\color{#d91a1a}-2.07\%$
test_membership_nested_last 39.5710μs 5.2944μs 188.8796 KOps/s 185.9182 KOps/s $\color{#35bf28}+1.59\%$
test_membership_nested_leaf_last 0.6705ms 5.3714μs 186.1708 KOps/s 186.6228 KOps/s $\color{#d91a1a}-0.24\%$
test_membership_stacked_nested_last 0.1709ms 0.1440ms 6.9437 KOps/s 6.9725 KOps/s $\color{#d91a1a}-0.41\%$
test_membership_stacked_nested_leaf_last 32.7900μs 13.0938μs 76.3720 KOps/s 77.1613 KOps/s $\color{#d91a1a}-1.02\%$
test_nested_getleaf 22.9710μs 8.4459μs 118.4000 KOps/s 118.1296 KOps/s $\color{#35bf28}+0.23\%$
test_nested_get 23.0700μs 7.9656μs 125.5392 KOps/s 125.7676 KOps/s $\color{#d91a1a}-0.18\%$
test_stacked_getleaf 0.3695ms 0.3184ms 3.1409 KOps/s 3.1483 KOps/s $\color{#d91a1a}-0.23\%$
test_stacked_get 0.3355ms 0.2839ms 3.5218 KOps/s 3.5336 KOps/s $\color{#d91a1a}-0.33\%$
test_nested_getitemleaf 22.5110μs 8.4751μs 117.9922 KOps/s 118.9180 KOps/s $\color{#d91a1a}-0.78\%$
test_nested_getitem 22.0800μs 8.0301μs 124.5321 KOps/s 125.6116 KOps/s $\color{#d91a1a}-0.86\%$
test_stacked_getitemleaf 0.4010ms 0.3213ms 3.1128 KOps/s 3.1312 KOps/s $\color{#d91a1a}-0.59\%$
test_stacked_getitem 0.3305ms 0.2853ms 3.5053 KOps/s 3.4791 KOps/s $\color{#35bf28}+0.75\%$
test_lock_nested 0.8713ms 0.3979ms 2.5134 KOps/s 2.4737 KOps/s $\color{#35bf28}+1.61\%$
test_lock_stack_nested 83.8555ms 6.3075ms 158.5416 Ops/s 157.1341 Ops/s $\color{#35bf28}+0.90\%$
test_unlock_nested 1.0478ms 0.4026ms 2.4837 KOps/s 2.4845 KOps/s $\color{#d91a1a}-0.03\%$
test_unlock_stack_nested 82.4056ms 6.7011ms 149.2292 Ops/s 146.1825 Ops/s $\color{#35bf28}+2.08\%$
test_flatten_speed 0.4590ms 0.2640ms 3.7877 KOps/s 3.7832 KOps/s $\color{#35bf28}+0.12\%$
test_unflatten_speed 0.4110ms 0.3658ms 2.7338 KOps/s 2.7434 KOps/s $\color{#d91a1a}-0.35\%$
test_common_ops 1.0771ms 0.6106ms 1.6379 KOps/s 1.6399 KOps/s $\color{#d91a1a}-0.12\%$
test_creation 18.4510μs 1.5581μs 641.8083 KOps/s 636.4415 KOps/s $\color{#35bf28}+0.84\%$
test_creation_empty 30.3800μs 8.8638μs 112.8184 KOps/s 125.7648 KOps/s $\textbf{\color{#d91a1a}-10.29\%}$
test_creation_nested_1 35.3100μs 10.6286μs 94.0860 KOps/s 103.0510 KOps/s $\textbf{\color{#d91a1a}-8.70\%}$
test_creation_nested_2 37.6700μs 13.0949μs 76.3657 KOps/s 82.2624 KOps/s $\textbf{\color{#d91a1a}-7.17\%}$
test_clone 0.1043ms 13.7407μs 72.7765 KOps/s 71.5006 KOps/s $\color{#35bf28}+1.78\%$
test_getitem[int] 26.0300μs 10.8275μs 92.3572 KOps/s 91.7005 KOps/s $\color{#35bf28}+0.72\%$
test_getitem[slice_int] 38.8600μs 21.0086μs 47.5996 KOps/s 46.4101 KOps/s $\color{#35bf28}+2.56\%$
test_getitem[range] 62.7900μs 39.3513μs 25.4121 KOps/s 27.1290 KOps/s $\textbf{\color{#d91a1a}-6.33\%}$
test_getitem[tuple] 38.9300μs 18.4127μs 54.3104 KOps/s 52.5027 KOps/s $\color{#35bf28}+3.44\%$
test_getitem[list] 0.3518ms 35.0319μs 28.5454 KOps/s 29.5535 KOps/s $\color{#d91a1a}-3.41\%$
test_setitem_dim[int] 45.3000μs 28.5697μs 35.0021 KOps/s 36.8307 KOps/s $\color{#d91a1a}-4.96\%$
test_setitem_dim[slice_int] 68.9600μs 51.0143μs 19.6023 KOps/s 21.2515 KOps/s $\textbf{\color{#d91a1a}-7.76\%}$
test_setitem_dim[range] 86.1300μs 64.8194μs 15.4275 KOps/s 16.0424 KOps/s $\color{#d91a1a}-3.83\%$
test_setitem_dim[tuple] 63.3000μs 43.3683μs 23.0583 KOps/s 23.5229 KOps/s $\color{#d91a1a}-1.97\%$
test_setitem 0.1247ms 18.9591μs 52.7451 KOps/s 49.1590 KOps/s $\textbf{\color{#35bf28}+7.29\%}$
test_set 0.1213ms 18.5651μs 53.8644 KOps/s 53.6557 KOps/s $\color{#35bf28}+0.39\%$
test_set_shared 0.5757ms 0.1020ms 9.8011 KOps/s 9.5285 KOps/s $\color{#35bf28}+2.86\%$
test_update 0.1240ms 21.4568μs 46.6052 KOps/s 47.1180 KOps/s $\color{#d91a1a}-1.09\%$
test_update_nested 0.1472ms 29.5144μs 33.8818 KOps/s 34.3989 KOps/s $\color{#d91a1a}-1.50\%$
test_set_nested 44.7000μs 20.7783μs 48.1271 KOps/s 48.2844 KOps/s $\color{#d91a1a}-0.33\%$
test_set_nested_new 0.1196ms 24.1815μs 41.3539 KOps/s 41.1870 KOps/s $\color{#35bf28}+0.41\%$
test_select 0.1247ms 38.0753μs 26.2637 KOps/s 26.2378 KOps/s $\color{#35bf28}+0.10\%$
test_select_nested 76.8210μs 55.9072μs 17.8868 KOps/s 18.3682 KOps/s $\color{#d91a1a}-2.62\%$
test_exclude_nested 0.1377ms 0.1093ms 9.1507 KOps/s 9.2333 KOps/s $\color{#d91a1a}-0.89\%$
test_empty[True] 1.1716ms 0.3265ms 3.0631 KOps/s 2.7568 KOps/s $\textbf{\color{#35bf28}+11.11\%}$
test_empty[False] 2.9020μs 0.8649μs 1.1562 MOps/s 1.1486 MOps/s $\color{#35bf28}+0.66\%$
test_to 74.3910μs 52.6446μs 18.9953 KOps/s 16.5913 KOps/s $\textbf{\color{#35bf28}+14.49\%}$
test_to_nonblocking 62.5410μs 34.3763μs 29.0898 KOps/s 28.4035 KOps/s $\color{#35bf28}+2.42\%$
test_unbind_speed 0.3838ms 0.3156ms 3.1687 KOps/s 3.1281 KOps/s $\color{#35bf28}+1.30\%$
test_unbind_speed_stack0 3.4174ms 3.3588ms 297.7253 Ops/s 250.4015 Ops/s $\textbf{\color{#35bf28}+18.90\%}$
test_unbind_speed_stack1 1.4421μs 0.5358μs 1.8665 MOps/s 1.8178 MOps/s $\color{#35bf28}+2.68\%$
test_split 78.2772ms 1.6805ms 595.0567 Ops/s 646.2567 Ops/s $\textbf{\color{#d91a1a}-7.92\%}$
test_chunk 74.9886ms 1.6469ms 607.2110 Ops/s 598.3187 Ops/s $\color{#35bf28}+1.49\%$
test_creation[device0] 0.1251ms 74.7862μs 13.3714 KOps/s 13.8399 KOps/s $\color{#d91a1a}-3.39\%$
test_creation_from_tensor 0.1413ms 55.9483μs 17.8736 KOps/s 18.8048 KOps/s $\color{#d91a1a}-4.95\%$
test_add_one[memmap_tensor0] 0.1587ms 6.5537μs 152.5866 KOps/s 143.6305 KOps/s $\textbf{\color{#35bf28}+6.24\%}$
test_contiguous[memmap_tensor0] 12.0700μs 0.6193μs 1.6147 MOps/s 1.4947 MOps/s $\textbf{\color{#35bf28}+8.03\%}$
test_stack[memmap_tensor0] 29.1400μs 4.4803μs 223.1975 KOps/s 212.1727 KOps/s $\textbf{\color{#35bf28}+5.20\%}$
test_memmaptd_index 1.0708ms 0.2559ms 3.9084 KOps/s 3.8349 KOps/s $\color{#35bf28}+1.92\%$
test_memmaptd_index_astensor 0.6733ms 0.3115ms 3.2103 KOps/s 3.1321 KOps/s $\color{#35bf28}+2.50\%$
test_memmaptd_index_op 0.9864ms 0.6055ms 1.6514 KOps/s 1.6311 KOps/s $\color{#35bf28}+1.24\%$
test_serialize_model 0.1677s 96.0363ms 10.4127 Ops/s 9.7883 Ops/s $\textbf{\color{#35bf28}+6.38\%}$
test_serialize_model_pickle 1.3652s 1.2391s 0.8070 Ops/s 0.8072 Ops/s $\color{#d91a1a}-0.02\%$
test_serialize_weights 0.1633s 93.7392ms 10.6679 Ops/s 9.9297 Ops/s $\textbf{\color{#35bf28}+7.43\%}$
test_serialize_weights_returnearly 0.2532s 78.8999ms 12.6743 Ops/s 13.4401 Ops/s $\textbf{\color{#d91a1a}-5.70\%}$
test_serialize_weights_pickle 1.3500s 1.2360s 0.8090 Ops/s 0.8013 Ops/s $\color{#35bf28}+0.97\%$
test_reshape_pytree 52.3800μs 24.6657μs 40.5421 KOps/s 39.9897 KOps/s $\color{#35bf28}+1.38\%$
test_reshape_td 47.2000μs 29.1714μs 34.2801 KOps/s 33.6600 KOps/s $\color{#35bf28}+1.84\%$
test_view_pytree 56.2400μs 24.1724μs 41.3694 KOps/s 40.8503 KOps/s $\color{#35bf28}+1.27\%$
test_view_td 18.6300μs 4.1834μs 239.0386 KOps/s 237.2401 KOps/s $\color{#35bf28}+0.76\%$
test_unbind_pytree 46.9510μs 30.6395μs 32.6376 KOps/s 32.6801 KOps/s $\color{#d91a1a}-0.13\%$
test_unbind_td 0.1164ms 49.1982μs 20.3260 KOps/s 20.0170 KOps/s $\color{#35bf28}+1.54\%$
test_split_pytree 44.4600μs 28.5817μs 34.9875 KOps/s 33.8349 KOps/s $\color{#35bf28}+3.41\%$
test_split_td 0.7029ms 38.3125μs 26.1012 KOps/s 24.7325 KOps/s $\textbf{\color{#35bf28}+5.53\%}$
test_add_pytree 53.3400μs 35.5864μs 28.1006 KOps/s 28.0129 KOps/s $\color{#35bf28}+0.31\%$
test_add_td 96.6700μs 47.6320μs 20.9943 KOps/s 20.3524 KOps/s $\color{#35bf28}+3.15\%$
test_distributed 0.2565ms 71.0995μs 14.0648 KOps/s 13.8261 KOps/s $\color{#35bf28}+1.73\%$
test_tdmodule 0.1613ms 18.9205μs 52.8528 KOps/s 58.2799 KOps/s $\textbf{\color{#d91a1a}-9.31\%}$
test_tdmodule_dispatch 0.2901ms 36.8854μs 27.1110 KOps/s 30.2183 KOps/s $\textbf{\color{#d91a1a}-10.28\%}$
test_tdseq 47.0400μs 20.8964μs 47.8551 KOps/s 49.6521 KOps/s $\color{#d91a1a}-3.62\%$
test_tdseq_dispatch 74.9910μs 36.8442μs 27.1413 KOps/s 27.8528 KOps/s $\color{#d91a1a}-2.55\%$
test_instantiation_functorch 1.7477ms 1.6740ms 597.3818 Ops/s 598.7491 Ops/s $\color{#d91a1a}-0.23\%$
test_instantiation_td 1.7605ms 1.1756ms 850.6623 Ops/s 856.9602 Ops/s $\color{#d91a1a}-0.73\%$
test_exec_functorch 0.2065ms 0.1557ms 6.4236 KOps/s 6.3221 KOps/s $\color{#35bf28}+1.61\%$
test_exec_functional_call 0.3072ms 0.1556ms 6.4264 KOps/s 6.4886 KOps/s $\color{#d91a1a}-0.96\%$
test_exec_td 0.1834ms 0.1439ms 6.9478 KOps/s 6.7622 KOps/s $\color{#35bf28}+2.74\%$
test_exec_td_decorator 0.9986ms 0.1848ms 5.4127 KOps/s 4.8690 KOps/s $\textbf{\color{#35bf28}+11.17\%}$
test_vmap_mlp_speed[True-True] 1.2156ms 1.0590ms 944.2491 Ops/s 936.8475 Ops/s $\color{#35bf28}+0.79\%$
test_vmap_mlp_speed[True-False] 0.7918ms 0.6313ms 1.5842 KOps/s 1.5748 KOps/s $\color{#35bf28}+0.60\%$
test_vmap_mlp_speed[False-True] 1.0403ms 0.9731ms 1.0276 KOps/s 1.0213 KOps/s $\color{#35bf28}+0.62\%$
test_vmap_mlp_speed[False-False] 0.7128ms 0.5588ms 1.7897 KOps/s 1.7677 KOps/s $\color{#35bf28}+1.25\%$
test_vmap_mlp_speed_decorator[True-True] 3.2275ms 2.4928ms 401.1477 Ops/s 400.5926 Ops/s $\color{#35bf28}+0.14\%$
test_vmap_mlp_speed_decorator[True-False] 1.1729ms 0.6975ms 1.4336 KOps/s 1.4633 KOps/s $\color{#d91a1a}-2.03\%$
test_vmap_mlp_speed_decorator[False-True] 2.6299ms 2.1482ms 465.5018 Ops/s 489.6379 Ops/s $\color{#d91a1a}-4.93\%$
test_vmap_mlp_speed_decorator[False-False] 0.9452ms 0.5963ms 1.6771 KOps/s 1.7052 KOps/s $\color{#d91a1a}-1.65\%$
test_vmap_transformer_speed[True-True] 12.6093ms 12.1489ms 82.3117 Ops/s 83.8421 Ops/s $\color{#d91a1a}-1.83\%$
test_vmap_transformer_speed[True-False] 8.6734ms 7.9704ms 125.4642 Ops/s 127.8504 Ops/s $\color{#d91a1a}-1.87\%$
test_vmap_transformer_speed[False-True] 13.2342ms 12.1866ms 82.0574 Ops/s 84.4522 Ops/s $\color{#d91a1a}-2.84\%$
test_vmap_transformer_speed[False-False] 7.8935ms 7.6570ms 130.6000 Ops/s 129.1424 Ops/s $\color{#35bf28}+1.13\%$
test_vmap_transformer_speed_decorator[True-True] 0.1628s 80.3498ms 12.4456 Ops/s 13.6108 Ops/s $\textbf{\color{#d91a1a}-8.56\%}$
test_vmap_transformer_speed_decorator[True-False] 20.4497ms 18.7014ms 53.4720 Ops/s 48.8269 Ops/s $\textbf{\color{#35bf28}+9.51\%}$
test_vmap_transformer_speed_decorator[False-True] 67.5171ms 66.5142ms 15.0344 Ops/s 15.0577 Ops/s $\color{#d91a1a}-0.15\%$
test_vmap_transformer_speed_decorator[False-False] 19.9256ms 18.2921ms 54.6684 Ops/s 54.3682 Ops/s $\color{#35bf28}+0.55\%$

@vmoens vmoens added bug Something isn't working enhancement New feature or request Performance labels Jan 17, 2024
@vmoens vmoens merged commit 617a449 into main Jan 17, 2024
30 of 40 checks passed
@vmoens vmoens deleted the test_empty branch January 17, 2024 15:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request Performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants