Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] from and to_pytree #832

Merged
merged 1 commit into from
Jun 25, 2024
Merged

[Feature] from and to_pytree #832

merged 1 commit into from
Jun 25, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Jun 24, 2024

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 24, 2024
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 144. Improved: $\large\color{#35bf28}5$. Worsened: $\large\color{#d91a1a}8$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 46.8680μs 17.3649μs 57.5873 KOps/s 58.8722 KOps/s $\color{#d91a1a}-2.18\%$
test_plain_set_stack_nested 47.5790μs 17.3395μs 57.6718 KOps/s 58.6131 KOps/s $\color{#d91a1a}-1.61\%$
test_plain_set_nested_inplace 50.3140μs 19.7734μs 50.5730 KOps/s 52.3476 KOps/s $\color{#d91a1a}-3.39\%$
test_plain_set_stack_nested_inplace 54.5610μs 19.6520μs 50.8853 KOps/s 52.1597 KOps/s $\color{#d91a1a}-2.44\%$
test_items 22.4220μs 2.5598μs 390.6567 KOps/s 384.6676 KOps/s $\color{#35bf28}+1.56\%$
test_items_nested 0.4202ms 0.2659ms 3.7605 KOps/s 3.7739 KOps/s $\color{#d91a1a}-0.36\%$
test_items_nested_locked 1.8783ms 0.2655ms 3.7661 KOps/s 3.7366 KOps/s $\color{#35bf28}+0.79\%$
test_items_nested_leaf 0.1837ms 77.8405μs 12.8468 KOps/s 12.7595 KOps/s $\color{#35bf28}+0.68\%$
test_items_stack_nested 0.5495ms 0.2644ms 3.7822 KOps/s 3.6863 KOps/s $\color{#35bf28}+2.60\%$
test_items_stack_nested_leaf 0.1414ms 77.8298μs 12.8485 KOps/s 12.9830 KOps/s $\color{#d91a1a}-1.04\%$
test_items_stack_nested_locked 1.3143ms 0.2660ms 3.7595 KOps/s 3.7323 KOps/s $\color{#35bf28}+0.73\%$
test_keys 43.9220μs 3.8732μs 258.1842 KOps/s 260.2943 KOps/s $\color{#d91a1a}-0.81\%$
test_keys_nested 0.2255ms 0.1376ms 7.2699 KOps/s 7.2957 KOps/s $\color{#d91a1a}-0.35\%$
test_keys_nested_locked 0.8065ms 0.1409ms 7.0948 KOps/s 7.0426 KOps/s $\color{#35bf28}+0.74\%$
test_keys_nested_leaf 0.2090ms 0.1162ms 8.6063 KOps/s 8.5828 KOps/s $\color{#35bf28}+0.27\%$
test_keys_stack_nested 0.2986ms 0.1372ms 7.2865 KOps/s 7.2266 KOps/s $\color{#35bf28}+0.83\%$
test_keys_stack_nested_leaf 0.2204ms 0.1162ms 8.6037 KOps/s 8.6784 KOps/s $\color{#d91a1a}-0.86\%$
test_keys_stack_nested_locked 0.2513ms 0.1415ms 7.0650 KOps/s 7.0975 KOps/s $\color{#d91a1a}-0.46\%$
test_values 8.7860μs 1.1634μs 859.5451 KOps/s 860.2094 KOps/s $\color{#d91a1a}-0.08\%$
test_values_nested 0.1151ms 50.0011μs 19.9996 KOps/s 19.7639 KOps/s $\color{#35bf28}+1.19\%$
test_values_nested_locked 0.1139ms 50.4137μs 19.8359 KOps/s 19.7727 KOps/s $\color{#35bf28}+0.32\%$
test_values_nested_leaf 0.1131ms 45.4677μs 21.9936 KOps/s 21.7678 KOps/s $\color{#35bf28}+1.04\%$
test_values_stack_nested 0.1415ms 51.3943μs 19.4574 KOps/s 19.6300 KOps/s $\color{#d91a1a}-0.88\%$
test_values_stack_nested_leaf 0.1019ms 45.3862μs 22.0331 KOps/s 21.6956 KOps/s $\color{#35bf28}+1.56\%$
test_values_stack_nested_locked 0.1163ms 51.3109μs 19.4890 KOps/s 19.3879 KOps/s $\color{#35bf28}+0.52\%$
test_membership 40.4850μs 1.3237μs 755.4485 KOps/s 734.8286 KOps/s $\color{#35bf28}+2.81\%$
test_membership_nested 52.0020μs 3.3718μs 296.5759 KOps/s 290.1778 KOps/s $\color{#35bf28}+2.20\%$
test_membership_nested_leaf 27.7720μs 3.3735μs 296.4260 KOps/s 290.6177 KOps/s $\color{#35bf28}+2.00\%$
test_membership_stacked_nested 39.7940μs 3.3309μs 300.2209 KOps/s 293.3169 KOps/s $\color{#35bf28}+2.35\%$
test_membership_stacked_nested_leaf 32.4600μs 3.3979μs 294.2951 KOps/s 292.9431 KOps/s $\color{#35bf28}+0.46\%$
test_membership_nested_last 27.7020μs 4.1001μs 243.8974 KOps/s 240.1980 KOps/s $\color{#35bf28}+1.54\%$
test_membership_nested_leaf_last 33.2820μs 4.1363μs 241.7612 KOps/s 238.6479 KOps/s $\color{#35bf28}+1.30\%$
test_membership_stacked_nested_last 27.9820μs 4.1158μs 242.9682 KOps/s 239.4970 KOps/s $\color{#35bf28}+1.45\%$
test_membership_stacked_nested_leaf_last 31.0770μs 4.0903μs 244.4802 KOps/s 238.0975 KOps/s $\color{#35bf28}+2.68\%$
test_nested_getleaf 54.3110μs 10.4552μs 95.6458 KOps/s 95.7692 KOps/s $\color{#d91a1a}-0.13\%$
test_nested_get 34.6450μs 9.8008μs 102.0320 KOps/s 99.2221 KOps/s $\color{#35bf28}+2.83\%$
test_stacked_getleaf 35.6760μs 10.4270μs 95.9053 KOps/s 94.9256 KOps/s $\color{#35bf28}+1.03\%$
test_stacked_get 39.9540μs 9.9105μs 100.9027 KOps/s 100.8840 KOps/s $\color{#35bf28}+0.02\%$
test_nested_getitemleaf 46.0960μs 10.9871μs 91.0161 KOps/s 89.7884 KOps/s $\color{#35bf28}+1.37\%$
test_nested_getitem 34.4140μs 10.1404μs 98.6159 KOps/s 96.6412 KOps/s $\color{#35bf28}+2.04\%$
test_stacked_getitemleaf 44.1720μs 10.8921μs 91.8101 KOps/s 90.5799 KOps/s $\color{#35bf28}+1.36\%$
test_stacked_getitem 41.2370μs 10.1029μs 98.9810 KOps/s 97.5472 KOps/s $\color{#35bf28}+1.47\%$
test_lock_nested 56.0475ms 0.4023ms 2.4856 KOps/s 2.9206 KOps/s $\textbf{\color{#d91a1a}-14.89\%}$
test_lock_stack_nested 0.4634ms 0.3090ms 3.2366 KOps/s 3.2237 KOps/s $\color{#35bf28}+0.40\%$
test_unlock_nested 0.8162ms 0.3484ms 2.8703 KOps/s 2.8541 KOps/s $\color{#35bf28}+0.57\%$
test_unlock_stack_nested 0.6893ms 0.3151ms 3.1734 KOps/s 3.1401 KOps/s $\color{#35bf28}+1.06\%$
test_flatten_speed 0.2338ms 94.2946μs 10.6051 KOps/s 10.5185 KOps/s $\color{#35bf28}+0.82\%$
test_unflatten_speed 0.5630ms 0.4062ms 2.4618 KOps/s 2.4484 KOps/s $\color{#35bf28}+0.55\%$
test_common_ops 1.5888ms 0.7484ms 1.3361 KOps/s 1.3630 KOps/s $\color{#d91a1a}-1.97\%$
test_creation 36.3880μs 1.8416μs 543.0150 KOps/s 483.8258 KOps/s $\textbf{\color{#35bf28}+12.23\%}$
test_creation_empty 39.1130μs 11.1871μs 89.3888 KOps/s 93.1496 KOps/s $\color{#d91a1a}-4.04\%$
test_creation_nested_1 46.1260μs 13.9795μs 71.5334 KOps/s 73.8488 KOps/s $\color{#d91a1a}-3.14\%$
test_creation_nested_2 47.7890μs 17.1076μs 58.4534 KOps/s 59.0807 KOps/s $\color{#d91a1a}-1.06\%$
test_clone 0.1971ms 13.5891μs 73.5885 KOps/s 74.4981 KOps/s $\color{#d91a1a}-1.22\%$
test_getitem[int] 40.2240μs 11.3142μs 88.3848 KOps/s 86.3388 KOps/s $\color{#35bf28}+2.37\%$
test_getitem[slice_int] 58.5390μs 22.2191μs 45.0064 KOps/s 42.9088 KOps/s $\color{#35bf28}+4.89\%$
test_getitem[range] 81.5520μs 58.4150μs 17.1189 KOps/s 15.7659 KOps/s $\textbf{\color{#35bf28}+8.58\%}$
test_getitem[tuple] 66.0740μs 18.7746μs 53.2634 KOps/s 51.5198 KOps/s $\color{#35bf28}+3.38\%$
test_getitem[list] 0.1754ms 41.5138μs 24.0884 KOps/s 24.0147 KOps/s $\color{#35bf28}+0.31\%$
test_setitem_dim[int] 77.7760μs 35.7671μs 27.9587 KOps/s 28.2971 KOps/s $\color{#d91a1a}-1.20\%$
test_setitem_dim[slice_int] 93.3230μs 62.7700μs 15.9312 KOps/s 16.3556 KOps/s $\color{#d91a1a}-2.60\%$
test_setitem_dim[range] 0.1856ms 87.9021μs 11.3763 KOps/s 11.9943 KOps/s $\textbf{\color{#d91a1a}-5.15\%}$
test_setitem_dim[tuple] 96.0690μs 52.3834μs 19.0900 KOps/s 20.0551 KOps/s $\color{#d91a1a}-4.81\%$
test_setitem 0.1163ms 20.7685μs 48.1500 KOps/s 49.1766 KOps/s $\color{#d91a1a}-2.09\%$
test_set 91.0690μs 20.6117μs 48.5161 KOps/s 50.4132 KOps/s $\color{#d91a1a}-3.76\%$
test_set_shared 3.9655ms 0.1478ms 6.7657 KOps/s 6.7141 KOps/s $\color{#35bf28}+0.77\%$
test_update 0.2116ms 23.3152μs 42.8905 KOps/s 44.3620 KOps/s $\color{#d91a1a}-3.32\%$
test_update_nested 0.1095ms 32.3674μs 30.8953 KOps/s 32.0623 KOps/s $\color{#d91a1a}-3.64\%$
test_update__nested 73.8170μs 25.6123μs 39.0438 KOps/s 39.4261 KOps/s $\color{#d91a1a}-0.97\%$
test_set_nested 87.0020μs 22.2506μs 44.9427 KOps/s 45.7120 KOps/s $\color{#d91a1a}-1.68\%$
test_set_nested_new 99.0940μs 26.1276μs 38.2737 KOps/s 38.8967 KOps/s $\color{#d91a1a}-1.60\%$
test_select 0.1373ms 44.2296μs 22.6093 KOps/s 24.4325 KOps/s $\textbf{\color{#d91a1a}-7.46\%}$
test_select_nested 0.1040ms 59.1435μs 16.9080 KOps/s 16.4877 KOps/s $\color{#35bf28}+2.55\%$
test_exclude_nested 1.0406ms 0.1210ms 8.2675 KOps/s 8.2037 KOps/s $\color{#35bf28}+0.78\%$
test_empty[True] 0.6331ms 0.3897ms 2.5659 KOps/s 2.5020 KOps/s $\color{#35bf28}+2.55\%$
test_empty[False] 9.4100μs 1.1436μs 874.4244 KOps/s 871.5584 KOps/s $\color{#35bf28}+0.33\%$
test_unbind_speed 1.9583ms 0.2582ms 3.8730 KOps/s 3.8055 KOps/s $\color{#35bf28}+1.78\%$
test_unbind_speed_stack0 0.3259ms 0.2488ms 4.0199 KOps/s 3.9501 KOps/s $\color{#35bf28}+1.77\%$
test_unbind_speed_stack1 74.0289ms 0.7354ms 1.3598 KOps/s 1.3535 KOps/s $\color{#35bf28}+0.47\%$
test_split 74.4434ms 1.6086ms 621.6578 Ops/s 620.8294 Ops/s $\color{#35bf28}+0.13\%$
test_chunk 75.0378ms 1.6051ms 623.0076 Ops/s 617.8499 Ops/s $\color{#35bf28}+0.83\%$
test_creation[device0] 0.2355ms 86.2272μs 11.5973 KOps/s 11.6034 KOps/s $\color{#d91a1a}-0.05\%$
test_creation_from_tensor 4.9700ms 85.8902μs 11.6428 KOps/s 11.2317 KOps/s $\color{#35bf28}+3.66\%$
test_add_one[memmap_tensor0] 83.7360μs 5.3743μs 186.0698 KOps/s 172.4423 KOps/s $\textbf{\color{#35bf28}+7.90\%}$
test_contiguous[memmap_tensor0] 13.1950μs 0.6426μs 1.5563 MOps/s 1.5761 MOps/s $\color{#d91a1a}-1.26\%$
test_stack[memmap_tensor0] 38.3120μs 3.5691μs 280.1855 KOps/s 271.2719 KOps/s $\color{#35bf28}+3.29\%$
test_memmaptd_index 0.8833ms 0.2513ms 3.9800 KOps/s 3.8754 KOps/s $\color{#35bf28}+2.70\%$
test_memmaptd_index_astensor 0.5694ms 0.3242ms 3.0849 KOps/s 3.0150 KOps/s $\color{#35bf28}+2.32\%$
test_memmaptd_index_op 1.0626ms 0.6205ms 1.6116 KOps/s 1.5883 KOps/s $\color{#35bf28}+1.47\%$
test_serialize_model 0.1849s 0.1183s 8.4551 Ops/s 8.3128 Ops/s $\color{#35bf28}+1.71\%$
test_serialize_model_pickle 0.4493s 0.3795s 2.6349 Ops/s 2.6233 Ops/s $\color{#35bf28}+0.44\%$
test_serialize_weights 0.1826s 0.1164s 8.5900 Ops/s 8.4286 Ops/s $\color{#35bf28}+1.91\%$
test_serialize_weights_returnearly 0.1988s 0.1381s 7.2437 Ops/s 6.9799 Ops/s $\color{#35bf28}+3.78\%$
test_serialize_weights_pickle 1.2016s 0.6270s 1.5949 Ops/s 2.2766 Ops/s $\textbf{\color{#d91a1a}-29.94\%}$
test_serialize_weights_filesystem 99.4304ms 93.7517ms 10.6665 Ops/s 10.1973 Ops/s $\color{#35bf28}+4.60\%$
test_serialize_model_filesystem 0.1755s 0.1023s 9.7760 Ops/s 9.8472 Ops/s $\color{#d91a1a}-0.72\%$
test_reshape_pytree 63.7890μs 25.9531μs 38.5310 KOps/s 38.6515 KOps/s $\color{#d91a1a}-0.31\%$
test_reshape_td 0.1095ms 33.9418μs 29.4622 KOps/s 29.1946 KOps/s $\color{#35bf28}+0.92\%$
test_view_pytree 65.3720μs 25.4413μs 39.3061 KOps/s 39.0724 KOps/s $\color{#35bf28}+0.60\%$
test_view_td 81.7320μs 37.6643μs 26.5504 KOps/s 25.7792 KOps/s $\color{#35bf28}+2.99\%$
test_unbind_pytree 88.4050μs 29.4309μs 33.9779 KOps/s 33.9219 KOps/s $\color{#35bf28}+0.17\%$
test_unbind_td 0.4340ms 37.2258μs 26.8631 KOps/s 26.6066 KOps/s $\color{#35bf28}+0.96\%$
test_split_pytree 78.3160μs 29.0129μs 34.4675 KOps/s 34.0856 KOps/s $\color{#35bf28}+1.12\%$
test_split_td 0.1320ms 40.1703μs 24.8940 KOps/s 24.2651 KOps/s $\color{#35bf28}+2.59\%$
test_add_pytree 79.7790μs 35.6835μs 28.0242 KOps/s 27.2198 KOps/s $\color{#35bf28}+2.96\%$
test_add_td 0.1338ms 58.1535μs 17.1959 KOps/s 17.7136 KOps/s $\color{#d91a1a}-2.92\%$
test_distributed 0.1981ms 0.1027ms 9.7330 KOps/s 9.5410 KOps/s $\color{#35bf28}+2.01\%$
test_tdmodule 34.5040μs 18.3843μs 54.3942 KOps/s 56.4117 KOps/s $\color{#d91a1a}-3.58\%$
test_tdmodule_dispatch 60.1120μs 36.4801μs 27.4122 KOps/s 28.4114 KOps/s $\color{#d91a1a}-3.52\%$
test_tdseq 43.5410μs 21.4490μs 46.6223 KOps/s 48.1715 KOps/s $\color{#d91a1a}-3.22\%$
test_tdseq_dispatch 80.3900μs 42.0495μs 23.7815 KOps/s 24.7253 KOps/s $\color{#d91a1a}-3.82\%$
test_instantiation_functorch 1.6410ms 1.3242ms 755.1982 Ops/s 756.0503 Ops/s $\color{#d91a1a}-0.11\%$
test_instantiation_td 1.8292ms 1.0062ms 993.8135 Ops/s 979.6522 Ops/s $\color{#35bf28}+1.45\%$
test_exec_functorch 0.2944ms 0.1673ms 5.9758 KOps/s 6.1972 KOps/s $\color{#d91a1a}-3.57\%$
test_exec_functional_call 0.2566ms 0.1491ms 6.7091 KOps/s 6.7301 KOps/s $\color{#d91a1a}-0.31\%$
test_exec_td 0.2695ms 0.1452ms 6.8882 KOps/s 6.7794 KOps/s $\color{#35bf28}+1.60\%$
test_exec_td_decorator 0.8989ms 0.2205ms 4.5361 KOps/s 4.5112 KOps/s $\color{#35bf28}+0.55\%$
test_vmap_mlp_speed[True-True] 0.9308ms 0.4791ms 2.0872 KOps/s 1.9398 KOps/s $\textbf{\color{#35bf28}+7.60\%}$
test_vmap_mlp_speed[True-False] 0.8439ms 0.4976ms 2.0096 KOps/s 2.0485 KOps/s $\color{#d91a1a}-1.90\%$
test_vmap_mlp_speed[False-True] 0.6510ms 0.3875ms 2.5807 KOps/s 2.4829 KOps/s $\color{#35bf28}+3.94\%$
test_vmap_mlp_speed[False-False] 0.7061ms 0.3899ms 2.5645 KOps/s 2.5077 KOps/s $\color{#35bf28}+2.27\%$
test_vmap_mlp_speed_decorator[True-True] 1.1551ms 0.5515ms 1.8133 KOps/s 1.7910 KOps/s $\color{#35bf28}+1.25\%$
test_vmap_mlp_speed_decorator[True-False] 0.7994ms 0.5531ms 1.8081 KOps/s 1.7989 KOps/s $\color{#35bf28}+0.51\%$
test_vmap_mlp_speed_decorator[False-True] 0.6854ms 0.4480ms 2.2323 KOps/s 2.1655 KOps/s $\color{#35bf28}+3.08\%$
test_vmap_mlp_speed_decorator[False-False] 0.7760ms 0.4532ms 2.2066 KOps/s 2.1779 KOps/s $\color{#35bf28}+1.32\%$
test_to_module_speed[True] 1.8172ms 1.6850ms 593.4742 Ops/s 586.1043 Ops/s $\color{#35bf28}+1.26\%$
test_to_module_speed[False] 1.9383ms 1.6655ms 600.4362 Ops/s 596.3363 Ops/s $\color{#35bf28}+0.69\%$
test_tc_init 83.8960μs 30.7927μs 32.4753 KOps/s 34.6038 KOps/s $\textbf{\color{#d91a1a}-6.15\%}$
test_tc_init_nested 0.1217ms 63.6780μs 15.7040 KOps/s 16.2771 KOps/s $\color{#d91a1a}-3.52\%$
test_tc_first_layer_tensor 3.0843μs 0.6786μs 1.4736 MOps/s 1.4243 MOps/s $\color{#35bf28}+3.46\%$
test_tc_first_layer_nontensor 2.0533μs 0.6556μs 1.5254 MOps/s 1.5041 MOps/s $\color{#35bf28}+1.41\%$
test_tc_second_layer_tensor 18.5750μs 1.8591μs 537.8809 KOps/s 536.4444 KOps/s $\color{#35bf28}+0.27\%$
test_tc_second_layer_nontensor 20.6280μs 1.6307μs 613.2440 KOps/s 646.6277 KOps/s $\textbf{\color{#d91a1a}-5.16\%}$
test_unbind 87.2740ms 8.3345ms 119.9829 Ops/s 135.0702 Ops/s $\textbf{\color{#d91a1a}-11.17\%}$
test_full_like 9.7174ms 8.5749ms 116.6197 Ops/s 91.1095 Ops/s $\textbf{\color{#35bf28}+28.00\%}$
test_zeros_like 13.0400ms 5.9141ms 169.0880 Ops/s 172.2763 Ops/s $\color{#d91a1a}-1.85\%$
test_ones_like 13.4721ms 6.5849ms 151.8618 Ops/s 162.5658 Ops/s $\textbf{\color{#d91a1a}-6.58\%}$
test_clone 15.5725ms 8.4737ms 118.0125 Ops/s 123.9021 Ops/s $\color{#d91a1a}-4.75\%$
test_squeeze 75.1900μs 14.3808μs 69.5373 KOps/s 72.4339 KOps/s $\color{#d91a1a}-4.00\%$
test_unsqueeze 0.1554ms 61.5916μs 16.2360 KOps/s 16.5434 KOps/s $\color{#d91a1a}-1.86\%$
test_split 0.1980ms 0.1135ms 8.8130 KOps/s 8.8950 KOps/s $\color{#d91a1a}-0.92\%$
test_permute 0.2702ms 0.1303ms 7.6758 KOps/s 7.7320 KOps/s $\color{#d91a1a}-0.73\%$
test_stack 28.6560ms 23.4084ms 42.7197 Ops/s 42.0647 Ops/s $\color{#35bf28}+1.56\%$
test_cat 31.4585ms 23.5111ms 42.5330 Ops/s 42.5470 Ops/s $\color{#d91a1a}-0.03\%$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 152. Improved: $\large\color{#35bf28}4$. Worsened: $\large\color{#d91a1a}25$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 39.6710μs 13.2829μs 75.2845 KOps/s 83.4024 KOps/s $\textbf{\color{#d91a1a}-9.73\%}$
test_plain_set_stack_nested 29.5810μs 13.4222μs 74.5035 KOps/s 82.0593 KOps/s $\textbf{\color{#d91a1a}-9.21\%}$
test_plain_set_nested_inplace 39.1310μs 14.5752μs 68.6099 KOps/s 74.9553 KOps/s $\textbf{\color{#d91a1a}-8.47\%}$
test_plain_set_stack_nested_inplace 62.9410μs 14.7845μs 67.6386 KOps/s 74.1021 KOps/s $\textbf{\color{#d91a1a}-8.72\%}$
test_items 20.8900μs 4.6996μs 212.7822 KOps/s 210.3834 KOps/s $\color{#35bf28}+1.14\%$
test_items_nested 0.3741ms 0.3389ms 2.9506 KOps/s 2.9618 KOps/s $\color{#d91a1a}-0.38\%$
test_items_nested_locked 0.5413ms 0.3410ms 2.9330 KOps/s 2.9091 KOps/s $\color{#35bf28}+0.82\%$
test_items_nested_leaf 0.1042ms 82.3531μs 12.1428 KOps/s 12.1318 KOps/s $\color{#35bf28}+0.09\%$
test_items_stack_nested 0.3971ms 0.3403ms 2.9387 KOps/s 2.9707 KOps/s $\color{#d91a1a}-1.08\%$
test_items_stack_nested_leaf 0.1050ms 85.0420μs 11.7589 KOps/s 12.1620 KOps/s $\color{#d91a1a}-3.31\%$
test_items_stack_nested_locked 0.3865ms 0.3418ms 2.9257 KOps/s 2.9571 KOps/s $\color{#d91a1a}-1.06\%$
test_keys 18.2600μs 4.3894μs 227.8220 KOps/s 231.6239 KOps/s $\color{#d91a1a}-1.64\%$
test_keys_nested 94.9520μs 67.3203μs 14.8544 KOps/s 14.8134 KOps/s $\color{#35bf28}+0.28\%$
test_keys_nested_locked 2.3100ms 73.1378μs 13.6728 KOps/s 13.8755 KOps/s $\color{#d91a1a}-1.46\%$
test_keys_nested_leaf 86.6520μs 57.7346μs 17.3206 KOps/s 17.3618 KOps/s $\color{#d91a1a}-0.24\%$
test_keys_stack_nested 0.1689ms 67.5226μs 14.8099 KOps/s 14.9219 KOps/s $\color{#d91a1a}-0.75\%$
test_keys_stack_nested_leaf 81.3920μs 58.0856μs 17.2160 KOps/s 17.3589 KOps/s $\color{#d91a1a}-0.82\%$
test_keys_stack_nested_locked 94.2310μs 72.4176μs 13.8088 KOps/s 14.0421 KOps/s $\color{#d91a1a}-1.66\%$
test_values 7.8667μs 1.8106μs 552.2918 KOps/s 553.2134 KOps/s $\color{#d91a1a}-0.17\%$
test_values_nested 58.2810μs 35.2589μs 28.3616 KOps/s 28.4321 KOps/s $\color{#d91a1a}-0.25\%$
test_values_nested_locked 84.9220μs 36.8924μs 27.1058 KOps/s 26.8228 KOps/s $\color{#35bf28}+1.06\%$
test_values_nested_leaf 53.8500μs 31.4746μs 31.7716 KOps/s 31.8075 KOps/s $\color{#d91a1a}-0.11\%$
test_values_stack_nested 56.7600μs 36.0206μs 27.7619 KOps/s 28.0864 KOps/s $\color{#d91a1a}-1.16\%$
test_values_stack_nested_leaf 58.0120μs 31.8406μs 31.4065 KOps/s 31.2126 KOps/s $\color{#35bf28}+0.62\%$
test_values_stack_nested_locked 62.9810μs 37.7554μs 26.4863 KOps/s 26.7516 KOps/s $\color{#d91a1a}-0.99\%$
test_membership 1.6840μs 0.7147μs 1.3992 MOps/s 1.3979 MOps/s $\color{#35bf28}+0.09\%$
test_membership_nested 32.9510μs 2.5547μs 391.4283 KOps/s 394.1520 KOps/s $\color{#d91a1a}-0.69\%$
test_membership_nested_leaf 16.4610μs 2.5416μs 393.4568 KOps/s 391.5794 KOps/s $\color{#35bf28}+0.48\%$
test_membership_stacked_nested 19.1100μs 2.5676μs 389.4690 KOps/s 393.2234 KOps/s $\color{#d91a1a}-0.95\%$
test_membership_stacked_nested_leaf 32.3800μs 2.5507μs 392.0542 KOps/s 391.2258 KOps/s $\color{#35bf28}+0.21\%$
test_membership_nested_last 20.0200μs 3.0898μs 323.6484 KOps/s 326.0658 KOps/s $\color{#d91a1a}-0.74\%$
test_membership_nested_leaf_last 33.7410μs 3.1052μs 322.0357 KOps/s 324.8510 KOps/s $\color{#d91a1a}-0.87\%$
test_membership_stacked_nested_last 24.7190μs 3.5271μs 283.5225 KOps/s 323.8538 KOps/s $\textbf{\color{#d91a1a}-12.45\%}$
test_membership_stacked_nested_leaf_last 34.0210μs 3.5121μs 284.7273 KOps/s 321.8622 KOps/s $\textbf{\color{#d91a1a}-11.54\%}$
test_nested_getleaf 29.5310μs 8.3522μs 119.7295 KOps/s 119.6951 KOps/s $\color{#35bf28}+0.03\%$
test_nested_get 34.3190μs 7.8885μs 126.7664 KOps/s 127.1603 KOps/s $\color{#d91a1a}-0.31\%$
test_stacked_getleaf 28.1800μs 8.3933μs 119.1425 KOps/s 118.9664 KOps/s $\color{#35bf28}+0.15\%$
test_stacked_get 23.4810μs 7.8864μs 126.8012 KOps/s 127.0350 KOps/s $\color{#d91a1a}-0.18\%$
test_nested_getitemleaf 32.5500μs 8.5716μs 116.6650 KOps/s 117.6850 KOps/s $\color{#d91a1a}-0.87\%$
test_nested_getitem 38.6800μs 8.0518μs 124.1962 KOps/s 124.7877 KOps/s $\color{#d91a1a}-0.47\%$
test_stacked_getitemleaf 30.5710μs 8.5629μs 116.7824 KOps/s 116.7160 KOps/s $\color{#35bf28}+0.06\%$
test_stacked_getitem 34.1010μs 8.0560μs 124.1305 KOps/s 123.6617 KOps/s $\color{#35bf28}+0.38\%$
test_lock_nested 58.0742ms 0.4008ms 2.4949 KOps/s 2.4887 KOps/s $\color{#35bf28}+0.25\%$
test_lock_stack_nested 0.3281ms 0.2966ms 3.3719 KOps/s 3.3467 KOps/s $\color{#35bf28}+0.75\%$
test_unlock_nested 60.9061ms 0.4039ms 2.4758 KOps/s 2.4713 KOps/s $\color{#35bf28}+0.18\%$
test_unlock_stack_nested 0.3449ms 0.3048ms 3.2809 KOps/s 3.2508 KOps/s $\color{#35bf28}+0.92\%$
test_flatten_speed 0.3047ms 0.1014ms 9.8600 KOps/s 9.8694 KOps/s $\color{#d91a1a}-0.09\%$
test_unflatten_speed 0.3181ms 0.2895ms 3.4538 KOps/s 3.4168 KOps/s $\color{#35bf28}+1.08\%$
test_common_ops 1.0697ms 0.5914ms 1.6908 KOps/s 1.7720 KOps/s $\color{#d91a1a}-4.58\%$
test_creation 20.2500μs 1.6048μs 623.1476 KOps/s 615.5521 KOps/s $\color{#35bf28}+1.23\%$
test_creation_empty 25.5500μs 9.5270μs 104.9653 KOps/s 140.2046 KOps/s $\textbf{\color{#d91a1a}-25.13\%}$
test_creation_nested_1 41.0310μs 11.3338μs 88.2320 KOps/s 112.2285 KOps/s $\textbf{\color{#d91a1a}-21.38\%}$
test_creation_nested_2 31.8010μs 13.4673μs 74.2538 KOps/s 90.4597 KOps/s $\textbf{\color{#d91a1a}-17.92\%}$
test_clone 87.5120μs 11.6071μs 86.1545 KOps/s 84.5383 KOps/s $\color{#35bf28}+1.91\%$
test_getitem[int] 25.1400μs 10.5349μs 94.9226 KOps/s 94.1044 KOps/s $\color{#35bf28}+0.87\%$
test_getitem[slice_int] 46.5410μs 20.3402μs 49.1638 KOps/s 48.5571 KOps/s $\color{#35bf28}+1.25\%$
test_getitem[range] 67.2410μs 49.3468μs 20.2647 KOps/s 19.8141 KOps/s $\color{#35bf28}+2.27\%$
test_getitem[tuple] 54.8700μs 17.9848μs 55.6024 KOps/s 54.6961 KOps/s $\color{#35bf28}+1.66\%$
test_getitem[list] 0.1037ms 33.4408μs 29.9036 KOps/s 29.7380 KOps/s $\color{#35bf28}+0.56\%$
test_setitem_dim[int] 45.5210μs 29.1318μs 34.3267 KOps/s 35.9165 KOps/s $\color{#d91a1a}-4.43\%$
test_setitem_dim[slice_int] 68.4810μs 49.6779μs 20.1297 KOps/s 20.8218 KOps/s $\color{#d91a1a}-3.32\%$
test_setitem_dim[range] 85.7520μs 67.0416μs 14.9161 KOps/s 15.4677 KOps/s $\color{#d91a1a}-3.57\%$
test_setitem_dim[tuple] 63.8620μs 43.4948μs 22.9913 KOps/s 23.1779 KOps/s $\color{#d91a1a}-0.81\%$
test_setitem 63.0300μs 16.9589μs 58.9660 KOps/s 64.1700 KOps/s $\textbf{\color{#d91a1a}-8.11\%}$
test_set 45.4000μs 16.2796μs 61.4264 KOps/s 65.5202 KOps/s $\textbf{\color{#d91a1a}-6.25\%}$
test_set_shared 1.4831ms 98.0151μs 10.2025 KOps/s 9.9723 KOps/s $\color{#35bf28}+2.31\%$
test_update 0.1728ms 19.5497μs 51.1516 KOps/s 59.6506 KOps/s $\textbf{\color{#d91a1a}-14.25\%}$
test_update_nested 68.4320μs 25.2658μs 39.5793 KOps/s 44.4780 KOps/s $\textbf{\color{#d91a1a}-11.01\%}$
test_update__nested 0.1362ms 22.9328μs 43.6057 KOps/s 44.8882 KOps/s $\color{#d91a1a}-2.86\%$
test_set_nested 67.0620μs 17.7476μs 56.3458 KOps/s 61.8662 KOps/s $\textbf{\color{#d91a1a}-8.92\%}$
test_set_nested_new 59.8200μs 20.4515μs 48.8961 KOps/s 52.2789 KOps/s $\textbf{\color{#d91a1a}-6.47\%}$
test_select 66.0710μs 34.3636μs 29.1006 KOps/s 31.1021 KOps/s $\textbf{\color{#d91a1a}-6.44\%}$
test_select_nested 0.7163ms 53.8914μs 18.5558 KOps/s 18.4162 KOps/s $\color{#35bf28}+0.76\%$
test_exclude_nested 0.1417ms 0.1108ms 9.0292 KOps/s 9.1340 KOps/s $\color{#d91a1a}-1.15\%$
test_empty[True] 0.3755ms 0.3468ms 2.8835 KOps/s 2.9256 KOps/s $\color{#d91a1a}-1.44\%$
test_empty[False] 2.8420μs 0.9341μs 1.0705 MOps/s 1.0727 MOps/s $\color{#d91a1a}-0.21\%$
test_to 0.1038ms 75.1500μs 13.3067 KOps/s 13.3129 KOps/s $\color{#d91a1a}-0.05\%$
test_to_nonblocking 0.1003ms 60.2981μs 16.5843 KOps/s 16.6898 KOps/s $\color{#d91a1a}-0.63\%$
test_unbind_speed 1.6480ms 0.2600ms 3.8464 KOps/s 3.8510 KOps/s $\color{#d91a1a}-0.12\%$
test_unbind_speed_stack0 0.2917ms 0.2603ms 3.8416 KOps/s 3.8690 KOps/s $\color{#d91a1a}-0.71\%$
test_unbind_speed_stack1 76.2882ms 0.8040ms 1.2438 KOps/s 1.2405 KOps/s $\color{#35bf28}+0.27\%$
test_split 75.9427ms 1.6588ms 602.8575 Ops/s 586.4042 Ops/s $\color{#35bf28}+2.81\%$
test_chunk 76.2678ms 1.6648ms 600.6582 Ops/s 586.6961 Ops/s $\color{#35bf28}+2.38\%$
test_creation[device0] 0.1141ms 59.4966μs 16.8077 KOps/s 17.1147 KOps/s $\color{#d91a1a}-1.79\%$
test_creation_from_tensor 0.1290ms 55.6530μs 17.9685 KOps/s 18.5285 KOps/s $\color{#d91a1a}-3.02\%$
test_add_one[memmap_tensor0] 62.1420μs 6.9331μs 144.2351 KOps/s 143.1787 KOps/s $\color{#35bf28}+0.74\%$
test_contiguous[memmap_tensor0] 24.4810μs 0.6577μs 1.5204 MOps/s 1.5081 MOps/s $\color{#35bf28}+0.81\%$
test_stack[memmap_tensor0] 40.4900μs 4.8623μs 205.6655 KOps/s 211.3186 KOps/s $\color{#d91a1a}-2.68\%$
test_memmaptd_index 1.1458ms 0.2863ms 3.4928 KOps/s 3.4943 KOps/s $\color{#d91a1a}-0.04\%$
test_memmaptd_index_astensor 0.6524ms 0.3544ms 2.8216 KOps/s 2.8089 KOps/s $\color{#35bf28}+0.45\%$
test_memmaptd_index_op 1.2317ms 0.6682ms 1.4966 KOps/s 1.5989 KOps/s $\textbf{\color{#d91a1a}-6.40\%}$
test_serialize_model 0.1826s 0.1108s 9.0258 Ops/s 8.6787 Ops/s $\color{#35bf28}+4.00\%$
test_serialize_model_pickle 1.3468s 1.2356s 0.8093 Ops/s 0.7935 Ops/s $\color{#35bf28}+1.99\%$
test_serialize_weights 0.1799s 0.1080s 9.2630 Ops/s 8.8307 Ops/s $\color{#35bf28}+4.89\%$
test_serialize_weights_returnearly 0.2880s 99.0665ms 10.0942 Ops/s 9.9866 Ops/s $\color{#35bf28}+1.08\%$
test_serialize_weights_pickle 1.3577s 1.2485s 0.8010 Ops/s 0.8008 Ops/s $\color{#35bf28}+0.02\%$
test_reshape_pytree 0.1274ms 25.8847μs 38.6328 KOps/s 39.1947 KOps/s $\color{#d91a1a}-1.43\%$
test_reshape_td 0.1910ms 31.6164μs 31.6292 KOps/s 32.4318 KOps/s $\color{#d91a1a}-2.47\%$
test_view_pytree 0.1158ms 25.3226μs 39.4904 KOps/s 40.1217 KOps/s $\color{#d91a1a}-1.57\%$
test_view_td 0.1153ms 36.7192μs 27.2337 KOps/s 27.8512 KOps/s $\color{#d91a1a}-2.22\%$
test_unbind_pytree 0.2072ms 31.6613μs 31.5843 KOps/s 30.7769 KOps/s $\color{#35bf28}+2.62\%$
test_unbind_td 0.4410ms 40.6313μs 24.6116 KOps/s 24.3480 KOps/s $\color{#35bf28}+1.08\%$
test_split_pytree 64.1810μs 33.3298μs 30.0032 KOps/s 29.5853 KOps/s $\color{#35bf28}+1.41\%$
test_split_td 0.4667ms 38.0610μs 26.2736 KOps/s 25.6235 KOps/s $\color{#35bf28}+2.54\%$
test_add_pytree 0.2401ms 36.9306μs 27.0778 KOps/s 25.8020 KOps/s $\color{#35bf28}+4.94\%$
test_add_td 0.2315ms 49.5802μs 20.1693 KOps/s 19.8491 KOps/s $\color{#35bf28}+1.61\%$
test_distributed 3.9460ms 0.1014ms 9.8581 KOps/s 14.3684 KOps/s $\textbf{\color{#d91a1a}-31.39\%}$
test_tdmodule 36.4810μs 15.3079μs 65.3257 KOps/s 71.8478 KOps/s $\textbf{\color{#d91a1a}-9.08\%}$
test_tdmodule_dispatch 55.7110μs 30.7564μs 32.5136 KOps/s 36.1904 KOps/s $\textbf{\color{#d91a1a}-10.16\%}$
test_tdseq 33.0200μs 17.3044μs 57.7889 KOps/s 61.3675 KOps/s $\textbf{\color{#d91a1a}-5.83\%}$
test_tdseq_dispatch 0.1448ms 33.3282μs 30.0046 KOps/s 32.0679 KOps/s $\textbf{\color{#d91a1a}-6.43\%}$
test_instantiation_functorch 1.7050ms 1.5121ms 661.3246 Ops/s 665.9927 Ops/s $\color{#d91a1a}-0.70\%$
test_instantiation_td 1.5526ms 1.0363ms 965.0127 Ops/s 961.4961 Ops/s $\color{#35bf28}+0.37\%$
test_exec_functorch 0.2216ms 0.1528ms 6.5443 KOps/s 6.4403 KOps/s $\color{#35bf28}+1.61\%$
test_exec_functional_call 0.3384ms 0.1417ms 7.0587 KOps/s 6.9391 KOps/s $\color{#35bf28}+1.72\%$
test_exec_td 0.2654ms 0.1396ms 7.1640 KOps/s 6.8175 KOps/s $\textbf{\color{#35bf28}+5.08\%}$
test_exec_td_decorator 0.4045ms 0.2115ms 4.7271 KOps/s 4.6041 KOps/s $\color{#35bf28}+2.67\%$
test_vmap_mlp_speed[True-True] 0.7814ms 0.5759ms 1.7363 KOps/s 1.7078 KOps/s $\color{#35bf28}+1.67\%$
test_vmap_mlp_speed[True-False] 0.7727ms 0.5750ms 1.7391 KOps/s 1.7404 KOps/s $\color{#d91a1a}-0.08\%$
test_vmap_mlp_speed[False-True] 0.7210ms 0.5131ms 1.9488 KOps/s 1.9209 KOps/s $\color{#35bf28}+1.45\%$
test_vmap_mlp_speed[False-False] 0.7054ms 0.5006ms 1.9975 KOps/s 1.8898 KOps/s $\textbf{\color{#35bf28}+5.70\%}$
test_vmap_mlp_speed_decorator[True-True] 0.9317ms 0.6349ms 1.5751 KOps/s 1.5497 KOps/s $\color{#35bf28}+1.64\%$
test_vmap_mlp_speed_decorator[True-False] 0.8205ms 0.6310ms 1.5849 KOps/s 1.5951 KOps/s $\color{#d91a1a}-0.64\%$
test_vmap_mlp_speed_decorator[False-True] 0.7614ms 0.5637ms 1.7740 KOps/s 1.7815 KOps/s $\color{#d91a1a}-0.42\%$
test_vmap_mlp_speed_decorator[False-False] 0.7597ms 0.5611ms 1.7823 KOps/s 1.7822 KOps/s $+0.01\%$
test_vmap_transformer_speed[True-True] 7.7578ms 7.5243ms 132.9030 Ops/s 133.8065 Ops/s $\color{#d91a1a}-0.68\%$
test_vmap_transformer_speed[True-False] 7.7514ms 7.5100ms 133.1567 Ops/s 133.2696 Ops/s $\color{#d91a1a}-0.08\%$
test_vmap_transformer_speed[False-True] 7.8667ms 7.4341ms 134.5154 Ops/s 134.2408 Ops/s $\color{#35bf28}+0.20\%$
test_vmap_transformer_speed[False-False] 7.6877ms 7.4453ms 134.3135 Ops/s 134.1741 Ops/s $\color{#35bf28}+0.10\%$
test_vmap_transformer_speed_decorator[True-True] 18.6834ms 18.3136ms 54.6041 Ops/s 54.8818 Ops/s $\color{#d91a1a}-0.51\%$
test_vmap_transformer_speed_decorator[True-False] 19.0053ms 18.2858ms 54.6872 Ops/s 54.6912 Ops/s $-0.01\%$
test_vmap_transformer_speed_decorator[False-True] 18.5985ms 18.1686ms 55.0399 Ops/s 54.8993 Ops/s $\color{#35bf28}+0.26\%$
test_vmap_transformer_speed_decorator[False-False] 18.4492ms 18.1172ms 55.1962 Ops/s 54.9917 Ops/s $\color{#35bf28}+0.37\%$
test_to_module_speed[True] 1.6913ms 1.5254ms 655.5774 Ops/s 642.6382 Ops/s $\color{#35bf28}+2.01\%$
test_to_module_speed[False] 1.6764ms 1.5050ms 664.4465 Ops/s 663.2418 Ops/s $\color{#35bf28}+0.18\%$
test_tc_init 0.1269ms 26.8571μs 37.2342 KOps/s 45.4551 KOps/s $\textbf{\color{#d91a1a}-18.09\%}$
test_tc_init_nested 0.2513ms 53.4242μs 18.7181 KOps/s 21.9932 KOps/s $\textbf{\color{#d91a1a}-14.89\%}$
test_tc_first_layer_tensor 4.9486μs 0.3598μs 2.7790 MOps/s 2.7678 MOps/s $\color{#35bf28}+0.41\%$
test_tc_first_layer_nontensor 14.6180μs 0.3893μs 2.5686 MOps/s 2.5424 MOps/s $\color{#35bf28}+1.03\%$
test_tc_second_layer_tensor 37.8228μs 0.9645μs 1.0368 MOps/s 1.0324 MOps/s $\color{#35bf28}+0.43\%$
test_tc_second_layer_nontensor 1.7680μs 0.8064μs 1.2401 MOps/s 1.2302 MOps/s $\color{#35bf28}+0.81\%$
test_unbind 0.1015s 8.0588ms 124.0874 Ops/s 153.1255 Ops/s $\textbf{\color{#d91a1a}-18.96\%}$
test_full_like 13.9110ms 13.2293ms 75.5900 Ops/s 75.6114 Ops/s $\color{#d91a1a}-0.03\%$
test_zeros_like 8.0645ms 7.7822ms 128.4985 Ops/s 129.0711 Ops/s $\color{#d91a1a}-0.44\%$
test_ones_like 8.1298ms 7.8562ms 127.2886 Ops/s 128.8224 Ops/s $\color{#d91a1a}-1.19\%$
test_clone 9.7008ms 9.4446ms 105.8801 Ops/s 106.1344 Ops/s $\color{#d91a1a}-0.24\%$
test_squeeze 86.5020μs 11.0140μs 90.7939 KOps/s 91.4108 KOps/s $\color{#d91a1a}-0.67\%$
test_unsqueeze 0.2550ms 51.6083μs 19.3767 KOps/s 18.6111 KOps/s $\color{#35bf28}+4.11\%$
test_split 0.1383ms 95.1524μs 10.5095 KOps/s 9.9705 KOps/s $\textbf{\color{#35bf28}+5.41\%}$
test_permute 0.3116ms 0.1090ms 9.1778 KOps/s 8.6219 KOps/s $\textbf{\color{#35bf28}+6.45\%}$
test_stack 27.7835ms 27.5204ms 36.3367 Ops/s 36.3577 Ops/s $\color{#d91a1a}-0.06\%$
test_cat 28.9422ms 27.4390ms 36.4445 Ops/s 36.4738 Ops/s $\color{#d91a1a}-0.08\%$

@vmoens vmoens added the enhancement New feature or request label Jun 25, 2024
@vmoens vmoens merged commit 9f942ba into main Jun 25, 2024
41 of 43 checks passed
@vmoens vmoens deleted the from-pytree branch October 21, 2024 14:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants