Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Fix non-blocking arg in copy_ #590

Merged
merged 1 commit into from
Dec 4, 2023
Merged

[BugFix] Fix non-blocking arg in copy_ #590

merged 1 commit into from
Dec 4, 2023

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Dec 4, 2023

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 4, 2023
@vmoens vmoens marked this pull request as ready for review December 4, 2023 13:54
@vmoens vmoens merged commit a25b22b into main Dec 4, 2023
25 of 33 checks passed
@vmoens vmoens added the bug Something isn't working label Dec 4, 2023
@vmoens vmoens deleted the fix-copy branch December 4, 2023 13:54
Copy link

github-actions bot commented Dec 4, 2023

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 113. Improved: $\large\color{#35bf28}3$. Worsened: $\large\color{#d91a1a}9$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 28.5030μs 16.1078μs 62.0817 KOps/s 64.6332 KOps/s $\color{#d91a1a}-3.95\%$
test_plain_set_stack_nested 0.2163ms 0.1424ms 7.0240 KOps/s 7.0069 KOps/s $\color{#35bf28}+0.25\%$
test_plain_set_nested_inplace 53.3490μs 17.9520μs 55.7040 KOps/s 56.3685 KOps/s $\color{#d91a1a}-1.18\%$
test_plain_set_stack_nested_inplace 0.2435ms 0.1779ms 5.6216 KOps/s 5.7137 KOps/s $\color{#d91a1a}-1.61\%$
test_items 45.6250μs 2.4784μs 403.4862 KOps/s 371.5470 KOps/s $\textbf{\color{#35bf28}+8.60\%}$
test_items_nested 0.3386ms 0.2678ms 3.7347 KOps/s 3.7268 KOps/s $\color{#35bf28}+0.21\%$
test_items_nested_locked 0.3307ms 0.2703ms 3.6992 KOps/s 3.7044 KOps/s $\color{#d91a1a}-0.14\%$
test_items_nested_leaf 0.7612ms 0.1670ms 5.9890 KOps/s 5.9480 KOps/s $\color{#35bf28}+0.69\%$
test_items_stack_nested 3.9993ms 1.5175ms 658.9813 Ops/s 685.3528 Ops/s $\color{#d91a1a}-3.85\%$
test_items_stack_nested_leaf 2.1408ms 1.3730ms 728.3579 Ops/s 754.1273 Ops/s $\color{#d91a1a}-3.42\%$
test_items_stack_nested_locked 1.9555ms 0.7766ms 1.2876 KOps/s 1.2978 KOps/s $\color{#d91a1a}-0.78\%$
test_keys 51.3150μs 3.8763μs 257.9754 KOps/s 258.7448 KOps/s $\color{#d91a1a}-0.30\%$
test_keys_nested 0.5399ms 0.1397ms 7.1602 KOps/s 6.4841 KOps/s $\textbf{\color{#35bf28}+10.43\%}$
test_keys_nested_locked 0.2706ms 0.1386ms 7.2140 KOps/s 7.0163 KOps/s $\color{#35bf28}+2.82\%$
test_keys_nested_leaf 0.3921ms 0.1392ms 7.1838 KOps/s 7.1254 KOps/s $\color{#35bf28}+0.82\%$
test_keys_stack_nested 1.6015ms 1.4108ms 708.8027 Ops/s 710.7868 Ops/s $\color{#d91a1a}-0.28\%$
test_keys_stack_nested_leaf 2.5342ms 1.4174ms 705.5386 Ops/s 711.5283 Ops/s $\color{#d91a1a}-0.84\%$
test_keys_stack_nested_locked 0.7899ms 0.6782ms 1.4744 KOps/s 1.4470 KOps/s $\color{#35bf28}+1.89\%$
test_values 12.6185μs 1.1515μs 868.4328 KOps/s 837.4843 KOps/s $\color{#35bf28}+3.70\%$
test_values_nested 0.1234ms 48.6959μs 20.5356 KOps/s 20.4750 KOps/s $\color{#35bf28}+0.30\%$
test_values_nested_locked 0.1202ms 49.6861μs 20.1264 KOps/s 20.7065 KOps/s $\color{#d91a1a}-2.80\%$
test_values_nested_leaf 81.9330μs 43.8930μs 22.7827 KOps/s 22.9146 KOps/s $\color{#d91a1a}-0.58\%$
test_values_stack_nested 1.3496ms 1.2151ms 822.9954 Ops/s 837.2354 Ops/s $\color{#d91a1a}-1.70\%$
test_values_stack_nested_leaf 2.1042ms 1.2120ms 825.0740 Ops/s 844.7286 Ops/s $\color{#d91a1a}-2.33\%$
test_values_stack_nested_locked 0.8929ms 0.5198ms 1.9240 KOps/s 1.9321 KOps/s $\color{#d91a1a}-0.42\%$
test_membership 19.2560μs 1.3597μs 735.4564 KOps/s 740.7804 KOps/s $\color{#d91a1a}-0.72\%$
test_membership_nested 25.2770μs 2.8136μs 355.4226 KOps/s 352.7159 KOps/s $\color{#35bf28}+0.77\%$
test_membership_nested_leaf 28.8930μs 2.8666μs 348.8483 KOps/s 351.7047 KOps/s $\color{#d91a1a}-0.81\%$
test_membership_stacked_nested 56.7160μs 11.7955μs 84.7782 KOps/s 85.7627 KOps/s $\color{#d91a1a}-1.15\%$
test_membership_stacked_nested_leaf 0.1967ms 12.5982μs 79.3761 KOps/s 86.2923 KOps/s $\textbf{\color{#d91a1a}-8.01\%}$
test_membership_nested_last 32.6710μs 6.0283μs 165.8839 KOps/s 165.3198 KOps/s $\color{#35bf28}+0.34\%$
test_membership_nested_leaf_last 44.7820μs 6.0133μs 166.2990 KOps/s 172.4669 KOps/s $\color{#d91a1a}-3.58\%$
test_membership_stacked_nested_last 0.3616ms 0.1678ms 5.9584 KOps/s 6.0669 KOps/s $\color{#d91a1a}-1.79\%$
test_membership_stacked_nested_leaf_last 79.9090μs 14.0537μs 71.1558 KOps/s 72.9510 KOps/s $\color{#d91a1a}-2.46\%$
test_nested_getleaf 60.2530μs 10.7118μs 93.3550 KOps/s 94.7822 KOps/s $\color{#d91a1a}-1.51\%$
test_nested_get 58.8600μs 10.1370μs 98.6482 KOps/s 100.8677 KOps/s $\color{#d91a1a}-2.20\%$
test_stacked_getleaf 1.0929ms 0.6541ms 1.5288 KOps/s 1.5751 KOps/s $\color{#d91a1a}-2.94\%$
test_stacked_get 4.8093ms 0.6264ms 1.5963 KOps/s 1.6586 KOps/s $\color{#d91a1a}-3.76\%$
test_nested_getitemleaf 63.5480μs 10.8960μs 91.7765 KOps/s 95.2524 KOps/s $\color{#d91a1a}-3.65\%$
test_nested_getitem 57.3270μs 10.2876μs 97.2045 KOps/s 99.4481 KOps/s $\color{#d91a1a}-2.26\%$
test_stacked_getitemleaf 1.2039ms 0.6611ms 1.5125 KOps/s 1.5736 KOps/s $\color{#d91a1a}-3.88\%$
test_stacked_getitem 0.9182ms 0.6229ms 1.6053 KOps/s 1.6589 KOps/s $\color{#d91a1a}-3.23\%$
test_lock_nested 72.6735ms 0.6442ms 1.5523 KOps/s 1.7842 KOps/s $\textbf{\color{#d91a1a}-13.00\%}$
test_lock_stack_nested 8.9668ms 5.3424ms 187.1809 Ops/s 193.8497 Ops/s $\color{#d91a1a}-3.44\%$
test_unlock_nested 1.0260ms 0.4451ms 2.2469 KOps/s 2.2335 KOps/s $\color{#35bf28}+0.60\%$
test_unlock_stack_nested 86.4107ms 7.4292ms 134.6045 Ops/s 129.2356 Ops/s $\color{#35bf28}+4.15\%$
test_flatten_speed 0.3590ms 0.2672ms 3.7426 KOps/s 3.7685 KOps/s $\color{#d91a1a}-0.69\%$
test_unflatten_speed 0.5915ms 0.4626ms 2.1616 KOps/s 2.2228 KOps/s $\color{#d91a1a}-2.75\%$
test_common_ops 3.0959ms 0.6862ms 1.4572 KOps/s 1.4931 KOps/s $\color{#d91a1a}-2.40\%$
test_creation 47.6990μs 2.5015μs 399.7563 KOps/s 399.6632 KOps/s $\color{#35bf28}+0.02\%$
test_creation_empty 22.0920μs 8.2318μs 121.4802 KOps/s 123.9570 KOps/s $\color{#d91a1a}-2.00\%$
test_creation_nested_1 56.4050μs 11.6667μs 85.7142 KOps/s 88.5428 KOps/s $\color{#d91a1a}-3.19\%$
test_creation_nested_2 46.3570μs 15.2465μs 65.5889 KOps/s 67.1439 KOps/s $\color{#d91a1a}-2.32\%$
test_clone 0.1421ms 13.5992μs 73.5340 KOps/s 73.5499 KOps/s $\color{#d91a1a}-0.02\%$
test_getitem[int] 43.9320μs 13.1331μs 76.1438 KOps/s 75.3394 KOps/s $\color{#35bf28}+1.07\%$
test_getitem[slice_int] 93.1760μs 25.1305μs 39.7922 KOps/s 40.1697 KOps/s $\color{#d91a1a}-0.94\%$
test_getitem[range] 93.4040μs 45.1757μs 22.1358 KOps/s 22.1209 KOps/s $\color{#35bf28}+0.07\%$
test_getitem[tuple] 77.6420μs 20.3202μs 49.2122 KOps/s 48.8001 KOps/s $\color{#35bf28}+0.84\%$
test_getitem[list] 83.9060μs 40.5275μs 24.6746 KOps/s 24.5196 KOps/s $\color{#35bf28}+0.63\%$
test_setitem_dim[int] 84.1370μs 27.3383μs 36.5787 KOps/s 35.5063 KOps/s $\color{#35bf28}+3.02\%$
test_setitem_dim[slice_int] 88.6560μs 50.9024μs 19.6455 KOps/s 19.2063 KOps/s $\color{#35bf28}+2.29\%$
test_setitem_dim[range] 0.1481ms 72.0636μs 13.8766 KOps/s 13.9180 KOps/s $\color{#d91a1a}-0.30\%$
test_setitem_dim[tuple] 75.1700μs 40.0877μs 24.9453 KOps/s 23.6639 KOps/s $\textbf{\color{#35bf28}+5.42\%}$
test_setitem 0.1954ms 18.5877μs 53.7991 KOps/s 54.9012 KOps/s $\color{#d91a1a}-2.01\%$
test_set 0.1999ms 17.7755μs 56.2573 KOps/s 56.6286 KOps/s $\color{#d91a1a}-0.66\%$
test_set_shared 2.1245ms 0.1423ms 7.0275 KOps/s 6.9475 KOps/s $\color{#35bf28}+1.15\%$
test_update 0.1727ms 19.3236μs 51.7501 KOps/s 53.8176 KOps/s $\color{#d91a1a}-3.84\%$
test_update_nested 0.1612ms 26.9197μs 37.1475 KOps/s 38.3735 KOps/s $\color{#d91a1a}-3.19\%$
test_set_nested 0.1481ms 19.5129μs 51.2480 KOps/s 51.7214 KOps/s $\color{#d91a1a}-0.92\%$
test_set_nested_new 0.1640ms 25.5944μs 39.0710 KOps/s 41.3357 KOps/s $\textbf{\color{#d91a1a}-5.48\%}$
test_select 1.0166ms 53.6850μs 18.6272 KOps/s 20.4372 KOps/s $\textbf{\color{#d91a1a}-8.86\%}$
test_unbind_speed 0.8637ms 0.3761ms 2.6585 KOps/s 2.7138 KOps/s $\color{#d91a1a}-2.04\%$
test_unbind_speed_stack0 73.2662ms 5.1454ms 194.3469 Ops/s 208.0051 Ops/s $\textbf{\color{#d91a1a}-6.57\%}$
test_unbind_speed_stack1 3.0181μs 0.6867μs 1.4563 MOps/s 1.5835 MOps/s $\textbf{\color{#d91a1a}-8.03\%}$
test_split 58.8216ms 1.7738ms 563.7708 Ops/s 555.7244 Ops/s $\color{#35bf28}+1.45\%$
test_chunk 1.8019ms 1.6513ms 605.5954 Ops/s 605.0634 Ops/s $\color{#35bf28}+0.09\%$
test_creation[device0] 60.9090ms 0.3402ms 2.9395 KOps/s 3.4080 KOps/s $\textbf{\color{#d91a1a}-13.75\%}$
test_creation_from_tensor 0.4243ms 0.3235ms 3.0908 KOps/s 3.0236 KOps/s $\color{#35bf28}+2.22\%$
test_add_one[memmap_tensor0] 0.4096ms 25.7550μs 38.8274 KOps/s 39.2666 KOps/s $\color{#d91a1a}-1.12\%$
test_contiguous[memmap_tensor0] 40.3760μs 5.8147μs 171.9780 KOps/s 173.1877 KOps/s $\color{#d91a1a}-0.70\%$
test_stack[memmap_tensor0] 93.2740μs 19.1256μs 52.2861 KOps/s 53.2546 KOps/s $\color{#d91a1a}-1.82\%$
test_memmaptd_index 0.3355ms 0.2047ms 4.8847 KOps/s 4.9357 KOps/s $\color{#d91a1a}-1.03\%$
test_memmaptd_index_astensor 0.4503ms 0.2641ms 3.7864 KOps/s 3.8063 KOps/s $\color{#d91a1a}-0.52\%$
test_memmaptd_index_op 1.6529ms 0.5218ms 1.9166 KOps/s 2.0283 KOps/s $\textbf{\color{#d91a1a}-5.51\%}$
test_reshape_pytree 94.2560μs 23.5367μs 42.4868 KOps/s 43.7955 KOps/s $\color{#d91a1a}-2.99\%$
test_reshape_td 70.7520μs 32.5638μs 30.7089 KOps/s 31.2885 KOps/s $\color{#d91a1a}-1.85\%$
test_view_pytree 77.7360μs 23.2349μs 43.0387 KOps/s 43.3007 KOps/s $\color{#d91a1a}-0.60\%$
test_view_td 26.6200μs 4.9676μs 201.3035 KOps/s 202.0544 KOps/s $\color{#d91a1a}-0.37\%$
test_unbind_pytree 76.7130μs 26.5358μs 37.6850 KOps/s 37.8642 KOps/s $\color{#d91a1a}-0.47\%$
test_unbind_td 0.1266ms 59.7480μs 16.7370 KOps/s 16.9144 KOps/s $\color{#d91a1a}-1.05\%$
test_split_pytree 93.5270μs 26.3044μs 38.0165 KOps/s 37.7114 KOps/s $\color{#35bf28}+0.81\%$
test_split_td 0.1141ms 46.6042μs 21.4573 KOps/s 21.8685 KOps/s $\color{#d91a1a}-1.88\%$
test_add_pytree 89.3860μs 31.8475μs 31.3997 KOps/s 31.5296 KOps/s $\color{#d91a1a}-0.41\%$
test_add_td 0.1112ms 45.4573μs 21.9987 KOps/s 22.5008 KOps/s $\color{#d91a1a}-2.23\%$
test_distributed 26.5500μs 5.9820μs 167.1692 KOps/s 165.1314 KOps/s $\color{#35bf28}+1.23\%$
test_tdmodule 1.5496ms 22.6542μs 44.1420 KOps/s 45.8679 KOps/s $\color{#d91a1a}-3.76\%$
test_tdmodule_dispatch 0.2245ms 39.2703μs 25.4645 KOps/s 26.7646 KOps/s $\color{#d91a1a}-4.86\%$
test_tdseq 0.1214ms 24.3680μs 41.0374 KOps/s 42.2046 KOps/s $\color{#d91a1a}-2.77\%$
test_tdseq_dispatch 0.4982ms 43.6218μs 22.9243 KOps/s 23.5773 KOps/s $\color{#d91a1a}-2.77\%$
test_instantiation_functorch 2.8794ms 1.3096ms 763.6179 Ops/s 770.2210 Ops/s $\color{#d91a1a}-0.86\%$
test_instantiation_td 1.8817ms 1.0479ms 954.3015 Ops/s 980.1585 Ops/s $\color{#d91a1a}-2.64\%$
test_exec_functorch 0.2620ms 0.1619ms 6.1760 KOps/s 6.3238 KOps/s $\color{#d91a1a}-2.34\%$
test_exec_functional_call 0.2395ms 0.1505ms 6.6451 KOps/s 6.8597 KOps/s $\color{#d91a1a}-3.13\%$
test_exec_td 0.2317ms 0.1437ms 6.9607 KOps/s 6.9951 KOps/s $\color{#d91a1a}-0.49\%$
test_exec_td_decorator 0.8031ms 0.1776ms 5.6297 KOps/s 5.7024 KOps/s $\color{#d91a1a}-1.27\%$
test_vmap_mlp_speed[True-True] 1.2382ms 0.8796ms 1.1369 KOps/s 1.1154 KOps/s $\color{#35bf28}+1.93\%$
test_vmap_mlp_speed[True-False] 0.5834ms 0.4610ms 2.1692 KOps/s 2.1103 KOps/s $\color{#35bf28}+2.79\%$
test_vmap_mlp_speed[False-True] 1.1067ms 0.7614ms 1.3133 KOps/s 1.2703 KOps/s $\color{#35bf28}+3.38\%$
test_vmap_mlp_speed[False-False] 0.6503ms 0.3812ms 2.6236 KOps/s 2.5658 KOps/s $\color{#35bf28}+2.25\%$
test_vmap_mlp_speed_decorator[True-True] 2.6656ms 1.7569ms 569.1762 Ops/s 547.7805 Ops/s $\color{#35bf28}+3.91\%$
test_vmap_mlp_speed_decorator[True-False] 81.8769ms 0.5618ms 1.7800 KOps/s 1.9125 KOps/s $\textbf{\color{#d91a1a}-6.93\%}$
test_vmap_mlp_speed_decorator[False-True] 1.9657ms 1.4673ms 681.5343 Ops/s 665.6422 Ops/s $\color{#35bf28}+2.39\%$
test_vmap_mlp_speed_decorator[False-False] 1.0893ms 0.3964ms 2.5229 KOps/s 2.4793 KOps/s $\color{#35bf28}+1.76\%$

Copy link

github-actions bot commented Dec 4, 2023

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 127. Improved: $\large\color{#35bf28}2$. Worsened: $\large\color{#d91a1a}19$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 25.4800μs 12.8306μs 77.9388 KOps/s 78.7731 KOps/s $\color{#d91a1a}-1.06\%$
test_plain_set_stack_nested 0.1473ms 0.1169ms 8.5522 KOps/s 8.3297 KOps/s $\color{#35bf28}+2.67\%$
test_plain_set_nested_inplace 29.9100μs 14.1570μs 70.6362 KOps/s 71.0977 KOps/s $\color{#d91a1a}-0.65\%$
test_plain_set_stack_nested_inplace 0.1826ms 0.1462ms 6.8407 KOps/s 6.9111 KOps/s $\color{#d91a1a}-1.02\%$
test_items 26.0300μs 4.6688μs 214.1884 KOps/s 210.3381 KOps/s $\color{#35bf28}+1.83\%$
test_items_nested 0.3924ms 0.3372ms 2.9658 KOps/s 2.9588 KOps/s $\color{#35bf28}+0.24\%$
test_items_nested_locked 0.3916ms 0.3386ms 2.9532 KOps/s 2.9321 KOps/s $\color{#35bf28}+0.72\%$
test_items_nested_leaf 0.2450ms 0.1976ms 5.0603 KOps/s 5.0429 KOps/s $\color{#35bf28}+0.35\%$
test_items_stack_nested 1.6304ms 1.5286ms 654.1915 Ops/s 662.5899 Ops/s $\color{#d91a1a}-1.27\%$
test_items_stack_nested_leaf 1.4501ms 1.3459ms 742.9795 Ops/s 746.5671 Ops/s $\color{#d91a1a}-0.48\%$
test_items_stack_nested_locked 2.1607ms 0.8510ms 1.1751 KOps/s 1.1864 KOps/s $\color{#d91a1a}-0.95\%$
test_keys 18.8210μs 4.6037μs 217.2172 KOps/s 219.0210 KOps/s $\color{#d91a1a}-0.82\%$
test_keys_nested 0.4830ms 90.6809μs 11.0277 KOps/s 11.1203 KOps/s $\color{#d91a1a}-0.83\%$
test_keys_nested_locked 0.1179ms 90.1904μs 11.0877 KOps/s 11.1463 KOps/s $\color{#d91a1a}-0.53\%$
test_keys_nested_leaf 43.9641ms 87.0321μs 11.4900 KOps/s 12.2189 KOps/s $\textbf{\color{#d91a1a}-5.96\%}$
test_keys_stack_nested 1.4368ms 1.3319ms 750.8090 Ops/s 773.6256 Ops/s $\color{#d91a1a}-2.95\%$
test_keys_stack_nested_leaf 1.4217ms 1.3195ms 757.8659 Ops/s 773.3090 Ops/s $\color{#d91a1a}-2.00\%$
test_keys_stack_nested_locked 0.7080ms 0.6549ms 1.5270 KOps/s 1.5700 KOps/s $\color{#d91a1a}-2.74\%$
test_values 6.7300μs 1.8866μs 530.0664 KOps/s 522.8253 KOps/s $\color{#35bf28}+1.38\%$
test_values_nested 57.5610μs 42.7080μs 23.4148 KOps/s 22.7092 KOps/s $\color{#35bf28}+3.11\%$
test_values_nested_locked 62.9810μs 44.9519μs 22.2460 KOps/s 21.3124 KOps/s $\color{#35bf28}+4.38\%$
test_values_nested_leaf 59.2610μs 37.2939μs 26.8140 KOps/s 26.1867 KOps/s $\color{#35bf28}+2.40\%$
test_values_stack_nested 1.2655ms 1.1646ms 858.6580 Ops/s 868.9868 Ops/s $\color{#d91a1a}-1.19\%$
test_values_stack_nested_leaf 1.2753ms 1.1602ms 861.9155 Ops/s 883.3477 Ops/s $\color{#d91a1a}-2.43\%$
test_values_stack_nested_locked 0.5926ms 0.5302ms 1.8862 KOps/s 1.9564 KOps/s $\color{#d91a1a}-3.59\%$
test_membership 3.6500μs 0.9391μs 1.0648 MOps/s 1.0464 MOps/s $\color{#35bf28}+1.76\%$
test_membership_nested 23.7600μs 2.1583μs 463.3240 KOps/s 459.1089 KOps/s $\color{#35bf28}+0.92\%$
test_membership_nested_leaf 10.1305μs 2.0983μs 476.5833 KOps/s 476.9854 KOps/s $\color{#d91a1a}-0.08\%$
test_membership_stacked_nested 30.2810μs 11.0465μs 90.5264 KOps/s 90.1905 KOps/s $\color{#35bf28}+0.37\%$
test_membership_stacked_nested_leaf 35.0110μs 10.9795μs 91.0792 KOps/s 90.2692 KOps/s $\color{#35bf28}+0.90\%$
test_membership_nested_last 18.0710μs 4.5537μs 219.6037 KOps/s 220.8910 KOps/s $\color{#d91a1a}-0.58\%$
test_membership_nested_leaf_last 19.8310μs 4.5839μs 218.1530 KOps/s 220.6416 KOps/s $\color{#d91a1a}-1.13\%$
test_membership_stacked_nested_last 0.1740ms 0.1345ms 7.4353 KOps/s 7.4437 KOps/s $\color{#d91a1a}-0.11\%$
test_membership_stacked_nested_leaf_last 31.4300μs 13.0207μs 76.8006 KOps/s 77.4676 KOps/s $\color{#d91a1a}-0.86\%$
test_nested_getleaf 23.0800μs 8.3920μs 119.1612 KOps/s 118.8625 KOps/s $\color{#35bf28}+0.25\%$
test_nested_get 26.6000μs 7.9119μs 126.3926 KOps/s 126.3992 KOps/s $-0.01\%$
test_stacked_getleaf 0.6486ms 0.5814ms 1.7201 KOps/s 1.7253 KOps/s $\color{#d91a1a}-0.30\%$
test_stacked_get 0.5900ms 0.5252ms 1.9042 KOps/s 1.8527 KOps/s $\color{#35bf28}+2.78\%$
test_nested_getitemleaf 22.6110μs 8.4488μs 118.3594 KOps/s 118.4758 KOps/s $\color{#d91a1a}-0.10\%$
test_nested_getitem 21.9600μs 7.9492μs 125.7989 KOps/s 125.0643 KOps/s $\color{#35bf28}+0.59\%$
test_stacked_getitemleaf 0.6404ms 0.5705ms 1.7527 KOps/s 1.7404 KOps/s $\color{#35bf28}+0.71\%$
test_stacked_getitem 0.6076ms 0.5447ms 1.8360 KOps/s 1.8499 KOps/s $\color{#d91a1a}-0.75\%$
test_lock_nested 3.2293ms 0.5651ms 1.7695 KOps/s 1.8107 KOps/s $\color{#d91a1a}-2.27\%$
test_lock_stack_nested 81.9023ms 7.4007ms 135.1219 Ops/s 137.4924 Ops/s $\color{#d91a1a}-1.72\%$
test_unlock_nested 2.4230ms 0.4434ms 2.2551 KOps/s 2.3240 KOps/s $\color{#d91a1a}-2.97\%$
test_unlock_stack_nested 67.8373ms 6.4354ms 155.3915 Ops/s 162.6617 Ops/s $\color{#d91a1a}-4.47\%$
test_flatten_speed 0.2382ms 0.1874ms 5.3359 KOps/s 5.3636 KOps/s $\color{#d91a1a}-0.52\%$
test_unflatten_speed 0.4188ms 0.3643ms 2.7452 KOps/s 2.7787 KOps/s $\color{#d91a1a}-1.20\%$
test_common_ops 1.1720ms 0.6355ms 1.5736 KOps/s 1.6173 KOps/s $\color{#d91a1a}-2.70\%$
test_creation 32.3800μs 2.1362μs 468.1209 KOps/s 474.7009 KOps/s $\color{#d91a1a}-1.39\%$
test_creation_empty 20.4710μs 7.1181μs 140.4860 KOps/s 140.1415 KOps/s $\color{#35bf28}+0.25\%$
test_creation_nested_1 23.9300μs 9.5247μs 104.9900 KOps/s 105.4605 KOps/s $\color{#d91a1a}-0.45\%$
test_creation_nested_2 37.4600μs 12.0783μs 82.7930 KOps/s 82.8462 KOps/s $\color{#d91a1a}-0.06\%$
test_clone 92.7020μs 15.0847μs 66.2921 KOps/s 68.2404 KOps/s $\color{#d91a1a}-2.85\%$
test_getitem[int] 27.1000μs 12.6511μs 79.0446 KOps/s 79.5639 KOps/s $\color{#d91a1a}-0.65\%$
test_getitem[slice_int] 50.2410μs 25.0973μs 39.8450 KOps/s 42.2764 KOps/s $\textbf{\color{#d91a1a}-5.75\%}$
test_getitem[range] 67.7610μs 43.5909μs 22.9406 KOps/s 24.2779 KOps/s $\textbf{\color{#d91a1a}-5.51\%}$
test_getitem[tuple] 40.2210μs 20.9360μs 47.7646 KOps/s 48.5056 KOps/s $\color{#d91a1a}-1.53\%$
test_getitem[list] 0.3130ms 39.7386μs 25.1644 KOps/s 26.1616 KOps/s $\color{#d91a1a}-3.81\%$
test_setitem_dim[int] 46.2910μs 27.6330μs 36.1886 KOps/s 37.7237 KOps/s $\color{#d91a1a}-4.07\%$
test_setitem_dim[slice_int] 68.0010μs 49.1357μs 20.3518 KOps/s 21.4540 KOps/s $\textbf{\color{#d91a1a}-5.14\%}$
test_setitem_dim[range] 90.6610μs 66.9909μs 14.9274 KOps/s 15.7620 KOps/s $\textbf{\color{#d91a1a}-5.29\%}$
test_setitem_dim[tuple] 59.4910μs 42.3692μs 23.6020 KOps/s 25.0521 KOps/s $\textbf{\color{#d91a1a}-5.79\%}$
test_setitem 0.1078ms 19.8063μs 50.4891 KOps/s 54.1196 KOps/s $\textbf{\color{#d91a1a}-6.71\%}$
test_set 85.6410μs 18.6904μs 53.5034 KOps/s 54.2346 KOps/s $\color{#d91a1a}-1.35\%$
test_set_shared 2.4812ms 0.1103ms 9.0673 KOps/s 8.4759 KOps/s $\textbf{\color{#35bf28}+6.98\%}$
test_update 95.6710μs 20.4499μs 48.9001 KOps/s 50.9149 KOps/s $\color{#d91a1a}-3.96\%$
test_update_nested 92.8220μs 26.6439μs 37.5320 KOps/s 37.3758 KOps/s $\color{#35bf28}+0.42\%$
test_set_nested 93.0110μs 20.4245μs 48.9607 KOps/s 50.3404 KOps/s $\color{#d91a1a}-2.74\%$
test_set_nested_new 0.1268ms 24.7211μs 40.4512 KOps/s 41.8311 KOps/s $\color{#d91a1a}-3.30\%$
test_select 0.1129ms 50.0465μs 19.9814 KOps/s 21.5182 KOps/s $\textbf{\color{#d91a1a}-7.14\%}$
test_to 78.5910μs 56.4286μs 17.7215 KOps/s 18.6708 KOps/s $\textbf{\color{#d91a1a}-5.08\%}$
test_to_nonblocking 67.4210μs 37.6170μs 26.5838 KOps/s 27.2182 KOps/s $\color{#d91a1a}-2.33\%$
test_unbind_speed 0.4487ms 0.3689ms 2.7108 KOps/s 2.7742 KOps/s $\color{#d91a1a}-2.29\%$
test_unbind_speed_stack0 63.5311ms 4.5284ms 220.8280 Ops/s 250.7490 Ops/s $\textbf{\color{#d91a1a}-11.93\%}$
test_unbind_speed_stack1 1.5300μs 0.5235μs 1.9103 MOps/s 1.8866 MOps/s $\color{#35bf28}+1.26\%$
test_split 54.2051ms 1.9305ms 517.9913 Ops/s 564.1756 Ops/s $\textbf{\color{#d91a1a}-8.19\%}$
test_chunk 1.9101ms 1.8194ms 549.6223 Ops/s 568.7057 Ops/s $\color{#d91a1a}-3.36\%$
test_creation[device0] 0.5733ms 0.3108ms 3.2173 KOps/s 3.2507 KOps/s $\color{#d91a1a}-1.03\%$
test_creation[device1] 0.7028ms 0.3145ms 3.1801 KOps/s 3.2109 KOps/s $\color{#d91a1a}-0.96\%$
test_creation_from_tensor 0.5650ms 0.3389ms 2.9510 KOps/s 2.7460 KOps/s $\textbf{\color{#35bf28}+7.47\%}$
test_add_one[memmap_tensor0] 0.2746ms 26.6706μs 37.4945 KOps/s 39.5285 KOps/s $\textbf{\color{#d91a1a}-5.15\%}$
test_add_one[memmap_tensor1] 0.2137ms 77.6819μs 12.8730 KOps/s 13.2468 KOps/s $\color{#d91a1a}-2.82\%$
test_contiguous[memmap_tensor0] 31.3800μs 6.4431μs 155.2052 KOps/s 164.9268 KOps/s $\textbf{\color{#d91a1a}-5.89\%}$
test_contiguous[memmap_tensor1] 52.2500μs 22.8859μs 43.6950 KOps/s 44.3196 KOps/s $\color{#d91a1a}-1.41\%$
test_stack[memmap_tensor0] 51.3200μs 21.1798μs 47.2148 KOps/s 50.1568 KOps/s $\textbf{\color{#d91a1a}-5.87\%}$
test_stack[memmap_tensor1] 0.1654ms 77.6374μs 12.8804 KOps/s 13.2112 KOps/s $\color{#d91a1a}-2.50\%$
test_memmaptd_index 0.3174ms 0.2568ms 3.8943 KOps/s 4.1426 KOps/s $\textbf{\color{#d91a1a}-5.99\%}$
test_memmaptd_index_astensor 0.3743ms 0.3125ms 3.1996 KOps/s 3.3556 KOps/s $\color{#d91a1a}-4.65\%$
test_memmaptd_index_op 0.6961ms 0.6166ms 1.6217 KOps/s 1.7177 KOps/s $\textbf{\color{#d91a1a}-5.59\%}$
test_reshape_pytree 39.7310μs 21.5853μs 46.3278 KOps/s 47.3457 KOps/s $\color{#d91a1a}-2.15\%$
test_reshape_td 52.6010μs 31.4683μs 31.7780 KOps/s 33.2620 KOps/s $\color{#d91a1a}-4.46\%$
test_view_pytree 41.8110μs 21.3205μs 46.9033 KOps/s 48.1585 KOps/s $\color{#d91a1a}-2.61\%$
test_view_td 18.3610μs 4.1067μs 243.5050 KOps/s 248.5765 KOps/s $\color{#d91a1a}-2.04\%$
test_unbind_pytree 51.5910μs 26.1320μs 38.2673 KOps/s 38.4220 KOps/s $\color{#d91a1a}-0.40\%$
test_unbind_td 86.3320μs 58.4678μs 17.1034 KOps/s 17.6095 KOps/s $\color{#d91a1a}-2.87\%$
test_split_pytree 45.6210μs 24.9699μs 40.0483 KOps/s 40.4766 KOps/s $\color{#d91a1a}-1.06\%$
test_split_td 64.8100μs 46.7983μs 21.3683 KOps/s 22.9378 KOps/s $\textbf{\color{#d91a1a}-6.84\%}$
test_add_pytree 56.9110μs 34.4658μs 29.0143 KOps/s 30.5062 KOps/s $\color{#d91a1a}-4.89\%$
test_add_td 70.2410μs 48.0806μs 20.7984 KOps/s 21.4485 KOps/s $\color{#d91a1a}-3.03\%$
test_distributed 18.6510μs 5.4414μs 183.7779 KOps/s 175.8735 KOps/s $\color{#35bf28}+4.49\%$
test_tdmodule 31.9200μs 16.8009μs 59.5206 KOps/s 59.1348 KOps/s $\color{#35bf28}+0.65\%$
test_tdmodule_dispatch 0.1960ms 33.4342μs 29.9095 KOps/s 29.7212 KOps/s $\color{#35bf28}+0.63\%$
test_tdseq 35.9800μs 20.0260μs 49.9350 KOps/s 49.7033 KOps/s $\color{#35bf28}+0.47\%$
test_tdseq_dispatch 54.6610μs 36.1507μs 27.6620 KOps/s 27.4540 KOps/s $\color{#35bf28}+0.76\%$
test_instantiation_functorch 1.8656ms 1.7116ms 584.2470 Ops/s 584.8492 Ops/s $\color{#d91a1a}-0.10\%$
test_instantiation_td 65.3680ms 1.2755ms 784.0038 Ops/s 846.8996 Ops/s $\textbf{\color{#d91a1a}-7.43\%}$
test_exec_functorch 0.2052ms 0.1651ms 6.0554 KOps/s 6.1230 KOps/s $\color{#d91a1a}-1.10\%$
test_exec_functional_call 0.2239ms 0.1666ms 6.0036 KOps/s 6.1941 KOps/s $\color{#d91a1a}-3.08\%$
test_exec_td 0.1938ms 0.1570ms 6.3695 KOps/s 6.5658 KOps/s $\color{#d91a1a}-2.99\%$
test_exec_td_decorator 0.9152ms 0.1944ms 5.1447 KOps/s 5.2518 KOps/s $\color{#d91a1a}-2.04\%$
test_vmap_mlp_speed[True-True] 1.1878ms 1.0985ms 910.3429 Ops/s 928.5978 Ops/s $\color{#d91a1a}-1.97\%$
test_vmap_mlp_speed[True-False] 0.6973ms 0.6279ms 1.5925 KOps/s 1.6243 KOps/s $\color{#d91a1a}-1.95\%$
test_vmap_mlp_speed[False-True] 1.0866ms 1.0041ms 995.8938 Ops/s 1.0212 KOps/s $\color{#d91a1a}-2.48\%$
test_vmap_mlp_speed[False-False] 0.6166ms 0.5550ms 1.8020 KOps/s 1.8456 KOps/s $\color{#d91a1a}-2.36\%$
test_vmap_mlp_speed_decorator[True-True] 2.5924ms 2.0822ms 480.2676 Ops/s 487.9255 Ops/s $\color{#d91a1a}-1.57\%$
test_vmap_mlp_speed_decorator[True-False] 1.0798ms 0.6734ms 1.4849 KOps/s 1.5212 KOps/s $\color{#d91a1a}-2.39\%$
test_vmap_mlp_speed_decorator[False-True] 2.2495ms 1.8131ms 551.5445 Ops/s 563.4929 Ops/s $\color{#d91a1a}-2.12\%$
test_vmap_mlp_speed_decorator[False-False] 1.0021ms 0.5707ms 1.7521 KOps/s 1.7901 KOps/s $\color{#d91a1a}-2.12\%$
test_vmap_transformer_speed[True-True] 13.1151ms 12.9622ms 77.1476 Ops/s 78.9779 Ops/s $\color{#d91a1a}-2.32\%$
test_vmap_transformer_speed[True-False] 8.7184ms 8.4870ms 117.8268 Ops/s 120.9117 Ops/s $\color{#d91a1a}-2.55\%$
test_vmap_transformer_speed[False-True] 13.9215ms 12.9353ms 77.3081 Ops/s 80.0397 Ops/s $\color{#d91a1a}-3.41\%$
test_vmap_transformer_speed[False-False] 8.7248ms 8.4044ms 118.9851 Ops/s 122.0678 Ops/s $\color{#d91a1a}-2.53\%$
test_vmap_transformer_speed_decorator[True-True] 67.2791ms 66.2806ms 15.0874 Ops/s 15.3713 Ops/s $\color{#d91a1a}-1.85\%$
test_vmap_transformer_speed_decorator[True-False] 97.8803ms 22.2199ms 45.0046 Ops/s 49.8353 Ops/s $\textbf{\color{#d91a1a}-9.69\%}$
test_vmap_transformer_speed_decorator[False-True] 61.7245ms 60.4289ms 16.5484 Ops/s 16.9925 Ops/s $\color{#d91a1a}-2.61\%$
test_vmap_transformer_speed_decorator[False-False] 22.3982ms 20.2392ms 49.4090 Ops/s 50.7167 Ops/s $\color{#d91a1a}-2.58\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants