-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Performance] Faster to_module #575
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Nov 24, 2023
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 37.1490μs | 16.0992μs | 62.1148 KOps/s | 62.1769 KOps/s | |
test_plain_set_stack_nested | 0.2536ms | 0.1454ms | 6.8769 KOps/s | 6.8277 KOps/s | |
test_plain_set_nested_inplace | 46.0060μs | 19.3833μs | 51.5909 KOps/s | 52.1739 KOps/s | |
test_plain_set_stack_nested_inplace | 0.3055ms | 0.1771ms | 5.6462 KOps/s | 5.6618 KOps/s | |
test_items | 0.1294ms | 2.6046μs | 383.9299 KOps/s | 355.0682 KOps/s | |
test_items_nested | 0.4759ms | 0.2706ms | 3.6955 KOps/s | 3.7514 KOps/s | |
test_items_nested_locked | 0.9910ms | 0.2705ms | 3.6963 KOps/s | 3.7441 KOps/s | |
test_items_nested_leaf | 0.5848ms | 0.1650ms | 6.0608 KOps/s | 6.0306 KOps/s | |
test_items_stack_nested | 1.6879ms | 1.4844ms | 673.6512 Ops/s | 675.2712 Ops/s | |
test_items_stack_nested_leaf | 2.0860ms | 1.3681ms | 730.9665 Ops/s | 739.1634 Ops/s | |
test_items_stack_nested_locked | 0.8691ms | 0.7655ms | 1.3064 KOps/s | 1.2949 KOps/s | |
test_keys | 0.1263ms | 3.8974μs | 256.5800 KOps/s | 261.0417 KOps/s | |
test_keys_nested | 1.4512ms | 0.1427ms | 7.0060 KOps/s | 6.6669 KOps/s | |
test_keys_nested_locked | 0.1944ms | 0.1418ms | 7.0499 KOps/s | 7.0865 KOps/s | |
test_keys_nested_leaf | 0.3243ms | 0.1425ms | 7.0152 KOps/s | 7.0783 KOps/s | |
test_keys_stack_nested | 2.1390ms | 1.4132ms | 707.5926 Ops/s | 704.4243 Ops/s | |
test_keys_stack_nested_leaf | 2.1344ms | 1.4081ms | 710.1575 Ops/s | 704.5780 Ops/s | |
test_keys_stack_nested_locked | 1.1629ms | 0.6725ms | 1.4871 KOps/s | 1.4407 KOps/s | |
test_values | 8.2378μs | 1.1873μs | 842.2167 KOps/s | 856.3997 KOps/s | |
test_values_nested | 85.7500μs | 49.4506μs | 20.2222 KOps/s | 20.3051 KOps/s | |
test_values_nested_locked | 98.3640μs | 49.5034μs | 20.2006 KOps/s | 20.1147 KOps/s | |
test_values_nested_leaf | 55.7240μs | 43.9320μs | 22.7625 KOps/s | 22.5214 KOps/s | |
test_values_stack_nested | 1.3934ms | 1.1883ms | 841.5384 Ops/s | 831.9000 Ops/s | |
test_values_stack_nested_leaf | 1.2703ms | 1.1853ms | 843.6415 Ops/s | 836.3661 Ops/s | |
test_values_stack_nested_locked | 0.8939ms | 0.5101ms | 1.9605 KOps/s | 1.9189 KOps/s | |
test_membership | 16.0700μs | 1.3606μs | 734.9906 KOps/s | 739.8790 KOps/s | |
test_membership_nested | 36.8490μs | 2.7888μs | 358.5803 KOps/s | 345.7939 KOps/s | |
test_membership_nested_leaf | 21.7200μs | 2.8290μs | 353.4816 KOps/s | 344.4963 KOps/s | |
test_membership_stacked_nested | 45.5450μs | 11.7055μs | 85.4299 KOps/s | 84.3855 KOps/s | |
test_membership_stacked_nested_leaf | 40.7360μs | 11.5619μs | 86.4907 KOps/s | 83.6561 KOps/s | |
test_membership_nested_last | 28.0020μs | 5.8693μs | 170.3768 KOps/s | 162.8009 KOps/s | |
test_membership_nested_leaf_last | 39.9240μs | 5.9031μs | 169.4018 KOps/s | 167.2527 KOps/s | |
test_membership_stacked_nested_last | 0.3300ms | 0.1691ms | 5.9145 KOps/s | 5.8964 KOps/s | |
test_membership_stacked_nested_leaf_last | 32.7510μs | 13.5994μs | 73.5329 KOps/s | 73.7285 KOps/s | |
test_nested_getleaf | 46.5970μs | 10.7366μs | 93.1396 KOps/s | 95.1910 KOps/s | |
test_nested_get | 36.1080μs | 10.2612μs | 97.4547 KOps/s | 99.3129 KOps/s | |
test_stacked_getleaf | 0.8595ms | 0.6464ms | 1.5471 KOps/s | 1.5420 KOps/s | |
test_stacked_get | 1.3914ms | 0.6133ms | 1.6306 KOps/s | 1.6055 KOps/s | |
test_nested_getitemleaf | 49.9930μs | 10.7118μs | 93.3549 KOps/s | 92.7389 KOps/s | |
test_nested_getitem | 39.4340μs | 10.1618μs | 98.4080 KOps/s | 99.0053 KOps/s | |
test_stacked_getitemleaf | 1.0988ms | 0.6451ms | 1.5502 KOps/s | 1.5464 KOps/s | |
test_stacked_getitem | 1.1529ms | 0.6268ms | 1.5953 KOps/s | 1.6313 KOps/s | |
test_lock_nested | 55.0275ms | 0.5424ms | 1.8438 KOps/s | 2.0256 KOps/s | |
test_lock_stack_nested | 74.1596ms | 8.2480ms | 121.2418 Ops/s | 124.8375 Ops/s | |
test_unlock_nested | 60.7838ms | 0.5023ms | 1.9909 KOps/s | 1.9536 KOps/s | |
test_unlock_stack_nested | 68.5761ms | 8.0016ms | 124.9752 Ops/s | 206.1984 Ops/s | |
test_flatten_speed | 1.1797ms | 0.2788ms | 3.5870 KOps/s | 3.6930 KOps/s | |
test_unflatten_speed | 0.5351ms | 0.4677ms | 2.1383 KOps/s | 2.1598 KOps/s | |
test_common_ops | 4.1997ms | 0.6804ms | 1.4698 KOps/s | 1.4932 KOps/s | |
test_creation | 25.7980μs | 2.4049μs | 415.8128 KOps/s | 418.3835 KOps/s | |
test_creation_empty | 39.6340μs | 7.9539μs | 125.7243 KOps/s | 122.5708 KOps/s | |
test_creation_nested_1 | 40.5860μs | 11.3128μs | 88.3954 KOps/s | 85.2253 KOps/s | |
test_creation_nested_2 | 38.5520μs | 14.6970μs | 68.0412 KOps/s | 66.2535 KOps/s | |
test_clone | 85.1090μs | 13.2138μs | 75.6784 KOps/s | 74.8756 KOps/s | |
test_getitem[int] | 44.7530μs | 12.9914μs | 76.9741 KOps/s | 75.2508 KOps/s | |
test_getitem[slice_int] | 72.7760μs | 25.0768μs | 39.8776 KOps/s | 39.5558 KOps/s | |
test_getitem[range] | 84.3670μs | 45.5076μs | 21.9744 KOps/s | 21.6114 KOps/s | |
test_getitem[tuple] | 66.7840μs | 20.4001μs | 49.0194 KOps/s | 48.4561 KOps/s | |
test_getitem[list] | 0.2674ms | 40.3544μs | 24.7804 KOps/s | 24.2328 KOps/s | |
test_setitem_dim[int] | 0.1004ms | 28.2860μs | 35.3532 KOps/s | 36.3653 KOps/s | |
test_setitem_dim[slice_int] | 85.9900μs | 52.4884μs | 19.0518 KOps/s | 18.9072 KOps/s | |
test_setitem_dim[range] | 0.1118ms | 72.4376μs | 13.8050 KOps/s | 13.6967 KOps/s | |
test_setitem_dim[tuple] | 87.0130μs | 41.1120μs | 24.3238 KOps/s | 24.6220 KOps/s | |
test_setitem | 0.1266ms | 18.2367μs | 54.8345 KOps/s | 53.8737 KOps/s | |
test_set | 0.1272ms | 17.4067μs | 57.4490 KOps/s | 56.1146 KOps/s | |
test_set_shared | 1.9672ms | 0.1401ms | 7.1383 KOps/s | 7.1165 KOps/s | |
test_update | 0.1106ms | 19.1860μs | 52.1215 KOps/s | 53.0755 KOps/s | |
test_update_nested | 0.1493ms | 26.5327μs | 37.6893 KOps/s | 38.1510 KOps/s | |
test_set_nested | 0.1175ms | 19.9209μs | 50.1985 KOps/s | 50.7384 KOps/s | |
test_set_nested_new | 0.1160ms | 24.7793μs | 40.3562 KOps/s | 38.8773 KOps/s | |
test_select | 0.1217ms | 50.3912μs | 19.8448 KOps/s | 19.9157 KOps/s | |
test_unbind_speed | 0.4404ms | 0.3745ms | 2.6703 KOps/s | 2.6889 KOps/s | |
test_unbind_speed_stack0 | 65.5388ms | 5.2956ms | 188.8346 Ops/s | 248.8094 Ops/s | |
test_unbind_speed_stack1 | 2.6093μs | 0.6335μs | 1.5787 MOps/s | 1.5718 MOps/s | |
test_split | 55.3698ms | 1.7556ms | 569.6016 Ops/s | 557.7401 Ops/s | |
test_chunk | 58.6198ms | 1.7392ms | 574.9751 Ops/s | 567.6994 Ops/s | |
test_creation[device0] | 5.2488ms | 0.2950ms | 3.3901 KOps/s | 3.2776 KOps/s | |
test_creation_from_tensor | 59.5799ms | 0.3580ms | 2.7933 KOps/s | 2.9859 KOps/s | |
test_add_one[memmap_tensor0] | 70.6520μs | 25.7879μs | 38.7779 KOps/s | 40.0790 KOps/s | |
test_contiguous[memmap_tensor0] | 29.4350μs | 6.0130μs | 166.3074 KOps/s | 171.7384 KOps/s | |
test_stack[memmap_tensor0] | 96.9910μs | 19.6944μs | 50.7759 KOps/s | 51.3259 KOps/s | |
test_memmaptd_index | 0.4862ms | 0.4016ms | 2.4899 KOps/s | 2.4456 KOps/s | |
test_memmaptd_index_astensor | 0.9022ms | 0.4683ms | 2.1352 KOps/s | 2.0830 KOps/s | |
test_memmaptd_index_op | 0.8051ms | 0.7051ms | 1.4183 KOps/s | 1.3855 KOps/s | |
test_reshape_pytree | 0.3323ms | 23.2215μs | 43.0635 KOps/s | 41.6877 KOps/s | |
test_reshape_td | 89.5580μs | 31.6180μs | 31.6276 KOps/s | 30.5861 KOps/s | |
test_view_pytree | 75.1900μs | 23.4766μs | 42.5956 KOps/s | 42.6087 KOps/s | |
test_view_td | 23.1530μs | 4.8547μs | 205.9853 KOps/s | 203.5797 KOps/s | |
test_unbind_pytree | 86.3010μs | 26.3338μs | 37.9740 KOps/s | 37.5214 KOps/s | |
test_unbind_td | 0.1154ms | 59.7871μs | 16.7260 KOps/s | 16.6501 KOps/s | |
test_split_pytree | 89.1680μs | 26.2076μs | 38.1568 KOps/s | 37.5763 KOps/s | |
test_split_td | 0.1298ms | 46.9680μs | 21.2911 KOps/s | 20.9089 KOps/s | |
test_add_pytree | 70.2610μs | 32.3736μs | 30.8894 KOps/s | 30.8967 KOps/s | |
test_add_td | 0.1073ms | 45.0460μs | 22.1995 KOps/s | 21.9269 KOps/s | |
test_distributed | 24.8660μs | 6.0096μs | 166.4017 KOps/s | 166.5763 KOps/s | |
test_tdmodule | 0.1021ms | 20.9949μs | 47.6305 KOps/s | 46.4625 KOps/s | |
test_tdmodule_dispatch | 0.1678ms | 38.4929μs | 25.9788 KOps/s | 25.5022 KOps/s | |
test_tdseq | 0.3638ms | 24.4907μs | 40.8318 KOps/s | 40.6807 KOps/s | |
test_tdseq_dispatch | 0.4113ms | 42.3081μs | 23.6362 KOps/s | 23.4523 KOps/s | |
test_instantiation_functorch | 1.9890ms | 1.3153ms | 760.3111 Ops/s | 766.2344 Ops/s | |
test_instantiation_td | 1.5170ms | 1.0264ms | 974.2583 Ops/s | 977.4129 Ops/s | |
test_exec_functorch | 0.2263ms | 0.1617ms | 6.1849 KOps/s | 6.2688 KOps/s | |
test_exec_functional_call | 0.3898ms | 0.1465ms | 6.8240 KOps/s | 6.6695 KOps/s | |
test_exec_td | 0.2165ms | 0.1423ms | 7.0269 KOps/s | 6.8472 KOps/s | |
test_exec_td_decorator | 1.0105ms | 0.1807ms | 5.5343 KOps/s | 3.6724 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.0075ms | 0.8958ms | 1.1164 KOps/s | 1.0970 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.7111ms | 0.4673ms | 2.1398 KOps/s | 2.0827 KOps/s | |
test_vmap_mlp_speed[False-True] | 1.1664ms | 0.7819ms | 1.2790 KOps/s | 1.2631 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.5048ms | 0.3861ms | 2.5898 KOps/s | 2.5397 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 2.7442ms | 1.7742ms | 563.6298 Ops/s | 623.8952 Ops/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.9940ms | 0.5179ms | 1.9310 KOps/s | 1.7831 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 2.3741ms | 1.4872ms | 672.3859 Ops/s | 728.0136 Ops/s | |
test_vmap_mlp_speed_decorator[False-False] | 1.2460ms | 0.4025ms | 2.4844 KOps/s | 2.2905 KOps/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.4211ms | 12.6294μs | 79.1805 KOps/s | 78.9239 KOps/s | |
test_plain_set_stack_nested | 0.1431ms | 0.1154ms | 8.6671 KOps/s | 8.3673 KOps/s | |
test_plain_set_nested_inplace | 39.8720μs | 14.8730μs | 67.2359 KOps/s | 66.2317 KOps/s | |
test_plain_set_stack_nested_inplace | 0.1716ms | 0.1394ms | 7.1752 KOps/s | 7.1217 KOps/s | |
test_items | 24.0710μs | 4.9031μs | 203.9512 KOps/s | 207.8566 KOps/s | |
test_items_nested | 0.4046ms | 0.3358ms | 2.9784 KOps/s | 2.9564 KOps/s | |
test_items_nested_locked | 0.3949ms | 0.3365ms | 2.9718 KOps/s | 2.9811 KOps/s | |
test_items_nested_leaf | 0.2241ms | 0.1981ms | 5.0482 KOps/s | 5.0160 KOps/s | |
test_items_stack_nested | 1.5875ms | 1.4776ms | 676.7658 Ops/s | 678.4091 Ops/s | |
test_items_stack_nested_leaf | 1.3704ms | 1.3053ms | 766.0841 Ops/s | 764.0761 Ops/s | |
test_items_stack_nested_locked | 1.7633ms | 0.7967ms | 1.2552 KOps/s | 1.2552 KOps/s | |
test_keys | 24.3810μs | 4.5738μs | 218.6377 KOps/s | 218.6668 KOps/s | |
test_keys_nested | 0.4960ms | 90.3431μs | 11.0689 KOps/s | 11.1941 KOps/s | |
test_keys_nested_locked | 0.1210ms | 89.6516μs | 11.1543 KOps/s | 11.2335 KOps/s | |
test_keys_nested_leaf | 42.0050ms | 86.9698μs | 11.4982 KOps/s | 12.3349 KOps/s | |
test_keys_stack_nested | 1.3674ms | 1.2767ms | 783.2743 Ops/s | 772.7949 Ops/s | |
test_keys_stack_nested_leaf | 1.3863ms | 1.2673ms | 789.0860 Ops/s | 776.1860 Ops/s | |
test_keys_stack_nested_locked | 0.6909ms | 0.5958ms | 1.6783 KOps/s | 1.6771 KOps/s | |
test_values | 8.2603μs | 1.8848μs | 530.5684 KOps/s | 533.7036 KOps/s | |
test_values_nested | 74.6130μs | 42.8407μs | 23.3423 KOps/s | 23.2731 KOps/s | |
test_values_nested_locked | 65.3330μs | 43.0486μs | 23.2296 KOps/s | 22.9672 KOps/s | |
test_values_nested_leaf | 64.8340μs | 37.1143μs | 26.9438 KOps/s | 26.5796 KOps/s | |
test_values_stack_nested | 1.1897ms | 1.1283ms | 886.3239 Ops/s | 893.0326 Ops/s | |
test_values_stack_nested_leaf | 1.1748ms | 1.1087ms | 901.9867 Ops/s | 899.7830 Ops/s | |
test_values_stack_nested_locked | 0.5304ms | 0.4728ms | 2.1151 KOps/s | 2.0980 KOps/s | |
test_membership | 4.5818μs | 0.9278μs | 1.0778 MOps/s | 1.0693 MOps/s | |
test_membership_nested | 28.3510μs | 2.1844μs | 457.7935 KOps/s | 468.9322 KOps/s | |
test_membership_nested_leaf | 16.0055μs | 2.1360μs | 468.1577 KOps/s | 475.1231 KOps/s | |
test_membership_stacked_nested | 44.0520μs | 10.7129μs | 93.3450 KOps/s | 92.6988 KOps/s | |
test_membership_stacked_nested_leaf | 35.3520μs | 10.7207μs | 93.2775 KOps/s | 91.1184 KOps/s | |
test_membership_nested_last | 21.5510μs | 4.5702μs | 218.8085 KOps/s | 217.3792 KOps/s | |
test_membership_nested_leaf_last | 33.2210μs | 4.5652μs | 219.0503 KOps/s | 218.4525 KOps/s | |
test_membership_stacked_nested_last | 0.1649ms | 0.1330ms | 7.5187 KOps/s | 7.4985 KOps/s | |
test_membership_stacked_nested_leaf_last | 32.1320μs | 12.5882μs | 79.4396 KOps/s | 79.1724 KOps/s | |
test_nested_getleaf | 29.7310μs | 8.3609μs | 119.6048 KOps/s | 118.9282 KOps/s | |
test_nested_get | 29.3810μs | 7.8730μs | 127.0170 KOps/s | 125.7927 KOps/s | |
test_stacked_getleaf | 0.6145ms | 0.5606ms | 1.7839 KOps/s | 1.7410 KOps/s | |
test_stacked_get | 0.5766ms | 0.5282ms | 1.8932 KOps/s | 1.8597 KOps/s | |
test_nested_getitemleaf | 31.4010μs | 8.3752μs | 119.3996 KOps/s | 118.1746 KOps/s | |
test_nested_getitem | 28.8910μs | 7.9380μs | 125.9770 KOps/s | 124.9850 KOps/s | |
test_stacked_getitemleaf | 0.6342ms | 0.5613ms | 1.7814 KOps/s | 1.7608 KOps/s | |
test_stacked_getitem | 0.6048ms | 0.5378ms | 1.8596 KOps/s | 1.8769 KOps/s | |
test_lock_nested | 4.3950ms | 0.4539ms | 2.2033 KOps/s | 2.1870 KOps/s | |
test_lock_stack_nested | 67.9166ms | 6.5143ms | 153.5091 Ops/s | 151.5574 Ops/s | |
test_unlock_nested | 1.2988ms | 0.4327ms | 2.3108 KOps/s | 2.0322 KOps/s | |
test_unlock_stack_nested | 63.8541ms | 7.2507ms | 137.9185 Ops/s | 138.2704 Ops/s | |
test_flatten_speed | 0.5201ms | 0.1869ms | 5.3505 KOps/s | 5.3941 KOps/s | |
test_unflatten_speed | 0.4305ms | 0.3620ms | 2.7625 KOps/s | 2.7980 KOps/s | |
test_common_ops | 1.0558ms | 0.5823ms | 1.7174 KOps/s | 1.6958 KOps/s | |
test_creation | 13.5710μs | 1.9300μs | 518.1304 KOps/s | 520.0878 KOps/s | |
test_creation_empty | 24.9510μs | 6.5387μs | 152.9363 KOps/s | 141.9366 KOps/s | |
test_creation_nested_1 | 41.6520μs | 8.9675μs | 111.5143 KOps/s | 106.3376 KOps/s | |
test_creation_nested_2 | 30.2810μs | 11.6067μs | 86.1575 KOps/s | 83.5301 KOps/s | |
test_clone | 0.1066ms | 13.7770μs | 72.5850 KOps/s | 72.5156 KOps/s | |
test_getitem[int] | 39.7020μs | 11.8738μs | 84.2192 KOps/s | 83.8293 KOps/s | |
test_getitem[slice_int] | 48.3020μs | 22.5313μs | 44.3828 KOps/s | 43.1771 KOps/s | |
test_getitem[range] | 61.6730μs | 38.7413μs | 25.8123 KOps/s | 25.1509 KOps/s | |
test_getitem[tuple] | 49.5520μs | 19.3865μs | 51.5824 KOps/s | 50.3319 KOps/s | |
test_getitem[list] | 0.3019ms | 35.5470μs | 28.1317 KOps/s | 27.2563 KOps/s | |
test_setitem_dim[int] | 40.8520μs | 24.0906μs | 41.5099 KOps/s | 38.5385 KOps/s | |
test_setitem_dim[slice_int] | 61.1930μs | 43.5627μs | 22.9554 KOps/s | 21.8642 KOps/s | |
test_setitem_dim[range] | 83.4740μs | 61.2008μs | 16.3397 KOps/s | 15.8878 KOps/s | |
test_setitem_dim[tuple] | 78.6440μs | 37.5969μs | 26.5980 KOps/s | 25.8887 KOps/s | |
test_setitem | 0.1126ms | 17.5392μs | 57.0153 KOps/s | 57.2480 KOps/s | |
test_set | 0.1082ms | 16.8393μs | 59.3850 KOps/s | 58.2965 KOps/s | |
test_set_shared | 2.6463ms | 98.9800μs | 10.1030 KOps/s | 9.3730 KOps/s | |
test_update | 0.1069ms | 18.0709μs | 55.3375 KOps/s | 54.4232 KOps/s | |
test_update_nested | 0.1285ms | 24.5734μs | 40.6944 KOps/s | 40.5572 KOps/s | |
test_set_nested | 0.1069ms | 18.2419μs | 54.8187 KOps/s | 55.2669 KOps/s | |
test_set_nested_new | 0.1208ms | 22.5770μs | 44.2930 KOps/s | 43.8337 KOps/s | |
test_select | 0.1467ms | 46.2471μs | 21.6230 KOps/s | 21.9435 KOps/s | |
test_to | 73.6140μs | 51.1312μs | 19.5575 KOps/s | 19.7928 KOps/s | |
test_to_nonblocking | 62.2230μs | 33.2236μs | 30.0990 KOps/s | 29.6131 KOps/s | |
test_unbind_speed | 0.3799ms | 0.3452ms | 2.8967 KOps/s | 2.8854 KOps/s | |
test_unbind_speed_stack0 | 60.0775ms | 5.0693ms | 197.2671 Ops/s | 195.3944 Ops/s | |
test_unbind_speed_stack1 | 1.9831μs | 0.5290μs | 1.8902 MOps/s | 1.9091 MOps/s | |
test_split | 53.0894ms | 1.7574ms | 569.0289 Ops/s | 562.7664 Ops/s | |
test_chunk | 53.0076ms | 1.7579ms | 568.8754 Ops/s | 567.4860 Ops/s | |
test_creation[device0] | 0.4213ms | 0.3093ms | 3.2334 KOps/s | 3.2287 KOps/s | |
test_creation[device1] | 54.9972ms | 0.3358ms | 2.9781 KOps/s | 3.1891 KOps/s | |
test_creation_from_tensor | 0.5787ms | 0.3376ms | 2.9620 KOps/s | 2.6957 KOps/s | |
test_add_one[memmap_tensor0] | 70.2840μs | 22.6726μs | 44.1062 KOps/s | 42.6968 KOps/s | |
test_add_one[memmap_tensor1] | 0.2095ms | 71.9854μs | 13.8917 KOps/s | 13.9939 KOps/s | |
test_contiguous[memmap_tensor0] | 29.9920μs | 5.7976μs | 172.4843 KOps/s | 174.7782 KOps/s | |
test_contiguous[memmap_tensor1] | 53.0330μs | 21.0752μs | 47.4491 KOps/s | 47.8573 KOps/s | |
test_stack[memmap_tensor0] | 49.0720μs | 18.9636μs | 52.7326 KOps/s | 51.9975 KOps/s | |
test_stack[memmap_tensor1] | 0.1528ms | 72.1606μs | 13.8580 KOps/s | 13.4228 KOps/s | |
test_memmaptd_index | 0.4609ms | 0.4199ms | 2.3812 KOps/s | 2.3747 KOps/s | |
test_memmaptd_index_astensor | 0.5294ms | 0.4776ms | 2.0939 KOps/s | 2.0862 KOps/s | |
test_memmaptd_index_op | 0.7716ms | 0.7150ms | 1.3987 KOps/s | 1.3514 KOps/s | |
test_reshape_pytree | 42.7710μs | 20.6918μs | 48.3282 KOps/s | 47.9222 KOps/s | |
test_reshape_td | 48.2920μs | 29.8524μs | 33.4982 KOps/s | 33.5291 KOps/s | |
test_view_pytree | 35.4120μs | 20.4906μs | 48.8028 KOps/s | 48.3753 KOps/s | |
test_view_td | 23.4510μs | 4.0882μs | 244.6036 KOps/s | 245.3667 KOps/s | |
test_unbind_pytree | 49.5920μs | 25.4534μs | 39.2875 KOps/s | 39.3311 KOps/s | |
test_unbind_td | 77.6140μs | 54.9088μs | 18.2120 KOps/s | 17.8613 KOps/s | |
test_split_pytree | 0.8458ms | 24.1525μs | 41.4035 KOps/s | 41.6226 KOps/s | |
test_split_td | 73.0030μs | 42.3103μs | 23.6349 KOps/s | 22.4112 KOps/s | |
test_add_pytree | 47.7130μs | 30.4811μs | 32.8073 KOps/s | 33.1077 KOps/s | |
test_add_td | 66.9230μs | 40.8044μs | 24.5071 KOps/s | 23.9581 KOps/s | |
test_distributed | 17.6010μs | 5.5205μs | 181.1427 KOps/s | 175.9385 KOps/s | |
test_tdmodule | 35.4810μs | 16.4018μs | 60.9690 KOps/s | 59.7600 KOps/s | |
test_tdmodule_dispatch | 0.2673ms | 32.1188μs | 31.1344 KOps/s | 30.5154 KOps/s | |
test_tdseq | 34.6420μs | 19.4664μs | 51.3705 KOps/s | 49.8794 KOps/s | |
test_tdseq_dispatch | 0.1510ms | 34.8325μs | 28.7088 KOps/s | 26.9404 KOps/s | |
test_instantiation_functorch | 1.7643ms | 1.6793ms | 595.5019 Ops/s | 593.8954 Ops/s | |
test_instantiation_td | 1.9085ms | 1.1762ms | 850.2305 Ops/s | 852.1358 Ops/s | |
test_exec_functorch | 0.2046ms | 0.1543ms | 6.4801 KOps/s | 6.4676 KOps/s | |
test_exec_functional_call | 0.2163ms | 0.1533ms | 6.5229 KOps/s | 6.5892 KOps/s | |
test_exec_td | 0.1734ms | 0.1420ms | 7.0409 KOps/s | 6.9983 KOps/s | |
test_exec_td_decorator | 63.9216ms | 0.2010ms | 4.9750 KOps/s | 4.6120 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.5470ms | 1.0549ms | 947.9602 Ops/s | 945.3579 Ops/s | |
test_vmap_mlp_speed[True-False] | 0.6434ms | 0.5903ms | 1.6941 KOps/s | 1.6718 KOps/s | |
test_vmap_mlp_speed[False-True] | 1.0100ms | 0.9586ms | 1.0432 KOps/s | 1.0312 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.5813ms | 0.5237ms | 1.9095 KOps/s | 1.8943 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 2.7425ms | 2.0231ms | 494.2838 Ops/s | 566.1121 Ops/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.1712ms | 0.6410ms | 1.5600 KOps/s | 1.4916 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 2.2573ms | 1.7400ms | 574.7166 Ops/s | 630.2028 Ops/s | |
test_vmap_mlp_speed_decorator[False-False] | 1.0054ms | 0.5454ms | 1.8335 KOps/s | 1.7749 KOps/s | |
test_vmap_transformer_speed[True-True] | 12.3646ms | 12.2601ms | 81.5656 Ops/s | 81.3220 Ops/s | |
test_vmap_transformer_speed[True-False] | 8.0023ms | 7.9440ms | 125.8818 Ops/s | 125.1025 Ops/s | |
test_vmap_transformer_speed[False-True] | 12.2011ms | 12.1082ms | 82.5887 Ops/s | 81.2234 Ops/s | |
test_vmap_transformer_speed[False-False] | 7.9540ms | 7.8556ms | 127.2983 Ops/s | 126.3453 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 64.2383ms | 63.2098ms | 15.8203 Ops/s | 23.7193 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 20.9064ms | 19.2437ms | 51.9650 Ops/s | 47.1718 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 58.8411ms | 57.6712ms | 17.3397 Ops/s | 23.9580 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 99.0540ms | 20.4384ms | 48.9275 Ops/s | 48.1389 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Performance
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.