-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Fix stack of tensorclasses (and nontensors) #820
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Jun 19, 2024
vmoens
added
bug
Something isn't working
and removed
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
labels
Jun 19, 2024
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Jun 19, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 34.0930μs | 17.1897μs | 58.1745 KOps/s | 60.8753 KOps/s | |
test_plain_set_stack_nested | 49.8340μs | 17.4352μs | 57.3552 KOps/s | 60.6307 KOps/s | |
test_plain_set_nested_inplace | 57.3780μs | 19.5598μs | 51.1253 KOps/s | 53.0998 KOps/s | |
test_plain_set_stack_nested_inplace | 73.4380μs | 19.2301μs | 52.0018 KOps/s | 53.5923 KOps/s | |
test_items | 36.1380μs | 2.7389μs | 365.1092 KOps/s | 384.6385 KOps/s | |
test_items_nested | 0.4142ms | 0.2633ms | 3.7976 KOps/s | 3.7425 KOps/s | |
test_items_nested_locked | 1.1477ms | 0.2648ms | 3.7764 KOps/s | 3.7382 KOps/s | |
test_items_nested_leaf | 0.1361ms | 76.9078μs | 13.0026 KOps/s | 12.8973 KOps/s | |
test_items_stack_nested | 0.4188ms | 0.2666ms | 3.7509 KOps/s | 3.7272 KOps/s | |
test_items_stack_nested_leaf | 0.1428ms | 77.5607μs | 12.8931 KOps/s | 13.1804 KOps/s | |
test_items_stack_nested_locked | 1.1430ms | 0.2674ms | 3.7401 KOps/s | 3.7389 KOps/s | |
test_keys | 22.5420μs | 4.0271μs | 248.3178 KOps/s | 260.7721 KOps/s | |
test_keys_nested | 0.2122ms | 0.1381ms | 7.2434 KOps/s | 7.1114 KOps/s | |
test_keys_nested_locked | 0.7168ms | 0.1432ms | 6.9824 KOps/s | 6.9589 KOps/s | |
test_keys_nested_leaf | 0.2075ms | 0.1162ms | 8.6034 KOps/s | 8.4797 KOps/s | |
test_keys_stack_nested | 0.2378ms | 0.1387ms | 7.2121 KOps/s | 7.3019 KOps/s | |
test_keys_stack_nested_leaf | 0.1965ms | 0.1169ms | 8.5561 KOps/s | 8.6310 KOps/s | |
test_keys_stack_nested_locked | 0.3057ms | 0.1447ms | 6.9102 KOps/s | 7.1197 KOps/s | |
test_values | 9.3298μs | 1.1735μs | 852.1274 KOps/s | 880.8301 KOps/s | |
test_values_nested | 0.1048ms | 50.3781μs | 19.8499 KOps/s | 19.4865 KOps/s | |
test_values_nested_locked | 0.1010ms | 50.3453μs | 19.8628 KOps/s | 19.4414 KOps/s | |
test_values_nested_leaf | 85.7710μs | 45.9300μs | 21.7723 KOps/s | 21.5985 KOps/s | |
test_values_stack_nested | 99.8160μs | 51.0966μs | 19.5708 KOps/s | 19.3302 KOps/s | |
test_values_stack_nested_leaf | 87.6140μs | 45.6974μs | 21.8831 KOps/s | 22.1686 KOps/s | |
test_values_stack_nested_locked | 0.1102ms | 50.2708μs | 19.8923 KOps/s | 19.5574 KOps/s | |
test_membership | 35.9770μs | 1.3268μs | 753.7140 KOps/s | 732.8642 KOps/s | |
test_membership_nested | 40.6160μs | 3.4294μs | 291.5935 KOps/s | 280.7667 KOps/s | |
test_membership_nested_leaf | 39.0840μs | 3.4240μs | 292.0522 KOps/s | 280.2631 KOps/s | |
test_membership_stacked_nested | 38.7630μs | 3.4397μs | 290.7196 KOps/s | 284.0692 KOps/s | |
test_membership_stacked_nested_leaf | 29.1150μs | 3.4303μs | 291.5191 KOps/s | 284.2234 KOps/s | |
test_membership_nested_last | 40.6360μs | 4.2312μs | 236.3413 KOps/s | 232.5825 KOps/s | |
test_membership_nested_leaf_last | 49.7270μs | 4.1845μs | 238.9746 KOps/s | 232.9815 KOps/s | |
test_membership_stacked_nested_last | 24.2150μs | 4.2971μs | 232.7133 KOps/s | 73.8089 KOps/s | |
test_membership_stacked_nested_leaf_last | 41.4670μs | 4.2196μs | 236.9900 KOps/s | 74.1847 KOps/s | |
test_nested_getleaf | 36.7790μs | 10.5047μs | 95.1956 KOps/s | 93.0481 KOps/s | |
test_nested_get | 47.8800μs | 10.0411μs | 99.5910 KOps/s | 98.6265 KOps/s | |
test_stacked_getleaf | 52.7480μs | 10.3842μs | 96.3004 KOps/s | 93.7722 KOps/s | |
test_stacked_get | 45.5850μs | 9.9835μs | 100.1650 KOps/s | 99.9561 KOps/s | |
test_nested_getitemleaf | 66.7780μs | 11.0328μs | 90.6385 KOps/s | 91.5765 KOps/s | |
test_nested_getitem | 46.8870μs | 10.2331μs | 97.7225 KOps/s | 97.9164 KOps/s | |
test_stacked_getitemleaf | 45.3250μs | 11.2147μs | 89.1686 KOps/s | 90.4063 KOps/s | |
test_stacked_getitem | 31.8900μs | 10.2192μs | 97.8548 KOps/s | 98.1783 KOps/s | |
test_lock_nested | 0.8087ms | 0.3447ms | 2.9010 KOps/s | 2.9173 KOps/s | |
test_lock_stack_nested | 0.6857ms | 0.3137ms | 3.1881 KOps/s | 3.3438 KOps/s | |
test_unlock_nested | 0.9498ms | 0.3477ms | 2.8757 KOps/s | 2.8355 KOps/s | |
test_unlock_stack_nested | 0.6346ms | 0.3216ms | 3.1099 KOps/s | 3.2646 KOps/s | |
test_flatten_speed | 0.5393ms | 93.9752μs | 10.6411 KOps/s | 10.4144 KOps/s | |
test_unflatten_speed | 0.6190ms | 0.4038ms | 2.4765 KOps/s | 2.4080 KOps/s | |
test_common_ops | 4.6699ms | 0.7476ms | 1.3376 KOps/s | 1.4180 KOps/s | |
test_creation | 51.6870μs | 1.9809μs | 504.8305 KOps/s | 522.7033 KOps/s | |
test_creation_empty | 46.6170μs | 11.0524μs | 90.4778 KOps/s | 108.0304 KOps/s | |
test_creation_nested_1 | 41.2970μs | 13.7409μs | 72.7752 KOps/s | 82.7492 KOps/s | |
test_creation_nested_2 | 58.4290μs | 17.1237μs | 58.3984 KOps/s | 64.9887 KOps/s | |
test_clone | 0.1894ms | 13.4275μs | 74.4743 KOps/s | 74.9993 KOps/s | |
test_getitem[int] | 52.7880μs | 11.3909μs | 87.7894 KOps/s | 87.8218 KOps/s | |
test_getitem[slice_int] | 72.5650μs | 22.8362μs | 43.7900 KOps/s | 43.6397 KOps/s | |
test_getitem[range] | 86.4310μs | 61.8414μs | 16.1704 KOps/s | 16.6605 KOps/s | |
test_getitem[tuple] | 64.7820μs | 18.7414μs | 53.3577 KOps/s | 52.8816 KOps/s | |
test_getitem[list] | 0.1778ms | 42.7920μs | 23.3688 KOps/s | 24.2160 KOps/s | |
test_setitem_dim[int] | 63.5790μs | 34.9728μs | 28.5936 KOps/s | 28.5599 KOps/s | |
test_setitem_dim[slice_int] | 0.1114ms | 63.1980μs | 15.8233 KOps/s | 15.4684 KOps/s | |
test_setitem_dim[range] | 0.2618ms | 87.3200μs | 11.4521 KOps/s | 11.7661 KOps/s | |
test_setitem_dim[tuple] | 0.1181ms | 51.1064μs | 19.5670 KOps/s | 19.7985 KOps/s | |
test_setitem | 71.4140μs | 20.5245μs | 48.7223 KOps/s | 52.1022 KOps/s | |
test_set | 64.4410μs | 20.0689μs | 49.8283 KOps/s | 53.6159 KOps/s | |
test_set_shared | 4.0178ms | 0.1466ms | 6.8191 KOps/s | 6.9158 KOps/s | |
test_update | 0.1681ms | 23.1562μs | 43.1850 KOps/s | 50.8227 KOps/s | |
test_update_nested | 96.6210μs | 31.6188μs | 31.6267 KOps/s | 34.6748 KOps/s | |
test_update__nested | 76.2330μs | 25.0452μs | 39.9278 KOps/s | 38.5926 KOps/s | |
test_set_nested | 0.1054ms | 22.1874μs | 45.0707 KOps/s | 47.1343 KOps/s | |
test_set_nested_new | 97.8130μs | 26.1974μs | 38.1718 KOps/s | 38.2946 KOps/s | |
test_select | 0.1240ms | 41.7797μs | 23.9351 KOps/s | 24.8441 KOps/s | |
test_select_nested | 0.1121ms | 59.8738μs | 16.7018 KOps/s | 16.8578 KOps/s | |
test_exclude_nested | 0.2918ms | 0.1182ms | 8.4605 KOps/s | 8.3963 KOps/s | |
test_empty[True] | 0.6454ms | 0.4030ms | 2.4816 KOps/s | 2.5359 KOps/s | |
test_empty[False] | 8.1472μs | 1.1681μs | 856.0737 KOps/s | 851.4972 KOps/s | |
test_unbind_speed | 0.4437ms | 0.2538ms | 3.9405 KOps/s | 3.8822 KOps/s | |
test_unbind_speed_stack0 | 0.4097ms | 0.2568ms | 3.8940 KOps/s | 4.0572 KOps/s | |
test_unbind_speed_stack1 | 0.8171ms | 0.6517ms | 1.5344 KOps/s | 1.4004 KOps/s | |
test_split | 69.6045ms | 1.6156ms | 618.9544 Ops/s | 625.4742 Ops/s | |
test_chunk | 70.8881ms | 1.6148ms | 619.2741 Ops/s | 619.4332 Ops/s | |
test_creation[device0] | 0.2052ms | 86.5312μs | 11.5565 KOps/s | 11.7076 KOps/s | |
test_creation_from_tensor | 3.5792ms | 87.8421μs | 11.3841 KOps/s | 11.3096 KOps/s | |
test_add_one[memmap_tensor0] | 0.1261ms | 5.6842μs | 175.9262 KOps/s | 182.0096 KOps/s | |
test_contiguous[memmap_tensor0] | 7.5440μs | 0.6337μs | 1.5779 MOps/s | 1.5731 MOps/s | |
test_stack[memmap_tensor0] | 26.5590μs | 3.7435μs | 267.1298 KOps/s | 282.4426 KOps/s | |
test_memmaptd_index | 0.9995ms | 0.2522ms | 3.9651 KOps/s | 4.0007 KOps/s | |
test_memmaptd_index_astensor | 0.7350ms | 0.3279ms | 3.0501 KOps/s | 3.0871 KOps/s | |
test_memmaptd_index_op | 1.0183ms | 0.6244ms | 1.6014 KOps/s | 1.6947 KOps/s | |
test_serialize_model | 0.1821s | 0.1153s | 8.6741 Ops/s | 8.5203 Ops/s | |
test_serialize_model_pickle | 0.4475s | 0.3799s | 2.6323 Ops/s | 2.4591 Ops/s | |
test_serialize_weights | 0.1759s | 0.1122s | 8.9154 Ops/s | 8.5751 Ops/s | |
test_serialize_weights_returnearly | 0.1977s | 0.1352s | 7.3990 Ops/s | 7.1674 Ops/s | |
test_serialize_weights_pickle | 0.7076s | 0.4853s | 2.0605 Ops/s | 2.4651 Ops/s | |
test_serialize_weights_filesystem | 94.6158ms | 91.8466ms | 10.8877 Ops/s | 10.7686 Ops/s | |
test_serialize_model_filesystem | 0.1631s | 0.1023s | 9.7789 Ops/s | 9.6710 Ops/s | |
test_reshape_pytree | 53.7000μs | 25.3638μs | 39.4263 KOps/s | 39.0742 KOps/s | |
test_reshape_td | 78.1660μs | 33.8176μs | 29.5704 KOps/s | 28.5995 KOps/s | |
test_view_pytree | 68.6990μs | 25.4950μs | 39.2234 KOps/s | 38.8954 KOps/s | |
test_view_td | 0.1212ms | 38.4239μs | 26.0255 KOps/s | 25.7361 KOps/s | |
test_unbind_pytree | 73.6980μs | 29.5471μs | 33.8442 KOps/s | 33.9126 KOps/s | |
test_unbind_td | 0.3653ms | 37.7611μs | 26.4823 KOps/s | 26.2891 KOps/s | |
test_split_pytree | 73.0470μs | 29.5196μs | 33.8758 KOps/s | 34.1940 KOps/s | |
test_split_td | 0.1237ms | 40.4752μs | 24.7065 KOps/s | 24.8254 KOps/s | |
test_add_pytree | 88.3360μs | 35.8756μs | 27.8741 KOps/s | 28.6214 KOps/s | |
test_add_td | 0.1196ms | 55.8544μs | 17.9037 KOps/s | 18.4731 KOps/s | |
test_distributed | 0.2018ms | 0.1019ms | 9.8172 KOps/s | 9.6132 KOps/s | |
test_tdmodule | 46.1970μs | 18.4091μs | 54.3210 KOps/s | 58.7876 KOps/s | |
test_tdmodule_dispatch | 69.1190μs | 35.2257μs | 28.3883 KOps/s | 29.9371 KOps/s | |
test_tdseq | 54.8830μs | 20.6105μs | 48.5191 KOps/s | 51.4226 KOps/s | |
test_tdseq_dispatch | 75.7820μs | 40.5059μs | 24.6877 KOps/s | 26.0618 KOps/s | |
test_instantiation_functorch | 1.5852ms | 1.2897ms | 775.3525 Ops/s | 753.5812 Ops/s | |
test_instantiation_td | 1.7091ms | 1.0221ms | 978.4017 Ops/s | 971.2304 Ops/s | |
test_exec_functorch | 0.3041ms | 0.1617ms | 6.1833 KOps/s | 6.1664 KOps/s | |
test_exec_functional_call | 0.2813ms | 0.1494ms | 6.6944 KOps/s | 6.6970 KOps/s | |
test_exec_td | 0.2485ms | 0.1452ms | 6.8879 KOps/s | 6.8474 KOps/s | |
test_exec_td_decorator | 0.9500ms | 0.2241ms | 4.4626 KOps/s | 4.3612 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.7117ms | 0.5000ms | 2.0000 KOps/s | 2.0449 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.8166ms | 0.4973ms | 2.0109 KOps/s | 2.0845 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.6065ms | 0.4050ms | 2.4691 KOps/s | 2.5202 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.8009ms | 0.4090ms | 2.4448 KOps/s | 2.5198 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.3417ms | 0.5733ms | 1.7443 KOps/s | 1.7863 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.7897ms | 0.5682ms | 1.7599 KOps/s | 1.7904 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7250ms | 0.4682ms | 2.1356 KOps/s | 2.1770 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.6578ms | 0.4653ms | 2.1492 KOps/s | 2.1687 KOps/s | |
test_to_module_speed[True] | 2.3785ms | 1.6990ms | 588.5954 Ops/s | 589.1059 Ops/s | |
test_to_module_speed[False] | 72.9227ms | 1.8010ms | 555.2496 Ops/s | 550.5737 Ops/s | |
test_tc_init | 64.4610μs | 30.1319μs | 33.1874 KOps/s | 38.1334 KOps/s | |
test_tc_init_nested | 0.1227ms | 61.1356μs | 16.3571 KOps/s | 18.8585 KOps/s | |
test_tc_first_layer_tensor | 3.7749μs | 0.6812μs | 1.4680 MOps/s | 1.4143 MOps/s | |
test_tc_first_layer_nontensor | 2.1153μs | 0.6657μs | 1.5022 MOps/s | 1.4439 MOps/s | |
test_tc_second_layer_tensor | 18.7980μs | 1.8719μs | 534.2093 KOps/s | 526.8644 KOps/s | |
test_tc_second_layer_nontensor | 42.8410μs | 1.6297μs | 613.6222 KOps/s | 653.6085 KOps/s | |
test_unbind | 81.9658ms | 7.4390ms | 134.4266 Ops/s | 138.2737 Ops/s | |
test_full_like | 15.7704ms | 10.6495ms | 93.9008 Ops/s | 94.6739 Ops/s | |
test_zeros_like | 12.1753ms | 6.1137ms | 163.5678 Ops/s | 176.0668 Ops/s | |
test_ones_like | 12.2791ms | 6.3496ms | 157.4911 Ops/s | 162.4875 Ops/s | |
test_clone | 16.5575ms | 8.0128ms | 124.7999 Ops/s | 128.6842 Ops/s | |
test_squeeze | 60.7440μs | 13.6691μs | 73.1580 KOps/s | 69.9096 KOps/s | |
test_unsqueeze | 0.1257ms | 59.6242μs | 16.7717 KOps/s | 16.7404 KOps/s | |
test_split | 0.1960ms | 0.1130ms | 8.8484 KOps/s | 8.7849 KOps/s | |
test_permute | 0.2000ms | 0.1286ms | 7.7780 KOps/s | 7.7739 KOps/s | |
test_stack | 27.6823ms | 22.6909ms | 44.0705 Ops/s | 44.9535 Ops/s | |
test_cat | 28.7136ms | 24.3236ms | 41.1124 Ops/s | 44.9512 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 24.8710μs | 12.5577μs | 79.6323 KOps/s | 80.7910 KOps/s | |
test_plain_set_stack_nested | 28.5110μs | 12.8887μs | 77.5873 KOps/s | 78.8146 KOps/s | |
test_plain_set_nested_inplace | 37.6120μs | 14.0120μs | 71.3673 KOps/s | 72.8356 KOps/s | |
test_plain_set_stack_nested_inplace | 74.2150μs | 14.0182μs | 71.3356 KOps/s | 72.5156 KOps/s | |
test_items | 28.7620μs | 4.7223μs | 211.7611 KOps/s | 209.8478 KOps/s | |
test_items_nested | 0.3633ms | 0.3414ms | 2.9288 KOps/s | 2.9541 KOps/s | |
test_items_nested_locked | 0.3674ms | 0.3404ms | 2.9373 KOps/s | 2.9391 KOps/s | |
test_items_nested_leaf | 0.1012ms | 82.2934μs | 12.1516 KOps/s | 12.0547 KOps/s | |
test_items_stack_nested | 0.3585ms | 0.3371ms | 2.9666 KOps/s | 2.9086 KOps/s | |
test_items_stack_nested_leaf | 0.1110ms | 82.6973μs | 12.0923 KOps/s | 12.0161 KOps/s | |
test_items_stack_nested_locked | 0.3812ms | 0.3410ms | 2.9322 KOps/s | 2.9262 KOps/s | |
test_keys | 22.8410μs | 4.3647μs | 229.1095 KOps/s | 229.2003 KOps/s | |
test_keys_nested | 88.8450μs | 66.6552μs | 15.0026 KOps/s | 14.7529 KOps/s | |
test_keys_nested_locked | 2.4762ms | 71.4138μs | 14.0029 KOps/s | 13.7910 KOps/s | |
test_keys_nested_leaf | 80.8250μs | 57.1412μs | 17.5005 KOps/s | 17.1968 KOps/s | |
test_keys_stack_nested | 94.4660μs | 65.8606μs | 15.1836 KOps/s | 14.8218 KOps/s | |
test_keys_stack_nested_leaf | 84.0750μs | 57.0160μs | 17.5389 KOps/s | 17.2275 KOps/s | |
test_keys_stack_nested_locked | 0.1015ms | 70.4269μs | 14.1991 KOps/s | 13.9533 KOps/s | |
test_values | 10.4740μs | 1.8258μs | 547.7190 KOps/s | 537.5097 KOps/s | |
test_values_nested | 58.3540μs | 35.2060μs | 28.4042 KOps/s | 28.5012 KOps/s | |
test_values_nested_locked | 56.3040μs | 36.9057μs | 27.0961 KOps/s | 26.9379 KOps/s | |
test_values_nested_leaf | 51.3930μs | 31.3639μs | 31.8838 KOps/s | 32.0048 KOps/s | |
test_values_stack_nested | 63.9240μs | 35.9578μs | 27.8103 KOps/s | 27.7501 KOps/s | |
test_values_stack_nested_leaf | 55.1540μs | 31.8362μs | 31.4108 KOps/s | 31.2306 KOps/s | |
test_values_stack_nested_locked | 62.6640μs | 37.4814μs | 26.6799 KOps/s | 26.4866 KOps/s | |
test_membership | 4.3931μs | 0.7263μs | 1.3768 MOps/s | 1.4216 MOps/s | |
test_membership_nested | 19.9210μs | 2.5892μs | 386.2189 KOps/s | 383.5603 KOps/s | |
test_membership_nested_leaf | 25.1020μs | 2.6154μs | 382.3499 KOps/s | 380.2530 KOps/s | |
test_membership_stacked_nested | 21.4910μs | 2.6163μs | 382.2127 KOps/s | 386.7757 KOps/s | |
test_membership_stacked_nested_leaf | 20.5510μs | 2.5689μs | 389.2777 KOps/s | 387.0145 KOps/s | |
test_membership_nested_last | 32.6620μs | 3.0998μs | 322.6055 KOps/s | 319.5239 KOps/s | |
test_membership_nested_leaf_last | 21.6610μs | 3.0826μs | 324.4057 KOps/s | 318.1077 KOps/s | |
test_membership_stacked_nested_last | 40.3120μs | 9.8155μs | 101.8801 KOps/s | 278.4616 KOps/s | |
test_membership_stacked_nested_leaf_last | 24.3610μs | 9.7845μs | 102.2021 KOps/s | 278.7410 KOps/s | |
test_nested_getleaf | 34.3320μs | 8.4383μs | 118.5073 KOps/s | 119.4465 KOps/s | |
test_nested_get | 34.5630μs | 7.9280μs | 126.1356 KOps/s | 127.3868 KOps/s | |
test_stacked_getleaf | 27.7320μs | 8.4314μs | 118.6037 KOps/s | 118.7666 KOps/s | |
test_stacked_get | 37.8520μs | 7.9316μs | 126.0776 KOps/s | 126.3014 KOps/s | |
test_nested_getitemleaf | 32.6520μs | 8.5825μs | 116.5164 KOps/s | 116.7016 KOps/s | |
test_nested_getitem | 23.8110μs | 8.0598μs | 124.0720 KOps/s | 124.9090 KOps/s | |
test_stacked_getitemleaf | 35.3820μs | 8.6591μs | 115.4855 KOps/s | 116.0807 KOps/s | |
test_stacked_getitem | 29.9120μs | 8.1707μs | 122.3884 KOps/s | 124.3248 KOps/s | |
test_lock_nested | 59.9662ms | 0.4000ms | 2.4998 KOps/s | 2.5272 KOps/s | |
test_lock_stack_nested | 0.3108ms | 0.2896ms | 3.4532 KOps/s | 3.3663 KOps/s | |
test_unlock_nested | 61.5180ms | 0.4036ms | 2.4774 KOps/s | 2.4761 KOps/s | |
test_unlock_stack_nested | 0.3368ms | 0.3003ms | 3.3305 KOps/s | 3.2672 KOps/s | |
test_flatten_speed | 0.3934ms | 0.1014ms | 9.8666 KOps/s | 9.8981 KOps/s | |
test_unflatten_speed | 0.3353ms | 0.2924ms | 3.4201 KOps/s | 3.4369 KOps/s | |
test_common_ops | 1.0374ms | 0.5715ms | 1.7497 KOps/s | 1.7599 KOps/s | |
test_creation | 37.1420μs | 1.6643μs | 600.8709 KOps/s | 600.5082 KOps/s | |
test_creation_empty | 23.1420μs | 8.2462μs | 121.2686 KOps/s | 124.7682 KOps/s | |
test_creation_nested_1 | 34.3820μs | 10.0159μs | 99.8414 KOps/s | 102.3222 KOps/s | |
test_creation_nested_2 | 44.5030μs | 12.2744μs | 81.4703 KOps/s | 83.2697 KOps/s | |
test_clone | 0.1059ms | 11.6566μs | 85.7886 KOps/s | 85.7024 KOps/s | |
test_getitem[int] | 26.1110μs | 10.7838μs | 92.7320 KOps/s | 93.0268 KOps/s | |
test_getitem[slice_int] | 57.6030μs | 20.3667μs | 49.0997 KOps/s | 48.2879 KOps/s | |
test_getitem[range] | 65.2440μs | 47.7970μs | 20.9218 KOps/s | 21.6972 KOps/s | |
test_getitem[tuple] | 55.7340μs | 18.5628μs | 53.8711 KOps/s | 53.9451 KOps/s | |
test_getitem[list] | 0.1189ms | 34.0958μs | 29.3291 KOps/s | 29.8544 KOps/s | |
test_setitem_dim[int] | 50.3440μs | 28.0381μs | 35.6658 KOps/s | 35.9692 KOps/s | |
test_setitem_dim[slice_int] | 70.5350μs | 49.4391μs | 20.2269 KOps/s | 20.8307 KOps/s | |
test_setitem_dim[range] | 0.1081ms | 66.1750μs | 15.1114 KOps/s | 15.2565 KOps/s | |
test_setitem_dim[tuple] | 62.3740μs | 41.5762μs | 24.0522 KOps/s | 23.7802 KOps/s | |
test_setitem | 50.1730μs | 16.1784μs | 61.8108 KOps/s | 63.5600 KOps/s | |
test_set | 0.1377ms | 15.6956μs | 63.7123 KOps/s | 65.7019 KOps/s | |
test_set_shared | 1.6249ms | 0.1004ms | 9.9605 KOps/s | 10.0918 KOps/s | |
test_update | 91.3960μs | 18.1164μs | 55.1987 KOps/s | 58.3043 KOps/s | |
test_update_nested | 58.0240μs | 23.6856μs | 42.2198 KOps/s | 45.0623 KOps/s | |
test_update__nested | 69.4150μs | 22.4512μs | 44.5410 KOps/s | 44.8206 KOps/s | |
test_set_nested | 52.1130μs | 16.6743μs | 59.9724 KOps/s | 61.2423 KOps/s | |
test_set_nested_new | 60.7340μs | 19.3520μs | 51.6743 KOps/s | 52.5873 KOps/s | |
test_select | 76.9750μs | 32.4296μs | 30.8360 KOps/s | 30.5816 KOps/s | |
test_select_nested | 0.8763ms | 55.8618μs | 17.9013 KOps/s | 18.4689 KOps/s | |
test_exclude_nested | 0.1331ms | 0.1086ms | 9.2046 KOps/s | 9.2439 KOps/s | |
test_empty[True] | 0.3831ms | 0.3484ms | 2.8699 KOps/s | 2.9225 KOps/s | |
test_empty[False] | 2.3232μs | 0.9336μs | 1.0712 MOps/s | 1.0706 MOps/s | |
test_to | 0.1033ms | 77.4261μs | 12.9155 KOps/s | 13.3318 KOps/s | |
test_to_nonblocking | 0.2114ms | 62.2499μs | 16.0643 KOps/s | 16.5890 KOps/s | |
test_unbind_speed | 0.8515ms | 0.2562ms | 3.9025 KOps/s | 3.8484 KOps/s | |
test_unbind_speed_stack0 | 0.2956ms | 0.2561ms | 3.9046 KOps/s | 3.8443 KOps/s | |
test_unbind_speed_stack1 | 76.6163ms | 0.7929ms | 1.2613 KOps/s | 1.2532 KOps/s | |
test_split | 76.6260ms | 1.6779ms | 596.0008 Ops/s | 595.0581 Ops/s | |
test_chunk | 76.8836ms | 1.6705ms | 598.6310 Ops/s | 599.8837 Ops/s | |
test_creation[device0] | 0.1129ms | 58.4498μs | 17.1087 KOps/s | 17.1486 KOps/s | |
test_creation_from_tensor | 0.1621ms | 54.2785μs | 18.4235 KOps/s | 18.5918 KOps/s | |
test_add_one[memmap_tensor0] | 87.2060μs | 6.7916μs | 147.2418 KOps/s | 149.0618 KOps/s | |
test_contiguous[memmap_tensor0] | 25.3410μs | 0.7085μs | 1.4115 MOps/s | 1.4504 MOps/s | |
test_stack[memmap_tensor0] | 33.3920μs | 4.7359μs | 211.1519 KOps/s | 217.9426 KOps/s | |
test_memmaptd_index | 1.0840ms | 0.2852ms | 3.5060 KOps/s | 3.4759 KOps/s | |
test_memmaptd_index_astensor | 0.7045ms | 0.3557ms | 2.8115 KOps/s | 2.8025 KOps/s | |
test_memmaptd_index_op | 1.0395ms | 0.6408ms | 1.5605 KOps/s | 1.5872 KOps/s | |
test_serialize_model | 0.1829s | 0.1099s | 9.0955 Ops/s | 9.5211 Ops/s | |
test_serialize_model_pickle | 1.3704s | 1.2387s | 0.8073 Ops/s | 0.8071 Ops/s | |
test_serialize_weights | 0.1815s | 0.1085s | 9.2152 Ops/s | 8.7908 Ops/s | |
test_serialize_weights_returnearly | 0.2577s | 0.1001s | 9.9853 Ops/s | 12.4686 Ops/s | |
test_serialize_weights_pickle | 1.3507s | 1.2488s | 0.8008 Ops/s | 0.8009 Ops/s | |
test_reshape_pytree | 0.1719ms | 26.3667μs | 37.9266 KOps/s | 38.4222 KOps/s | |
test_reshape_td | 0.1610ms | 31.4599μs | 31.7865 KOps/s | 32.8143 KOps/s | |
test_view_pytree | 0.1574ms | 26.3161μs | 37.9996 KOps/s | 38.7853 KOps/s | |
test_view_td | 0.1572ms | 36.3987μs | 27.4735 KOps/s | 27.1546 KOps/s | |
test_unbind_pytree | 58.6840μs | 31.4727μs | 31.7736 KOps/s | 29.8248 KOps/s | |
test_unbind_td | 0.4608ms | 39.8834μs | 25.0731 KOps/s | 25.1870 KOps/s | |
test_split_pytree | 54.0730μs | 33.9993μs | 29.4124 KOps/s | 27.8065 KOps/s | |
test_split_td | 0.1046ms | 39.9286μs | 25.0447 KOps/s | 25.5118 KOps/s | |
test_add_pytree | 72.5750μs | 37.1987μs | 26.8827 KOps/s | 25.6463 KOps/s | |
test_add_td | 82.3750μs | 50.5184μs | 19.7948 KOps/s | 20.0095 KOps/s | |
test_distributed | 0.1830ms | 70.5772μs | 14.1689 KOps/s | 13.8423 KOps/s | |
test_tdmodule | 0.1360ms | 14.5357μs | 68.7963 KOps/s | 66.9373 KOps/s | |
test_tdmodule_dispatch | 43.7620μs | 28.6674μs | 34.8828 KOps/s | 35.2070 KOps/s | |
test_tdseq | 32.5420μs | 16.4918μs | 60.6363 KOps/s | 60.2860 KOps/s | |
test_tdseq_dispatch | 47.8830μs | 31.8433μs | 31.4038 KOps/s | 31.0279 KOps/s | |
test_instantiation_functorch | 1.6269ms | 1.5167ms | 659.3109 Ops/s | 652.7405 Ops/s | |
test_instantiation_td | 1.8628ms | 1.0350ms | 966.2251 Ops/s | 937.4874 Ops/s | |
test_exec_functorch | 0.1858ms | 0.1459ms | 6.8525 KOps/s | 6.6546 KOps/s | |
test_exec_functional_call | 0.1795ms | 0.1310ms | 7.6342 KOps/s | 7.4389 KOps/s | |
test_exec_td | 0.1647ms | 0.1301ms | 7.6879 KOps/s | 7.1609 KOps/s | |
test_exec_td_decorator | 0.5109ms | 0.2008ms | 4.9788 KOps/s | 4.7954 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.7561ms | 0.5658ms | 1.7673 KOps/s | 1.7208 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.6578ms | 0.5624ms | 1.7779 KOps/s | 1.7527 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.5621ms | 0.5035ms | 1.9860 KOps/s | 1.9552 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.5683ms | 0.5127ms | 1.9506 KOps/s | 1.9567 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.1656ms | 0.6385ms | 1.5662 KOps/s | 1.5527 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.7920ms | 0.6364ms | 1.5713 KOps/s | 1.5686 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7044ms | 0.5647ms | 1.7708 KOps/s | 1.5819 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7598ms | 0.5837ms | 1.7131 KOps/s | 1.7466 KOps/s | |
test_vmap_transformer_speed[True-True] | 8.0190ms | 7.5788ms | 131.9463 Ops/s | 136.6080 Ops/s | |
test_vmap_transformer_speed[True-False] | 8.0025ms | 7.5023ms | 133.2925 Ops/s | 137.0938 Ops/s | |
test_vmap_transformer_speed[False-True] | 8.2188ms | 7.4683ms | 133.8988 Ops/s | 136.6549 Ops/s | |
test_vmap_transformer_speed[False-False] | 7.7921ms | 7.4334ms | 134.5273 Ops/s | 138.0213 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 19.0319ms | 18.2517ms | 54.7894 Ops/s | 56.1436 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 18.7769ms | 18.2600ms | 54.7645 Ops/s | 55.0494 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 18.4784ms | 18.1192ms | 55.1900 Ops/s | 56.7702 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 18.8025ms | 18.1893ms | 54.9774 Ops/s | 56.7676 Ops/s | |
test_to_module_speed[True] | 1.8819ms | 1.5726ms | 635.8944 Ops/s | 650.9032 Ops/s | |
test_to_module_speed[False] | 1.8102ms | 1.5475ms | 646.2233 Ops/s | 655.8360 Ops/s | |
test_tc_init | 0.1576ms | 24.4753μs | 40.8575 KOps/s | 41.1552 KOps/s | |
test_tc_init_nested | 0.1903ms | 53.4223μs | 18.7188 KOps/s | 20.9665 KOps/s | |
test_tc_first_layer_tensor | 3.4955μs | 0.3649μs | 2.7402 MOps/s | 2.7547 MOps/s | |
test_tc_first_layer_nontensor | 10.4935μs | 0.3992μs | 2.5049 MOps/s | 2.5375 MOps/s | |
test_tc_second_layer_tensor | 26.0616μs | 0.9839μs | 1.0163 MOps/s | 931.8189 KOps/s | |
test_tc_second_layer_nontensor | 21.6780μs | 0.8385μs | 1.1925 MOps/s | 1.2187 MOps/s | |
test_unbind | 0.1061s | 6.3835ms | 156.6537 Ops/s | 126.5784 Ops/s | |
test_full_like | 12.3694ms | 11.8126ms | 84.6552 Ops/s | 75.1922 Ops/s | |
test_zeros_like | 8.6668ms | 7.9469ms | 125.8351 Ops/s | 126.6928 Ops/s | |
test_ones_like | 8.3081ms | 7.8801ms | 126.9020 Ops/s | 125.3878 Ops/s | |
test_clone | 9.8927ms | 9.4707ms | 105.5885 Ops/s | 104.4127 Ops/s | |
test_squeeze | 55.5530μs | 10.8906μs | 91.8221 KOps/s | 90.3298 KOps/s | |
test_unsqueeze | 95.0160μs | 51.0622μs | 19.5840 KOps/s | 18.8186 KOps/s | |
test_split | 0.1562ms | 98.1719μs | 10.1862 KOps/s | 9.9531 KOps/s | |
test_permute | 0.1421ms | 0.1095ms | 9.1350 KOps/s | 8.7085 KOps/s | |
test_stack | 28.3179ms | 27.5490ms | 36.2990 Ops/s | 36.2740 Ops/s | |
test_cat | 27.6968ms | 27.2959ms | 36.6356 Ops/s | 36.5803 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
bug
Something isn't working
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.