-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Refactor] use from_file instead of mmap+from_buffer for readonly files #808
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Jun 10, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.2727ms | 18.5146μs | 54.0116 KOps/s | 57.9460 KOps/s | |
test_plain_set_stack_nested | 0.2785ms | 17.5949μs | 56.8346 KOps/s | 57.1607 KOps/s | |
test_plain_set_nested_inplace | 52.9380μs | 18.2429μs | 54.8157 KOps/s | 50.4429 KOps/s | |
test_plain_set_stack_nested_inplace | 45.6950μs | 18.3260μs | 54.5672 KOps/s | 50.5923 KOps/s | |
test_items | 16.1500μs | 2.6635μs | 375.4445 KOps/s | 384.7576 KOps/s | |
test_items_nested | 0.4016ms | 0.2655ms | 3.7669 KOps/s | 3.7647 KOps/s | |
test_items_nested_locked | 1.1236ms | 0.2680ms | 3.7313 KOps/s | 3.7364 KOps/s | |
test_items_nested_leaf | 0.1480ms | 76.1620μs | 13.1299 KOps/s | 12.8894 KOps/s | |
test_items_stack_nested | 0.3405ms | 0.2699ms | 3.7051 KOps/s | 3.6961 KOps/s | |
test_items_stack_nested_leaf | 0.1917ms | 79.0886μs | 12.6441 KOps/s | 11.9649 KOps/s | |
test_items_stack_nested_locked | 1.0170ms | 0.2692ms | 3.7154 KOps/s | 3.7118 KOps/s | |
test_keys | 24.4350μs | 3.8029μs | 262.9571 KOps/s | 255.2959 KOps/s | |
test_keys_nested | 0.2529ms | 0.1362ms | 7.3441 KOps/s | 7.1271 KOps/s | |
test_keys_nested_locked | 0.7401ms | 0.1415ms | 7.0681 KOps/s | 6.9209 KOps/s | |
test_keys_nested_leaf | 0.2185ms | 0.1166ms | 8.5769 KOps/s | 8.3734 KOps/s | |
test_keys_stack_nested | 0.2869ms | 0.1369ms | 7.3050 KOps/s | 7.1770 KOps/s | |
test_keys_stack_nested_leaf | 0.2070ms | 0.1149ms | 8.7005 KOps/s | 8.4141 KOps/s | |
test_keys_stack_nested_locked | 0.2008ms | 0.1414ms | 7.0722 KOps/s | 6.9730 KOps/s | |
test_values | 7.7795μs | 1.1804μs | 847.1778 KOps/s | 861.0168 KOps/s | |
test_values_nested | 0.1860ms | 51.9600μs | 19.2456 KOps/s | 19.6810 KOps/s | |
test_values_nested_locked | 0.3054ms | 51.9235μs | 19.2591 KOps/s | 19.7050 KOps/s | |
test_values_nested_leaf | 0.1040ms | 46.9658μs | 21.2921 KOps/s | 21.5650 KOps/s | |
test_values_stack_nested | 97.9220μs | 52.7980μs | 18.9401 KOps/s | 19.3199 KOps/s | |
test_values_stack_nested_leaf | 77.7950μs | 46.5575μs | 21.4788 KOps/s | 21.7479 KOps/s | |
test_values_stack_nested_locked | 86.8210μs | 52.7520μs | 18.9566 KOps/s | 19.4860 KOps/s | |
test_membership | 16.0300μs | 1.3555μs | 737.7584 KOps/s | 735.2388 KOps/s | |
test_membership_nested | 32.1200μs | 3.5112μs | 284.8028 KOps/s | 288.7505 KOps/s | |
test_membership_nested_leaf | 22.4920μs | 3.5336μs | 282.9990 KOps/s | 292.3248 KOps/s | |
test_membership_stacked_nested | 25.9490μs | 3.4788μs | 287.4529 KOps/s | 258.5084 KOps/s | |
test_membership_stacked_nested_leaf | 0.1193ms | 3.5389μs | 282.5765 KOps/s | 294.6031 KOps/s | |
test_membership_nested_last | 0.2241ms | 4.2577μs | 234.8708 KOps/s | 241.9792 KOps/s | |
test_membership_nested_leaf_last | 30.8580μs | 4.2638μs | 234.5310 KOps/s | 238.9135 KOps/s | |
test_membership_stacked_nested_last | 28.1220μs | 4.8239μs | 207.3001 KOps/s | 208.1500 KOps/s | |
test_membership_stacked_nested_leaf_last | 21.8300μs | 4.8551μs | 205.9678 KOps/s | 208.3851 KOps/s | |
test_nested_getleaf | 52.3760μs | 10.8712μs | 91.9861 KOps/s | 92.9227 KOps/s | |
test_nested_get | 50.7850μs | 10.2488μs | 97.5729 KOps/s | 98.6171 KOps/s | |
test_stacked_getleaf | 0.2615ms | 11.2197μs | 89.1293 KOps/s | 94.2741 KOps/s | |
test_stacked_get | 57.3580μs | 10.2121μs | 97.9230 KOps/s | 99.5787 KOps/s | |
test_nested_getitemleaf | 45.1540μs | 11.2492μs | 88.8951 KOps/s | 87.3725 KOps/s | |
test_nested_getitem | 30.6970μs | 10.7002μs | 93.4562 KOps/s | 89.7092 KOps/s | |
test_stacked_getitemleaf | 49.9930μs | 11.3728μs | 87.9293 KOps/s | 88.0735 KOps/s | |
test_stacked_getitem | 31.6090μs | 10.5192μs | 95.0645 KOps/s | 96.0376 KOps/s | |
test_lock_nested | 60.1212ms | 0.4054ms | 2.4667 KOps/s | 2.8716 KOps/s | |
test_lock_stack_nested | 0.3618ms | 0.3070ms | 3.2577 KOps/s | 3.1984 KOps/s | |
test_unlock_nested | 0.7674ms | 0.3526ms | 2.8360 KOps/s | 2.4406 KOps/s | |
test_unlock_stack_nested | 0.6053ms | 0.3155ms | 3.1694 KOps/s | 3.1344 KOps/s | |
test_flatten_speed | 0.2329ms | 94.4436μs | 10.5883 KOps/s | 10.3727 KOps/s | |
test_unflatten_speed | 0.7386ms | 0.4138ms | 2.4165 KOps/s | 2.4702 KOps/s | |
test_common_ops | 4.1928ms | 0.6737ms | 1.4843 KOps/s | 1.3485 KOps/s | |
test_creation | 25.5680μs | 1.9351μs | 516.7730 KOps/s | 532.5191 KOps/s | |
test_creation_empty | 26.5000μs | 8.0864μs | 123.6651 KOps/s | 78.9303 KOps/s | |
test_creation_nested_1 | 31.1580μs | 10.9291μs | 91.4984 KOps/s | 67.4244 KOps/s | |
test_creation_nested_2 | 43.0700μs | 14.1709μs | 70.5671 KOps/s | 57.0033 KOps/s | |
test_clone | 0.1474ms | 13.4029μs | 74.6107 KOps/s | 71.4109 KOps/s | |
test_getitem[int] | 35.4860μs | 11.5314μs | 86.7198 KOps/s | 86.3591 KOps/s | |
test_getitem[slice_int] | 53.5600μs | 22.6025μs | 44.2428 KOps/s | 43.1458 KOps/s | |
test_getitem[range] | 81.9330μs | 59.4488μs | 16.8212 KOps/s | 13.7243 KOps/s | |
test_getitem[tuple] | 68.2270μs | 19.0092μs | 52.6060 KOps/s | 51.7861 KOps/s | |
test_getitem[list] | 0.1499ms | 39.7885μs | 25.1329 KOps/s | 24.0321 KOps/s | |
test_setitem_dim[int] | 71.6030μs | 32.2757μs | 30.9831 KOps/s | 27.1151 KOps/s | |
test_setitem_dim[slice_int] | 88.7860μs | 57.5912μs | 17.3638 KOps/s | 15.8124 KOps/s | |
test_setitem_dim[range] | 0.1375ms | 79.6849μs | 12.5494 KOps/s | 11.5485 KOps/s | |
test_setitem_dim[tuple] | 0.1096ms | 47.1099μs | 21.2270 KOps/s | 17.9798 KOps/s | |
test_setitem | 72.5450μs | 18.5574μs | 53.8868 KOps/s | 47.4037 KOps/s | |
test_set | 0.2609ms | 17.9965μs | 55.5664 KOps/s | 48.6647 KOps/s | |
test_set_shared | 3.4649ms | 0.1435ms | 6.9667 KOps/s | 6.8637 KOps/s | |
test_update | 0.2668ms | 18.6893μs | 53.5065 KOps/s | 42.8358 KOps/s | |
test_update_nested | 75.6810μs | 26.4448μs | 37.8146 KOps/s | 32.4332 KOps/s | |
test_update__nested | 76.6530μs | 25.2000μs | 39.6825 KOps/s | 39.7929 KOps/s | |
test_set_nested | 0.2510ms | 20.0234μs | 49.9415 KOps/s | 43.6842 KOps/s | |
test_set_nested_new | 83.6050μs | 23.9420μs | 41.7676 KOps/s | 36.8316 KOps/s | |
test_select | 0.1065ms | 38.9602μs | 25.6672 KOps/s | 23.7784 KOps/s | |
test_select_nested | 0.1535ms | 59.5686μs | 16.7874 KOps/s | 16.5713 KOps/s | |
test_exclude_nested | 0.2666ms | 0.1232ms | 8.1170 KOps/s | 8.4582 KOps/s | |
test_empty[True] | 0.6725ms | 0.3969ms | 2.5197 KOps/s | 2.5412 KOps/s | |
test_empty[False] | 10.3568μs | 1.1657μs | 857.8258 KOps/s | 847.4869 KOps/s | |
test_unbind_speed | 0.4370ms | 0.2563ms | 3.9015 KOps/s | 3.8614 KOps/s | |
test_unbind_speed_stack0 | 0.4567ms | 0.2497ms | 4.0048 KOps/s | 3.9179 KOps/s | |
test_unbind_speed_stack1 | 91.4027ms | 0.7486ms | 1.3358 KOps/s | 1.2658 KOps/s | |
test_split | 74.7153ms | 1.5943ms | 627.2510 Ops/s | 607.2579 Ops/s | |
test_chunk | 75.4501ms | 1.6065ms | 622.4547 Ops/s | 610.0858 Ops/s | |
test_creation[device0] | 3.6787ms | 90.7313μs | 11.0215 KOps/s | 11.5324 KOps/s | |
test_creation_from_tensor | 0.2741ms | 88.0888μs | 11.3522 KOps/s | 11.5337 KOps/s | |
test_add_one[memmap_tensor0] | 0.1026ms | 5.1638μs | 193.6554 KOps/s | 180.2155 KOps/s | |
test_contiguous[memmap_tensor0] | 14.5470μs | 0.6470μs | 1.5456 MOps/s | 1.5280 MOps/s | |
test_stack[memmap_tensor0] | 27.0510μs | 3.5061μs | 285.2151 KOps/s | 276.8686 KOps/s | |
test_memmaptd_index | 0.9813ms | 0.2575ms | 3.8835 KOps/s | 3.5727 KOps/s | |
test_memmaptd_index_astensor | 0.7616ms | 0.3325ms | 3.0074 KOps/s | 2.9703 KOps/s | |
test_memmaptd_index_op | 0.9619ms | 0.5702ms | 1.7536 KOps/s | 1.5604 KOps/s | |
test_serialize_model | 0.1830s | 0.1172s | 8.5322 Ops/s | 8.5727 Ops/s | |
test_serialize_model_pickle | 0.4496s | 0.3783s | 2.6433 Ops/s | 2.6097 Ops/s | |
test_serialize_weights | 0.1818s | 0.1154s | 8.6652 Ops/s | 8.6607 Ops/s | |
test_serialize_weights_returnearly | 0.2049s | 0.1401s | 7.1376 Ops/s | 7.7251 Ops/s | |
test_serialize_weights_pickle | 0.6565s | 0.4709s | 2.1238 Ops/s | 2.3445 Ops/s | |
test_serialize_weights_filesystem | 0.1043s | 96.0515ms | 10.4111 Ops/s | 10.2154 Ops/s | |
test_serialize_model_filesystem | 97.1398ms | 95.7112ms | 10.4481 Ops/s | 10.4403 Ops/s | |
test_reshape_pytree | 85.1080μs | 25.6878μs | 38.9290 KOps/s | 38.2473 KOps/s | |
test_reshape_td | 0.1228ms | 34.1973μs | 29.2421 KOps/s | 28.6946 KOps/s | |
test_view_pytree | 72.2350μs | 25.7987μs | 38.7616 KOps/s | 39.5824 KOps/s | |
test_view_td | 0.2911ms | 38.3156μs | 26.0990 KOps/s | 25.5811 KOps/s | |
test_unbind_pytree | 69.1790μs | 29.2518μs | 34.1859 KOps/s | 34.0998 KOps/s | |
test_unbind_td | 0.3954ms | 37.1197μs | 26.9399 KOps/s | 25.9088 KOps/s | |
test_split_pytree | 66.2440μs | 29.6138μs | 33.7681 KOps/s | 33.7791 KOps/s | |
test_split_td | 0.1438ms | 40.1568μs | 24.9024 KOps/s | 24.2583 KOps/s | |
test_add_pytree | 97.8020μs | 34.7723μs | 28.7585 KOps/s | 28.2876 KOps/s | |
test_add_td | 0.2394ms | 50.9015μs | 19.6458 KOps/s | 17.1104 KOps/s | |
test_distributed | 0.2613ms | 0.1022ms | 9.7876 KOps/s | 9.7104 KOps/s | |
test_tdmodule | 0.1167ms | 16.7109μs | 59.8412 KOps/s | 53.9485 KOps/s | |
test_tdmodule_dispatch | 53.0890μs | 31.9296μs | 31.3189 KOps/s | 26.9609 KOps/s | |
test_tdseq | 40.9260μs | 19.1277μs | 52.2803 KOps/s | 46.1922 KOps/s | |
test_tdseq_dispatch | 56.5560μs | 36.5074μs | 27.3917 KOps/s | 23.2326 KOps/s | |
test_instantiation_functorch | 1.6304ms | 1.3385ms | 747.0866 Ops/s | 760.2517 Ops/s | |
test_instantiation_td | 1.7355ms | 1.0292ms | 971.6668 Ops/s | 979.2481 Ops/s | |
test_exec_functorch | 0.3541ms | 0.1602ms | 6.2405 KOps/s | 6.2446 KOps/s | |
test_exec_functional_call | 0.2843ms | 0.1511ms | 6.6164 KOps/s | 6.7112 KOps/s | |
test_exec_td | 0.3508ms | 0.1443ms | 6.9281 KOps/s | 6.9466 KOps/s | |
test_exec_td_decorator | 1.0686ms | 0.2242ms | 4.4611 KOps/s | 4.4832 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.8384ms | 0.4844ms | 2.0645 KOps/s | 2.0342 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.7777ms | 0.4825ms | 2.0725 KOps/s | 1.9294 KOps/s | |
test_vmap_mlp_speed[False-True] | 1.8596ms | 0.4092ms | 2.4437 KOps/s | 2.5014 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.5857ms | 0.3947ms | 2.5333 KOps/s | 2.5091 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.3080ms | 0.5540ms | 1.8052 KOps/s | 1.7648 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8860ms | 0.5522ms | 1.8108 KOps/s | 1.7740 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7375ms | 0.4617ms | 2.1658 KOps/s | 2.1552 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 1.0619ms | 0.4643ms | 2.1539 KOps/s | 2.1535 KOps/s | |
test_to_module_speed[True] | 2.4842ms | 1.7271ms | 578.9984 Ops/s | 591.8048 Ops/s | |
test_to_module_speed[False] | 2.3633ms | 1.7065ms | 585.9995 Ops/s | 603.2827 Ops/s | |
test_tc_init | 54.8320μs | 23.4081μs | 42.7203 KOps/s | 32.9424 KOps/s | |
test_tc_init_nested | 0.1555ms | 44.9665μs | 22.2388 KOps/s | 17.0740 KOps/s | |
test_tc_first_layer_tensor | 4.9061μs | 0.7001μs | 1.4283 MOps/s | 1.4565 MOps/s | |
test_tc_first_layer_nontensor | 1.8605μs | 0.6673μs | 1.4986 MOps/s | 1.4757 MOps/s | |
test_tc_second_layer_tensor | 26.3390μs | 1.8322μs | 545.7951 KOps/s | 551.0102 KOps/s | |
test_tc_second_layer_nontensor | 9.5277μs | 1.5123μs | 661.2433 KOps/s | 667.6953 KOps/s | |
test_unbind | 96.6495ms | 8.6316ms | 115.8528 Ops/s | 118.7328 Ops/s | |
test_full_like | 17.2720ms | 12.3558ms | 80.9335 Ops/s | 85.6447 Ops/s | |
test_zeros_like | 15.4765ms | 6.3587ms | 157.2643 Ops/s | 155.8501 Ops/s | |
test_ones_like | 11.3124ms | 6.7885ms | 147.3073 Ops/s | 147.9961 Ops/s | |
test_clone | 16.1451ms | 8.9150ms | 112.1699 Ops/s | 119.6761 Ops/s | |
test_squeeze | 72.9960μs | 14.4515μs | 69.1971 KOps/s | 72.4818 KOps/s | |
test_unsqueeze | 0.1261ms | 60.8614μs | 16.4308 KOps/s | 16.1226 KOps/s | |
test_split | 0.2116ms | 0.1128ms | 8.8622 KOps/s | 8.9370 KOps/s | |
test_permute | 0.3046ms | 0.1286ms | 7.7787 KOps/s | 7.7788 KOps/s | |
test_stack | 30.5940ms | 24.4430ms | 40.9116 Ops/s | 41.9292 Ops/s | |
test_cat | 31.6828ms | 24.4839ms | 40.8432 Ops/s | 41.5337 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.1004ms | 12.9904μs | 76.9797 KOps/s | 77.1153 KOps/s | |
test_plain_set_stack_nested | 31.4810μs | 13.1357μs | 76.1287 KOps/s | 76.3613 KOps/s | |
test_plain_set_nested_inplace | 0.2067ms | 14.2610μs | 70.1214 KOps/s | 70.1881 KOps/s | |
test_plain_set_stack_nested_inplace | 0.1980ms | 14.4189μs | 69.3533 KOps/s | 69.3531 KOps/s | |
test_items | 0.1894ms | 4.6691μs | 214.1728 KOps/s | 211.6821 KOps/s | |
test_items_nested | 0.5318ms | 0.3421ms | 2.9235 KOps/s | 2.9362 KOps/s | |
test_items_nested_locked | 0.5413ms | 0.3489ms | 2.8658 KOps/s | 2.9243 KOps/s | |
test_items_nested_leaf | 0.1133ms | 84.1317μs | 11.8861 KOps/s | 11.9609 KOps/s | |
test_items_stack_nested | 0.3739ms | 0.3407ms | 2.9349 KOps/s | 2.8897 KOps/s | |
test_items_stack_nested_leaf | 0.1754ms | 86.2343μs | 11.5963 KOps/s | 11.8780 KOps/s | |
test_items_stack_nested_locked | 0.4000ms | 0.3454ms | 2.8950 KOps/s | 2.9028 KOps/s | |
test_keys | 21.6510μs | 4.3779μs | 228.4208 KOps/s | 227.6078 KOps/s | |
test_keys_nested | 85.0320μs | 67.8008μs | 14.7491 KOps/s | 14.8178 KOps/s | |
test_keys_nested_locked | 0.7827ms | 73.8347μs | 13.5438 KOps/s | 13.7844 KOps/s | |
test_keys_nested_leaf | 93.4620μs | 58.4386μs | 17.1120 KOps/s | 17.3024 KOps/s | |
test_keys_stack_nested | 0.1028ms | 68.7897μs | 14.5371 KOps/s | 14.8697 KOps/s | |
test_keys_stack_nested_leaf | 82.3510μs | 58.8872μs | 16.9816 KOps/s | 17.1654 KOps/s | |
test_keys_stack_nested_locked | 0.1062ms | 74.0430μs | 13.5057 KOps/s | 13.9233 KOps/s | |
test_values | 9.3737μs | 1.8346μs | 545.0735 KOps/s | 545.8520 KOps/s | |
test_values_nested | 0.1566ms | 35.3028μs | 28.3263 KOps/s | 28.4248 KOps/s | |
test_values_nested_locked | 56.6310μs | 37.2735μs | 26.8287 KOps/s | 26.5308 KOps/s | |
test_values_nested_leaf | 47.4310μs | 31.3232μs | 31.9252 KOps/s | 32.1625 KOps/s | |
test_values_stack_nested | 60.3410μs | 35.8326μs | 27.9075 KOps/s | 27.5492 KOps/s | |
test_values_stack_nested_leaf | 47.8810μs | 32.1079μs | 31.1450 KOps/s | 31.1228 KOps/s | |
test_values_stack_nested_locked | 52.8810μs | 37.7996μs | 26.4553 KOps/s | 25.7928 KOps/s | |
test_membership | 3.3057μs | 0.7381μs | 1.3548 MOps/s | 1.3562 MOps/s | |
test_membership_nested | 61.5210μs | 2.5739μs | 388.5148 KOps/s | 383.8238 KOps/s | |
test_membership_nested_leaf | 18.9410μs | 2.5973μs | 385.0203 KOps/s | 386.0992 KOps/s | |
test_membership_stacked_nested | 21.1310μs | 2.5914μs | 385.8970 KOps/s | 384.3342 KOps/s | |
test_membership_stacked_nested_leaf | 19.4400μs | 2.5859μs | 386.7177 KOps/s | 388.8727 KOps/s | |
test_membership_nested_last | 18.5210μs | 3.1158μs | 320.9443 KOps/s | 319.8583 KOps/s | |
test_membership_nested_leaf_last | 19.8200μs | 3.1096μs | 321.5804 KOps/s | 320.4185 KOps/s | |
test_membership_stacked_nested_last | 24.5100μs | 3.0984μs | 322.7506 KOps/s | 281.9671 KOps/s | |
test_membership_stacked_nested_leaf_last | 16.0500μs | 3.1138μs | 321.1544 KOps/s | 281.0003 KOps/s | |
test_nested_getleaf | 53.6510μs | 8.3491μs | 119.7741 KOps/s | 119.2427 KOps/s | |
test_nested_get | 23.2000μs | 7.8833μs | 126.8512 KOps/s | 126.6812 KOps/s | |
test_stacked_getleaf | 30.1600μs | 8.4057μs | 118.9671 KOps/s | 118.3994 KOps/s | |
test_stacked_get | 24.1910μs | 7.9066μs | 126.4767 KOps/s | 125.1162 KOps/s | |
test_nested_getitemleaf | 0.1440ms | 8.5540μs | 116.9050 KOps/s | 116.9990 KOps/s | |
test_nested_getitem | 22.7300μs | 8.0628μs | 124.0258 KOps/s | 124.0704 KOps/s | |
test_stacked_getitemleaf | 22.6200μs | 8.5978μs | 116.3091 KOps/s | 115.4514 KOps/s | |
test_stacked_getitem | 29.9600μs | 8.0682μs | 123.9440 KOps/s | 122.8375 KOps/s | |
test_lock_nested | 57.5754ms | 0.4063ms | 2.4612 KOps/s | 2.3387 KOps/s | |
test_lock_stack_nested | 0.3777ms | 0.3063ms | 3.2642 KOps/s | 3.1988 KOps/s | |
test_unlock_nested | 59.2340ms | 0.4101ms | 2.4384 KOps/s | 2.7459 KOps/s | |
test_unlock_stack_nested | 0.4163ms | 0.3134ms | 3.1907 KOps/s | 3.1077 KOps/s | |
test_flatten_speed | 0.2738ms | 0.1032ms | 9.6867 KOps/s | 9.7332 KOps/s | |
test_unflatten_speed | 0.4909ms | 0.2978ms | 3.3575 KOps/s | 3.4481 KOps/s | |
test_common_ops | 1.1765ms | 0.6031ms | 1.6580 KOps/s | 1.5943 KOps/s | |
test_creation | 16.9000μs | 1.6863μs | 593.0283 KOps/s | 592.7644 KOps/s | |
test_creation_empty | 0.2043ms | 9.2940μs | 107.5960 KOps/s | 108.6429 KOps/s | |
test_creation_nested_1 | 37.8510μs | 11.0453μs | 90.5364 KOps/s | 91.4645 KOps/s | |
test_creation_nested_2 | 31.1310μs | 13.3503μs | 74.9047 KOps/s | 76.4294 KOps/s | |
test_clone | 68.6310μs | 11.6809μs | 85.6096 KOps/s | 78.2995 KOps/s | |
test_getitem[int] | 33.3110μs | 10.9589μs | 91.2499 KOps/s | 87.7883 KOps/s | |
test_getitem[slice_int] | 43.1310μs | 21.0283μs | 47.5551 KOps/s | 45.5897 KOps/s | |
test_getitem[range] | 67.5420μs | 47.8410μs | 20.9026 KOps/s | 20.8156 KOps/s | |
test_getitem[tuple] | 48.5010μs | 18.7960μs | 53.2027 KOps/s | 51.5726 KOps/s | |
test_getitem[list] | 0.1600ms | 34.3182μs | 29.1391 KOps/s | 27.1697 KOps/s | |
test_setitem_dim[int] | 75.8110μs | 30.1852μs | 33.1288 KOps/s | 31.4886 KOps/s | |
test_setitem_dim[slice_int] | 0.1757ms | 50.7379μs | 19.7091 KOps/s | 19.0531 KOps/s | |
test_setitem_dim[range] | 88.3810μs | 68.9637μs | 14.5004 KOps/s | 14.1992 KOps/s | |
test_setitem_dim[tuple] | 68.8210μs | 45.2917μs | 22.0791 KOps/s | 21.6357 KOps/s | |
test_setitem | 0.1101ms | 16.7773μs | 59.6042 KOps/s | 55.6846 KOps/s | |
test_set | 69.8310μs | 16.3561μs | 61.1392 KOps/s | 57.2855 KOps/s | |
test_set_shared | 1.0396ms | 99.4263μs | 10.0577 KOps/s | 9.3514 KOps/s | |
test_update | 83.2420μs | 18.4937μs | 54.0724 KOps/s | 46.4774 KOps/s | |
test_update_nested | 68.3710μs | 23.8762μs | 41.8827 KOps/s | 39.1844 KOps/s | |
test_update__nested | 0.1208ms | 22.0631μs | 45.3246 KOps/s | 40.6638 KOps/s | |
test_set_nested | 54.8710μs | 17.2856μs | 57.8515 KOps/s | 53.9638 KOps/s | |
test_set_nested_new | 59.5310μs | 20.3965μs | 49.0281 KOps/s | 45.5441 KOps/s | |
test_select | 87.8720μs | 33.2514μs | 30.0739 KOps/s | 26.9332 KOps/s | |
test_select_nested | 86.4310μs | 55.6407μs | 17.9724 KOps/s | 18.2330 KOps/s | |
test_exclude_nested | 0.1887ms | 0.1102ms | 9.0754 KOps/s | 8.9946 KOps/s | |
test_empty[True] | 0.4136ms | 0.3494ms | 2.8620 KOps/s | 2.8491 KOps/s | |
test_empty[False] | 2.0786μs | 0.9240μs | 1.0822 MOps/s | 1.0711 MOps/s | |
test_to | 0.1032ms | 78.1064μs | 12.8031 KOps/s | 12.7346 KOps/s | |
test_to_nonblocking | 0.2141ms | 63.3908μs | 15.7752 KOps/s | 15.1690 KOps/s | |
test_unbind_speed | 0.3302ms | 0.2654ms | 3.7676 KOps/s | 3.6315 KOps/s | |
test_unbind_speed_stack0 | 0.3960ms | 0.2663ms | 3.7554 KOps/s | 3.6596 KOps/s | |
test_unbind_speed_stack1 | 75.0457ms | 0.8380ms | 1.1934 KOps/s | 1.1579 KOps/s | |
test_split | 75.2094ms | 1.6997ms | 588.3245 Ops/s | 640.8934 Ops/s | |
test_chunk | 1.6080ms | 1.5685ms | 637.5373 Ops/s | 594.8318 Ops/s | |
test_creation[device0] | 0.1992ms | 59.1409μs | 16.9088 KOps/s | 15.6778 KOps/s | |
test_creation_from_tensor | 0.2042ms | 56.3072μs | 17.7597 KOps/s | 16.6699 KOps/s | |
test_add_one[memmap_tensor0] | 0.1366ms | 7.1421μs | 140.0158 KOps/s | 131.4054 KOps/s | |
test_contiguous[memmap_tensor0] | 20.3610μs | 0.7185μs | 1.3918 MOps/s | 1.4124 MOps/s | |
test_stack[memmap_tensor0] | 42.2410μs | 5.0880μs | 196.5407 KOps/s | 195.0689 KOps/s | |
test_memmaptd_index | 1.1502ms | 0.2953ms | 3.3866 KOps/s | 3.2982 KOps/s | |
test_memmaptd_index_astensor | 0.6388ms | 0.3658ms | 2.7334 KOps/s | 2.6595 KOps/s | |
test_memmaptd_index_op | 1.1584ms | 0.6796ms | 1.4715 KOps/s | 1.3986 KOps/s | |
test_serialize_model | 0.1064s | 0.1034s | 9.6732 Ops/s | 8.5488 Ops/s | |
test_serialize_model_pickle | 1.3661s | 1.2391s | 0.8071 Ops/s | 0.8074 Ops/s | |
test_serialize_weights | 0.1802s | 0.1104s | 9.0557 Ops/s | 8.7058 Ops/s | |
test_serialize_weights_returnearly | 0.2423s | 0.1019s | 9.8128 Ops/s | 10.1879 Ops/s | |
test_serialize_weights_pickle | 1.3533s | 1.2365s | 0.8087 Ops/s | 0.8009 Ops/s | |
test_reshape_pytree | 93.3810μs | 26.4478μs | 37.8104 KOps/s | 35.4348 KOps/s | |
test_reshape_td | 90.5820μs | 31.9686μs | 31.2807 KOps/s | 31.0524 KOps/s | |
test_view_pytree | 0.1705ms | 26.0873μs | 38.3328 KOps/s | 37.2699 KOps/s | |
test_view_td | 66.6510μs | 36.1114μs | 27.6921 KOps/s | 26.4914 KOps/s | |
test_unbind_pytree | 64.7310μs | 32.0318μs | 31.2190 KOps/s | 30.2990 KOps/s | |
test_unbind_td | 0.4014ms | 41.2026μs | 24.2703 KOps/s | 23.7051 KOps/s | |
test_split_pytree | 57.6710μs | 34.9665μs | 28.5988 KOps/s | 28.0048 KOps/s | |
test_split_td | 0.1073ms | 40.5883μs | 24.6376 KOps/s | 24.7891 KOps/s | |
test_add_pytree | 0.1834ms | 39.5709μs | 25.2711 KOps/s | 23.7788 KOps/s | |
test_add_td | 0.2255ms | 54.2830μs | 18.4220 KOps/s | 18.2438 KOps/s | |
test_distributed | 4.1594ms | 80.7307μs | 12.3869 KOps/s | 10.4972 KOps/s | |
test_tdmodule | 0.1422ms | 15.6800μs | 63.7753 KOps/s | 66.3276 KOps/s | |
test_tdmodule_dispatch | 50.9010μs | 30.1538μs | 33.1633 KOps/s | 33.3456 KOps/s | |
test_tdseq | 34.0200μs | 17.4082μs | 57.4442 KOps/s | 57.8409 KOps/s | |
test_tdseq_dispatch | 50.1910μs | 33.7372μs | 29.6409 KOps/s | 29.6784 KOps/s | |
test_instantiation_functorch | 1.7200ms | 1.5706ms | 636.7174 Ops/s | 629.2309 Ops/s | |
test_instantiation_td | 1.5387ms | 1.0766ms | 928.8247 Ops/s | 851.7520 Ops/s | |
test_exec_functorch | 0.2239ms | 0.1563ms | 6.3975 KOps/s | 6.2070 KOps/s | |
test_exec_functional_call | 0.3137ms | 0.1435ms | 6.9674 KOps/s | 6.8144 KOps/s | |
test_exec_td | 0.1707ms | 0.1387ms | 7.2111 KOps/s | 6.8823 KOps/s | |
test_exec_td_decorator | 0.4949ms | 0.2188ms | 4.5694 KOps/s | 4.5753 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.8193ms | 0.6203ms | 1.6120 KOps/s | 1.5972 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.7819ms | 0.6185ms | 1.6167 KOps/s | 1.5993 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.7264ms | 0.5688ms | 1.7580 KOps/s | 1.8068 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.7062ms | 0.5494ms | 1.8203 KOps/s | 1.8127 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.1395ms | 0.6870ms | 1.4557 KOps/s | 1.4396 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8485ms | 0.6849ms | 1.4601 KOps/s | 1.4479 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7765ms | 0.6086ms | 1.6431 KOps/s | 1.6346 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7993ms | 0.6092ms | 1.6416 KOps/s | 1.6390 KOps/s | |
test_vmap_transformer_speed[True-True] | 8.4705ms | 8.1799ms | 122.2509 Ops/s | 120.1829 Ops/s | |
test_vmap_transformer_speed[True-False] | 8.6357ms | 8.2111ms | 121.7867 Ops/s | 120.4085 Ops/s | |
test_vmap_transformer_speed[False-True] | 8.3459ms | 8.1078ms | 123.3381 Ops/s | 121.4150 Ops/s | |
test_vmap_transformer_speed[False-False] | 8.8301ms | 8.1633ms | 122.5000 Ops/s | 121.5368 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 20.2955ms | 19.9154ms | 50.2124 Ops/s | 49.6955 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 20.4971ms | 19.9208ms | 50.1989 Ops/s | 49.7917 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 20.2115ms | 19.7983ms | 50.5094 Ops/s | 49.9736 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 20.0132ms | 19.7769ms | 50.5642 Ops/s | 50.0538 Ops/s | |
test_to_module_speed[True] | 2.1141ms | 1.5723ms | 635.9973 Ops/s | 642.4245 Ops/s | |
test_to_module_speed[False] | 1.6535ms | 1.5365ms | 650.8329 Ops/s | 647.7895 Ops/s | |
test_tc_init | 85.8310μs | 25.3702μs | 39.4163 KOps/s | 39.3794 KOps/s | |
test_tc_init_nested | 93.7910μs | 50.7250μs | 19.7141 KOps/s | 19.2576 KOps/s | |
test_tc_first_layer_tensor | 0.7558μs | 0.3745μs | 2.6700 MOps/s | 2.6742 MOps/s | |
test_tc_first_layer_nontensor | 4.0462μs | 0.4038μs | 2.4762 MOps/s | 2.5066 MOps/s | |
test_tc_second_layer_tensor | 4.2720μs | 0.9975μs | 1.0025 MOps/s | 999.4594 KOps/s | |
test_tc_second_layer_nontensor | 4.4250μs | 0.8523μs | 1.1733 MOps/s | 1.1656 MOps/s | |
test_unbind | 92.7126ms | 6.6425ms | 150.5448 Ops/s | 188.2249 Ops/s | |
test_full_like | 14.2627ms | 13.4837ms | 74.1633 Ops/s | 73.1956 Ops/s | |
test_zeros_like | 8.2594ms | 7.8389ms | 127.5697 Ops/s | 126.6101 Ops/s | |
test_ones_like | 8.3005ms | 7.8327ms | 127.6701 Ops/s | 128.1007 Ops/s | |
test_clone | 9.9532ms | 9.4858ms | 105.4208 Ops/s | 105.0317 Ops/s | |
test_squeeze | 0.1438ms | 11.3147μs | 88.3808 KOps/s | 89.4251 KOps/s | |
test_unsqueeze | 0.1828ms | 53.8997μs | 18.5530 KOps/s | 18.4146 KOps/s | |
test_split | 0.2280ms | 0.1017ms | 9.8323 KOps/s | 9.7586 KOps/s | |
test_permute | 0.2351ms | 0.1145ms | 8.7370 KOps/s | 8.7716 KOps/s | |
test_stack | 28.8452ms | 27.8106ms | 35.9576 Ops/s | 36.0777 Ops/s | |
test_cat | 28.3947ms | 27.7569ms | 36.0271 Ops/s | 36.3131 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Refactor
Refactoring code - not a new feature
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
However, this fails on our tests
For a minimal reprod https://gist.github.com/vmoens/83cdb5b9059e319829607c22927f0383
cc @teopir @albanD @mikaylagawarecki