-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Ensure grads and noned when needed #1069
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Nov 1, 2024
ghstack-source-id: 5e9c5a974e5a5c73b033e5b85c3eb70c2f433512 Pull Request resolved: #1069
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Nov 1, 2024
vmoens
added a commit
that referenced
this pull request
Nov 1, 2024
ghstack-source-id: 5e9c5a974e5a5c73b033e5b85c3eb70c2f433512 Pull Request resolved: #1069
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 49.6130μs | 21.4264μs | 46.6714 KOps/s | 46.0430 KOps/s | |
test_plain_set_stack_nested | 60.1230μs | 21.4382μs | 46.6456 KOps/s | 45.5783 KOps/s | |
test_plain_set_nested_inplace | 61.1640μs | 23.5136μs | 42.5285 KOps/s | 42.2147 KOps/s | |
test_plain_set_stack_nested_inplace | 61.3350μs | 23.5616μs | 42.4420 KOps/s | 42.0425 KOps/s | |
test_items | 25.1470μs | 4.1992μs | 238.1395 KOps/s | 240.2238 KOps/s | |
test_items_nested | 0.4065ms | 0.3353ms | 2.9824 KOps/s | 2.9166 KOps/s | |
test_items_nested_locked | 0.6329ms | 0.3410ms | 2.9328 KOps/s | 2.9302 KOps/s | |
test_items_nested_leaf | 0.1300ms | 71.0052μs | 14.0835 KOps/s | 13.7194 KOps/s | |
test_items_stack_nested | 0.6525ms | 0.3407ms | 2.9356 KOps/s | 2.8923 KOps/s | |
test_items_stack_nested_leaf | 0.1424ms | 74.1460μs | 13.4869 KOps/s | 13.2453 KOps/s | |
test_items_stack_nested_locked | 0.6212ms | 0.3399ms | 2.9423 KOps/s | 2.9050 KOps/s | |
test_keys | 17.7340μs | 4.8409μs | 206.5738 KOps/s | 278.2824 KOps/s | |
test_keys_nested | 0.2003ms | 0.1371ms | 7.2965 KOps/s | 7.0518 KOps/s | |
test_keys_nested_locked | 0.6684ms | 0.1413ms | 7.0793 KOps/s | 6.9443 KOps/s | |
test_keys_nested_leaf | 0.2444ms | 0.1142ms | 8.7550 KOps/s | 8.3382 KOps/s | |
test_keys_stack_nested | 0.2517ms | 0.1351ms | 7.4026 KOps/s | 7.1741 KOps/s | |
test_keys_stack_nested_leaf | 0.2002ms | 0.1154ms | 8.6639 KOps/s | 8.6160 KOps/s | |
test_keys_stack_nested_locked | 0.2705ms | 0.1403ms | 7.1262 KOps/s | 6.9075 KOps/s | |
test_values | 5.2058μs | 1.0354μs | 965.7813 KOps/s | 942.7813 KOps/s | |
test_values_nested | 0.1091ms | 55.1086μs | 18.1460 KOps/s | 17.3217 KOps/s | |
test_values_nested_locked | 0.1111ms | 54.6999μs | 18.2816 KOps/s | 17.4438 KOps/s | |
test_values_nested_leaf | 0.1072ms | 59.5297μs | 16.7983 KOps/s | 16.3596 KOps/s | |
test_values_stack_nested | 0.1100ms | 56.2781μs | 17.7689 KOps/s | 16.9999 KOps/s | |
test_values_stack_nested_leaf | 0.1124ms | 59.7659μs | 16.7319 KOps/s | 16.3294 KOps/s | |
test_values_stack_nested_locked | 0.1075ms | 55.9499μs | 17.8731 KOps/s | 17.1308 KOps/s | |
test_membership | 5.6191μs | 0.7299μs | 1.3701 MOps/s | 1.3123 MOps/s | |
test_membership_nested | 33.5130μs | 2.7396μs | 365.0187 KOps/s | 359.7195 KOps/s | |
test_membership_nested_leaf | 23.8250μs | 2.7362μs | 365.4651 KOps/s | 355.8225 KOps/s | |
test_membership_stacked_nested | 22.1320μs | 2.7166μs | 368.1096 KOps/s | 357.7841 KOps/s | |
test_membership_stacked_nested_leaf | 15.4690μs | 2.6897μs | 371.7944 KOps/s | 351.5829 KOps/s | |
test_membership_nested_last | 19.9570μs | 3.9851μs | 250.9333 KOps/s | 240.7254 KOps/s | |
test_membership_nested_leaf_last | 37.3090μs | 4.0172μs | 248.9301 KOps/s | 238.4439 KOps/s | |
test_membership_stacked_nested_last | 27.1110μs | 3.9664μs | 252.1182 KOps/s | 158.9135 KOps/s | |
test_membership_stacked_nested_leaf_last | 35.8770μs | 4.0215μs | 248.6638 KOps/s | 158.9444 KOps/s | |
test_nested_getleaf | 36.2580μs | 10.4859μs | 95.3661 KOps/s | 95.9031 KOps/s | |
test_nested_get | 28.1820μs | 9.9769μs | 100.2317 KOps/s | 96.9187 KOps/s | |
test_stacked_getleaf | 48.9320μs | 10.4643μs | 95.5628 KOps/s | 95.0474 KOps/s | |
test_stacked_get | 47.3420μs | 10.0308μs | 99.6933 KOps/s | 99.2650 KOps/s | |
test_nested_getitemleaf | 28.5730μs | 10.8662μs | 92.0287 KOps/s | 88.4752 KOps/s | |
test_nested_getitem | 33.6730μs | 10.0651μs | 99.3532 KOps/s | 93.2774 KOps/s | |
test_stacked_getitemleaf | 33.6930μs | 10.9282μs | 91.5067 KOps/s | 89.8618 KOps/s | |
test_stacked_getitem | 33.4330μs | 10.2571μs | 97.4937 KOps/s | 95.5856 KOps/s | |
test_lock_nested | 0.9772ms | 0.4851ms | 2.0612 KOps/s | 2.0656 KOps/s | |
test_lock_stack_nested | 0.5613ms | 0.4565ms | 2.1904 KOps/s | 2.2320 KOps/s | |
test_unlock_nested | 0.9912ms | 0.4073ms | 2.4555 KOps/s | 2.4789 KOps/s | |
test_unlock_stack_nested | 0.6383ms | 0.3785ms | 2.6423 KOps/s | 2.7296 KOps/s | |
test_flatten_speed | 0.1653ms | 90.9944μs | 10.9897 KOps/s | 10.8559 KOps/s | |
test_unflatten_speed | 0.8835ms | 0.4717ms | 2.1200 KOps/s | 2.0901 KOps/s | |
test_common_ops | 2.0218ms | 1.1079ms | 902.6039 Ops/s | 865.2676 Ops/s | |
test_creation | 16.7510μs | 2.1111μs | 473.6904 KOps/s | 479.1918 KOps/s | |
test_creation_empty | 50.1330μs | 18.0529μs | 55.3927 KOps/s | 53.2980 KOps/s | |
test_creation_nested_1 | 1.2426ms | 21.2275μs | 47.1087 KOps/s | 45.2853 KOps/s | |
test_creation_nested_2 | 72.2050μs | 25.5662μs | 39.1142 KOps/s | 38.2977 KOps/s | |
test_clone | 48.4100μs | 17.3422μs | 57.6627 KOps/s | 57.6336 KOps/s | |
test_getitem[int] | 0.8518ms | 16.4705μs | 60.7147 KOps/s | 58.3670 KOps/s | |
test_getitem[slice_int] | 0.1374ms | 30.3036μs | 32.9994 KOps/s | 31.4157 KOps/s | |
test_getitem[range] | 0.1753ms | 58.4010μs | 17.1230 KOps/s | 16.3407 KOps/s | |
test_getitem[tuple] | 0.1314ms | 24.7842μs | 40.3484 KOps/s | 39.8414 KOps/s | |
test_getitem[list] | 0.1607ms | 53.6984μs | 18.6225 KOps/s | 17.9551 KOps/s | |
test_setitem_dim[int] | 65.7530μs | 32.4588μs | 30.8083 KOps/s | 30.2550 KOps/s | |
test_setitem_dim[slice_int] | 87.1230μs | 60.2406μs | 16.6001 KOps/s | 15.6774 KOps/s | |
test_setitem_dim[range] | 0.1485ms | 83.1816μs | 12.0219 KOps/s | 11.6316 KOps/s | |
test_setitem_dim[tuple] | 93.4950μs | 47.9060μs | 20.8742 KOps/s | 19.8161 KOps/s | |
test_setitem | 0.1057ms | 30.0711μs | 33.2545 KOps/s | 32.7496 KOps/s | |
test_set | 0.1550ms | 29.4654μs | 33.9381 KOps/s | 33.8503 KOps/s | |
test_set_shared | 4.6396ms | 0.2172ms | 4.6030 KOps/s | 4.6285 KOps/s | |
test_update | 0.4188ms | 37.9177μs | 26.3729 KOps/s | 26.4824 KOps/s | |
test_update_nested | 0.1988ms | 46.4509μs | 21.5281 KOps/s | 20.3288 KOps/s | |
test_update__nested | 0.9772ms | 41.5548μs | 24.0646 KOps/s | 24.0927 KOps/s | |
test_set_nested | 0.1054ms | 32.0966μs | 31.1560 KOps/s | 29.5728 KOps/s | |
test_set_nested_new | 0.1009ms | 36.7691μs | 27.1968 KOps/s | 26.3619 KOps/s | |
test_select | 0.1233ms | 53.9404μs | 18.5390 KOps/s | 17.2096 KOps/s | |
test_select_nested | 0.1180ms | 58.1263μs | 17.2039 KOps/s | 16.7919 KOps/s | |
test_exclude_nested | 0.5464ms | 77.8463μs | 12.8458 KOps/s | 13.3917 KOps/s | |
test_empty[True] | 0.4581ms | 0.3491ms | 2.8646 KOps/s | 2.8263 KOps/s | |
test_empty[False] | 9.2450μs | 1.2207μs | 819.2289 KOps/s | 749.1616 KOps/s | |
test_unbind_speed | 0.3584ms | 0.3047ms | 3.2822 KOps/s | 3.3607 KOps/s | |
test_unbind_speed_stack0 | 0.9042ms | 0.2977ms | 3.3588 KOps/s | 3.4455 KOps/s | |
test_unbind_speed_stack1 | 98.8955ms | 0.8115ms | 1.2323 KOps/s | 1.3729 KOps/s | |
test_split | 2.4152ms | 1.9673ms | 508.3194 Ops/s | 507.1358 Ops/s | |
test_chunk | 0.1055s | 2.1924ms | 456.1262 Ops/s | 420.2582 Ops/s | |
test_creation[device0] | 0.5197ms | 0.1162ms | 8.6035 KOps/s | 8.4894 KOps/s | |
test_creation_from_tensor | 3.3715ms | 0.1190ms | 8.4047 KOps/s | 8.3343 KOps/s | |
test_add_one[memmap_tensor0] | 0.2638ms | 7.4802μs | 133.6855 KOps/s | 137.8046 KOps/s | |
test_contiguous[memmap_tensor0] | 18.6950μs | 1.8623μs | 536.9591 KOps/s | 516.9724 KOps/s | |
test_stack[memmap_tensor0] | 81.0520μs | 5.3121μs | 188.2481 KOps/s | 175.4203 KOps/s | |
test_memmaptd_index | 1.0931ms | 0.3875ms | 2.5803 KOps/s | 2.5127 KOps/s | |
test_memmaptd_index_astensor | 1.1106ms | 0.4714ms | 2.1214 KOps/s | 2.0662 KOps/s | |
test_memmaptd_index_op | 1.3613ms | 1.0071ms | 992.9887 Ops/s | 974.0518 Ops/s | |
test_serialize_model | 0.1257s | 0.1159s | 8.6251 Ops/s | 8.4671 Ops/s | |
test_serialize_model_pickle | 0.4356s | 0.3883s | 2.5756 Ops/s | 2.5038 Ops/s | |
test_serialize_weights | 0.1156s | 0.1126s | 8.8802 Ops/s | 7.6255 Ops/s | |
test_serialize_weights_returnearly | 0.2670s | 0.1743s | 5.7388 Ops/s | 6.3599 Ops/s | |
test_serialize_weights_pickle | 0.5175s | 0.4292s | 2.3299 Ops/s | 2.5267 Ops/s | |
test_serialize_weights_filesystem | 0.2369s | 0.1542s | 6.4852 Ops/s | 6.9500 Ops/s | |
test_serialize_model_filesystem | 0.1609s | 0.1438s | 6.9523 Ops/s | 6.7945 Ops/s | |
test_reshape_pytree | 0.1165ms | 39.2502μs | 25.4776 KOps/s | 24.4193 KOps/s | |
test_reshape_td | 0.1175ms | 48.0432μs | 20.8146 KOps/s | 21.3112 KOps/s | |
test_view_pytree | 87.7140μs | 38.7729μs | 25.7912 KOps/s | 24.9740 KOps/s | |
test_view_td | 0.1143ms | 54.2698μs | 18.4265 KOps/s | 18.9347 KOps/s | |
test_unbind_pytree | 74.1490μs | 35.1557μs | 28.4449 KOps/s | 27.2432 KOps/s | |
test_unbind_td | 0.3073ms | 45.0138μs | 22.2154 KOps/s | 22.2173 KOps/s | |
test_split_pytree | 0.1151ms | 38.0296μs | 26.2953 KOps/s | 25.6700 KOps/s | |
test_split_td | 0.1919ms | 56.9988μs | 17.5442 KOps/s | 17.1918 KOps/s | |
test_add_pytree | 0.1466ms | 47.2111μs | 21.1815 KOps/s | 21.5570 KOps/s | |
test_add_td | 0.1706ms | 86.3540μs | 11.5802 KOps/s | 11.6684 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1523ms | 70.0422μs | 14.2771 KOps/s | 13.8260 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.3548ms | 0.1877ms | 5.3284 KOps/s | 5.3839 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1197ms | 53.3437μs | 18.7464 KOps/s | 17.8109 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2276ms | 0.1454ms | 6.8758 KOps/s | 6.7806 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 83.9970μs | 25.6571μs | 38.9756 KOps/s | 37.9497 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1196ms | 70.3331μs | 14.2181 KOps/s | 14.2758 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1367ms | 78.0182μs | 12.8175 KOps/s | 12.4027 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1230ms | 67.5378μs | 14.8065 KOps/s | 14.5954 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1910ms | 0.1130ms | 8.8502 KOps/s | 8.4252 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.7595ms | 0.2101ms | 4.7597 KOps/s | 4.6530 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1269ms | 52.2822μs | 19.1270 KOps/s | 17.4504 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.4547ms | 71.3767μs | 14.0102 KOps/s | 14.1363 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.6306ms | 0.1136ms | 8.8061 KOps/s | 8.6940 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.6002ms | 0.2995ms | 3.3391 KOps/s | 3.2430 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3872ms | 0.2176ms | 4.5947 KOps/s | 4.5223 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2125ms | 0.1148ms | 8.7085 KOps/s | 8.5921 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.3152ms | 63.0258μs | 15.8665 KOps/s | 15.8816 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1233ms | 53.8319μs | 18.5763 KOps/s | 17.8261 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.6478ms | 0.2454ms | 4.0749 KOps/s | 4.0201 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1890ms | 0.1118ms | 8.9424 KOps/s | 8.7276 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 57.4870μs | 21.6270μs | 46.2385 KOps/s | 47.2645 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1316ms | 60.0472μs | 16.6536 KOps/s | 16.7233 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1632ms | 79.4468μs | 12.5870 KOps/s | 11.3378 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1270ms | 69.6111μs | 14.3655 KOps/s | 14.1133 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.4545ms | 0.2191ms | 4.5631 KOps/s | 4.5311 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.8607ms | 1.7673ms | 565.8406 Ops/s | 561.3136 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.2954ms | 0.2110ms | 4.7384 KOps/s | 4.6418 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.9785ms | 1.1743ms | 851.5532 Ops/s | 838.4011 Ops/s | |
test_compile_assign_and_add_stack[compile] | 0.5726ms | 0.4648ms | 2.1514 KOps/s | 2.1203 KOps/s | |
test_compile_assign_and_add_stack[eager] | 4.2133ms | 3.9498ms | 253.1745 Ops/s | 247.6838 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1092ms | 44.4576μs | 22.4934 KOps/s | 21.9444 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5112ms | 50.5057μs | 19.7997 KOps/s | 19.4237 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1072ms | 36.8802μs | 27.1148 KOps/s | 25.6426 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 78.4670μs | 28.8343μs | 34.6809 KOps/s | 33.7988 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1151ms | 37.2440μs | 26.8500 KOps/s | 24.9589 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1011ms | 28.8898μs | 34.6143 KOps/s | 33.4919 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1424ms | 77.9583μs | 12.8274 KOps/s | 12.3422 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.5571ms | 29.1719μs | 34.2795 KOps/s | 34.8753 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.5569ms | 73.0235μs | 13.6942 KOps/s | 13.7989 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 80.9520μs | 23.4144μs | 42.7088 KOps/s | 42.2024 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1570ms | 71.6312μs | 13.9604 KOps/s | 13.5407 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 73.5170μs | 23.1745μs | 43.1508 KOps/s | 42.1678 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.2259ms | 80.6972μs | 12.3920 KOps/s | 12.2007 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 1.1007ms | 29.3952μs | 34.0192 KOps/s | 34.6861 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1617ms | 71.4076μs | 14.0041 KOps/s | 13.7303 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 68.2280μs | 23.2690μs | 42.9755 KOps/s | 41.8765 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1598ms | 71.0495μs | 14.0747 KOps/s | 13.7412 KOps/s | |
test_compile_indexing[int-pytree-eager] | 87.2130μs | 23.5938μs | 42.3841 KOps/s | 42.2316 KOps/s | |
test_mod_add[eager] | 85.0690μs | 26.7797μs | 37.3418 KOps/s | 36.8857 KOps/s | |
test_mod_add[compile] | 0.1215ms | 44.4078μs | 22.5186 KOps/s | 21.1640 KOps/s | |
test_mod_add[compile-overhead] | 0.1181ms | 44.5136μs | 22.4650 KOps/s | 21.6689 KOps/s | |
test_mod_wrap[eager] | 0.4209ms | 0.2200ms | 4.5459 KOps/s | 4.5140 KOps/s | |
test_mod_wrap[compile] | 1.6748ms | 0.2036ms | 4.9110 KOps/s | 4.8314 KOps/s | |
test_mod_wrap[compile-overhead] | 1.6821ms | 0.2021ms | 4.9475 KOps/s | 4.8427 KOps/s | |
test_mod_wrap_and_backward[eager] | 18.7637ms | 13.1051ms | 76.3064 Ops/s | 81.8971 Ops/s | |
test_mod_wrap_and_backward[compile] | 17.6042ms | 12.1402ms | 82.3713 Ops/s | 74.8248 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 13.1888ms | 10.7032ms | 93.4299 Ops/s | 87.4501 Ops/s | |
test_seq_add[eager] | 0.1638ms | 90.5444μs | 11.0443 KOps/s | 10.3746 KOps/s | |
test_seq_add[compile] | 0.1182ms | 59.3116μs | 16.8601 KOps/s | 16.6813 KOps/s | |
test_seq_add[compile-overhead] | 0.1198ms | 58.2017μs | 17.1816 KOps/s | 17.0095 KOps/s | |
test_seq_wrap[eager] | 0.7278ms | 0.4017ms | 2.4893 KOps/s | 2.4767 KOps/s | |
test_seq_wrap[compile] | 0.3647ms | 0.2268ms | 4.4095 KOps/s | 4.4148 KOps/s | |
test_seq_wrap[compile-overhead] | 0.4283ms | 0.2261ms | 4.4230 KOps/s | 4.4593 KOps/s | |
test_func_call_runtime[False-eager] | 1.2254ms | 0.5549ms | 1.8021 KOps/s | 1.7828 KOps/s | |
test_func_call_runtime[False-compile] | 0.7822ms | 0.4247ms | 2.3547 KOps/s | 2.3310 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.5228ms | 0.4260ms | 2.3476 KOps/s | 2.2045 KOps/s | |
test_func_call_runtime[True-eager] | 0.9246ms | 0.7709ms | 1.2972 KOps/s | 1.3022 KOps/s | |
test_func_call_runtime[True-compile] | 0.6935ms | 0.4625ms | 2.1623 KOps/s | 2.1097 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.6157ms | 0.4670ms | 2.1414 KOps/s | 2.1302 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.9651ms | 0.5566ms | 1.7966 KOps/s | 1.8013 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8147ms | 0.4261ms | 2.3469 KOps/s | 2.3156 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.8769ms | 0.4243ms | 2.3570 KOps/s | 2.3353 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.3964ms | 0.9084ms | 1.1008 KOps/s | 1.1030 KOps/s | |
test_func_call_cm_runtime[True-compile] | 1.0011ms | 0.4892ms | 2.0443 KOps/s | 2.0132 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.0177ms | 0.4940ms | 2.0244 KOps/s | 1.9634 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5140ms | 1.8629ms | 536.8002 Ops/s | 525.5669 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.8839ms | 0.5153ms | 1.9405 KOps/s | 1.9178 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 1.6189ms | 0.5170ms | 1.9342 KOps/s | 1.9153 KOps/s | |
test_distributed | 0.2825ms | 0.1233ms | 8.1096 KOps/s | 7.8963 KOps/s | |
test_tdmodule | 34.2540μs | 18.7153μs | 53.4321 KOps/s | 51.5716 KOps/s | |
test_tdmodule_dispatch | 62.4470μs | 36.5667μs | 27.3473 KOps/s | 26.3815 KOps/s | |
test_tdseq | 47.6890μs | 21.3079μs | 46.9309 KOps/s | 45.4890 KOps/s | |
test_tdseq_dispatch | 67.6170μs | 40.8210μs | 24.4972 KOps/s | 23.1888 KOps/s | |
test_instantiation_functorch | 2.4139ms | 1.5329ms | 652.3370 Ops/s | 649.6572 Ops/s | |
test_exec_functorch | 0.3363ms | 0.1795ms | 5.5700 KOps/s | 5.4672 KOps/s | |
test_exec_functional_call | 0.2873ms | 0.1752ms | 5.7092 KOps/s | 5.5880 KOps/s | |
test_exec_td_decorator | 0.5023ms | 0.2304ms | 4.3397 KOps/s | 4.3127 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8587ms | 0.6277ms | 1.5930 KOps/s | 1.5635 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.9204ms | 0.6262ms | 1.5969 KOps/s | 1.5541 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.6986ms | 0.5134ms | 1.9478 KOps/s | 1.8767 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7983ms | 0.5134ms | 1.9477 KOps/s | 1.9055 KOps/s | |
test_to_module_speed[True] | 1.3820ms | 1.2839ms | 778.9051 Ops/s | 784.6799 Ops/s | |
test_to_module_speed[False] | 1.6255ms | 1.2676ms | 788.9102 Ops/s | 795.8454 Ops/s | |
test_tc_init | 98.3140μs | 45.8439μs | 21.8132 KOps/s | 21.6767 KOps/s | |
test_tc_init_nested | 0.1985ms | 90.2572μs | 11.0794 KOps/s | 10.9272 KOps/s | |
test_tc_first_layer_tensor | 26.0980μs | 1.5335μs | 652.1134 KOps/s | 657.9800 KOps/s | |
test_tc_first_layer_nontensor | 43.9930μs | 4.7536μs | 210.3669 KOps/s | 201.8592 KOps/s | |
test_tc_second_layer_tensor | 18.4550μs | 2.8354μs | 352.6895 KOps/s | 363.7948 KOps/s | |
test_tc_second_layer_nontensor | 24.5770μs | 6.0851μs | 164.3359 KOps/s | 159.4528 KOps/s | |
test_unbind | 0.2080s | 12.1825ms | 82.0846 Ops/s | 76.4124 Ops/s | |
test_full_like | 7.8346ms | 6.7981ms | 147.0993 Ops/s | 143.9310 Ops/s | |
test_zeros_like | 3.0139ms | 2.6110ms | 382.9892 Ops/s | 367.2204 Ops/s | |
test_ones_like | 3.3971ms | 3.0680ms | 325.9440 Ops/s | 320.4610 Ops/s | |
test_clone | 5.3023ms | 4.8360ms | 206.7840 Ops/s | 204.9076 Ops/s | |
test_squeeze | 62.4660μs | 12.0414μs | 83.0471 KOps/s | 86.9250 KOps/s | |
test_unsqueeze | 0.2105ms | 88.4146μs | 11.3104 KOps/s | 11.5021 KOps/s | |
test_split | 0.4854ms | 0.1841ms | 5.4305 KOps/s | 5.2653 KOps/s | |
test_permute | 0.3362ms | 0.2135ms | 4.6842 KOps/s | 4.6497 KOps/s | |
test_stack | 26.8154ms | 24.2457ms | 41.2444 Ops/s | 39.9090 Ops/s | |
test_cat | 29.1559ms | 24.0184ms | 41.6347 Ops/s | 40.1004 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 45.3910μs | 14.8827μs | 67.1923 KOps/s | 63.2061 KOps/s | |
test_plain_set_stack_nested | 0.1188ms | 15.1261μs | 66.1110 KOps/s | 62.3097 KOps/s | |
test_plain_set_nested_inplace | 60.5710μs | 15.8973μs | 62.9038 KOps/s | 59.4723 KOps/s | |
test_plain_set_stack_nested_inplace | 80.2110μs | 15.8711μs | 63.0075 KOps/s | 59.1612 KOps/s | |
test_items | 22.9300μs | 2.9212μs | 342.3247 KOps/s | 341.9150 KOps/s | |
test_items_nested | 0.4480ms | 0.3190ms | 3.1349 KOps/s | 3.1615 KOps/s | |
test_items_nested_locked | 0.4144ms | 0.3217ms | 3.1086 KOps/s | 3.1491 KOps/s | |
test_items_nested_leaf | 88.9710μs | 58.3364μs | 17.1420 KOps/s | 17.1112 KOps/s | |
test_items_stack_nested | 0.3831ms | 0.3256ms | 3.0717 KOps/s | 3.1757 KOps/s | |
test_items_stack_nested_leaf | 88.9020μs | 59.3970μs | 16.8359 KOps/s | 17.0379 KOps/s | |
test_items_stack_nested_locked | 0.4122ms | 0.3266ms | 3.0617 KOps/s | 3.1594 KOps/s | |
test_keys | 0.6847ms | 3.4859μs | 286.8707 KOps/s | 288.5850 KOps/s | |
test_keys_nested | 0.1002ms | 71.9301μs | 13.9024 KOps/s | 13.8355 KOps/s | |
test_keys_nested_locked | 2.7462ms | 77.5944μs | 12.8875 KOps/s | 12.8478 KOps/s | |
test_keys_nested_leaf | 91.5710μs | 63.2571μs | 15.8085 KOps/s | 15.7421 KOps/s | |
test_keys_stack_nested | 0.1201ms | 72.6514μs | 13.7644 KOps/s | 13.6552 KOps/s | |
test_keys_stack_nested_leaf | 91.4020μs | 63.6645μs | 15.7074 KOps/s | 15.6833 KOps/s | |
test_keys_stack_nested_locked | 0.2483ms | 77.4961μs | 12.9039 KOps/s | 12.8106 KOps/s | |
test_values | 32.2523μs | 0.8591μs | 1.1640 MOps/s | 1.1553 MOps/s | |
test_values_nested | 57.9910μs | 30.9086μs | 32.3534 KOps/s | 32.2493 KOps/s | |
test_values_nested_locked | 65.3410μs | 32.5035μs | 30.7659 KOps/s | 30.5315 KOps/s | |
test_values_nested_leaf | 61.9610μs | 33.5559μs | 29.8010 KOps/s | 29.7061 KOps/s | |
test_values_stack_nested | 63.8910μs | 31.4634μs | 31.7830 KOps/s | 31.9635 KOps/s | |
test_values_stack_nested_leaf | 55.5210μs | 34.0709μs | 29.3506 KOps/s | 29.5765 KOps/s | |
test_values_stack_nested_locked | 59.9710μs | 32.6876μs | 30.5926 KOps/s | 30.3586 KOps/s | |
test_membership | 1.6331μs | 0.5053μs | 1.9790 MOps/s | 1.9640 MOps/s | |
test_membership_nested | 15.8855μs | 1.9422μs | 514.8729 KOps/s | 532.5396 KOps/s | |
test_membership_nested_leaf | 13.8700μs | 1.9351μs | 516.7654 KOps/s | 531.4385 KOps/s | |
test_membership_stacked_nested | 28.5510μs | 2.0208μs | 494.8518 KOps/s | 515.1788 KOps/s | |
test_membership_stacked_nested_leaf | 0.1912ms | 2.0177μs | 495.6232 KOps/s | 521.6396 KOps/s | |
test_membership_nested_last | 23.2210μs | 2.8269μs | 353.7411 KOps/s | 355.3344 KOps/s | |
test_membership_nested_leaf_last | 33.1810μs | 2.8690μs | 348.5521 KOps/s | 356.7887 KOps/s | |
test_membership_stacked_nested_last | 0.1997ms | 2.8886μs | 346.1929 KOps/s | 358.2392 KOps/s | |
test_membership_stacked_nested_leaf_last | 28.5100μs | 2.8447μs | 351.5342 KOps/s | 358.8436 KOps/s | |
test_nested_getleaf | 26.2100μs | 5.9999μs | 166.6688 KOps/s | 167.2306 KOps/s | |
test_nested_get | 33.8510μs | 5.6845μs | 175.9158 KOps/s | 176.7904 KOps/s | |
test_stacked_getleaf | 30.8810μs | 5.9709μs | 167.4781 KOps/s | 166.9726 KOps/s | |
test_stacked_get | 31.1300μs | 5.6780μs | 176.1181 KOps/s | 176.2536 KOps/s | |
test_nested_getitemleaf | 30.6010μs | 6.0791μs | 164.4979 KOps/s | 165.1749 KOps/s | |
test_nested_getitem | 35.1310μs | 5.7838μs | 172.8974 KOps/s | 174.1798 KOps/s | |
test_stacked_getitemleaf | 26.2310μs | 6.1089μs | 163.6950 KOps/s | 164.6724 KOps/s | |
test_stacked_getitem | 32.2700μs | 5.7904μs | 172.6984 KOps/s | 172.9578 KOps/s | |
test_lock_nested | 5.4375ms | 0.4261ms | 2.3468 KOps/s | 2.3218 KOps/s | |
test_lock_stack_nested | 0.5375ms | 0.3930ms | 2.5447 KOps/s | 2.4841 KOps/s | |
test_unlock_nested | 0.7615ms | 0.3624ms | 2.7596 KOps/s | 2.6749 KOps/s | |
test_unlock_stack_nested | 0.4249ms | 0.3327ms | 3.0053 KOps/s | 2.9243 KOps/s | |
test_flatten_speed | 0.1335ms | 74.0817μs | 13.4986 KOps/s | 13.4666 KOps/s | |
test_unflatten_speed | 0.3314ms | 0.2923ms | 3.4208 KOps/s | 3.4058 KOps/s | |
test_common_ops | 1.5689ms | 1.2352ms | 809.5987 Ops/s | 777.0831 Ops/s | |
test_creation | 23.5010μs | 1.5893μs | 629.2000 KOps/s | 628.1802 KOps/s | |
test_creation_empty | 56.5310μs | 16.2965μs | 61.3628 KOps/s | 54.2265 KOps/s | |
test_creation_nested_1 | 53.0710μs | 18.0808μs | 55.3073 KOps/s | 49.6555 KOps/s | |
test_creation_nested_2 | 47.6110μs | 20.6102μs | 48.5198 KOps/s | 44.0294 KOps/s | |
test_clone | 0.1832ms | 28.6559μs | 34.8968 KOps/s | 34.5448 KOps/s | |
test_getitem[int] | 1.2449ms | 16.5248μs | 60.5152 KOps/s | 59.7923 KOps/s | |
test_getitem[slice_int] | 0.1342ms | 29.0648μs | 34.4059 KOps/s | 34.0083 KOps/s | |
test_getitem[range] | 0.2399ms | 0.1130ms | 8.8500 KOps/s | 8.9038 KOps/s | |
test_getitem[tuple] | 0.1507ms | 25.3655μs | 39.4236 KOps/s | 39.9591 KOps/s | |
test_getitem[list] | 0.2710ms | 0.1005ms | 9.9489 KOps/s | 9.9172 KOps/s | |
test_setitem_dim[int] | 68.7510μs | 43.8002μs | 22.8309 KOps/s | 22.9551 KOps/s | |
test_setitem_dim[slice_int] | 0.2086ms | 66.3746μs | 15.0660 KOps/s | 15.0765 KOps/s | |
test_setitem_dim[range] | 0.2467ms | 0.1273ms | 7.8531 KOps/s | 7.8574 KOps/s | |
test_setitem_dim[tuple] | 0.1999ms | 59.7974μs | 16.7231 KOps/s | 16.7032 KOps/s | |
test_setitem | 0.1932ms | 41.8945μs | 23.8695 KOps/s | 23.5264 KOps/s | |
test_set | 0.2196ms | 40.3789μs | 24.7654 KOps/s | 23.7098 KOps/s | |
test_set_shared | 0.4014ms | 50.2158μs | 19.9141 KOps/s | 19.6094 KOps/s | |
test_update | 0.2261ms | 49.1518μs | 20.3451 KOps/s | 18.9404 KOps/s | |
test_update_nested | 0.2339ms | 56.8174μs | 17.6002 KOps/s | 16.8676 KOps/s | |
test_update__nested | 0.1991ms | 58.4123μs | 17.1197 KOps/s | 16.2464 KOps/s | |
test_set_nested | 0.2290ms | 42.5436μs | 23.5053 KOps/s | 22.0009 KOps/s | |
test_set_nested_new | 0.2038ms | 46.5038μs | 21.5036 KOps/s | 20.6735 KOps/s | |
test_select | 0.2069ms | 60.0347μs | 16.6570 KOps/s | 16.2701 KOps/s | |
test_select_nested | 70.9110μs | 42.5966μs | 23.4760 KOps/s | 22.8503 KOps/s | |
test_exclude_nested | 97.0710μs | 60.1861μs | 16.6151 KOps/s | 16.4659 KOps/s | |
test_empty[True] | 0.3378ms | 0.2614ms | 3.8253 KOps/s | 3.8942 KOps/s | |
test_empty[False] | 2.9760μs | 0.8275μs | 1.2084 MOps/s | 1.1964 MOps/s | |
test_to | 0.1517ms | 51.0070μs | 19.6052 KOps/s | 18.8272 KOps/s | |
test_to_nonblocking | 83.2610μs | 49.1861μs | 20.3309 KOps/s | 19.3611 KOps/s | |
test_unbind_speed | 1.2034ms | 0.2776ms | 3.6019 KOps/s | 3.4613 KOps/s | |
test_unbind_speed_stack0 | 0.3764ms | 0.2763ms | 3.6190 KOps/s | 3.5306 KOps/s | |
test_unbind_speed_stack1 | 97.3206ms | 0.7064ms | 1.4156 KOps/s | 1.3679 KOps/s | |
test_split | 0.1002s | 2.2768ms | 439.2089 Ops/s | 433.9057 Ops/s | |
test_chunk | 0.1022s | 2.2984ms | 435.0909 Ops/s | 430.5495 Ops/s | |
test_to[False] | 6.2445ms | 5.9792ms | 167.2460 Ops/s | 157.6223 Ops/s | |
test_to[True] | 4.8230ms | 4.4319ms | 225.6348 Ops/s | 220.7713 Ops/s | |
test_to_njt[False] | 0.3596s | 0.2754s | 3.6306 Ops/s | 3.5710 Ops/s | |
test_to_njt[True] | 0.3735s | 0.2878s | 3.4744 Ops/s | 3.7192 Ops/s | |
test_creation[device0] | 0.4214ms | 0.1289ms | 7.7598 KOps/s | 7.7932 KOps/s | |
test_creation_from_tensor | 0.4662ms | 0.1301ms | 7.6875 KOps/s | 7.6663 KOps/s | |
test_add_one[memmap_tensor0] | 0.1382ms | 8.8341μs | 113.1978 KOps/s | 111.5715 KOps/s | |
test_contiguous[memmap_tensor0] | 35.0410μs | 2.1896μs | 456.7103 KOps/s | 440.6131 KOps/s | |
test_stack[memmap_tensor0] | 0.1517ms | 6.9770μs | 143.3284 KOps/s | 137.8855 KOps/s | |
test_memmaptd_index | 1.0569ms | 0.4381ms | 2.2828 KOps/s | 2.2593 KOps/s | |
test_memmaptd_index_astensor | 0.8567ms | 0.4933ms | 2.0273 KOps/s | 1.9877 KOps/s | |
test_memmaptd_index_op | 1.4155ms | 1.0239ms | 976.6815 Ops/s | 909.3106 Ops/s | |
test_serialize_model | 0.1322s | 0.1307s | 7.6497 Ops/s | 6.8789 Ops/s | |
test_serialize_model_pickle | 1.3604s | 1.2132s | 0.8243 Ops/s | 0.8233 Ops/s | |
test_serialize_weights | 0.1308s | 0.1302s | 7.6793 Ops/s | 7.6795 Ops/s | |
test_serialize_weights_returnearly | 0.2239s | 55.8344ms | 17.9101 Ops/s | 17.8015 Ops/s | |
test_serialize_weights_pickle | 1.3457s | 1.2128s | 0.8245 Ops/s | 0.8191 Ops/s | |
test_reshape_pytree | 0.1564ms | 36.4005μs | 27.4721 KOps/s | 26.7927 KOps/s | |
test_reshape_td | 73.6510μs | 41.4370μs | 24.1330 KOps/s | 23.8291 KOps/s | |
test_view_pytree | 0.1703ms | 36.4133μs | 27.4625 KOps/s | 27.0844 KOps/s | |
test_view_td | 0.1727ms | 46.3442μs | 21.5777 KOps/s | 21.4751 KOps/s | |
test_unbind_pytree | 0.1359ms | 35.4885μs | 28.1782 KOps/s | 27.8468 KOps/s | |
test_unbind_td | 0.5523ms | 42.6241μs | 23.4609 KOps/s | 21.9740 KOps/s | |
test_split_pytree | 0.5121ms | 47.2680μs | 21.1560 KOps/s | 21.4317 KOps/s | |
test_split_td | 0.2733ms | 59.1310μs | 16.9116 KOps/s | 14.4819 KOps/s | |
test_add_pytree | 0.2093ms | 59.2161μs | 16.8873 KOps/s | 16.6475 KOps/s | |
test_add_td | 0.2438ms | 89.3914μs | 11.1868 KOps/s | 10.1921 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.3098ms | 0.1634ms | 6.1183 KOps/s | 5.9152 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.5504ms | 0.1527ms | 6.5476 KOps/s | 6.3960 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.5701ms | 0.1582ms | 6.3226 KOps/s | 6.1999 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.5876ms | 0.1830ms | 5.4638 KOps/s | 5.4085 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 0.4304ms | 21.8075μs | 45.8557 KOps/s | 45.3592 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.4710ms | 45.6090μs | 21.9255 KOps/s | 21.9116 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.4669ms | 68.5084μs | 14.5967 KOps/s | 14.7691 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.4440ms | 52.6213μs | 19.0037 KOps/s | 19.3001 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.4529ms | 0.3184ms | 3.1411 KOps/s | 3.1147 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.6074ms | 0.2147ms | 4.6584 KOps/s | 4.6036 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.2836ms | 0.1362ms | 7.3443 KOps/s | 7.2187 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.4667ms | 62.9575μs | 15.8837 KOps/s | 15.7623 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.4607ms | 0.3252ms | 3.0749 KOps/s | 3.0350 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 1.0318ms | 0.6500ms | 1.5385 KOps/s | 1.6019 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.6605ms | 0.2553ms | 3.9164 KOps/s | 3.8592 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.5183ms | 0.3217ms | 3.1086 KOps/s | 3.0925 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.4922ms | 73.3706μs | 13.6294 KOps/s | 14.1336 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.5313ms | 0.1362ms | 7.3412 KOps/s | 7.4573 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.9240ms | 0.5133ms | 1.9482 KOps/s | 1.9203 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.4776ms | 0.3262ms | 3.0658 KOps/s | 3.0281 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.4206ms | 18.3225μs | 54.5776 KOps/s | 54.3835 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.4083ms | 28.7589μs | 34.7718 KOps/s | 34.4127 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.2406ms | 69.7395μs | 14.3391 KOps/s | 14.1845 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.4553ms | 52.1770μs | 19.1655 KOps/s | 19.1614 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 2.3991ms | 0.8302ms | 1.2046 KOps/s | 1.0996 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.7344ms | 3.3650ms | 297.1794 Ops/s | 309.5542 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 2.4014ms | 0.8461ms | 1.1819 KOps/s | 1.0788 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.6678ms | 3.2047ms | 312.0444 Ops/s | 306.9339 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.3045ms | 0.1244ms | 8.0413 KOps/s | 8.0439 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.4594ms | 62.2380μs | 16.0674 KOps/s | 16.2546 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.2688ms | 0.1235ms | 8.0973 KOps/s | 8.4906 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.2255ms | 47.1888μs | 21.1915 KOps/s | 23.4059 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.2803ms | 0.1231ms | 8.1213 KOps/s | 8.4178 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.4351ms | 45.1487μs | 22.1490 KOps/s | 23.5004 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.5608ms | 0.1542ms | 6.4850 KOps/s | 6.5438 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.4180ms | 26.8091μs | 37.3007 KOps/s | 37.1050 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.2960ms | 0.1449ms | 6.9009 KOps/s | 6.8299 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 0.4257ms | 21.7490μs | 45.9792 KOps/s | 45.3865 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.5658ms | 0.1458ms | 6.8584 KOps/s | 6.8147 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 0.4097ms | 25.1111μs | 39.8230 KOps/s | 45.0698 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.5449ms | 0.1527ms | 6.5495 KOps/s | 6.5138 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.4630ms | 26.6045μs | 37.5876 KOps/s | 36.7807 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.5335ms | 0.1461ms | 6.8449 KOps/s | 6.8245 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.4103ms | 21.6698μs | 46.1471 KOps/s | 45.7231 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.5426ms | 0.1460ms | 6.8507 KOps/s | 6.8249 KOps/s | |
test_compile_indexing[int-pytree-eager] | 0.1480ms | 21.5909μs | 46.3158 KOps/s | 45.7294 KOps/s | |
test_mod_add[eager] | 0.2173ms | 33.4338μs | 29.9099 KOps/s | 29.1580 KOps/s | |
test_mod_add[compile] | 0.2397ms | 77.8976μs | 12.8374 KOps/s | 12.8783 KOps/s | |
test_mod_add[compile-overhead] | 0.3102ms | 0.1540ms | 6.4954 KOps/s | 5.6300 KOps/s | |
test_mod_wrap[eager] | 0.4464ms | 0.2537ms | 3.9409 KOps/s | 4.1129 KOps/s | |
test_mod_wrap[compile] | 0.4338ms | 0.2834ms | 3.5286 KOps/s | 3.5027 KOps/s | |
test_mod_wrap[compile-overhead] | 7.7923ms | 4.1204ms | 242.6965 Ops/s | 243.7769 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.6594ms | 1.3365ms | 748.2189 Ops/s | 705.5210 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.5125ms | 1.2576ms | 795.1582 Ops/s | 719.7293 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3438ms | 0.9116ms | 1.0970 KOps/s | 955.8758 Ops/s | |
test_seq_add[eager] | 0.2458ms | 97.9066μs | 10.2138 KOps/s | 9.8975 KOps/s | |
test_seq_add[compile] | 0.2380ms | 87.1892μs | 11.4693 KOps/s | 11.3944 KOps/s | |
test_seq_add[compile-overhead] | 0.2740ms | 0.1257ms | 7.9563 KOps/s | 7.7745 KOps/s | |
test_seq_wrap[eager] | 0.5166ms | 0.3756ms | 2.6627 KOps/s | 2.5319 KOps/s | |
test_seq_wrap[compile] | 0.4500ms | 0.3008ms | 3.3248 KOps/s | 3.2818 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3734ms | 0.2226ms | 4.4915 KOps/s | 4.4314 KOps/s | |
test_func_call_runtime[False-eager] | 0.8984ms | 0.7251ms | 1.3792 KOps/s | 1.2802 KOps/s | |
test_func_call_runtime[False-compile] | 0.9765ms | 0.7573ms | 1.3204 KOps/s | 1.2909 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.5438ms | 0.3649ms | 2.7403 KOps/s | 2.7358 KOps/s | |
test_func_call_runtime[True-eager] | 1.0699ms | 0.8826ms | 1.1330 KOps/s | 1.1198 KOps/s | |
test_func_call_runtime[True-compile] | 0.9377ms | 0.7799ms | 1.2823 KOps/s | 1.2773 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.5759ms | 0.3825ms | 2.6141 KOps/s | 2.6055 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8678ms | 0.7230ms | 1.3831 KOps/s | 1.3188 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.9123ms | 0.7590ms | 1.3175 KOps/s | 1.2954 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.5045ms | 0.3639ms | 2.7478 KOps/s | 2.7318 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1897ms | 0.9938ms | 1.0062 KOps/s | 1.0035 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.9863ms | 0.8077ms | 1.2380 KOps/s | 1.2180 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.5667ms | 0.4093ms | 2.4434 KOps/s | 2.4200 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.6249ms | 2.0356ms | 491.2613 Ops/s | 489.1744 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 1.0223ms | 0.8263ms | 1.2101 KOps/s | 1.2035 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.5308ms | 0.4163ms | 2.4019 KOps/s | 2.4138 KOps/s | |
test_distributed | 6.4762ms | 0.1791ms | 5.5831 KOps/s | 8.7140 KOps/s | |
test_tdmodule | 45.6110μs | 14.8708μs | 67.2459 KOps/s | 60.1476 KOps/s | |
test_tdmodule_dispatch | 62.0510μs | 29.6577μs | 33.7181 KOps/s | 31.3906 KOps/s | |
test_tdseq | 35.3800μs | 16.3612μs | 61.1201 KOps/s | 56.8305 KOps/s | |
test_tdseq_dispatch | 60.7020μs | 32.6996μs | 30.5814 KOps/s | 28.3619 KOps/s | |
test_instantiation_functorch | 2.0698ms | 1.8945ms | 527.8395 Ops/s | 520.0393 Ops/s | |
test_exec_functorch | 0.3547ms | 0.2093ms | 4.7781 KOps/s | 4.6817 KOps/s | |
test_exec_functional_call | 0.3963ms | 0.2077ms | 4.8143 KOps/s | 4.6846 KOps/s | |
test_exec_td_decorator | 0.4463ms | 0.2543ms | 3.9326 KOps/s | 3.8467 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8523ms | 0.6628ms | 1.5086 KOps/s | 1.5021 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8608ms | 0.6610ms | 1.5129 KOps/s | 1.5079 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7589ms | 0.5783ms | 1.7291 KOps/s | 1.7279 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7810ms | 0.5821ms | 1.7178 KOps/s | 1.7266 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 19.4123ms | 19.0125ms | 52.5971 Ops/s | 52.6883 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.2397ms | 19.0099ms | 52.6042 Ops/s | 52.5981 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.1731ms | 18.8795ms | 52.9676 Ops/s | 53.1178 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.2046ms | 18.8750ms | 52.9802 Ops/s | 52.6553 Ops/s | |
test_to_module_speed[True] | 1.4757ms | 0.9570ms | 1.0449 KOps/s | 1.0457 KOps/s | |
test_to_module_speed[False] | 1.3281ms | 0.9462ms | 1.0569 KOps/s | 1.0639 KOps/s | |
test_tc_init | 71.2210μs | 34.2584μs | 29.1900 KOps/s | 25.8693 KOps/s | |
test_tc_init_nested | 0.2539ms | 69.6717μs | 14.3530 KOps/s | 12.5843 KOps/s | |
test_tc_first_layer_tensor | 27.2861μs | 0.7179μs | 1.3929 MOps/s | 1.3959 MOps/s | |
test_tc_first_layer_nontensor | 18.9310μs | 2.4453μs | 408.9441 KOps/s | 436.9755 KOps/s | |
test_tc_second_layer_tensor | 9.1527μs | 1.4537μs | 687.9199 KOps/s | 683.2213 KOps/s | |
test_tc_second_layer_nontensor | 41.4810μs | 3.1917μs | 313.3079 KOps/s | 322.5501 KOps/s | |
test_unbind | 0.2043s | 11.1398ms | 89.7684 Ops/s | 88.3935 Ops/s | |
test_full_like | 0.7905ms | 0.5763ms | 1.7353 KOps/s | 1.7388 KOps/s | |
test_zeros_like | 0.3624ms | 0.1982ms | 5.0450 KOps/s | 5.0438 KOps/s | |
test_ones_like | 0.3469ms | 0.1981ms | 5.0483 KOps/s | 5.0468 KOps/s | |
test_clone | 0.5550ms | 0.4153ms | 2.4080 KOps/s | 2.4067 KOps/s | |
test_squeeze | 0.1228ms | 9.4100μs | 106.2700 KOps/s | 106.0284 KOps/s | |
test_unsqueeze | 0.2168ms | 72.3942μs | 13.8133 KOps/s | 13.5006 KOps/s | |
test_split | 0.4019ms | 0.1647ms | 6.0712 KOps/s | 5.9016 KOps/s | |
test_permute | 0.2822ms | 0.1754ms | 5.7007 KOps/s | 5.6767 KOps/s | |
test_stack | 1.2757ms | 0.8697ms | 1.1498 KOps/s | 1.2025 KOps/s | |
test_cat | 1.3439ms | 1.2317ms | 811.8582 Ops/s | 811.9529 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):