-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Densify lazy tensordicts #955
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Aug 9, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 49.5830μs | 22.1152μs | 45.2177 KOps/s | 46.3063 KOps/s | |
test_plain_set_stack_nested | 57.2970μs | 22.3371μs | 44.7685 KOps/s | 45.9930 KOps/s | |
test_plain_set_nested_inplace | 57.6380μs | 25.0612μs | 39.9023 KOps/s | 42.0509 KOps/s | |
test_plain_set_stack_nested_inplace | 61.6250μs | 24.3103μs | 41.1348 KOps/s | 42.4230 KOps/s | |
test_items | 23.0740μs | 2.7052μs | 369.6620 KOps/s | 352.0356 KOps/s | |
test_items_nested | 1.4549ms | 0.3704ms | 2.7001 KOps/s | 2.9797 KOps/s | |
test_items_nested_locked | 0.5983ms | 0.3380ms | 2.9586 KOps/s | 2.9583 KOps/s | |
test_items_nested_leaf | 0.1627ms | 87.5593μs | 11.4208 KOps/s | 11.9237 KOps/s | |
test_items_stack_nested | 0.6323ms | 0.3403ms | 2.9387 KOps/s | 2.9457 KOps/s | |
test_items_stack_nested_leaf | 0.1760ms | 84.3198μs | 11.8596 KOps/s | 12.0171 KOps/s | |
test_items_stack_nested_locked | 0.6228ms | 0.3391ms | 2.9487 KOps/s | 2.9423 KOps/s | |
test_keys | 31.7600μs | 3.9907μs | 250.5809 KOps/s | 257.4958 KOps/s | |
test_keys_nested | 0.2518ms | 0.1454ms | 6.8767 KOps/s | 6.8638 KOps/s | |
test_keys_nested_locked | 0.6694ms | 0.1508ms | 6.6331 KOps/s | 6.6195 KOps/s | |
test_keys_nested_leaf | 0.2136ms | 0.1247ms | 8.0210 KOps/s | 8.1322 KOps/s | |
test_keys_stack_nested | 0.2790ms | 0.1463ms | 6.8330 KOps/s | 6.9731 KOps/s | |
test_keys_stack_nested_leaf | 0.2383ms | 0.1249ms | 8.0033 KOps/s | 8.1730 KOps/s | |
test_keys_stack_nested_locked | 0.2486ms | 0.1502ms | 6.6572 KOps/s | 6.7425 KOps/s | |
test_values | 9.4803μs | 1.4443μs | 692.3908 KOps/s | 856.5606 KOps/s | |
test_values_nested | 0.1217ms | 51.3316μs | 19.4812 KOps/s | 19.3442 KOps/s | |
test_values_nested_locked | 97.8930μs | 50.5112μs | 19.7976 KOps/s | 19.9921 KOps/s | |
test_values_nested_leaf | 82.9350μs | 46.0704μs | 21.7059 KOps/s | 22.4109 KOps/s | |
test_values_stack_nested | 96.8610μs | 52.2080μs | 19.1542 KOps/s | 19.6169 KOps/s | |
test_values_stack_nested_leaf | 0.1051ms | 45.5290μs | 21.9640 KOps/s | 22.5449 KOps/s | |
test_values_stack_nested_locked | 0.1010ms | 52.0264μs | 19.2210 KOps/s | 19.7326 KOps/s | |
test_membership | 2.8734μs | 0.7495μs | 1.3342 MOps/s | 1.0913 MOps/s | |
test_membership_nested | 30.8980μs | 2.6875μs | 372.0961 KOps/s | 367.0342 KOps/s | |
test_membership_nested_leaf | 30.8280μs | 2.6859μs | 372.3194 KOps/s | 383.6181 KOps/s | |
test_membership_stacked_nested | 38.5020μs | 2.6692μs | 374.6372 KOps/s | 382.2072 KOps/s | |
test_membership_stacked_nested_leaf | 38.3710μs | 2.6542μs | 376.7670 KOps/s | 382.0374 KOps/s | |
test_membership_nested_last | 31.2180μs | 3.9769μs | 251.4516 KOps/s | 257.3222 KOps/s | |
test_membership_nested_leaf_last | 29.4050μs | 3.9623μs | 252.3776 KOps/s | 258.6231 KOps/s | |
test_membership_stacked_nested_last | 57.0670μs | 13.0736μs | 76.4898 KOps/s | 202.0362 KOps/s | |
test_membership_stacked_nested_leaf_last | 61.4650μs | 12.5924μs | 79.4128 KOps/s | 203.7740 KOps/s | |
test_nested_getleaf | 40.3560μs | 10.4307μs | 95.8712 KOps/s | 95.3393 KOps/s | |
test_nested_get | 43.2010μs | 9.9345μs | 100.6596 KOps/s | 101.8667 KOps/s | |
test_stacked_getleaf | 38.4720μs | 10.4234μs | 95.9377 KOps/s | 95.8762 KOps/s | |
test_stacked_get | 48.5820μs | 9.7560μs | 102.5009 KOps/s | 101.7012 KOps/s | |
test_nested_getitemleaf | 41.1770μs | 10.8906μs | 91.8224 KOps/s | 91.8996 KOps/s | |
test_nested_getitem | 42.8400μs | 10.0550μs | 99.4530 KOps/s | 99.7175 KOps/s | |
test_stacked_getitemleaf | 0.1888ms | 10.9600μs | 91.2406 KOps/s | 93.0340 KOps/s | |
test_stacked_getitem | 42.4100μs | 10.0555μs | 99.4480 KOps/s | 102.8447 KOps/s | |
test_lock_nested | 80.6786ms | 0.5780ms | 1.7301 KOps/s | 1.9760 KOps/s | |
test_lock_stack_nested | 0.7115ms | 0.4467ms | 2.2384 KOps/s | 2.1635 KOps/s | |
test_unlock_nested | 88.5185ms | 0.5178ms | 1.9314 KOps/s | 2.3908 KOps/s | |
test_unlock_stack_nested | 0.6307ms | 0.3634ms | 2.7521 KOps/s | 2.6603 KOps/s | |
test_flatten_speed | 0.5522ms | 0.1073ms | 9.3154 KOps/s | 9.7910 KOps/s | |
test_unflatten_speed | 0.5984ms | 0.4648ms | 2.1513 KOps/s | 2.1953 KOps/s | |
test_common_ops | 5.4312ms | 1.1783ms | 848.6853 Ops/s | 916.8127 Ops/s | |
test_creation | 0.1375ms | 2.2168μs | 451.1008 KOps/s | 495.2284 KOps/s | |
test_creation_empty | 53.5910μs | 19.3333μs | 51.7243 KOps/s | 56.3804 KOps/s | |
test_creation_nested_1 | 60.3430μs | 22.5377μs | 44.3702 KOps/s | 47.0005 KOps/s | |
test_creation_nested_2 | 1.3476ms | 27.4419μs | 36.4406 KOps/s | 39.4596 KOps/s | |
test_clone | 0.2651ms | 17.9813μs | 55.6132 KOps/s | 62.0612 KOps/s | |
test_getitem[int] | 0.8255ms | 17.1901μs | 58.1730 KOps/s | 61.0094 KOps/s | |
test_getitem[slice_int] | 0.1356ms | 33.6375μs | 29.7287 KOps/s | 32.4183 KOps/s | |
test_getitem[range] | 0.1623ms | 58.1922μs | 17.1844 KOps/s | 17.9576 KOps/s | |
test_getitem[tuple] | 0.1388ms | 26.1217μs | 38.2824 KOps/s | 40.2078 KOps/s | |
test_getitem[list] | 0.2893ms | 52.9444μs | 18.8877 KOps/s | 19.5446 KOps/s | |
test_setitem_dim[int] | 88.1450μs | 43.3310μs | 23.0782 KOps/s | 24.0836 KOps/s | |
test_setitem_dim[slice_int] | 0.1258ms | 74.4821μs | 13.4260 KOps/s | 13.5735 KOps/s | |
test_setitem_dim[range] | 0.2011ms | 96.7834μs | 10.3324 KOps/s | 10.7608 KOps/s | |
test_setitem_dim[tuple] | 0.1097ms | 61.0217μs | 16.3876 KOps/s | 16.9919 KOps/s | |
test_setitem | 0.1675ms | 31.8820μs | 31.3657 KOps/s | 35.4565 KOps/s | |
test_set | 0.3143ms | 31.4596μs | 31.7868 KOps/s | 36.0598 KOps/s | |
test_set_shared | 1.2976ms | 0.2204ms | 4.5382 KOps/s | 4.6820 KOps/s | |
test_update | 0.3441ms | 38.8928μs | 25.7117 KOps/s | 28.5334 KOps/s | |
test_update_nested | 0.2483ms | 49.6538μs | 20.1394 KOps/s | 22.3366 KOps/s | |
test_update__nested | 0.1282ms | 35.3600μs | 28.2806 KOps/s | 30.4925 KOps/s | |
test_set_nested | 0.1469ms | 34.3228μs | 29.1351 KOps/s | 33.2697 KOps/s | |
test_set_nested_new | 0.1339ms | 38.5237μs | 25.9580 KOps/s | 28.8362 KOps/s | |
test_select | 0.1679ms | 56.2533μs | 17.7767 KOps/s | 19.0070 KOps/s | |
test_select_nested | 0.1172ms | 59.5073μs | 16.8047 KOps/s | 16.9734 KOps/s | |
test_exclude_nested | 0.1627ms | 78.0333μs | 12.8150 KOps/s | 13.1391 KOps/s | |
test_empty[True] | 0.7217ms | 0.3252ms | 3.0748 KOps/s | 3.0950 KOps/s | |
test_empty[False] | 11.8497μs | 1.1605μs | 861.6754 KOps/s | 862.6629 KOps/s | |
test_unbind_speed | 0.5256ms | 0.3110ms | 3.2155 KOps/s | 3.2230 KOps/s | |
test_unbind_speed_stack0 | 0.5736ms | 0.2953ms | 3.3865 KOps/s | 3.3675 KOps/s | |
test_unbind_speed_stack1 | 81.9724ms | 0.7647ms | 1.3077 KOps/s | 1.3964 KOps/s | |
test_split | 89.4074ms | 2.1988ms | 454.7959 Ops/s | 469.0029 Ops/s | |
test_chunk | 83.8390ms | 2.1872ms | 457.2155 Ops/s | 467.7117 Ops/s | |
test_creation[device0] | 0.2177ms | 0.1216ms | 8.2223 KOps/s | 8.2205 KOps/s | |
test_creation_from_tensor | 4.1673ms | 0.1232ms | 8.1155 KOps/s | 8.3875 KOps/s | |
test_add_one[memmap_tensor0] | 0.2251ms | 8.0027μs | 124.9575 KOps/s | 133.2127 KOps/s | |
test_contiguous[memmap_tensor0] | 18.3140μs | 2.0929μs | 477.7989 KOps/s | 500.9327 KOps/s | |
test_stack[memmap_tensor0] | 50.8450μs | 5.9258μs | 168.7541 KOps/s | 179.8554 KOps/s | |
test_memmaptd_index | 1.2011ms | 0.4244ms | 2.3564 KOps/s | 2.4935 KOps/s | |
test_memmaptd_index_astensor | 1.0639ms | 0.5008ms | 1.9967 KOps/s | 2.0933 KOps/s | |
test_memmaptd_index_op | 1.8573ms | 1.0897ms | 917.7253 Ops/s | 984.1313 Ops/s | |
test_serialize_model | 0.1281s | 0.1211s | 8.2589 Ops/s | 7.4242 Ops/s | |
test_serialize_model_pickle | 0.4711s | 0.3881s | 2.5769 Ops/s | 2.5405 Ops/s | |
test_serialize_weights | 0.1313s | 0.1213s | 8.2471 Ops/s | 8.7053 Ops/s | |
test_serialize_weights_returnearly | 0.1774s | 0.1607s | 6.2244 Ops/s | 6.2154 Ops/s | |
test_serialize_weights_pickle | 0.4415s | 0.4019s | 2.4884 Ops/s | 2.4186 Ops/s | |
test_serialize_weights_filesystem | 0.1465s | 0.1416s | 7.0640 Ops/s | 6.5801 Ops/s | |
test_serialize_model_filesystem | 0.2288s | 0.1603s | 6.2378 Ops/s | 6.6309 Ops/s | |
test_reshape_pytree | 0.1031ms | 41.0704μs | 24.3484 KOps/s | 25.2112 KOps/s | |
test_reshape_td | 96.0500μs | 47.7815μs | 20.9286 KOps/s | 20.6710 KOps/s | |
test_view_pytree | 94.9480μs | 39.7903μs | 25.1317 KOps/s | 25.0476 KOps/s | |
test_view_td | 0.1070ms | 54.1267μs | 18.4752 KOps/s | 18.9574 KOps/s | |
test_unbind_pytree | 78.5570μs | 37.9453μs | 26.3538 KOps/s | 27.2752 KOps/s | |
test_unbind_td | 0.3646ms | 47.0479μs | 21.2550 KOps/s | 21.8943 KOps/s | |
test_split_pytree | 80.5700μs | 41.8738μs | 23.8813 KOps/s | 25.1560 KOps/s | |
test_split_td | 0.5653ms | 60.8105μs | 16.4445 KOps/s | 17.2782 KOps/s | |
test_add_pytree | 0.1250ms | 50.0850μs | 19.9661 KOps/s | 21.8675 KOps/s | |
test_add_td | 0.1890ms | 89.8703μs | 11.1272 KOps/s | 11.9204 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1217ms | 55.3685μs | 18.0608 KOps/s | 18.4384 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.4200ms | 0.1934ms | 5.1713 KOps/s | 5.2596 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.2098ms | 55.9354μs | 17.8778 KOps/s | 18.8698 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2963ms | 0.1487ms | 6.7263 KOps/s | 7.0572 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 66.2840μs | 20.9232μs | 47.7938 KOps/s | 48.1187 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1893ms | 65.2790μs | 15.3189 KOps/s | 15.7408 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1803ms | 79.7734μs | 12.5355 KOps/s | 12.7461 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1349ms | 70.0495μs | 14.2756 KOps/s | 14.2735 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2966ms | 0.1800ms | 5.5562 KOps/s | 5.7898 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.2966ms | 0.1977ms | 5.0582 KOps/s | 5.1643 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1101ms | 40.2565μs | 24.8407 KOps/s | 26.2943 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 1.1929ms | 72.3399μs | 13.8236 KOps/s | 14.2006 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.3034ms | 0.1773ms | 5.6393 KOps/s | 5.8604 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.3842ms | 0.3038ms | 3.2912 KOps/s | 3.4493 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.2989ms | 0.2111ms | 4.7371 KOps/s | 4.8729 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2833ms | 0.1803ms | 5.5471 KOps/s | 5.7067 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1932ms | 63.7792μs | 15.6791 KOps/s | 15.8233 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 79.1590μs | 41.0246μs | 24.3756 KOps/s | 25.0798 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.3642ms | 0.2513ms | 3.9795 KOps/s | 4.2061 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.3010ms | 0.1780ms | 5.6187 KOps/s | 5.8169 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.2701ms | 0.1101ms | 9.0828 KOps/s | 9.3687 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1192ms | 56.5721μs | 17.6766 KOps/s | 18.1291 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1531ms | 79.2012μs | 12.6261 KOps/s | 11.9595 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1380ms | 70.2825μs | 14.2283 KOps/s | 14.0584 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.2950ms | 0.1918ms | 5.2150 KOps/s | 5.2640 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.8921ms | 1.6936ms | 590.4551 Ops/s | 613.2494 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.3763ms | 0.1881ms | 5.3155 KOps/s | 5.3266 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.4652ms | 1.1297ms | 885.1958 Ops/s | 935.5714 Ops/s | |
test_compile_assign_and_add_stack[compile] | 0.6248ms | 0.4055ms | 2.4662 KOps/s | 2.4109 KOps/s | |
test_compile_assign_and_add_stack[eager] | 4.3552ms | 4.0027ms | 249.8329 Ops/s | 260.9257 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 91.5020μs | 33.2263μs | 30.0966 KOps/s | 30.6204 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 1.4968ms | 50.0832μs | 19.9668 KOps/s | 21.0963 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 86.2520μs | 29.6650μs | 33.7098 KOps/s | 36.9389 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 92.1730μs | 30.7161μs | 32.5562 KOps/s | 32.7913 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 64.4310μs | 29.8210μs | 33.5334 KOps/s | 36.7996 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 83.2860μs | 31.0768μs | 32.1783 KOps/s | 32.7270 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1963ms | 73.2297μs | 13.6557 KOps/s | 13.7970 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.7271ms | 28.9713μs | 34.5169 KOps/s | 36.0616 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1366ms | 69.3495μs | 14.4197 KOps/s | 14.9320 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 67.2360μs | 25.3465μs | 39.4532 KOps/s | 40.8123 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1310ms | 68.7588μs | 14.5436 KOps/s | 15.1598 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 72.6960μs | 25.0241μs | 39.9615 KOps/s | 39.7813 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1385ms | 73.6514μs | 13.5775 KOps/s | 13.9787 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 1.0381ms | 29.0201μs | 34.4589 KOps/s | 36.4577 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1635ms | 68.9071μs | 14.5123 KOps/s | 14.6545 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.1408ms | 25.5335μs | 39.1643 KOps/s | 40.5168 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1701ms | 69.3363μs | 14.4225 KOps/s | 14.9700 KOps/s | |
test_compile_indexing[int-pytree-eager] | 73.6760μs | 24.6729μs | 40.5303 KOps/s | 39.5369 KOps/s | |
test_mod_add[eager] | 76.8140μs | 27.0835μs | 36.9228 KOps/s | 40.0176 KOps/s | |
test_mod_add[compile] | 87.4340μs | 38.0226μs | 26.3002 KOps/s | 27.6859 KOps/s | |
test_mod_add[compile-overhead] | 81.1020μs | 37.3596μs | 26.7669 KOps/s | 27.3214 KOps/s | |
test_mod_wrap[eager] | 0.4138ms | 0.2143ms | 4.6659 KOps/s | 4.4457 KOps/s | |
test_mod_wrap[compile] | 2.0368ms | 0.2322ms | 4.3073 KOps/s | 4.2833 KOps/s | |
test_mod_wrap[compile-overhead] | 0.4546ms | 0.2266ms | 4.4136 KOps/s | 4.2088 KOps/s | |
test_mod_wrap_and_backward[eager] | 13.8402ms | 11.4367ms | 87.4379 Ops/s | 91.3787 Ops/s | |
test_mod_wrap_and_backward[compile] | 13.6393ms | 11.6260ms | 86.0140 Ops/s | 92.0014 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 14.9062ms | 12.2259ms | 81.7937 Ops/s | 91.7629 Ops/s | |
test_seq_add[eager] | 0.1702ms | 92.2231μs | 10.8433 KOps/s | 11.5547 KOps/s | |
test_seq_add[compile] | 0.1560ms | 59.7094μs | 16.7478 KOps/s | 16.6438 KOps/s | |
test_seq_add[compile-overhead] | 0.1742ms | 59.7875μs | 16.7259 KOps/s | 16.8632 KOps/s | |
test_seq_wrap[eager] | 0.5076ms | 0.3840ms | 2.6044 KOps/s | 2.6840 KOps/s | |
test_seq_wrap[compile] | 0.5511ms | 0.2626ms | 3.8076 KOps/s | 3.7932 KOps/s | |
test_seq_wrap[compile-overhead] | 0.5157ms | 0.2630ms | 3.8025 KOps/s | 3.7904 KOps/s | |
test_func_call_runtime[False-eager] | 0.7349ms | 0.5507ms | 1.8159 KOps/s | 1.9015 KOps/s | |
test_func_call_runtime[False-compile] | 0.9221ms | 0.4982ms | 2.0074 KOps/s | 2.0418 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.6113ms | 0.4929ms | 2.0288 KOps/s | 2.0451 KOps/s | |
test_func_call_runtime[True-eager] | 1.2247ms | 0.7743ms | 1.2914 KOps/s | 1.3263 KOps/s | |
test_func_call_runtime[True-compile] | 0.5936ms | 0.5094ms | 1.9631 KOps/s | 1.9936 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.9091ms | 0.5112ms | 1.9562 KOps/s | 1.9680 KOps/s | |
test_func_call_cm_runtime[False-eager] | 1.5901ms | 0.5640ms | 1.7730 KOps/s | 1.9219 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.5994ms | 0.4930ms | 2.0282 KOps/s | 2.0502 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 1.0161ms | 0.4957ms | 2.0175 KOps/s | 2.0359 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1310ms | 0.8992ms | 1.1121 KOps/s | 1.1133 KOps/s | |
test_func_call_cm_runtime[True-compile] | 1.1447ms | 0.8482ms | 1.1790 KOps/s | 1.1823 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.0676ms | 0.8460ms | 1.1821 KOps/s | 1.1892 KOps/s | |
test_distributed | 0.2667ms | 0.1294ms | 7.7303 KOps/s | 7.4154 KOps/s | |
test_tdmodule | 93.4250μs | 18.2857μs | 54.6875 KOps/s | 57.9267 KOps/s | |
test_tdmodule_dispatch | 62.8180μs | 37.4287μs | 26.7175 KOps/s | 27.9936 KOps/s | |
test_tdseq | 35.6060μs | 20.0617μs | 49.8462 KOps/s | 52.3524 KOps/s | |
test_tdseq_dispatch | 84.7080μs | 41.8683μs | 23.8844 KOps/s | 24.9460 KOps/s | |
test_instantiation_functorch | 1.9260ms | 1.6675ms | 599.6831 Ops/s | 602.4616 Ops/s | |
test_instantiation_td | 1.9740ms | 1.2058ms | 829.3365 Ops/s | 827.4871 Ops/s | |
test_exec_functorch | 0.4057ms | 0.1843ms | 5.4260 KOps/s | 5.5778 KOps/s | |
test_exec_functional_call | 0.3280ms | 0.1737ms | 5.7556 KOps/s | 5.8513 KOps/s | |
test_exec_td | 0.2827ms | 0.1745ms | 5.7307 KOps/s | 5.7649 KOps/s | |
test_exec_td_decorator | 0.8630ms | 0.2285ms | 4.3754 KOps/s | 4.4504 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.8393ms | 0.5939ms | 1.6839 KOps/s | 1.7322 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.8583ms | 0.5863ms | 1.7056 KOps/s | 1.7624 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.7927ms | 0.4878ms | 2.0499 KOps/s | 2.1271 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.6706ms | 0.4882ms | 2.0484 KOps/s | 2.1316 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.9491ms | 0.6457ms | 1.5488 KOps/s | 1.5985 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.0580ms | 0.6451ms | 1.5501 KOps/s | 1.6083 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.8574ms | 0.5314ms | 1.8819 KOps/s | 1.9345 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7380ms | 0.5316ms | 1.8811 KOps/s | 1.9440 KOps/s | |
test_to_module_speed[True] | 2.2819ms | 1.3249ms | 754.7476 Ops/s | 754.8753 Ops/s | |
test_to_module_speed[False] | 2.1091ms | 1.2911ms | 774.5321 Ops/s | 770.4757 Ops/s | |
test_tc_init | 77.6560μs | 46.1553μs | 21.6660 KOps/s | 22.0187 KOps/s | |
test_tc_init_nested | 0.1738ms | 94.8299μs | 10.5452 KOps/s | 10.9411 KOps/s | |
test_tc_first_layer_tensor | 23.8140μs | 1.4859μs | 672.9845 KOps/s | 690.6866 KOps/s | |
test_tc_first_layer_nontensor | 28.7830μs | 4.2970μs | 232.7203 KOps/s | 236.0247 KOps/s | |
test_tc_second_layer_tensor | 39.8640μs | 2.6909μs | 371.6186 KOps/s | 369.1007 KOps/s | |
test_tc_second_layer_nontensor | 29.0140μs | 5.5165μs | 181.2747 KOps/s | 181.5853 KOps/s | |
test_unbind | 0.4549s | 17.2802ms | 57.8698 Ops/s | 67.5783 Ops/s | |
test_full_like | 17.3406ms | 11.3495ms | 88.1096 Ops/s | 141.6104 Ops/s | |
test_zeros_like | 15.4088ms | 7.1767ms | 139.3399 Ops/s | 131.9552 Ops/s | |
test_ones_like | 12.1370ms | 7.4477ms | 134.2704 Ops/s | 128.3583 Ops/s | |
test_clone | 16.9052ms | 8.8800ms | 112.6130 Ops/s | 102.5625 Ops/s | |
test_squeeze | 73.4470μs | 13.4700μs | 74.2388 KOps/s | 76.6820 KOps/s | |
test_unsqueeze | 0.3276ms | 96.2925μs | 10.3850 KOps/s | 10.7905 KOps/s | |
test_split | 0.3920ms | 0.2039ms | 4.9050 KOps/s | 4.9867 KOps/s | |
test_permute | 0.3700ms | 0.2205ms | 4.5355 KOps/s | 4.5835 KOps/s | |
test_stack | 29.5706ms | 24.1686ms | 41.3760 Ops/s | 41.6282 Ops/s | |
test_cat | 29.5340ms | 24.2424ms | 41.2500 Ops/s | 41.5835 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.2985ms | 16.5736μs | 60.3371 KOps/s | 55.5278 KOps/s | |
test_plain_set_stack_nested | 34.9100μs | 16.6564μs | 60.0369 KOps/s | 54.9337 KOps/s | |
test_plain_set_nested_inplace | 0.2094ms | 17.8321μs | 56.0788 KOps/s | 52.2699 KOps/s | |
test_plain_set_stack_nested_inplace | 43.7110μs | 17.7074μs | 56.4734 KOps/s | 50.8548 KOps/s | |
test_items | 0.1936ms | 4.7056μs | 212.5136 KOps/s | 212.4430 KOps/s | |
test_items_nested | 0.5555ms | 0.3620ms | 2.7621 KOps/s | 2.6930 KOps/s | |
test_items_nested_locked | 0.5455ms | 0.3636ms | 2.7501 KOps/s | 2.6494 KOps/s | |
test_items_nested_leaf | 0.1123ms | 84.2212μs | 11.8735 KOps/s | 11.6745 KOps/s | |
test_items_stack_nested | 0.5583ms | 0.3606ms | 2.7731 KOps/s | 2.7039 KOps/s | |
test_items_stack_nested_leaf | 0.2812ms | 85.7132μs | 11.6668 KOps/s | 11.5298 KOps/s | |
test_items_stack_nested_locked | 0.5491ms | 0.3649ms | 2.7402 KOps/s | 2.7010 KOps/s | |
test_keys | 0.1883ms | 4.3702μs | 228.8200 KOps/s | 227.7542 KOps/s | |
test_keys_nested | 0.2577ms | 66.8962μs | 14.9485 KOps/s | 15.2018 KOps/s | |
test_keys_nested_locked | 0.9129ms | 72.3059μs | 13.8301 KOps/s | 13.8115 KOps/s | |
test_keys_nested_leaf | 74.9310μs | 57.2151μs | 17.4779 KOps/s | 17.4365 KOps/s | |
test_keys_stack_nested | 0.2620ms | 67.3925μs | 14.8385 KOps/s | 14.9438 KOps/s | |
test_keys_stack_nested_leaf | 0.2537ms | 57.6332μs | 17.3511 KOps/s | 17.1302 KOps/s | |
test_keys_stack_nested_locked | 0.2674ms | 72.7481μs | 13.7461 KOps/s | 13.6921 KOps/s | |
test_values | 63.9513μs | 1.7771μs | 562.7289 KOps/s | 561.0806 KOps/s | |
test_values_nested | 0.2169ms | 33.7256μs | 29.6510 KOps/s | 29.4816 KOps/s | |
test_values_nested_locked | 0.2204ms | 35.6307μs | 28.0657 KOps/s | 27.6198 KOps/s | |
test_values_nested_leaf | 52.7610μs | 30.0000μs | 33.3333 KOps/s | 32.7862 KOps/s | |
test_values_stack_nested | 0.2274ms | 34.4075μs | 29.0635 KOps/s | 29.1783 KOps/s | |
test_values_stack_nested_leaf | 0.2188ms | 30.3418μs | 32.9579 KOps/s | 32.5252 KOps/s | |
test_values_stack_nested_locked | 0.2231ms | 36.2816μs | 27.5622 KOps/s | 27.6465 KOps/s | |
test_membership | 1.2836μs | 0.5389μs | 1.8557 MOps/s | 1.7751 MOps/s | |
test_membership_nested | 0.1039ms | 1.9581μs | 510.6983 KOps/s | 511.4504 KOps/s | |
test_membership_nested_leaf | 99.9565μs | 1.9543μs | 511.6888 KOps/s | 504.1702 KOps/s | |
test_membership_stacked_nested | 26.9600μs | 2.0045μs | 498.8844 KOps/s | 504.0078 KOps/s | |
test_membership_stacked_nested_leaf | 17.6310μs | 2.0085μs | 497.8840 KOps/s | 497.3662 KOps/s | |
test_membership_nested_last | 0.1941ms | 2.9204μs | 342.4173 KOps/s | 340.4623 KOps/s | |
test_membership_nested_leaf_last | 16.1000μs | 2.9445μs | 339.6205 KOps/s | 340.0258 KOps/s | |
test_membership_stacked_nested_last | 27.6310μs | 2.9637μs | 337.4202 KOps/s | 343.4002 KOps/s | |
test_membership_stacked_nested_leaf_last | 0.2000ms | 2.9322μs | 341.0416 KOps/s | 341.7601 KOps/s | |
test_nested_getleaf | 0.1986ms | 7.9156μs | 126.3327 KOps/s | 126.4926 KOps/s | |
test_nested_get | 22.5800μs | 7.4479μs | 134.2667 KOps/s | 133.5551 KOps/s | |
test_stacked_getleaf | 0.2035ms | 7.9687μs | 125.4914 KOps/s | 125.6733 KOps/s | |
test_stacked_get | 21.5700μs | 7.4852μs | 133.5974 KOps/s | 135.2651 KOps/s | |
test_nested_getitemleaf | 0.1976ms | 8.1207μs | 123.1421 KOps/s | 121.8433 KOps/s | |
test_nested_getitem | 21.2100μs | 7.7050μs | 129.7862 KOps/s | 130.1232 KOps/s | |
test_stacked_getitemleaf | 30.7000μs | 8.2027μs | 121.9114 KOps/s | 121.9459 KOps/s | |
test_stacked_getitem | 0.2018ms | 7.7070μs | 129.7521 KOps/s | 130.0029 KOps/s | |
test_lock_nested | 1.2952ms | 0.4824ms | 2.0731 KOps/s | 2.0534 KOps/s | |
test_lock_stack_nested | 0.5010ms | 0.4440ms | 2.2525 KOps/s | 2.2398 KOps/s | |
test_unlock_nested | 0.9116ms | 0.4025ms | 2.4842 KOps/s | 2.4594 KOps/s | |
test_unlock_stack_nested | 0.5014ms | 0.3642ms | 2.7458 KOps/s | 2.7322 KOps/s | |
test_flatten_speed | 0.3118ms | 0.1068ms | 9.3661 KOps/s | 9.5279 KOps/s | |
test_unflatten_speed | 0.5123ms | 0.3190ms | 3.1352 KOps/s | 3.1369 KOps/s | |
test_common_ops | 1.7399ms | 1.4389ms | 694.9654 Ops/s | 691.7279 Ops/s | |
test_creation | 19.7110μs | 1.6578μs | 603.2191 KOps/s | 592.2792 KOps/s | |
test_creation_empty | 43.3610μs | 16.3219μs | 61.2673 KOps/s | 51.9830 KOps/s | |
test_creation_nested_1 | 35.7010μs | 18.6457μs | 53.6315 KOps/s | 47.5933 KOps/s | |
test_creation_nested_2 | 0.2233ms | 21.5330μs | 46.4403 KOps/s | 40.2115 KOps/s | |
test_clone | 54.0510μs | 32.8600μs | 30.4322 KOps/s | 30.1221 KOps/s | |
test_getitem[int] | 1.0586ms | 18.2880μs | 54.6807 KOps/s | 53.0161 KOps/s | |
test_getitem[slice_int] | 0.2430ms | 30.3968μs | 32.8982 KOps/s | 32.1034 KOps/s | |
test_getitem[range] | 0.2904ms | 0.1184ms | 8.4469 KOps/s | 8.5596 KOps/s | |
test_getitem[tuple] | 0.1444ms | 26.6580μs | 37.5122 KOps/s | 36.2181 KOps/s | |
test_getitem[list] | 0.4035ms | 0.1074ms | 9.3119 KOps/s | 8.7844 KOps/s | |
test_setitem_dim[int] | 78.2420μs | 55.8493μs | 17.9053 KOps/s | 15.7706 KOps/s | |
test_setitem_dim[slice_int] | 0.1021ms | 79.6506μs | 12.5548 KOps/s | 11.6263 KOps/s | |
test_setitem_dim[range] | 0.1671ms | 0.1432ms | 6.9857 KOps/s | 6.4248 KOps/s | |
test_setitem_dim[tuple] | 0.2227ms | 73.6526μs | 13.5773 KOps/s | 12.2162 KOps/s | |
test_setitem | 0.1914ms | 45.4787μs | 21.9883 KOps/s | 20.5316 KOps/s | |
test_set | 0.2389ms | 44.8307μs | 22.3062 KOps/s | 19.8443 KOps/s | |
test_set_shared | 0.3922ms | 56.5816μs | 17.6736 KOps/s | 17.2435 KOps/s | |
test_update | 0.2032ms | 53.3731μs | 18.7360 KOps/s | 17.4172 KOps/s | |
test_update_nested | 0.2590ms | 61.3139μs | 16.3095 KOps/s | 14.4105 KOps/s | |
test_update__nested | 0.2675ms | 65.6733μs | 15.2269 KOps/s | 14.2046 KOps/s | |
test_set_nested | 0.2478ms | 47.8081μs | 20.9170 KOps/s | 19.5774 KOps/s | |
test_set_nested_new | 0.2033ms | 51.8052μs | 19.3031 KOps/s | 17.5298 KOps/s | |
test_select | 0.2682ms | 66.2801μs | 15.0875 KOps/s | 13.7352 KOps/s | |
test_select_nested | 74.4010μs | 51.1689μs | 19.5431 KOps/s | 19.2942 KOps/s | |
test_exclude_nested | 0.2569ms | 69.9652μs | 14.2928 KOps/s | 14.3221 KOps/s | |
test_empty[True] | 0.4803ms | 0.2838ms | 3.5238 KOps/s | 3.4937 KOps/s | |
test_empty[False] | 19.2453μs | 0.8658μs | 1.1550 MOps/s | 1.1052 MOps/s | |
test_to | 49.1610μs | 28.4711μs | 35.1233 KOps/s | 35.4237 KOps/s | |
test_to_nonblocking | 0.2264ms | 28.3789μs | 35.2375 KOps/s | 37.3586 KOps/s | |
test_unbind_speed | 1.1815ms | 0.3165ms | 3.1591 KOps/s | 3.1614 KOps/s | |
test_unbind_speed_stack0 | 0.5105ms | 0.3169ms | 3.1553 KOps/s | 3.1718 KOps/s | |
test_unbind_speed_stack1 | 89.6224ms | 0.7871ms | 1.2704 KOps/s | 1.3758 KOps/s | |
test_split | 92.4675ms | 2.4759ms | 403.9010 Ops/s | 403.7185 Ops/s | |
test_chunk | 92.7088ms | 2.4837ms | 402.6261 Ops/s | 404.6127 Ops/s | |
test_creation[device0] | 0.2493ms | 0.1073ms | 9.3208 KOps/s | 9.3143 KOps/s | |
test_creation_from_tensor | 0.3158ms | 0.1047ms | 9.5501 KOps/s | 9.2820 KOps/s | |
test_add_one[memmap_tensor0] | 0.1288ms | 10.0660μs | 99.3444 KOps/s | 99.1214 KOps/s | |
test_contiguous[memmap_tensor0] | 0.1904ms | 2.3505μs | 425.4451 KOps/s | 421.8040 KOps/s | |
test_stack[memmap_tensor0] | 33.6510μs | 7.2983μs | 137.0184 KOps/s | 130.6959 KOps/s | |
test_memmaptd_index | 1.2428ms | 0.4770ms | 2.0966 KOps/s | 2.0861 KOps/s | |
test_memmaptd_index_astensor | 0.7904ms | 0.5431ms | 1.8412 KOps/s | 1.8512 KOps/s | |
test_memmaptd_index_op | 1.5526ms | 1.1397ms | 877.4285 Ops/s | 839.8908 Ops/s | |
test_serialize_model | 93.7375ms | 89.2760ms | 11.2012 Ops/s | 10.8196 Ops/s | |
test_serialize_model_pickle | 1.3491s | 1.2362s | 0.8089 Ops/s | 0.8083 Ops/s | |
test_serialize_weights | 0.1826s | 96.6039ms | 10.3515 Ops/s | 9.7019 Ops/s | |
test_serialize_weights_returnearly | 0.2732s | 67.8955ms | 14.7285 Ops/s | 14.9354 Ops/s | |
test_serialize_weights_pickle | 1.3550s | 1.2370s | 0.8084 Ops/s | 0.8029 Ops/s | |
test_reshape_pytree | 0.1197ms | 39.8769μs | 25.0772 KOps/s | 24.4544 KOps/s | |
test_reshape_td | 0.1482ms | 50.0326μs | 19.9870 KOps/s | 20.9994 KOps/s | |
test_view_pytree | 85.0420μs | 41.0625μs | 24.3531 KOps/s | 24.7627 KOps/s | |
test_view_td | 0.1963ms | 51.9561μs | 19.2470 KOps/s | 18.7668 KOps/s | |
test_unbind_pytree | 0.1782ms | 39.4786μs | 25.3302 KOps/s | 25.0898 KOps/s | |
test_unbind_td | 0.3942ms | 48.4505μs | 20.6396 KOps/s | 20.5387 KOps/s | |
test_split_pytree | 0.2301ms | 55.1450μs | 18.1340 KOps/s | 18.6660 KOps/s | |
test_split_td | 0.5602ms | 64.4068μs | 15.5263 KOps/s | 15.7340 KOps/s | |
test_add_pytree | 0.2356ms | 66.7445μs | 14.9825 KOps/s | 15.9179 KOps/s | |
test_add_td | 0.2743ms | 98.7287μs | 10.1288 KOps/s | 9.3595 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.4234ms | 0.2238ms | 4.4681 KOps/s | 4.4062 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.3220ms | 0.1788ms | 5.5918 KOps/s | 5.5808 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.3373ms | 0.1554ms | 6.4332 KOps/s | 6.4438 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.3841ms | 0.2025ms | 4.9371 KOps/s | 4.7727 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 0.2203ms | 23.3914μs | 42.7508 KOps/s | 41.8699 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 87.3710μs | 50.0557μs | 19.9778 KOps/s | 19.8509 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.2791ms | 74.5599μs | 13.4120 KOps/s | 13.4291 KOps/s | |
test_compile_copy_nested[pytree-eager] | 85.9410μs | 59.8062μs | 16.7207 KOps/s | 16.7698 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.4744ms | 0.3438ms | 2.9091 KOps/s | 2.8936 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3701ms | 0.2257ms | 4.4316 KOps/s | 4.4517 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.2899ms | 0.1386ms | 7.2143 KOps/s | 6.9034 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.2058ms | 64.7664μs | 15.4401 KOps/s | 15.2622 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.4503ms | 0.3428ms | 2.9174 KOps/s | 2.9022 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.8195ms | 0.6725ms | 1.4869 KOps/s | 1.4398 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4203ms | 0.2740ms | 3.6492 KOps/s | 3.6039 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.4694ms | 0.3481ms | 2.8724 KOps/s | 2.8771 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.2306ms | 76.0074μs | 13.1566 KOps/s | 13.0137 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.2878ms | 0.1385ms | 7.2203 KOps/s | 6.8739 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.7200ms | 0.5741ms | 1.7418 KOps/s | 1.6861 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.4519ms | 0.3432ms | 2.9140 KOps/s | 2.9016 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.1613ms | 20.3449μs | 49.1523 KOps/s | 48.7364 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 64.4810μs | 32.5812μs | 30.6925 KOps/s | 30.4628 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1842ms | 77.0855μs | 12.9726 KOps/s | 12.8907 KOps/s | |
test_compile_copy_flat[pytree-eager] | 93.1320μs | 60.0547μs | 16.6515 KOps/s | 16.4629 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 2.5072ms | 0.8737ms | 1.1446 KOps/s | 1.0632 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.7284ms | 3.5239ms | 283.7751 Ops/s | 281.1193 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 2.5108ms | 0.8612ms | 1.1612 KOps/s | 1.0682 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.9175ms | 3.6066ms | 277.2728 Ops/s | 276.1444 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.2632ms | 0.1195ms | 8.3662 KOps/s | 8.0101 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.3107ms | 66.8198μs | 14.9656 KOps/s | 14.3023 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.2723ms | 0.1159ms | 8.6295 KOps/s | 8.6640 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.2352ms | 51.4709μs | 19.4285 KOps/s | 20.5001 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.2801ms | 0.1155ms | 8.6605 KOps/s | 8.8064 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.2409ms | 51.2303μs | 19.5197 KOps/s | 20.2853 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.3695ms | 0.1524ms | 6.5599 KOps/s | 6.5622 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.2396ms | 29.3606μs | 34.0592 KOps/s | 35.0555 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.2891ms | 0.1407ms | 7.1095 KOps/s | 6.8271 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 0.1586ms | 23.6932μs | 42.2062 KOps/s | 40.7193 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.3015ms | 0.1405ms | 7.1167 KOps/s | 6.8680 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 0.2254ms | 23.8495μs | 41.9296 KOps/s | 41.5179 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.4086ms | 0.1514ms | 6.6032 KOps/s | 6.4556 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5075ms | 27.5010μs | 36.3623 KOps/s | 35.4150 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.2957ms | 0.1407ms | 7.1069 KOps/s | 7.0473 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.1350ms | 25.1764μs | 39.7197 KOps/s | 40.8442 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.3200ms | 0.1465ms | 6.8245 KOps/s | 7.1011 KOps/s | |
test_compile_indexing[int-pytree-eager] | 53.8500μs | 25.3930μs | 39.3809 KOps/s | 41.6349 KOps/s | |
test_mod_add[eager] | 0.2093ms | 35.4128μs | 28.2384 KOps/s | 27.8625 KOps/s | |
test_mod_add[compile] | 0.2072ms | 74.9678μs | 13.3391 KOps/s | 13.1028 KOps/s | |
test_mod_add[compile-overhead] | 0.2707ms | 0.1425ms | 7.0183 KOps/s | 6.4285 KOps/s | |
test_mod_wrap[eager] | 0.4012ms | 0.2489ms | 4.0180 KOps/s | 3.8881 KOps/s | |
test_mod_wrap[compile] | 1.2326ms | 0.3070ms | 3.2578 KOps/s | 3.1664 KOps/s | |
test_mod_wrap[compile-overhead] | 8.4132ms | 4.3819ms | 228.2117 Ops/s | 231.6199 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.5631ms | 1.3902ms | 719.3127 Ops/s | 704.6603 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.5439ms | 1.3892ms | 719.8564 Ops/s | 662.5210 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3375ms | 0.9270ms | 1.0787 KOps/s | 966.9465 Ops/s | |
test_seq_add[eager] | 0.2529ms | 0.1043ms | 9.5914 KOps/s | 9.2046 KOps/s | |
test_seq_add[compile] | 0.2347ms | 87.3096μs | 11.4535 KOps/s | 11.0183 KOps/s | |
test_seq_add[compile-overhead] | 0.2659ms | 0.1246ms | 8.0261 KOps/s | 7.9537 KOps/s | |
test_seq_wrap[eager] | 0.5615ms | 0.4048ms | 2.4705 KOps/s | 2.4129 KOps/s | |
test_seq_wrap[compile] | 0.5138ms | 0.3349ms | 2.9860 KOps/s | 2.9863 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3822ms | 0.2422ms | 4.1280 KOps/s | 4.1108 KOps/s | |
test_func_call_runtime[False-eager] | 0.9049ms | 0.7605ms | 1.3149 KOps/s | 1.2982 KOps/s | |
test_func_call_runtime[False-compile] | 1.0351ms | 0.8683ms | 1.1516 KOps/s | 1.1880 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4722ms | 0.3850ms | 2.5972 KOps/s | 2.5764 KOps/s | |
test_func_call_runtime[True-eager] | 1.1765ms | 0.9584ms | 1.0434 KOps/s | 1.0407 KOps/s | |
test_func_call_runtime[True-compile] | 1.0188ms | 0.8782ms | 1.1387 KOps/s | 1.1280 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.6080ms | 0.4315ms | 2.3176 KOps/s | 2.3132 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.9634ms | 0.7907ms | 1.2647 KOps/s | 1.3024 KOps/s | |
test_func_call_cm_runtime[False-compile] | 1.0574ms | 0.8470ms | 1.1806 KOps/s | 1.1804 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.5384ms | 0.3879ms | 2.5779 KOps/s | 2.5459 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.2330ms | 1.0636ms | 940.2349 Ops/s | 927.3585 Ops/s | |
test_func_call_cm_runtime[True-compile] | 1.1852ms | 1.0387ms | 962.7831 Ops/s | 947.7602 Ops/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.1877ms | 1.0315ms | 969.4998 Ops/s | 946.4511 Ops/s | |
test_distributed | 1.2916ms | 71.7629μs | 13.9348 KOps/s | 14.6363 KOps/s | |
test_tdmodule | 69.8810μs | 16.0064μs | 62.4751 KOps/s | 55.8470 KOps/s | |
test_tdmodule_dispatch | 53.6600μs | 31.8687μs | 31.3788 KOps/s | 27.9016 KOps/s | |
test_tdseq | 33.3310μs | 16.5918μs | 60.2708 KOps/s | 54.0815 KOps/s | |
test_tdseq_dispatch | 50.7410μs | 33.1169μs | 30.1961 KOps/s | 26.5467 KOps/s | |
test_instantiation_functorch | 2.2553ms | 2.0908ms | 478.2768 Ops/s | 478.6613 Ops/s | |
test_instantiation_td | 2.0537ms | 1.3550ms | 737.9967 Ops/s | 737.9661 Ops/s | |
test_exec_functorch | 0.3749ms | 0.2355ms | 4.2455 KOps/s | 4.2760 KOps/s | |
test_exec_functional_call | 0.4148ms | 0.2393ms | 4.1784 KOps/s | 4.4427 KOps/s | |
test_exec_td | 0.4373ms | 0.2437ms | 4.1032 KOps/s | 4.2906 KOps/s | |
test_exec_td_decorator | 0.6028ms | 0.2874ms | 3.4790 KOps/s | 3.4770 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.8229ms | 0.6610ms | 1.5129 KOps/s | 1.4897 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.8804ms | 0.6877ms | 1.4542 KOps/s | 1.5008 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.7819ms | 0.6091ms | 1.6417 KOps/s | 1.7231 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.7695ms | 0.6105ms | 1.6379 KOps/s | 1.7231 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.3559ms | 0.7125ms | 1.4034 KOps/s | 1.3838 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.9632ms | 0.7391ms | 1.3530 KOps/s | 1.3835 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.8326ms | 0.6491ms | 1.5407 KOps/s | 1.6051 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7998ms | 0.6310ms | 1.5847 KOps/s | 1.5997 KOps/s | |
test_vmap_transformer_speed[True-True] | 9.3381ms | 8.8883ms | 112.5076 Ops/s | 111.7936 Ops/s | |
test_vmap_transformer_speed[True-False] | 9.2134ms | 8.8588ms | 112.8820 Ops/s | 112.5895 Ops/s | |
test_vmap_transformer_speed[False-True] | 9.1413ms | 8.7763ms | 113.9428 Ops/s | 113.3041 Ops/s | |
test_vmap_transformer_speed[False-False] | 9.2636ms | 8.7850ms | 113.8299 Ops/s | 113.7433 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 22.0769ms | 21.0437ms | 47.5202 Ops/s | 47.8268 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 0.1984s | 24.7288ms | 40.4387 Ops/s | 47.6432 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 21.8742ms | 20.8787ms | 47.8958 Ops/s | 48.0046 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 21.7712ms | 20.8365ms | 47.9927 Ops/s | 47.9871 Ops/s | |
test_to_module_speed[True] | 1.6815ms | 1.1513ms | 868.5457 Ops/s | 866.5182 Ops/s | |
test_to_module_speed[False] | 1.6016ms | 1.1216ms | 891.5949 Ops/s | 889.7193 Ops/s | |
test_tc_init | 0.1351ms | 37.7610μs | 26.4823 KOps/s | 23.7695 KOps/s | |
test_tc_init_nested | 0.2044ms | 78.6685μs | 12.7116 KOps/s | 11.8059 KOps/s | |
test_tc_first_layer_tensor | 3.5502μs | 0.7794μs | 1.2831 MOps/s | 1.2756 MOps/s | |
test_tc_first_layer_nontensor | 16.8100μs | 2.5558μs | 391.2723 KOps/s | 397.0708 KOps/s | |
test_tc_second_layer_tensor | 25.9900μs | 1.7052μs | 586.4248 KOps/s | 618.7240 KOps/s | |
test_tc_second_layer_nontensor | 27.6210μs | 3.4540μs | 289.5202 KOps/s | 296.8910 KOps/s | |
test_unbind | 0.1819s | 12.9144ms | 77.4328 Ops/s | 84.2086 Ops/s | |
test_full_like | 0.7558ms | 0.5782ms | 1.7296 KOps/s | 1.7346 KOps/s | |
test_zeros_like | 0.3406ms | 0.1979ms | 5.0525 KOps/s | 5.0591 KOps/s | |
test_ones_like | 0.3506ms | 0.1979ms | 5.0524 KOps/s | 5.0608 KOps/s | |
test_clone | 0.5628ms | 0.4153ms | 2.4080 KOps/s | 2.4138 KOps/s | |
test_squeeze | 0.1457ms | 11.2988μs | 88.5047 KOps/s | 89.2232 KOps/s | |
test_unsqueeze | 0.2814ms | 82.9407μs | 12.0568 KOps/s | 12.0587 KOps/s | |
test_split | 0.2948ms | 0.1805ms | 5.5404 KOps/s | 5.5363 KOps/s | |
test_permute | 0.3108ms | 0.1972ms | 5.0704 KOps/s | 5.1269 KOps/s | |
test_stack | 1.3791ms | 0.9141ms | 1.0940 KOps/s | 1.0905 KOps/s | |
test_cat | 1.2743ms | 1.2314ms | 812.0684 Ops/s | 811.6571 Ops/s |
vmoens
added a commit
that referenced
this pull request
Aug 9, 2024
ghstack-source-id: 5bf45ecbe91cc5172d67f33761a3cb6e4a0e5fb2 Pull Request resolved: #955
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
enhancement
New feature or request
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):