-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Non-blocking for consolidated TD #1020
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This was referenced Oct 2, 2024
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Oct 2, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 52.6380μs | 26.0287μs | 38.4192 KOps/s | 40.6917 KOps/s | |
test_plain_set_stack_nested | 57.8880μs | 26.2237μs | 38.1335 KOps/s | 40.4269 KOps/s | |
test_plain_set_nested_inplace | 62.8380μs | 28.5825μs | 34.9865 KOps/s | 37.0205 KOps/s | |
test_plain_set_stack_nested_inplace | 67.0860μs | 28.2036μs | 35.4564 KOps/s | 36.5039 KOps/s | |
test_items | 28.7540μs | 4.2781μs | 233.7487 KOps/s | 229.8629 KOps/s | |
test_items_nested | 0.5258ms | 0.3806ms | 2.6274 KOps/s | 2.5650 KOps/s | |
test_items_nested_locked | 0.7185ms | 0.3853ms | 2.5951 KOps/s | 2.5656 KOps/s | |
test_items_nested_leaf | 0.1471ms | 80.9546μs | 12.3526 KOps/s | 12.1977 KOps/s | |
test_items_stack_nested | 0.8131ms | 0.3890ms | 2.5709 KOps/s | 2.5545 KOps/s | |
test_items_stack_nested_leaf | 0.1494ms | 84.8068μs | 11.7915 KOps/s | 11.7265 KOps/s | |
test_items_stack_nested_locked | 0.7096ms | 0.3906ms | 2.5604 KOps/s | 2.5619 KOps/s | |
test_keys | 28.6330μs | 3.5929μs | 278.3267 KOps/s | 281.1507 KOps/s | |
test_keys_nested | 0.2293ms | 0.1368ms | 7.3077 KOps/s | 7.2484 KOps/s | |
test_keys_nested_locked | 0.7032ms | 0.1429ms | 6.9983 KOps/s | 6.9209 KOps/s | |
test_keys_nested_leaf | 0.1957ms | 0.1197ms | 8.3519 KOps/s | 8.2171 KOps/s | |
test_keys_stack_nested | 0.2760ms | 0.1373ms | 7.2838 KOps/s | 7.2415 KOps/s | |
test_keys_stack_nested_leaf | 0.2079ms | 0.1200ms | 8.3351 KOps/s | 8.3472 KOps/s | |
test_keys_stack_nested_locked | 0.3405ms | 0.1439ms | 6.9472 KOps/s | 6.9497 KOps/s | |
test_values | 5.9972μs | 1.0577μs | 945.4583 KOps/s | 919.1975 KOps/s | |
test_values_nested | 0.1773ms | 94.0094μs | 10.6372 KOps/s | 10.5050 KOps/s | |
test_values_nested_locked | 0.1580ms | 92.7870μs | 10.7774 KOps/s | 10.4981 KOps/s | |
test_values_nested_leaf | 0.1757ms | 79.2858μs | 12.6126 KOps/s | 12.3476 KOps/s | |
test_values_stack_nested | 0.1449ms | 94.0777μs | 10.6295 KOps/s | 10.4799 KOps/s | |
test_values_stack_nested_leaf | 0.1554ms | 79.5996μs | 12.5629 KOps/s | 12.5748 KOps/s | |
test_values_stack_nested_locked | 0.1576ms | 92.7963μs | 10.7763 KOps/s | 10.5382 KOps/s | |
test_membership | 6.1111μs | 0.7381μs | 1.3548 MOps/s | 1.3632 MOps/s | |
test_membership_nested | 18.9050μs | 2.7518μs | 363.4006 KOps/s | 363.8772 KOps/s | |
test_membership_nested_leaf | 30.6170μs | 2.7850μs | 359.0662 KOps/s | 360.5264 KOps/s | |
test_membership_stacked_nested | 26.4690μs | 2.7764μs | 360.1734 KOps/s | 360.4654 KOps/s | |
test_membership_stacked_nested_leaf | 23.4940μs | 2.8086μs | 356.0519 KOps/s | 345.5350 KOps/s | |
test_membership_nested_last | 45.8680μs | 4.2825μs | 233.5079 KOps/s | 237.9729 KOps/s | |
test_membership_nested_leaf_last | 98.3230μs | 4.3631μs | 229.1928 KOps/s | 238.5868 KOps/s | |
test_membership_stacked_nested_last | 33.5720μs | 4.2633μs | 234.5589 KOps/s | 169.5345 KOps/s | |
test_membership_stacked_nested_leaf_last | 32.9610μs | 4.2871μs | 233.2566 KOps/s | 169.9251 KOps/s | |
test_nested_getleaf | 38.1020μs | 10.8945μs | 91.7892 KOps/s | 92.8809 KOps/s | |
test_nested_get | 36.4580μs | 10.5377μs | 94.8972 KOps/s | 98.2193 KOps/s | |
test_stacked_getleaf | 45.4150μs | 10.9887μs | 91.0025 KOps/s | 94.3366 KOps/s | |
test_stacked_get | 36.9490μs | 10.4774μs | 95.4433 KOps/s | 98.3367 KOps/s | |
test_nested_getitemleaf | 37.5610μs | 11.4004μs | 87.7163 KOps/s | 89.3315 KOps/s | |
test_nested_getitem | 40.7460μs | 10.3505μs | 96.6136 KOps/s | 96.0038 KOps/s | |
test_stacked_getitemleaf | 35.8670μs | 11.4276μs | 87.5075 KOps/s | 88.8264 KOps/s | |
test_stacked_getitem | 36.0570μs | 10.2864μs | 97.2157 KOps/s | 95.8853 KOps/s | |
test_lock_nested | 85.5422ms | 0.5979ms | 1.6725 KOps/s | 1.9676 KOps/s | |
test_lock_stack_nested | 0.7483ms | 0.4849ms | 2.0622 KOps/s | 2.1296 KOps/s | |
test_unlock_nested | 85.8721ms | 0.5128ms | 1.9501 KOps/s | 2.3770 KOps/s | |
test_unlock_stack_nested | 0.7134ms | 0.3960ms | 2.5250 KOps/s | 2.6108 KOps/s | |
test_flatten_speed | 0.2063ms | 0.1018ms | 9.8190 KOps/s | 9.8980 KOps/s | |
test_unflatten_speed | 0.7012ms | 0.5306ms | 1.8846 KOps/s | 1.9174 KOps/s | |
test_common_ops | 6.3955ms | 1.2181ms | 820.9513 Ops/s | 862.7404 Ops/s | |
test_creation | 36.0880μs | 2.0935μs | 477.6626 KOps/s | 469.3700 KOps/s | |
test_creation_empty | 51.3560μs | 20.3174μs | 49.2190 KOps/s | 54.8212 KOps/s | |
test_creation_nested_1 | 74.6900μs | 24.1624μs | 41.3867 KOps/s | 45.5840 KOps/s | |
test_creation_nested_2 | 71.1840μs | 28.0966μs | 35.5915 KOps/s | 37.1035 KOps/s | |
test_clone | 0.1461ms | 17.5250μs | 57.0614 KOps/s | 58.0671 KOps/s | |
test_getitem[int] | 1.1730ms | 17.5475μs | 56.9881 KOps/s | 59.7277 KOps/s | |
test_getitem[slice_int] | 0.1571ms | 33.0245μs | 30.2805 KOps/s | 32.1597 KOps/s | |
test_getitem[range] | 0.1877ms | 59.0228μs | 16.9426 KOps/s | 16.8939 KOps/s | |
test_getitem[tuple] | 0.1450ms | 26.0494μs | 38.3885 KOps/s | 39.3414 KOps/s | |
test_getitem[list] | 0.1645ms | 54.8637μs | 18.2270 KOps/s | 18.2725 KOps/s | |
test_setitem_dim[int] | 66.8450μs | 34.0141μs | 29.3995 KOps/s | 29.0570 KOps/s | |
test_setitem_dim[slice_int] | 0.1133ms | 62.6627μs | 15.9584 KOps/s | 15.8745 KOps/s | |
test_setitem_dim[range] | 0.1446ms | 85.6468μs | 11.6759 KOps/s | 11.4693 KOps/s | |
test_setitem_dim[tuple] | 87.4130μs | 50.1390μs | 19.9446 KOps/s | 19.8189 KOps/s | |
test_setitem | 92.3130μs | 32.4241μs | 30.8412 KOps/s | 33.1869 KOps/s | |
test_set | 76.3020μs | 31.2186μs | 32.0321 KOps/s | 33.9443 KOps/s | |
test_set_shared | 3.8809ms | 0.2220ms | 4.5042 KOps/s | 4.5006 KOps/s | |
test_update | 0.1403ms | 40.8770μs | 24.4636 KOps/s | 26.5258 KOps/s | |
test_update_nested | 0.1943ms | 51.9247μs | 19.2587 KOps/s | 20.3695 KOps/s | |
test_update__nested | 1.0118ms | 38.1497μs | 26.2125 KOps/s | 26.6678 KOps/s | |
test_set_nested | 83.0040μs | 35.2756μs | 28.3482 KOps/s | 30.3337 KOps/s | |
test_set_nested_new | 96.5300μs | 40.1146μs | 24.9286 KOps/s | 26.2681 KOps/s | |
test_select | 0.2446ms | 57.4027μs | 17.4208 KOps/s | 17.9137 KOps/s | |
test_select_nested | 0.1214ms | 60.3048μs | 16.5824 KOps/s | 16.7976 KOps/s | |
test_exclude_nested | 0.1673ms | 75.9072μs | 13.1740 KOps/s | 13.2284 KOps/s | |
test_empty[True] | 0.6688ms | 0.3620ms | 2.7622 KOps/s | 2.7594 KOps/s | |
test_empty[False] | 9.6102μs | 1.2536μs | 797.7187 KOps/s | 723.7633 KOps/s | |
test_unbind_speed | 0.5191ms | 0.3080ms | 3.2472 KOps/s | 3.2847 KOps/s | |
test_unbind_speed_stack0 | 0.9010ms | 0.3087ms | 3.2399 KOps/s | 3.4468 KOps/s | |
test_unbind_speed_stack1 | 90.1346ms | 0.8208ms | 1.2183 KOps/s | 1.3794 KOps/s | |
test_split | 3.2966ms | 2.0163ms | 495.9555 Ops/s | 455.6154 Ops/s | |
test_chunk | 94.9570ms | 2.2056ms | 453.3907 Ops/s | 451.2668 Ops/s | |
test_creation[device0] | 0.2748ms | 0.1192ms | 8.3878 KOps/s | 8.2238 KOps/s | |
test_creation_from_tensor | 4.0891ms | 0.1217ms | 8.2147 KOps/s | 8.4289 KOps/s | |
test_add_one[memmap_tensor0] | 0.6369ms | 7.1150μs | 140.5490 KOps/s | 132.4672 KOps/s | |
test_contiguous[memmap_tensor0] | 20.3380μs | 1.8728μs | 533.9637 KOps/s | 517.3099 KOps/s | |
test_stack[memmap_tensor0] | 29.3550μs | 5.6265μs | 177.7311 KOps/s | 174.0301 KOps/s | |
test_memmaptd_index | 1.2025ms | 0.4118ms | 2.4284 KOps/s | 2.3429 KOps/s | |
test_memmaptd_index_astensor | 1.0297ms | 0.5124ms | 1.9517 KOps/s | 1.8891 KOps/s | |
test_memmaptd_index_op | 2.2664ms | 1.1027ms | 906.8953 Ops/s | 942.4719 Ops/s | |
test_serialize_model | 0.1233s | 0.1179s | 8.4826 Ops/s | 8.5199 Ops/s | |
test_serialize_model_pickle | 0.4269s | 0.4006s | 2.4962 Ops/s | 2.5256 Ops/s | |
test_serialize_weights | 0.1311s | 0.1170s | 8.5464 Ops/s | 8.7400 Ops/s | |
test_serialize_weights_returnearly | 0.2814s | 0.1722s | 5.8061 Ops/s | 5.5995 Ops/s | |
test_serialize_weights_pickle | 1.0524s | 0.6660s | 1.5015 Ops/s | 2.3733 Ops/s | |
test_serialize_weights_filesystem | 0.1500s | 0.1407s | 7.1092 Ops/s | 7.1684 Ops/s | |
test_serialize_model_filesystem | 0.1513s | 0.1411s | 7.0847 Ops/s | 6.7164 Ops/s | |
test_reshape_pytree | 0.1168ms | 40.8407μs | 24.4854 KOps/s | 25.4148 KOps/s | |
test_reshape_td | 0.1084ms | 46.6729μs | 21.4257 KOps/s | 20.9636 KOps/s | |
test_view_pytree | 0.1035ms | 40.3886μs | 24.7595 KOps/s | 25.3585 KOps/s | |
test_view_td | 98.0240μs | 52.3517μs | 19.1016 KOps/s | 18.7751 KOps/s | |
test_unbind_pytree | 70.8730μs | 37.4719μs | 26.6867 KOps/s | 27.6753 KOps/s | |
test_unbind_td | 0.3159ms | 46.6561μs | 21.4334 KOps/s | 22.0447 KOps/s | |
test_split_pytree | 84.0770μs | 39.7916μs | 25.1310 KOps/s | 26.1449 KOps/s | |
test_split_td | 84.2913ms | 69.4311μs | 14.4028 KOps/s | 17.0024 KOps/s | |
test_add_pytree | 0.1085ms | 45.4481μs | 22.0031 KOps/s | 21.4160 KOps/s | |
test_add_td | 0.1844ms | 93.5935μs | 10.6845 KOps/s | 11.4399 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1046ms | 57.1702μs | 17.4916 KOps/s | 16.9021 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.3861ms | 0.1999ms | 5.0021 KOps/s | 4.8893 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1203ms | 55.8066μs | 17.9190 KOps/s | 17.3490 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.3248ms | 0.1420ms | 7.0402 KOps/s | 6.9650 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 76.4230μs | 23.0956μs | 43.2983 KOps/s | 43.3169 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1467ms | 74.3684μs | 13.4466 KOps/s | 12.9084 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1528ms | 76.7800μs | 13.0242 KOps/s | 13.1338 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1358ms | 68.3473μs | 14.6312 KOps/s | 14.4388 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.3989ms | 0.1806ms | 5.5370 KOps/s | 5.4905 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.5059ms | 0.2391ms | 4.1831 KOps/s | 4.0344 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1143ms | 47.8461μs | 20.9003 KOps/s | 20.2525 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1777ms | 77.9574μs | 12.8275 KOps/s | 12.7105 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.3649ms | 0.1753ms | 5.7048 KOps/s | 5.6905 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.6371ms | 0.2887ms | 3.4633 KOps/s | 3.3414 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.5340ms | 0.2729ms | 3.6642 KOps/s | 3.5408 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.5386ms | 0.1831ms | 5.4624 KOps/s | 5.4063 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1594ms | 74.7201μs | 13.3833 KOps/s | 13.2014 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1161ms | 48.8297μs | 20.4793 KOps/s | 20.0060 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.4472ms | 0.2330ms | 4.2918 KOps/s | 4.1718 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2868ms | 0.1760ms | 5.6830 KOps/s | 5.7491 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.2427ms | 0.1112ms | 8.9962 KOps/s | 9.0368 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1567ms | 77.2072μs | 12.9522 KOps/s | 12.5735 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1440ms | 77.3859μs | 12.9222 KOps/s | 12.7017 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1502ms | 69.5137μs | 14.3856 KOps/s | 14.1095 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.3360ms | 0.1931ms | 5.1780 KOps/s | 5.1823 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.9584ms | 1.7360ms | 576.0494 Ops/s | 564.3420 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.3808ms | 0.1925ms | 5.1954 KOps/s | 5.1951 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.3292ms | 1.0904ms | 917.0850 Ops/s | 878.7278 Ops/s | |
test_compile_assign_and_add_stack[compile] | 0.6738ms | 0.4134ms | 2.4190 KOps/s | 2.3897 KOps/s | |
test_compile_assign_and_add_stack[eager] | 4.7154ms | 4.2808ms | 233.5998 Ops/s | 240.0512 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 81.9030μs | 33.7374μs | 29.6407 KOps/s | 28.3522 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.9220ms | 50.5278μs | 19.7911 KOps/s | 20.2785 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 73.6570μs | 30.4504μs | 32.8403 KOps/s | 33.2482 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 68.4170μs | 29.5584μs | 33.8313 KOps/s | 34.0388 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 76.5030μs | 30.1312μs | 33.1882 KOps/s | 33.0106 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 95.6990μs | 29.5206μs | 33.8747 KOps/s | 34.1011 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1811ms | 73.7724μs | 13.5552 KOps/s | 13.3520 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.6452ms | 28.7052μs | 34.8369 KOps/s | 34.2855 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1214ms | 68.9938μs | 14.4941 KOps/s | 14.6381 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 63.2780μs | 23.7056μs | 42.1842 KOps/s | 41.9194 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1253ms | 68.4042μs | 14.6190 KOps/s | 14.5806 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 63.9790μs | 23.6004μs | 42.3722 KOps/s | 42.8127 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1376ms | 73.4189μs | 13.6205 KOps/s | 13.3620 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 1.2070ms | 28.7648μs | 34.7647 KOps/s | 34.1354 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1607ms | 69.0793μs | 14.4761 KOps/s | 14.5963 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 64.7000μs | 23.3906μs | 42.7523 KOps/s | 41.7736 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1330ms | 67.3923μs | 14.8385 KOps/s | 14.7102 KOps/s | |
test_compile_indexing[int-pytree-eager] | 69.2900μs | 23.3515μs | 42.8237 KOps/s | 42.6733 KOps/s | |
test_mod_add[eager] | 98.2540μs | 28.0360μs | 35.6684 KOps/s | 38.6727 KOps/s | |
test_mod_add[compile] | 0.1109ms | 38.3592μs | 26.0694 KOps/s | 26.0929 KOps/s | |
test_mod_add[compile-overhead] | 87.9840μs | 38.5922μs | 25.9120 KOps/s | 26.0052 KOps/s | |
test_mod_wrap[eager] | 0.4041ms | 0.2089ms | 4.7859 KOps/s | 4.5880 KOps/s | |
test_mod_wrap[compile] | 0.3089ms | 0.2288ms | 4.3710 KOps/s | 4.3391 KOps/s | |
test_mod_wrap[compile-overhead] | 0.3091ms | 0.2260ms | 4.4241 KOps/s | 4.4013 KOps/s | |
test_mod_wrap_and_backward[eager] | 12.3277ms | 10.7536ms | 92.9917 Ops/s | 93.5066 Ops/s | |
test_mod_wrap_and_backward[compile] | 12.6167ms | 11.0992ms | 90.0964 Ops/s | 82.4422 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 13.7725ms | 11.3706ms | 87.9462 Ops/s | 82.6547 Ops/s | |
test_seq_add[eager] | 0.1644ms | 94.3135μs | 10.6029 KOps/s | 10.3084 KOps/s | |
test_seq_add[compile] | 0.1248ms | 64.1098μs | 15.5982 KOps/s | 14.8683 KOps/s | |
test_seq_add[compile-overhead] | 0.1306ms | 63.2830μs | 15.8020 KOps/s | 15.6746 KOps/s | |
test_seq_wrap[eager] | 0.5971ms | 0.3950ms | 2.5317 KOps/s | 2.5327 KOps/s | |
test_seq_wrap[compile] | 1.1867ms | 0.2676ms | 3.7374 KOps/s | 3.7205 KOps/s | |
test_seq_wrap[compile-overhead] | 1.2183ms | 0.2665ms | 3.7528 KOps/s | 3.5977 KOps/s | |
test_func_call_runtime[False-eager] | 0.9956ms | 0.5370ms | 1.8624 KOps/s | 1.8859 KOps/s | |
test_func_call_runtime[False-compile] | 0.8780ms | 0.5005ms | 1.9979 KOps/s | 1.9830 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.6374ms | 0.5022ms | 1.9914 KOps/s | 1.9977 KOps/s | |
test_func_call_runtime[True-eager] | 1.2901ms | 0.7575ms | 1.3201 KOps/s | 1.3320 KOps/s | |
test_func_call_runtime[True-compile] | 0.6240ms | 0.5138ms | 1.9462 KOps/s | 1.9292 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.6030ms | 0.5109ms | 1.9575 KOps/s | 1.9486 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.9037ms | 0.5271ms | 1.8972 KOps/s | 1.8843 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.6731ms | 0.4951ms | 2.0198 KOps/s | 1.9724 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.9270ms | 0.4967ms | 2.0134 KOps/s | 1.9774 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.0609ms | 0.8973ms | 1.1144 KOps/s | 1.1056 KOps/s | |
test_func_call_cm_runtime[True-compile] | 1.2185ms | 0.7468ms | 1.3390 KOps/s | 1.3517 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.9294ms | 0.7451ms | 1.3421 KOps/s | 1.3416 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.4289ms | 1.9167ms | 521.7280 Ops/s | 524.8040 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 2.6320ms | 1.9762ms | 506.0166 Ops/s | 512.2920 Ops/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 2.6886ms | 1.9797ms | 505.1160 Ops/s | 503.3427 Ops/s | |
test_distributed | 0.2300ms | 0.1259ms | 7.9451 KOps/s | 7.7393 KOps/s | |
test_tdmodule | 42.8700μs | 19.7457μs | 50.6440 KOps/s | 55.3814 KOps/s | |
test_tdmodule_dispatch | 70.3420μs | 39.2082μs | 25.5049 KOps/s | 27.7483 KOps/s | |
test_tdseq | 67.8270μs | 22.2603μs | 44.9230 KOps/s | 47.3509 KOps/s | |
test_tdseq_dispatch | 79.0170μs | 44.8318μs | 22.3056 KOps/s | 22.2833 KOps/s | |
test_instantiation_functorch | 2.5040ms | 1.6129ms | 620.0087 Ops/s | 614.5818 Ops/s | |
test_instantiation_td | 2.0066ms | 1.1951ms | 836.7588 Ops/s | 820.6131 Ops/s | |
test_exec_functorch | 0.4155ms | 0.1899ms | 5.2648 KOps/s | 5.3141 KOps/s | |
test_exec_functional_call | 0.3088ms | 0.1714ms | 5.8359 KOps/s | 5.5830 KOps/s | |
test_exec_td | 0.3644ms | 0.2003ms | 4.9917 KOps/s | 4.9298 KOps/s | |
test_exec_td_decorator | 0.7814ms | 0.2337ms | 4.2785 KOps/s | 4.1876 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.0157ms | 0.6921ms | 1.4449 KOps/s | 1.4585 KOps/s | |
test_vmap_mlp_speed[True-False] | 1.0081ms | 0.6879ms | 1.4536 KOps/s | 1.4693 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.6686ms | 0.5379ms | 1.8590 KOps/s | 1.8668 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.8124ms | 0.5417ms | 1.8461 KOps/s | 1.8598 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8833ms | 0.6512ms | 1.5357 KOps/s | 1.5536 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.0798ms | 0.6546ms | 1.5277 KOps/s | 1.5521 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.9341ms | 0.5363ms | 1.8647 KOps/s | 1.8832 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.8477ms | 0.5367ms | 1.8634 KOps/s | 1.8930 KOps/s | |
test_to_module_speed[True] | 1.6080ms | 1.3888ms | 720.0479 Ops/s | 688.4168 Ops/s | |
test_to_module_speed[False] | 2.1808ms | 1.3767ms | 726.3889 Ops/s | 723.1174 Ops/s | |
test_tc_init | 94.7570μs | 50.1545μs | 19.9384 KOps/s | 21.0269 KOps/s | |
test_tc_init_nested | 0.1749ms | 99.6678μs | 10.0333 KOps/s | 10.4540 KOps/s | |
test_tc_first_layer_tensor | 22.9830μs | 1.5531μs | 643.8852 KOps/s | 632.4159 KOps/s | |
test_tc_first_layer_nontensor | 26.9800μs | 4.7718μs | 209.5652 KOps/s | 204.8666 KOps/s | |
test_tc_second_layer_tensor | 24.2150μs | 2.8075μs | 356.1893 KOps/s | 345.1799 KOps/s | |
test_tc_second_layer_nontensor | 22.3520μs | 6.0015μs | 166.6257 KOps/s | 159.6663 KOps/s | |
test_unbind | 0.4604s | 13.0939ms | 76.3716 Ops/s | 78.2471 Ops/s | |
test_full_like | 15.0701ms | 11.5648ms | 86.4694 Ops/s | 133.5770 Ops/s | |
test_zeros_like | 12.6066ms | 6.8810ms | 145.3284 Ops/s | 350.8841 Ops/s | |
test_ones_like | 14.7679ms | 7.4271ms | 134.6426 Ops/s | 308.0310 Ops/s | |
test_clone | 14.7695ms | 9.1160ms | 109.6972 Ops/s | 196.5174 Ops/s | |
test_squeeze | 70.2210μs | 13.7338μs | 72.8130 KOps/s | 78.5917 KOps/s | |
test_unsqueeze | 0.1919ms | 97.1448μs | 10.2939 KOps/s | 10.8015 KOps/s | |
test_split | 0.4971ms | 0.1975ms | 5.0642 KOps/s | 5.0553 KOps/s | |
test_permute | 0.3020ms | 0.2207ms | 4.5314 KOps/s | 4.4768 KOps/s | |
test_stack | 30.1211ms | 24.1658ms | 41.3808 Ops/s | 38.2408 Ops/s | |
test_cat | 25.8665ms | 23.9055ms | 41.8314 Ops/s | 38.0515 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.1257ms | 16.6185μs | 60.1738 KOps/s | 56.8502 KOps/s | |
test_plain_set_stack_nested | 87.8350μs | 16.7516μs | 59.6957 KOps/s | 57.1452 KOps/s | |
test_plain_set_nested_inplace | 67.3140μs | 18.0026μs | 55.5477 KOps/s | 53.3854 KOps/s | |
test_plain_set_stack_nested_inplace | 57.9440μs | 17.9768μs | 55.6272 KOps/s | 52.9722 KOps/s | |
test_items | 28.8820μs | 2.8403μs | 352.0770 KOps/s | 348.8082 KOps/s | |
test_items_nested | 0.4479ms | 0.3443ms | 2.9045 KOps/s | 2.8985 KOps/s | |
test_items_nested_locked | 0.4985ms | 0.3480ms | 2.8736 KOps/s | 2.8970 KOps/s | |
test_items_nested_leaf | 91.5660μs | 62.8693μs | 15.9060 KOps/s | 16.0177 KOps/s | |
test_items_stack_nested | 0.4860ms | 0.3529ms | 2.8335 KOps/s | 2.9021 KOps/s | |
test_items_stack_nested_leaf | 0.1056ms | 63.6398μs | 15.7134 KOps/s | 15.9581 KOps/s | |
test_items_stack_nested_locked | 0.5219ms | 0.3487ms | 2.8674 KOps/s | 2.8821 KOps/s | |
test_keys | 38.9230μs | 3.4306μs | 291.4944 KOps/s | 293.8702 KOps/s | |
test_keys_nested | 0.1109ms | 71.5381μs | 13.9786 KOps/s | 14.0279 KOps/s | |
test_keys_nested_locked | 2.8001ms | 77.2597μs | 12.9434 KOps/s | 13.0082 KOps/s | |
test_keys_nested_leaf | 99.7060μs | 62.0571μs | 16.1142 KOps/s | 16.2676 KOps/s | |
test_keys_stack_nested | 0.1118ms | 71.7270μs | 13.9417 KOps/s | 14.1843 KOps/s | |
test_keys_stack_nested_leaf | 0.1239ms | 62.7191μs | 15.9441 KOps/s | 16.2861 KOps/s | |
test_keys_stack_nested_locked | 0.1915ms | 76.9617μs | 12.9935 KOps/s | 12.8707 KOps/s | |
test_values | 4.4403μs | 0.8602μs | 1.1625 MOps/s | 1.1399 MOps/s | |
test_values_nested | 0.4248ms | 48.8463μs | 20.4724 KOps/s | 20.5658 KOps/s | |
test_values_nested_locked | 0.4402ms | 50.3905μs | 19.8450 KOps/s | 19.8867 KOps/s | |
test_values_nested_leaf | 69.6840μs | 43.0022μs | 23.2546 KOps/s | 23.5702 KOps/s | |
test_values_stack_nested | 0.4433ms | 49.7886μs | 20.0849 KOps/s | 20.2743 KOps/s | |
test_values_stack_nested_leaf | 0.4384ms | 43.6414μs | 22.9140 KOps/s | 23.3047 KOps/s | |
test_values_stack_nested_locked | 0.4298ms | 51.2322μs | 19.5190 KOps/s | 19.8732 KOps/s | |
test_membership | 19.7723μs | 0.5017μs | 1.9933 MOps/s | 1.9875 MOps/s | |
test_membership_nested | 0.1908ms | 1.8720μs | 534.2009 KOps/s | 530.6618 KOps/s | |
test_membership_nested_leaf | 0.1285ms | 1.8365μs | 544.5264 KOps/s | 528.9284 KOps/s | |
test_membership_stacked_nested | 29.1320μs | 1.9189μs | 521.1316 KOps/s | 519.0438 KOps/s | |
test_membership_stacked_nested_leaf | 47.1330μs | 1.9474μs | 513.5049 KOps/s | 521.6067 KOps/s | |
test_membership_nested_last | 0.3869ms | 2.9561μs | 338.2851 KOps/s | 334.2080 KOps/s | |
test_membership_nested_leaf_last | 27.0420μs | 2.9594μs | 337.9106 KOps/s | 335.8815 KOps/s | |
test_membership_stacked_nested_last | 20.6110μs | 2.9436μs | 339.7179 KOps/s | 331.2112 KOps/s | |
test_membership_stacked_nested_leaf_last | 27.1220μs | 2.9429μs | 339.7972 KOps/s | 332.8099 KOps/s | |
test_nested_getleaf | 35.7020μs | 6.1079μs | 163.7212 KOps/s | 164.3559 KOps/s | |
test_nested_get | 0.3849ms | 5.8009μs | 172.3882 KOps/s | 172.9740 KOps/s | |
test_stacked_getleaf | 0.3895ms | 6.0551μs | 165.1506 KOps/s | 166.1092 KOps/s | |
test_stacked_get | 36.7420μs | 5.7518μs | 173.8599 KOps/s | 175.8366 KOps/s | |
test_nested_getitemleaf | 0.3872ms | 6.2130μs | 160.9522 KOps/s | 162.6437 KOps/s | |
test_nested_getitem | 31.6020μs | 5.8720μs | 170.2991 KOps/s | 172.8287 KOps/s | |
test_stacked_getitemleaf | 0.3924ms | 6.1441μs | 162.7578 KOps/s | 164.7630 KOps/s | |
test_stacked_getitem | 29.1220μs | 5.7504μs | 173.9009 KOps/s | 173.1550 KOps/s | |
test_lock_nested | 6.8677ms | 0.4365ms | 2.2910 KOps/s | 2.3210 KOps/s | |
test_lock_stack_nested | 0.4371ms | 0.3968ms | 2.5202 KOps/s | 2.5599 KOps/s | |
test_unlock_nested | 0.7721ms | 0.3691ms | 2.7097 KOps/s | 2.7148 KOps/s | |
test_unlock_stack_nested | 0.3665ms | 0.3338ms | 2.9962 KOps/s | 3.0241 KOps/s | |
test_flatten_speed | 0.4628ms | 76.2808μs | 13.1095 KOps/s | 13.1791 KOps/s | |
test_unflatten_speed | 0.3927ms | 0.3238ms | 3.0887 KOps/s | 3.0488 KOps/s | |
test_common_ops | 1.6582ms | 1.2866ms | 777.2187 Ops/s | 774.0121 Ops/s | |
test_creation | 0.3763ms | 1.4758μs | 677.5786 KOps/s | 671.0766 KOps/s | |
test_creation_empty | 45.9130μs | 15.4134μs | 64.8788 KOps/s | 58.2219 KOps/s | |
test_creation_nested_1 | 0.4146ms | 17.0596μs | 58.6181 KOps/s | 52.1152 KOps/s | |
test_creation_nested_2 | 61.4940μs | 19.6898μs | 50.7878 KOps/s | 47.1154 KOps/s | |
test_clone | 71.5850μs | 29.3244μs | 34.1013 KOps/s | 35.1792 KOps/s | |
test_getitem[int] | 1.2098ms | 16.5152μs | 60.5504 KOps/s | 63.2557 KOps/s | |
test_getitem[slice_int] | 0.1192ms | 28.2324μs | 35.4203 KOps/s | 35.7426 KOps/s | |
test_getitem[range] | 0.1791ms | 0.1092ms | 9.1551 KOps/s | 9.1721 KOps/s | |
test_getitem[tuple] | 0.1178ms | 24.1052μs | 41.4848 KOps/s | 41.9750 KOps/s | |
test_getitem[list] | 0.5003ms | 99.0685μs | 10.0940 KOps/s | 10.0334 KOps/s | |
test_setitem_dim[int] | 65.8650μs | 44.6869μs | 22.3779 KOps/s | 22.1536 KOps/s | |
test_setitem_dim[slice_int] | 99.9370μs | 67.5130μs | 14.8120 KOps/s | 14.7270 KOps/s | |
test_setitem_dim[range] | 0.1553ms | 0.1290ms | 7.7532 KOps/s | 7.8445 KOps/s | |
test_setitem_dim[tuple] | 0.4667ms | 61.4001μs | 16.2866 KOps/s | 16.3782 KOps/s | |
test_setitem | 88.6450μs | 42.7115μs | 23.4129 KOps/s | 23.6923 KOps/s | |
test_set | 0.4457ms | 41.7099μs | 23.9751 KOps/s | 23.9910 KOps/s | |
test_set_shared | 0.3776ms | 54.6473μs | 18.2992 KOps/s | 18.4706 KOps/s | |
test_update | 0.4325ms | 51.4649μs | 19.4307 KOps/s | 18.5020 KOps/s | |
test_update_nested | 0.4586ms | 59.5362μs | 16.7965 KOps/s | 16.0491 KOps/s | |
test_update__nested | 0.1086ms | 62.6071μs | 15.9726 KOps/s | 15.7888 KOps/s | |
test_set_nested | 0.4355ms | 44.8759μs | 22.2837 KOps/s | 21.2395 KOps/s | |
test_set_nested_new | 98.2060μs | 48.6411μs | 20.5587 KOps/s | 19.8255 KOps/s | |
test_select | 0.1116ms | 61.7999μs | 16.1813 KOps/s | 15.5360 KOps/s | |
test_select_nested | 0.4283ms | 41.7290μs | 23.9641 KOps/s | 21.9884 KOps/s | |
test_exclude_nested | 0.4432ms | 58.4259μs | 17.1157 KOps/s | 16.3774 KOps/s | |
test_empty[True] | 0.6552ms | 0.2608ms | 3.8342 KOps/s | 3.8159 KOps/s | |
test_empty[False] | 4.7683μs | 0.7365μs | 1.3578 MOps/s | 1.3538 MOps/s | |
test_to | 0.4078ms | 26.3262μs | 37.9850 KOps/s | 38.6573 KOps/s | |
test_to_nonblocking | 59.8740μs | 25.5647μs | 39.1164 KOps/s | 39.8495 KOps/s | |
test_unbind_speed | 1.5111ms | 0.2848ms | 3.5108 KOps/s | 3.4946 KOps/s | |
test_unbind_speed_stack0 | 0.6862ms | 0.2802ms | 3.5686 KOps/s | 3.5685 KOps/s | |
test_unbind_speed_stack1 | 91.1781ms | 0.7122ms | 1.4041 KOps/s | 1.3959 KOps/s | |
test_split | 93.1798ms | 2.2470ms | 445.0385 Ops/s | 456.1572 Ops/s | |
test_chunk | 93.2378ms | 2.2527ms | 443.9182 Ops/s | 458.1654 Ops/s | |
test_creation[device0] | 0.5048ms | 0.1290ms | 7.7529 KOps/s | 7.8983 KOps/s | |
test_creation_from_tensor | 0.3783ms | 0.1305ms | 7.6649 KOps/s | 7.6484 KOps/s | |
test_add_one[memmap_tensor0] | 0.2663ms | 8.5477μs | 116.9909 KOps/s | 113.6537 KOps/s | |
test_contiguous[memmap_tensor0] | 17.7010μs | 2.1938μs | 455.8384 KOps/s | 458.0518 KOps/s | |
test_stack[memmap_tensor0] | 36.4330μs | 6.8708μs | 145.5439 KOps/s | 151.2467 KOps/s | |
test_memmaptd_index | 1.3094ms | 0.4455ms | 2.2448 KOps/s | 2.3501 KOps/s | |
test_memmaptd_index_astensor | 1.0112ms | 0.5137ms | 1.9466 KOps/s | 2.0175 KOps/s | |
test_memmaptd_index_op | 1.4210ms | 1.0393ms | 962.1907 Ops/s | 931.3130 Ops/s | |
test_serialize_model | 0.1309s | 0.1294s | 7.7268 Ops/s | 7.6712 Ops/s | |
test_serialize_model_pickle | 1.3473s | 1.2125s | 0.8247 Ops/s | 0.8246 Ops/s | |
test_serialize_weights | 0.2132s | 0.1414s | 7.0706 Ops/s | 6.9608 Ops/s | |
test_serialize_weights_returnearly | 0.2148s | 55.9685ms | 17.8672 Ops/s | 18.0293 Ops/s | |
test_serialize_weights_pickle | 1.3825s | 1.2183s | 0.8208 Ops/s | 0.8217 Ops/s | |
test_reshape_pytree | 66.7340μs | 35.8876μs | 27.8647 KOps/s | 27.8510 KOps/s | |
test_reshape_td | 71.5240μs | 41.6169μs | 24.0287 KOps/s | 23.6934 KOps/s | |
test_view_pytree | 75.1350μs | 36.4172μs | 27.4596 KOps/s | 28.6591 KOps/s | |
test_view_td | 78.3750μs | 46.5388μs | 21.4874 KOps/s | 22.1342 KOps/s | |
test_unbind_pytree | 68.1140μs | 34.8244μs | 28.7155 KOps/s | 29.4502 KOps/s | |
test_unbind_td | 0.3878ms | 43.6703μs | 22.8989 KOps/s | 23.4293 KOps/s | |
test_split_pytree | 86.4950μs | 47.0056μs | 21.2741 KOps/s | 21.4403 KOps/s | |
test_split_td | 0.7066ms | 58.3102μs | 17.1497 KOps/s | 17.7862 KOps/s | |
test_add_pytree | 97.5660μs | 58.6546μs | 17.0489 KOps/s | 17.8383 KOps/s | |
test_add_td | 0.1365ms | 98.3858μs | 10.1641 KOps/s | 10.3049 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.2287ms | 0.1618ms | 6.1795 KOps/s | 5.9889 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2521ms | 0.1667ms | 5.9986 KOps/s | 6.2638 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1778ms | 0.1450ms | 6.8963 KOps/s | 6.9080 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2618ms | 0.1885ms | 5.3060 KOps/s | 5.4912 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 64.5140μs | 22.2656μs | 44.9124 KOps/s | 47.3740 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 83.1750μs | 49.1678μs | 20.3385 KOps/s | 20.1494 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.2310ms | 64.9618μs | 15.3937 KOps/s | 15.4547 KOps/s | |
test_compile_copy_nested[pytree-eager] | 84.8550μs | 50.2655μs | 19.8944 KOps/s | 19.9871 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.3568ms | 0.3215ms | 3.1104 KOps/s | 3.1223 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3431ms | 0.2418ms | 4.1356 KOps/s | 4.2935 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1721ms | 0.1274ms | 7.8470 KOps/s | 7.8338 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1756ms | 67.4633μs | 14.8229 KOps/s | 15.3934 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.3629ms | 0.3200ms | 3.1255 KOps/s | 3.1534 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.7671ms | 0.6348ms | 1.5752 KOps/s | 1.6386 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3453ms | 0.2886ms | 3.4656 KOps/s | 3.5167 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.3740ms | 0.3210ms | 3.1153 KOps/s | 3.1116 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1696ms | 78.8933μs | 12.6754 KOps/s | 13.1454 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1866ms | 0.1306ms | 7.6596 KOps/s | 7.6020 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.6238ms | 0.5477ms | 1.8258 KOps/s | 1.9392 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.3601ms | 0.3193ms | 3.1323 KOps/s | 3.1431 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 70.8240μs | 19.5859μs | 51.0572 KOps/s | 51.2055 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 69.1740μs | 38.1989μs | 26.1788 KOps/s | 24.3866 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1059ms | 69.4339μs | 14.4022 KOps/s | 14.3951 KOps/s | |
test_compile_copy_flat[pytree-eager] | 87.7660μs | 51.2345μs | 19.5181 KOps/s | 19.3914 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 2.3699ms | 0.8405ms | 1.1898 KOps/s | 1.1231 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.4529ms | 3.3019ms | 302.8534 Ops/s | 314.8395 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 2.3501ms | 0.8253ms | 1.2117 KOps/s | 1.1301 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.4166ms | 3.2671ms | 306.0800 Ops/s | 318.5538 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1425ms | 0.1086ms | 9.2059 KOps/s | 9.1195 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.1900ms | 60.8971μs | 16.4212 KOps/s | 15.9640 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1550ms | 0.1031ms | 9.7022 KOps/s | 9.6877 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1467ms | 44.2565μs | 22.5955 KOps/s | 23.2013 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1815ms | 0.1068ms | 9.3620 KOps/s | 9.6021 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 85.0960μs | 47.5803μs | 21.0171 KOps/s | 23.1239 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1742ms | 0.1377ms | 7.2606 KOps/s | 7.2862 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1619ms | 25.9298μs | 38.5657 KOps/s | 39.8013 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1779ms | 0.1312ms | 7.6227 KOps/s | 7.6073 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 59.5440μs | 21.5170μs | 46.4750 KOps/s | 48.3435 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1912ms | 0.1354ms | 7.3874 KOps/s | 7.5869 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 89.0860μs | 20.9572μs | 47.7163 KOps/s | 47.8067 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1880ms | 0.1385ms | 7.2182 KOps/s | 7.2191 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.4850ms | 25.2703μs | 39.5722 KOps/s | 39.4738 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.5816ms | 0.1330ms | 7.5214 KOps/s | 7.5698 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 51.7630μs | 21.3027μs | 46.9425 KOps/s | 48.0774 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1925ms | 0.1326ms | 7.5431 KOps/s | 7.5610 KOps/s | |
test_compile_indexing[int-pytree-eager] | 50.1830μs | 20.9972μs | 47.6254 KOps/s | 39.4005 KOps/s | |
test_mod_add[eager] | 64.7240μs | 33.0189μs | 30.2857 KOps/s | 29.9693 KOps/s | |
test_mod_add[compile] | 0.1093ms | 71.0353μs | 14.0775 KOps/s | 14.2314 KOps/s | |
test_mod_add[compile-overhead] | 0.2590ms | 0.1350ms | 7.4098 KOps/s | 6.8049 KOps/s | |
test_mod_wrap[eager] | 0.8540ms | 0.7934ms | 1.2604 KOps/s | 1.2574 KOps/s | |
test_mod_wrap[compile] | 2.0035ms | 0.8477ms | 1.1797 KOps/s | 1.1806 KOps/s | |
test_mod_wrap[compile-overhead] | 4.8545ms | 3.0389ms | 329.0658 Ops/s | 320.9817 Ops/s | |
test_mod_wrap_and_backward[eager] | 4.5993ms | 4.0799ms | 245.1057 Ops/s | 237.9322 Ops/s | |
test_mod_wrap_and_backward[compile] | 5.0422ms | 4.1094ms | 243.3458 Ops/s | 239.5729 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3510ms | 0.9051ms | 1.1048 KOps/s | 991.4427 Ops/s | |
test_seq_add[eager] | 0.2000ms | 0.1035ms | 9.6583 KOps/s | 9.8867 KOps/s | |
test_seq_add[compile] | 0.1750ms | 81.5427μs | 12.2635 KOps/s | 12.3823 KOps/s | |
test_seq_add[compile-overhead] | 0.1508ms | 0.1142ms | 8.7531 KOps/s | 8.7423 KOps/s | |
test_seq_wrap[eager] | 1.0160ms | 0.9418ms | 1.0618 KOps/s | 1.0435 KOps/s | |
test_seq_wrap[compile] | 0.9551ms | 0.8643ms | 1.1570 KOps/s | 1.1571 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2970ms | 0.2193ms | 4.5591 KOps/s | 4.5373 KOps/s | |
test_func_call_runtime[False-eager] | 2.6259ms | 2.4073ms | 415.4015 Ops/s | 414.3886 Ops/s | |
test_func_call_runtime[False-compile] | 2.5334ms | 2.4476ms | 408.5715 Ops/s | 411.6690 Ops/s | |
test_func_call_runtime[False-compile-overhead] | 0.4294ms | 0.3609ms | 2.7710 KOps/s | 2.7852 KOps/s | |
test_func_call_runtime[True-eager] | 2.6733ms | 2.5623ms | 390.2690 Ops/s | 388.9693 Ops/s | |
test_func_call_runtime[True-compile] | 2.5482ms | 2.4720ms | 404.5255 Ops/s | 407.4661 Ops/s | |
test_func_call_runtime[True-compile-overhead] | 0.4324ms | 0.3845ms | 2.6007 KOps/s | 2.6404 KOps/s | |
test_func_call_cm_runtime[False-eager] | 2.4797ms | 2.4048ms | 415.8415 Ops/s | 414.7612 Ops/s | |
test_func_call_cm_runtime[False-compile] | 2.5299ms | 2.4457ms | 408.8758 Ops/s | 414.1822 Ops/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4173ms | 0.3653ms | 2.7376 KOps/s | 2.7646 KOps/s | |
test_func_call_cm_runtime[True-eager] | 2.8352ms | 2.6804ms | 373.0727 Ops/s | 373.0262 Ops/s | |
test_func_call_cm_runtime[True-compile] | 2.6041ms | 2.5063ms | 398.9988 Ops/s | 402.8327 Ops/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.4973ms | 0.4088ms | 2.4465 KOps/s | 2.4571 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 4.2696ms | 3.8435ms | 260.1819 Ops/s | 263.5596 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 2.5813ms | 2.5072ms | 398.8504 Ops/s | 403.4707 Ops/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.4597ms | 0.4118ms | 2.4287 KOps/s | 2.4517 KOps/s | |
test_distributed | 2.5392ms | 0.1757ms | 5.6927 KOps/s | 8.8475 KOps/s | |
test_tdmodule | 62.3740μs | 15.2659μs | 65.5056 KOps/s | 61.6484 KOps/s | |
test_tdmodule_dispatch | 59.5040μs | 29.5661μs | 33.8225 KOps/s | 30.6073 KOps/s | |
test_tdseq | 36.1520μs | 16.2337μs | 61.6003 KOps/s | 57.2296 KOps/s | |
test_tdseq_dispatch | 53.4230μs | 32.3248μs | 30.9360 KOps/s | 28.3367 KOps/s | |
test_instantiation_functorch | 2.0974ms | 1.8776ms | 532.6061 Ops/s | 545.2851 Ops/s | |
test_instantiation_td | 1.8255ms | 1.2147ms | 823.2339 Ops/s | 831.8947 Ops/s | |
test_exec_functorch | 1.1319ms | 1.0177ms | 982.6443 Ops/s | 987.9135 Ops/s | |
test_exec_functional_call | 1.0736ms | 1.0275ms | 973.2424 Ops/s | 977.7719 Ops/s | |
test_exec_td | 1.2473ms | 1.0608ms | 942.7256 Ops/s | 976.8026 Ops/s | |
test_exec_td_decorator | 1.9103ms | 1.0819ms | 924.3331 Ops/s | 941.5361 Ops/s | |
test_vmap_mlp_speed[True-True] | 1.3521ms | 1.2810ms | 780.6133 Ops/s | 781.0284 Ops/s | |
test_vmap_mlp_speed[True-False] | 1.3571ms | 1.2770ms | 783.0696 Ops/s | 787.1952 Ops/s | |
test_vmap_mlp_speed[False-True] | 1.2654ms | 1.1690ms | 855.4472 Ops/s | 863.3416 Ops/s | |
test_vmap_mlp_speed[False-False] | 1.2257ms | 1.1745ms | 851.4392 Ops/s | 860.4453 Ops/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.7547ms | 1.2533ms | 797.9184 Ops/s | 799.8749 Ops/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.3428ms | 1.2526ms | 798.3587 Ops/s | 800.1403 Ops/s | |
test_vmap_mlp_speed_decorator[False-True] | 1.3211ms | 1.1723ms | 853.0006 Ops/s | 858.0404 Ops/s | |
test_vmap_mlp_speed_decorator[False-False] | 1.2794ms | 1.1705ms | 854.3565 Ops/s | 859.9525 Ops/s | |
test_vmap_transformer_speed[True-True] | 13.2957ms | 13.1967ms | 75.7764 Ops/s | 75.6901 Ops/s | |
test_vmap_transformer_speed[True-False] | 13.3208ms | 13.1628ms | 75.9717 Ops/s | 75.8283 Ops/s | |
test_vmap_transformer_speed[False-True] | 13.0310ms | 12.9498ms | 77.2212 Ops/s | 76.8370 Ops/s | |
test_vmap_transformer_speed[False-False] | 13.0104ms | 12.9358ms | 77.3049 Ops/s | 76.9785 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 33.9718ms | 33.8839ms | 29.5126 Ops/s | 29.6791 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 34.1805ms | 34.0526ms | 29.3664 Ops/s | 29.6879 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 34.3522ms | 34.0215ms | 29.3931 Ops/s | 29.5662 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 34.1088ms | 33.9989ms | 29.4127 Ops/s | 29.5955 Ops/s | |
test_to_module_speed[True] | 2.0741ms | 0.9998ms | 1.0002 KOps/s | 1.0084 KOps/s | |
test_to_module_speed[False] | 1.0761ms | 0.9806ms | 1.0198 KOps/s | 1.0366 KOps/s | |
test_tc_init | 63.5840μs | 35.8911μs | 27.8620 KOps/s | 27.5524 KOps/s | |
test_tc_init_nested | 0.1062ms | 74.4452μs | 13.4327 KOps/s | 13.9670 KOps/s | |
test_tc_first_layer_tensor | 4.6120μs | 0.6534μs | 1.5305 MOps/s | 1.4879 MOps/s | |
test_tc_first_layer_nontensor | 23.2120μs | 2.2066μs | 453.1885 KOps/s | 443.6885 KOps/s | |
test_tc_second_layer_tensor | 10.6180μs | 1.3601μs | 735.2245 KOps/s | 726.7870 KOps/s | |
test_tc_second_layer_nontensor | 24.0220μs | 2.9678μs | 336.9510 KOps/s | 336.9989 KOps/s | |
test_unbind | 0.1927s | 12.2283ms | 81.7777 Ops/s | 93.8199 Ops/s | |
test_full_like | 0.6561ms | 0.5740ms | 1.7423 KOps/s | 1.7408 KOps/s | |
test_zeros_like | 0.2619ms | 0.1980ms | 5.0513 KOps/s | 5.0501 KOps/s | |
test_ones_like | 0.2342ms | 0.1978ms | 5.0549 KOps/s | 5.0555 KOps/s | |
test_clone | 0.4612ms | 0.4137ms | 2.4170 KOps/s | 2.4165 KOps/s | |
test_squeeze | 43.1630μs | 10.0592μs | 99.4111 KOps/s | 104.0571 KOps/s | |
test_unsqueeze | 0.2194ms | 76.0830μs | 13.1435 KOps/s | 13.4978 KOps/s | |
test_split | 0.4295ms | 0.1616ms | 6.1869 KOps/s | 6.4290 KOps/s | |
test_permute | 0.2696ms | 0.1772ms | 5.6433 KOps/s | 5.7301 KOps/s | |
test_stack | 1.2454ms | 0.8657ms | 1.1551 KOps/s | 1.1516 KOps/s | |
test_cat | 1.2520ms | 1.2315ms | 812.0051 Ops/s | 811.9832 Ops/s |
vmoens
added a commit
that referenced
this pull request
Oct 3, 2024
ghstack-source-id: 92120f9043653078ed1eaa693a48c3f7e1ce3412 Pull Request resolved: #1020
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
enhancement
New feature or request
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):