-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Fix _foreach_copy_ for older versions of PT #1035
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Oct 8, 2024
ghstack-source-id: 682b96483f0ffdad4ef8e7cdd35f133587c2c828 Pull Request resolved: #1035
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Oct 8, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 52.8800μs | 23.5445μs | 42.4728 KOps/s | 40.9754 KOps/s | |
test_plain_set_stack_nested | 59.8920μs | 23.6062μs | 42.3617 KOps/s | 39.8222 KOps/s | |
test_plain_set_nested_inplace | 65.4130μs | 25.5057μs | 39.2070 KOps/s | 36.5333 KOps/s | |
test_plain_set_stack_nested_inplace | 90.0200μs | 25.5773μs | 39.0971 KOps/s | 36.6362 KOps/s | |
test_items | 34.0140μs | 4.0986μs | 243.9850 KOps/s | 214.6165 KOps/s | |
test_items_nested | 0.5990ms | 0.3775ms | 2.6490 KOps/s | 2.6006 KOps/s | |
test_items_nested_locked | 0.5826ms | 0.3807ms | 2.6271 KOps/s | 2.5826 KOps/s | |
test_items_nested_leaf | 0.1523ms | 80.1237μs | 12.4807 KOps/s | 12.2991 KOps/s | |
test_items_stack_nested | 0.5711ms | 0.3845ms | 2.6006 KOps/s | 2.5736 KOps/s | |
test_items_stack_nested_leaf | 0.1671ms | 83.9236μs | 11.9156 KOps/s | 11.9296 KOps/s | |
test_items_stack_nested_locked | 0.6217ms | 0.3841ms | 2.6038 KOps/s | 2.5926 KOps/s | |
test_keys | 25.1670μs | 3.6341μs | 275.1734 KOps/s | 275.3318 KOps/s | |
test_keys_nested | 0.2510ms | 0.1349ms | 7.4124 KOps/s | 7.1638 KOps/s | |
test_keys_nested_locked | 0.6918ms | 0.1436ms | 6.9620 KOps/s | 6.9163 KOps/s | |
test_keys_nested_leaf | 0.1948ms | 0.1179ms | 8.4834 KOps/s | 8.1121 KOps/s | |
test_keys_stack_nested | 0.2492ms | 0.1363ms | 7.3367 KOps/s | 7.1401 KOps/s | |
test_keys_stack_nested_leaf | 0.1868ms | 0.1201ms | 8.3280 KOps/s | 8.0927 KOps/s | |
test_keys_stack_nested_locked | 0.2611ms | 0.1420ms | 7.0406 KOps/s | 6.8039 KOps/s | |
test_values | 8.5760μs | 1.0437μs | 958.1477 KOps/s | 945.6531 KOps/s | |
test_values_nested | 0.1687ms | 92.6996μs | 10.7875 KOps/s | 10.1541 KOps/s | |
test_values_nested_locked | 0.1688ms | 93.6565μs | 10.6773 KOps/s | 10.4638 KOps/s | |
test_values_nested_leaf | 0.1464ms | 78.8075μs | 12.6891 KOps/s | 12.2573 KOps/s | |
test_values_stack_nested | 0.1636ms | 93.1914μs | 10.7306 KOps/s | 10.5700 KOps/s | |
test_values_stack_nested_leaf | 0.1616ms | 78.3801μs | 12.7583 KOps/s | 12.2240 KOps/s | |
test_values_stack_nested_locked | 0.1594ms | 94.6361μs | 10.5668 KOps/s | 10.3204 KOps/s | |
test_membership | 6.0499μs | 0.7272μs | 1.3751 MOps/s | 1.1519 MOps/s | |
test_membership_nested | 28.7540μs | 2.7318μs | 366.0596 KOps/s | 367.3263 KOps/s | |
test_membership_nested_leaf | 28.3530μs | 2.7204μs | 367.5987 KOps/s | 361.7287 KOps/s | |
test_membership_stacked_nested | 26.3590μs | 2.7440μs | 364.4311 KOps/s | 367.2817 KOps/s | |
test_membership_stacked_nested_leaf | 43.7820μs | 2.7066μs | 369.4620 KOps/s | 363.5475 KOps/s | |
test_membership_nested_last | 30.8980μs | 4.2620μs | 234.6307 KOps/s | 238.6715 KOps/s | |
test_membership_nested_leaf_last | 50.3050μs | 4.2124μs | 237.3930 KOps/s | 235.4393 KOps/s | |
test_membership_stacked_nested_last | 45.6160μs | 5.8279μs | 171.5871 KOps/s | 201.0910 KOps/s | |
test_membership_stacked_nested_leaf_last | 43.1210μs | 5.8993μs | 169.5125 KOps/s | 198.3664 KOps/s | |
test_nested_getleaf | 55.2340μs | 10.5302μs | 94.9645 KOps/s | 93.9151 KOps/s | |
test_nested_get | 38.4920μs | 10.0357μs | 99.6439 KOps/s | 99.3022 KOps/s | |
test_stacked_getleaf | 50.5550μs | 10.5022μs | 95.2179 KOps/s | 94.8649 KOps/s | |
test_stacked_get | 35.7870μs | 10.0645μs | 99.3593 KOps/s | 98.6879 KOps/s | |
test_nested_getitemleaf | 32.4810μs | 10.9496μs | 91.3276 KOps/s | 91.3077 KOps/s | |
test_nested_getitem | 33.4330μs | 10.3508μs | 96.6107 KOps/s | 96.7918 KOps/s | |
test_stacked_getitemleaf | 29.7450μs | 10.9078μs | 91.6777 KOps/s | 89.9944 KOps/s | |
test_stacked_getitem | 49.1230μs | 10.2073μs | 97.9693 KOps/s | 96.8683 KOps/s | |
test_lock_nested | 88.9962ms | 0.6058ms | 1.6508 KOps/s | 1.9551 KOps/s | |
test_lock_stack_nested | 0.8422ms | 0.4668ms | 2.1420 KOps/s | 2.0722 KOps/s | |
test_unlock_nested | 90.6623ms | 0.5207ms | 1.9205 KOps/s | 2.3270 KOps/s | |
test_unlock_stack_nested | 0.5251ms | 0.3766ms | 2.6554 KOps/s | 2.5015 KOps/s | |
test_flatten_speed | 0.3155ms | 0.1031ms | 9.6948 KOps/s | 9.9331 KOps/s | |
test_unflatten_speed | 0.7335ms | 0.5149ms | 1.9422 KOps/s | 1.9364 KOps/s | |
test_common_ops | 3.8587ms | 1.1047ms | 905.2573 Ops/s | 870.6569 Ops/s | |
test_creation | 74.9210μs | 2.0569μs | 486.1687 KOps/s | 478.9606 KOps/s | |
test_creation_empty | 48.4810μs | 17.6140μs | 56.7731 KOps/s | 51.7927 KOps/s | |
test_creation_nested_1 | 59.2110μs | 21.2707μs | 47.0130 KOps/s | 43.9139 KOps/s | |
test_creation_nested_2 | 73.8290μs | 25.5743μs | 39.1018 KOps/s | 37.4714 KOps/s | |
test_clone | 0.1531ms | 17.0243μs | 58.7396 KOps/s | 57.7807 KOps/s | |
test_getitem[int] | 1.0153ms | 16.3852μs | 61.0307 KOps/s | 57.9631 KOps/s | |
test_getitem[slice_int] | 0.1350ms | 29.7008μs | 33.6692 KOps/s | 31.4830 KOps/s | |
test_getitem[range] | 0.5151ms | 57.7821μs | 17.3064 KOps/s | 17.2950 KOps/s | |
test_getitem[tuple] | 0.1611ms | 25.1329μs | 39.7885 KOps/s | 38.2233 KOps/s | |
test_getitem[list] | 0.1794ms | 51.8156μs | 19.2992 KOps/s | 18.8411 KOps/s | |
test_setitem_dim[int] | 68.2080μs | 31.7322μs | 31.5138 KOps/s | 30.2722 KOps/s | |
test_setitem_dim[slice_int] | 95.8810μs | 58.7430μs | 17.0233 KOps/s | 15.8141 KOps/s | |
test_setitem_dim[range] | 0.1299ms | 82.3134μs | 12.1487 KOps/s | 12.0315 KOps/s | |
test_setitem_dim[tuple] | 84.3780μs | 47.6539μs | 20.9846 KOps/s | 20.1249 KOps/s | |
test_setitem | 85.4610μs | 29.4237μs | 33.9862 KOps/s | 33.2186 KOps/s | |
test_set | 0.2562ms | 28.8338μs | 34.6815 KOps/s | 33.8476 KOps/s | |
test_set_shared | 3.1137ms | 0.2151ms | 4.6482 KOps/s | 4.6240 KOps/s | |
test_update | 0.2789ms | 36.6901μs | 27.2553 KOps/s | 25.7331 KOps/s | |
test_update_nested | 0.1719ms | 47.2561μs | 21.1613 KOps/s | 20.0760 KOps/s | |
test_update__nested | 0.9757ms | 44.9865μs | 22.2289 KOps/s | 22.1386 KOps/s | |
test_set_nested | 0.2562ms | 32.7204μs | 30.5619 KOps/s | 31.2523 KOps/s | |
test_set_nested_new | 0.2499ms | 36.9826μs | 27.0397 KOps/s | 27.2985 KOps/s | |
test_select | 0.2444ms | 53.8883μs | 18.5569 KOps/s | 17.9964 KOps/s | |
test_select_nested | 0.1175ms | 59.8319μs | 16.7135 KOps/s | 16.6925 KOps/s | |
test_exclude_nested | 0.1450ms | 75.4418μs | 13.2553 KOps/s | 13.2511 KOps/s | |
test_empty[True] | 0.4772ms | 0.3534ms | 2.8299 KOps/s | 2.8191 KOps/s | |
test_empty[False] | 6.6084μs | 1.1885μs | 841.3719 KOps/s | 793.5746 KOps/s | |
test_unbind_speed | 0.4077ms | 0.3074ms | 3.2526 KOps/s | 3.2169 KOps/s | |
test_unbind_speed_stack0 | 0.4291ms | 0.2912ms | 3.4337 KOps/s | 3.2457 KOps/s | |
test_unbind_speed_stack1 | 0.1012s | 0.7958ms | 1.2565 KOps/s | 1.3309 KOps/s | |
test_split | 3.1298ms | 1.9929ms | 501.7789 Ops/s | 435.0395 Ops/s | |
test_chunk | 93.3572ms | 2.1852ms | 457.6277 Ops/s | 436.1655 Ops/s | |
test_creation[device0] | 0.2330ms | 0.1144ms | 8.7401 KOps/s | 8.4894 KOps/s | |
test_creation_from_tensor | 4.8757ms | 0.1162ms | 8.6078 KOps/s | 8.3873 KOps/s | |
test_add_one[memmap_tensor0] | 0.2311ms | 6.9211μs | 144.4856 KOps/s | 134.1030 KOps/s | |
test_contiguous[memmap_tensor0] | 28.1530μs | 1.8648μs | 536.2458 KOps/s | 522.0150 KOps/s | |
test_stack[memmap_tensor0] | 63.9500μs | 5.4033μs | 185.0729 KOps/s | 172.5575 KOps/s | |
test_memmaptd_index | 1.2009ms | 0.4107ms | 2.4348 KOps/s | 2.3665 KOps/s | |
test_memmaptd_index_astensor | 0.7984ms | 0.5091ms | 1.9642 KOps/s | 1.9003 KOps/s | |
test_memmaptd_index_op | 1.6331ms | 1.0091ms | 991.0137 Ops/s | 925.6551 Ops/s | |
test_serialize_model | 0.2160s | 0.1290s | 7.7490 Ops/s | 8.3823 Ops/s | |
test_serialize_model_pickle | 0.4987s | 0.4072s | 2.4557 Ops/s | 2.5857 Ops/s | |
test_serialize_weights | 0.1294s | 0.1166s | 8.5775 Ops/s | 7.5196 Ops/s | |
test_serialize_weights_returnearly | 0.2489s | 0.1715s | 5.8317 Ops/s | 6.2534 Ops/s | |
test_serialize_weights_pickle | 0.5001s | 0.4159s | 2.4046 Ops/s | 2.3993 Ops/s | |
test_serialize_weights_filesystem | 0.1508s | 0.1439s | 6.9492 Ops/s | 6.9915 Ops/s | |
test_serialize_model_filesystem | 0.1608s | 0.1525s | 6.5594 Ops/s | 6.1106 Ops/s | |
test_reshape_pytree | 0.1138ms | 38.6713μs | 25.8590 KOps/s | 26.0610 KOps/s | |
test_reshape_td | 0.1022ms | 48.7562μs | 20.5102 KOps/s | 20.7172 KOps/s | |
test_view_pytree | 0.1001ms | 37.9878μs | 26.3242 KOps/s | 25.9792 KOps/s | |
test_view_td | 0.1074ms | 54.1056μs | 18.4824 KOps/s | 19.2610 KOps/s | |
test_unbind_pytree | 75.5420μs | 35.3564μs | 28.2834 KOps/s | 27.3340 KOps/s | |
test_unbind_td | 0.3266ms | 46.8382μs | 21.3501 KOps/s | 21.6995 KOps/s | |
test_split_pytree | 81.2730μs | 37.3479μs | 26.7753 KOps/s | 26.3193 KOps/s | |
test_split_td | 0.2043ms | 57.5127μs | 17.3875 KOps/s | 16.5817 KOps/s | |
test_add_pytree | 93.6870μs | 43.9351μs | 22.7609 KOps/s | 22.0571 KOps/s | |
test_add_td | 0.2280ms | 85.2647μs | 11.7282 KOps/s | 11.4311 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1384ms | 58.2395μs | 17.1705 KOps/s | 17.4359 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.4305ms | 0.1980ms | 5.0515 KOps/s | 5.1825 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1512ms | 57.1783μs | 17.4891 KOps/s | 17.8618 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2656ms | 0.1376ms | 7.2650 KOps/s | 7.0667 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 61.0950μs | 23.9636μs | 41.7299 KOps/s | 42.9859 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1554ms | 75.2659μs | 13.2862 KOps/s | 13.3175 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1479ms | 74.2219μs | 13.4731 KOps/s | 13.4254 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1251ms | 67.8824μs | 14.7314 KOps/s | 14.5859 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.3685ms | 0.1810ms | 5.5242 KOps/s | 5.5297 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3907ms | 0.2399ms | 4.1692 KOps/s | 4.1640 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1016ms | 47.4678μs | 21.0669 KOps/s | 21.2365 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.4136ms | 76.7435μs | 13.0304 KOps/s | 12.7776 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2739ms | 0.1734ms | 5.7676 KOps/s | 5.7290 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.4689ms | 0.2777ms | 3.6014 KOps/s | 3.4333 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4646ms | 0.2757ms | 3.6276 KOps/s | 3.5951 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.3781ms | 0.1802ms | 5.5508 KOps/s | 5.5785 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1801ms | 73.3073μs | 13.6412 KOps/s | 13.5762 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1582ms | 47.9641μs | 20.8489 KOps/s | 20.9809 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.4585ms | 0.2295ms | 4.3577 KOps/s | 4.3225 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.3797ms | 0.1723ms | 5.8035 KOps/s | 5.6850 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.2596ms | 0.1092ms | 9.1582 KOps/s | 8.8344 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1939ms | 78.2738μs | 12.7757 KOps/s | 11.9726 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1459ms | 77.2340μs | 12.9477 KOps/s | 13.2093 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1276ms | 68.2885μs | 14.6438 KOps/s | 14.5260 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.4013ms | 0.1981ms | 5.0473 KOps/s | 5.1815 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 1.9826ms | 1.7012ms | 587.8192 Ops/s | 567.8482 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.3985ms | 0.1956ms | 5.1125 KOps/s | 5.2560 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.7908ms | 1.1071ms | 903.2470 Ops/s | 887.1507 Ops/s | |
test_compile_assign_and_add_stack[compile] | 0.5352ms | 0.4204ms | 2.3789 KOps/s | 2.3854 KOps/s | |
test_compile_assign_and_add_stack[eager] | 4.1010ms | 3.8754ms | 258.0404 Ops/s | 244.4281 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 83.2570μs | 34.2601μs | 29.1885 KOps/s | 29.7841 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.6434ms | 46.4505μs | 21.5283 KOps/s | 20.5776 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 4.8755ms | 29.9281μs | 33.4134 KOps/s | 34.2630 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 84.0080μs | 27.9204μs | 35.8161 KOps/s | 35.2769 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 81.1020μs | 29.6591μs | 33.7165 KOps/s | 34.8974 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1124ms | 27.4074μs | 36.4866 KOps/s | 35.5108 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1410ms | 72.9023μs | 13.7170 KOps/s | 13.6028 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.5500ms | 27.5079μs | 36.3532 KOps/s | 35.1045 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1294ms | 66.5336μs | 15.0300 KOps/s | 14.8106 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 61.8360μs | 22.9128μs | 43.6438 KOps/s | 43.7330 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1533ms | 66.6310μs | 15.0080 KOps/s | 14.9089 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 81.9840μs | 22.9614μs | 43.5514 KOps/s | 43.7475 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1411ms | 72.3422μs | 13.8232 KOps/s | 13.5899 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.9131ms | 27.5132μs | 36.3461 KOps/s | 35.2534 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1428ms | 66.3083μs | 15.0811 KOps/s | 14.7630 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 91.8920μs | 22.8708μs | 43.7239 KOps/s | 43.3926 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1432ms | 66.0903μs | 15.1308 KOps/s | 14.4321 KOps/s | |
test_compile_indexing[int-pytree-eager] | 0.1133ms | 22.7839μs | 43.8906 KOps/s | 43.7835 KOps/s | |
test_mod_add[eager] | 0.1054ms | 24.1882μs | 41.3425 KOps/s | 39.3816 KOps/s | |
test_mod_add[compile] | 0.1073ms | 37.4242μs | 26.7207 KOps/s | 26.3768 KOps/s | |
test_mod_add[compile-overhead] | 86.5130μs | 37.7638μs | 26.4804 KOps/s | 27.0643 KOps/s | |
test_mod_wrap[eager] | 0.3347ms | 0.2080ms | 4.8068 KOps/s | 4.7259 KOps/s | |
test_mod_wrap[compile] | 0.3753ms | 0.2325ms | 4.3011 KOps/s | 4.2919 KOps/s | |
test_mod_wrap[compile-overhead] | 0.3345ms | 0.2305ms | 4.3386 KOps/s | 4.3312 KOps/s | |
test_mod_wrap_and_backward[eager] | 12.3936ms | 10.7155ms | 93.3230 Ops/s | 86.4881 Ops/s | |
test_mod_wrap_and_backward[compile] | 12.5517ms | 10.6348ms | 94.0309 Ops/s | 80.7215 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 11.5605ms | 10.6760ms | 93.6678 Ops/s | 78.2329 Ops/s | |
test_seq_add[eager] | 0.2769ms | 89.8437μs | 11.1304 KOps/s | 10.8707 KOps/s | |
test_seq_add[compile] | 0.1865ms | 63.3898μs | 15.7754 KOps/s | 15.5066 KOps/s | |
test_seq_add[compile-overhead] | 0.1615ms | 61.8316μs | 16.1730 KOps/s | 15.9605 KOps/s | |
test_seq_wrap[eager] | 1.0108ms | 0.3797ms | 2.6338 KOps/s | 2.5352 KOps/s | |
test_seq_wrap[compile] | 0.4119ms | 0.2664ms | 3.7537 KOps/s | 3.6813 KOps/s | |
test_seq_wrap[compile-overhead] | 0.5166ms | 0.2698ms | 3.7067 KOps/s | 3.6683 KOps/s | |
test_func_call_runtime[False-eager] | 0.8943ms | 0.5263ms | 1.9000 KOps/s | 1.8735 KOps/s | |
test_func_call_runtime[False-compile] | 0.6168ms | 0.4982ms | 2.0072 KOps/s | 1.9695 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.6372ms | 0.4991ms | 2.0035 KOps/s | 1.9400 KOps/s | |
test_func_call_runtime[True-eager] | 0.8771ms | 0.7375ms | 1.3560 KOps/s | 1.3336 KOps/s | |
test_func_call_runtime[True-compile] | 1.0889ms | 0.5175ms | 1.9322 KOps/s | 1.9371 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.6270ms | 0.5155ms | 1.9398 KOps/s | 1.9274 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8172ms | 0.5275ms | 1.8958 KOps/s | 1.8904 KOps/s | |
test_func_call_cm_runtime[False-compile] | 1.0801ms | 0.5035ms | 1.9860 KOps/s | 1.9620 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.5823ms | 0.5017ms | 1.9930 KOps/s | 1.9579 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.2377ms | 0.9093ms | 1.0997 KOps/s | 1.0860 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.9060ms | 0.7497ms | 1.3338 KOps/s | 1.3168 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.0720ms | 0.7589ms | 1.3177 KOps/s | 1.2953 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 3.5418ms | 1.9432ms | 514.6263 Ops/s | 504.8087 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 2.9311ms | 2.0044ms | 498.9140 Ops/s | 489.9636 Ops/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 2.6846ms | 1.9943ms | 501.4372 Ops/s | 488.2073 Ops/s | |
test_distributed | 0.2797ms | 0.1276ms | 7.8364 KOps/s | 7.6358 KOps/s | |
test_tdmodule | 31.2190μs | 17.5192μs | 57.0804 KOps/s | 51.4164 KOps/s | |
test_tdmodule_dispatch | 56.3960μs | 34.8917μs | 28.6601 KOps/s | 26.7990 KOps/s | |
test_tdseq | 39.3440μs | 20.2508μs | 49.3808 KOps/s | 45.8632 KOps/s | |
test_tdseq_dispatch | 80.9230μs | 40.2942μs | 24.8175 KOps/s | 23.4576 KOps/s | |
test_instantiation_functorch | 1.6797ms | 1.5385ms | 649.9991 Ops/s | 624.5841 Ops/s | |
test_exec_functorch | 0.3189ms | 0.1816ms | 5.5058 KOps/s | 5.3165 KOps/s | |
test_exec_functional_call | 0.3976ms | 0.1727ms | 5.7919 KOps/s | 5.5966 KOps/s | |
test_exec_td_decorator | 0.5318ms | 0.2340ms | 4.2740 KOps/s | 4.1758 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.9642ms | 0.6475ms | 1.5445 KOps/s | 1.5305 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.9376ms | 0.6467ms | 1.5463 KOps/s | 1.5106 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7186ms | 0.5330ms | 1.8761 KOps/s | 1.8348 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.8697ms | 0.5353ms | 1.8679 KOps/s | 1.8528 KOps/s | |
test_to_module_speed[True] | 2.3320ms | 1.4107ms | 708.8656 Ops/s | 699.8337 Ops/s | |
test_to_module_speed[False] | 1.9965ms | 1.3653ms | 732.4277 Ops/s | 720.0742 Ops/s | |
test_tc_init | 92.8450μs | 45.4230μs | 22.0153 KOps/s | 20.6985 KOps/s | |
test_tc_init_nested | 0.1589ms | 90.6594μs | 11.0303 KOps/s | 10.5114 KOps/s | |
test_tc_first_layer_tensor | 15.7890μs | 1.5974μs | 626.0218 KOps/s | 615.9985 KOps/s | |
test_tc_first_layer_nontensor | 23.2740μs | 4.7056μs | 212.5119 KOps/s | 205.6748 KOps/s | |
test_tc_second_layer_tensor | 40.6570μs | 2.8598μs | 349.6726 KOps/s | 340.0885 KOps/s | |
test_tc_second_layer_nontensor | 52.0780μs | 6.0911μs | 164.1742 KOps/s | 162.5731 KOps/s | |
test_unbind | 0.4750s | 13.5066ms | 74.0380 Ops/s | 75.1760 Ops/s | |
test_full_like | 8.5786ms | 7.4698ms | 133.8720 Ops/s | 141.3096 Ops/s | |
test_zeros_like | 3.3634ms | 2.7703ms | 360.9698 Ops/s | 363.3751 Ops/s | |
test_ones_like | 4.1743ms | 3.2060ms | 311.9197 Ops/s | 144.7759 Ops/s | |
test_clone | 5.4791ms | 4.9492ms | 202.0528 Ops/s | 116.7959 Ops/s | |
test_squeeze | 64.9130μs | 12.3877μs | 80.7253 KOps/s | 80.8775 KOps/s | |
test_unsqueeze | 0.3571ms | 94.5528μs | 10.5761 KOps/s | 10.6916 KOps/s | |
test_split | 0.3912ms | 0.1927ms | 5.1894 KOps/s | 4.9619 KOps/s | |
test_permute | 0.4011ms | 0.2236ms | 4.4718 KOps/s | 4.4253 KOps/s | |
test_stack | 31.1722ms | 24.6929ms | 40.4975 Ops/s | 40.7536 Ops/s | |
test_cat | 29.0726ms | 24.5817ms | 40.6806 Ops/s | 41.6093 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.1476ms | 16.4937μs | 60.6294 KOps/s | 57.1575 KOps/s | |
test_plain_set_stack_nested | 37.2900μs | 16.5780μs | 60.3209 KOps/s | 57.0105 KOps/s | |
test_plain_set_nested_inplace | 47.6200μs | 17.6462μs | 56.6695 KOps/s | 53.8312 KOps/s | |
test_plain_set_stack_nested_inplace | 50.3110μs | 17.5804μs | 56.8814 KOps/s | 53.3699 KOps/s | |
test_items | 21.4800μs | 2.8458μs | 351.3926 KOps/s | 348.0382 KOps/s | |
test_items_nested | 0.3729ms | 0.3382ms | 2.9566 KOps/s | 2.9438 KOps/s | |
test_items_nested_locked | 0.3833ms | 0.3396ms | 2.9449 KOps/s | 2.9674 KOps/s | |
test_items_nested_leaf | 91.7120μs | 62.1810μs | 16.0821 KOps/s | 15.9196 KOps/s | |
test_items_stack_nested | 0.4219ms | 0.3398ms | 2.9425 KOps/s | 2.9236 KOps/s | |
test_items_stack_nested_leaf | 91.3120μs | 62.8614μs | 15.9080 KOps/s | 15.5087 KOps/s | |
test_items_stack_nested_locked | 0.3737ms | 0.3434ms | 2.9119 KOps/s | 2.9123 KOps/s | |
test_keys | 25.1410μs | 3.3987μs | 294.2317 KOps/s | 289.6323 KOps/s | |
test_keys_nested | 98.8520μs | 71.0264μs | 14.0793 KOps/s | 13.9689 KOps/s | |
test_keys_nested_locked | 2.5988ms | 76.9714μs | 12.9918 KOps/s | 12.9071 KOps/s | |
test_keys_nested_leaf | 94.6110μs | 61.9635μs | 16.1385 KOps/s | 16.1277 KOps/s | |
test_keys_stack_nested | 96.6310μs | 70.6640μs | 14.1515 KOps/s | 14.2790 KOps/s | |
test_keys_stack_nested_leaf | 88.4610μs | 61.2269μs | 16.3327 KOps/s | 16.2387 KOps/s | |
test_keys_stack_nested_locked | 0.1077ms | 77.1728μs | 12.9579 KOps/s | 13.0060 KOps/s | |
test_values | 4.1467μs | 0.8362μs | 1.1958 MOps/s | 1.1568 MOps/s | |
test_values_nested | 78.7720μs | 48.5228μs | 20.6089 KOps/s | 20.4514 KOps/s | |
test_values_nested_locked | 84.2320μs | 49.8793μs | 20.0484 KOps/s | 19.8734 KOps/s | |
test_values_nested_leaf | 69.3210μs | 42.6768μs | 23.4319 KOps/s | 23.3726 KOps/s | |
test_values_stack_nested | 86.1010μs | 49.0237μs | 20.3983 KOps/s | 20.0809 KOps/s | |
test_values_stack_nested_leaf | 66.5710μs | 43.1082μs | 23.1974 KOps/s | 23.1062 KOps/s | |
test_values_stack_nested_locked | 80.2520μs | 50.3434μs | 19.8636 KOps/s | 19.3857 KOps/s | |
test_membership | 1.7145μs | 0.5007μs | 1.9971 MOps/s | 1.9645 MOps/s | |
test_membership_nested | 16.5005μs | 1.8925μs | 528.3984 KOps/s | 507.3444 KOps/s | |
test_membership_nested_leaf | 14.9600μs | 1.8971μs | 527.1171 KOps/s | 538.5882 KOps/s | |
test_membership_stacked_nested | 36.2510μs | 1.9316μs | 517.6935 KOps/s | 520.1716 KOps/s | |
test_membership_stacked_nested_leaf | 27.4900μs | 1.9624μs | 509.5718 KOps/s | 518.6846 KOps/s | |
test_membership_nested_last | 36.1300μs | 2.9862μs | 334.8743 KOps/s | 336.2173 KOps/s | |
test_membership_nested_leaf_last | 23.8100μs | 3.0051μs | 332.7637 KOps/s | 328.6513 KOps/s | |
test_membership_stacked_nested_last | 26.6900μs | 3.0007μs | 333.2570 KOps/s | 160.5686 KOps/s | |
test_membership_stacked_nested_leaf_last | 28.8800μs | 2.9598μs | 337.8610 KOps/s | 160.9732 KOps/s | |
test_nested_getleaf | 35.3710μs | 6.0377μs | 165.6267 KOps/s | 162.5076 KOps/s | |
test_nested_get | 37.0910μs | 5.6267μs | 177.7237 KOps/s | 172.4554 KOps/s | |
test_stacked_getleaf | 37.6200μs | 5.9844μs | 167.1001 KOps/s | 165.2412 KOps/s | |
test_stacked_get | 24.8700μs | 5.5846μs | 179.0636 KOps/s | 174.5539 KOps/s | |
test_nested_getitemleaf | 32.6710μs | 6.0719μs | 164.6935 KOps/s | 162.5136 KOps/s | |
test_nested_getitem | 32.0100μs | 5.6400μs | 177.3046 KOps/s | 175.4921 KOps/s | |
test_stacked_getitemleaf | 44.8910μs | 6.1537μs | 162.5048 KOps/s | 164.3897 KOps/s | |
test_stacked_getitem | 0.9800ms | 5.6126μs | 178.1703 KOps/s | 174.5915 KOps/s | |
test_lock_nested | 7.0441ms | 0.4327ms | 2.3111 KOps/s | 2.3030 KOps/s | |
test_lock_stack_nested | 0.4730ms | 0.3918ms | 2.5520 KOps/s | 2.5780 KOps/s | |
test_unlock_nested | 0.7924ms | 0.3642ms | 2.7455 KOps/s | 2.7061 KOps/s | |
test_unlock_stack_nested | 0.3784ms | 0.3306ms | 3.0252 KOps/s | 3.0724 KOps/s | |
test_flatten_speed | 0.1557ms | 75.9714μs | 13.1628 KOps/s | 12.9415 KOps/s | |
test_unflatten_speed | 0.3625ms | 0.3201ms | 3.1241 KOps/s | 3.1505 KOps/s | |
test_common_ops | 1.5908ms | 1.2767ms | 783.2405 Ops/s | 776.0090 Ops/s | |
test_creation | 27.4910μs | 1.4597μs | 685.0593 KOps/s | 674.2716 KOps/s | |
test_creation_empty | 45.6810μs | 15.0762μs | 66.3298 KOps/s | 58.9033 KOps/s | |
test_creation_nested_1 | 64.6010μs | 16.9982μs | 58.8298 KOps/s | 53.6792 KOps/s | |
test_creation_nested_2 | 43.7710μs | 19.4175μs | 51.5000 KOps/s | 47.7138 KOps/s | |
test_clone | 71.7410μs | 29.0431μs | 34.4316 KOps/s | 34.3150 KOps/s | |
test_getitem[int] | 1.3447ms | 16.1603μs | 61.8802 KOps/s | 60.9231 KOps/s | |
test_getitem[slice_int] | 0.1269ms | 28.0133μs | 35.6974 KOps/s | 35.5897 KOps/s | |
test_getitem[range] | 0.2281ms | 0.1121ms | 8.9207 KOps/s | 8.8508 KOps/s | |
test_getitem[tuple] | 0.1197ms | 23.8852μs | 41.8670 KOps/s | 41.1253 KOps/s | |
test_getitem[list] | 0.2046ms | 0.1010ms | 9.8967 KOps/s | 9.9256 KOps/s | |
test_setitem_dim[int] | 77.7110μs | 45.3270μs | 22.0619 KOps/s | 21.9718 KOps/s | |
test_setitem_dim[slice_int] | 93.9810μs | 67.7299μs | 14.7645 KOps/s | 14.8387 KOps/s | |
test_setitem_dim[range] | 0.1633ms | 0.1290ms | 7.7518 KOps/s | 7.7545 KOps/s | |
test_setitem_dim[tuple] | 84.7410μs | 61.5486μs | 16.2473 KOps/s | 16.3998 KOps/s | |
test_setitem | 92.3520μs | 42.7086μs | 23.4145 KOps/s | 23.2248 KOps/s | |
test_set | 72.0310μs | 41.6464μs | 24.0117 KOps/s | 23.8555 KOps/s | |
test_set_shared | 0.3533ms | 55.0887μs | 18.1526 KOps/s | 18.2615 KOps/s | |
test_update | 98.4020μs | 50.7435μs | 19.7069 KOps/s | 19.3192 KOps/s | |
test_update_nested | 0.1115ms | 58.1351μs | 17.2013 KOps/s | 16.8590 KOps/s | |
test_update__nested | 0.4233ms | 65.1936μs | 15.3389 KOps/s | 16.2805 KOps/s | |
test_set_nested | 95.7720μs | 44.8888μs | 22.2773 KOps/s | 22.6925 KOps/s | |
test_set_nested_new | 85.7610μs | 49.8259μs | 20.0699 KOps/s | 20.8435 KOps/s | |
test_select | 0.1047ms | 60.5045μs | 16.5277 KOps/s | 16.2479 KOps/s | |
test_select_nested | 80.8420μs | 41.7226μs | 23.9678 KOps/s | 24.2177 KOps/s | |
test_exclude_nested | 90.2710μs | 59.1456μs | 16.9074 KOps/s | 17.1021 KOps/s | |
test_empty[True] | 0.3050ms | 0.2587ms | 3.8648 KOps/s | 3.8964 KOps/s | |
test_empty[False] | 5.0691μs | 0.7433μs | 1.3453 MOps/s | 1.3508 MOps/s | |
test_to | 57.1500μs | 26.5358μs | 37.6850 KOps/s | 37.4427 KOps/s | |
test_to_nonblocking | 69.1610μs | 25.2376μs | 39.6235 KOps/s | 39.0636 KOps/s | |
test_unbind_speed | 1.4633ms | 0.2802ms | 3.5687 KOps/s | 3.5774 KOps/s | |
test_unbind_speed_stack0 | 0.3580ms | 0.2731ms | 3.6614 KOps/s | 3.6276 KOps/s | |
test_unbind_speed_stack1 | 91.7594ms | 0.7075ms | 1.4135 KOps/s | 1.4244 KOps/s | |
test_split | 93.6574ms | 2.1555ms | 463.9246 Ops/s | 450.5751 Ops/s | |
test_chunk | 93.3817ms | 2.1499ms | 465.1467 Ops/s | 447.0926 Ops/s | |
test_creation[device0] | 0.3474ms | 0.1294ms | 7.7261 KOps/s | 7.8266 KOps/s | |
test_creation_from_tensor | 0.3601ms | 0.1363ms | 7.3394 KOps/s | 7.5515 KOps/s | |
test_add_one[memmap_tensor0] | 0.2336ms | 9.0939μs | 109.9633 KOps/s | 113.4235 KOps/s | |
test_contiguous[memmap_tensor0] | 30.9700μs | 2.1924μs | 456.1139 KOps/s | 455.4627 KOps/s | |
test_stack[memmap_tensor0] | 42.8600μs | 6.6754μs | 149.8036 KOps/s | 141.2164 KOps/s | |
test_memmaptd_index | 1.3590ms | 0.4246ms | 2.3550 KOps/s | 2.2560 KOps/s | |
test_memmaptd_index_astensor | 0.7461ms | 0.5022ms | 1.9911 KOps/s | 1.9301 KOps/s | |
test_memmaptd_index_op | 1.4521ms | 1.0404ms | 961.1486 Ops/s | 928.6160 Ops/s | |
test_serialize_model | 0.1322s | 0.1301s | 7.6879 Ops/s | 7.6955 Ops/s | |
test_serialize_model_pickle | 1.3477s | 1.2132s | 0.8243 Ops/s | 0.8239 Ops/s | |
test_serialize_weights | 0.2245s | 0.1430s | 6.9906 Ops/s | 7.0059 Ops/s | |
test_serialize_weights_returnearly | 0.2140s | 56.9957ms | 17.5452 Ops/s | 17.7991 Ops/s | |
test_serialize_weights_pickle | 1.3721s | 1.2183s | 0.8208 Ops/s | 0.8220 Ops/s | |
test_reshape_pytree | 76.9010μs | 35.2999μs | 28.3287 KOps/s | 28.1059 KOps/s | |
test_reshape_td | 81.9710μs | 41.0699μs | 24.3487 KOps/s | 23.4307 KOps/s | |
test_view_pytree | 66.5510μs | 34.7618μs | 28.7672 KOps/s | 28.3663 KOps/s | |
test_view_td | 81.9410μs | 45.4914μs | 21.9822 KOps/s | 21.2266 KOps/s | |
test_unbind_pytree | 69.7810μs | 33.5423μs | 29.8131 KOps/s | 29.4293 KOps/s | |
test_unbind_td | 0.5296ms | 42.0449μs | 23.7841 KOps/s | 23.3351 KOps/s | |
test_split_pytree | 0.1034ms | 46.3063μs | 21.5953 KOps/s | 21.7514 KOps/s | |
test_split_td | 93.7510ms | 64.3986μs | 15.5283 KOps/s | 17.4197 KOps/s | |
test_add_pytree | 0.1031ms | 56.9738μs | 17.5519 KOps/s | 17.6258 KOps/s | |
test_add_td | 0.1666ms | 96.4142μs | 10.3719 KOps/s | 10.3460 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.2112ms | 0.1603ms | 6.2369 KOps/s | 6.1721 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2866ms | 0.1644ms | 6.0825 KOps/s | 6.0360 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1824ms | 0.1442ms | 6.9364 KOps/s | 6.9177 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2361ms | 0.1850ms | 5.4060 KOps/s | 5.4390 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 64.7210μs | 21.7761μs | 45.9219 KOps/s | 45.4771 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 81.4710μs | 49.3786μs | 20.2517 KOps/s | 20.1006 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.2258ms | 65.4696μs | 15.2743 KOps/s | 15.4375 KOps/s | |
test_compile_copy_nested[pytree-eager] | 92.8120μs | 49.4949μs | 20.2041 KOps/s | 19.9807 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.3832ms | 0.3214ms | 3.1112 KOps/s | 3.0904 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3919ms | 0.2335ms | 4.2821 KOps/s | 4.1641 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1766ms | 0.1271ms | 7.8702 KOps/s | 7.7663 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1233ms | 65.3770μs | 15.2959 KOps/s | 14.2221 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.4589ms | 0.3231ms | 3.0953 KOps/s | 3.1273 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.7966ms | 0.6631ms | 1.5080 KOps/s | 1.5979 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4284ms | 0.2853ms | 3.5048 KOps/s | 3.4478 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.4377ms | 0.3272ms | 3.0559 KOps/s | 3.0825 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1883ms | 78.4725μs | 12.7433 KOps/s | 12.6609 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.2012ms | 0.1353ms | 7.3920 KOps/s | 7.5996 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.7211ms | 0.5420ms | 1.8450 KOps/s | 1.9084 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.6153ms | 0.3279ms | 3.0496 KOps/s | 3.1362 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.1109ms | 20.2313μs | 49.4284 KOps/s | 49.4949 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1343ms | 37.9104μs | 26.3780 KOps/s | 25.3332 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1525ms | 69.9791μs | 14.2900 KOps/s | 14.3131 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1393ms | 51.5883μs | 19.3842 KOps/s | 19.3669 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 2.3505ms | 0.7776ms | 1.2861 KOps/s | 1.1132 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.5642ms | 3.3344ms | 299.9068 Ops/s | 309.9770 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 2.3142ms | 0.8312ms | 1.2030 KOps/s | 1.1320 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.5630ms | 3.2595ms | 306.7947 Ops/s | 311.4700 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1766ms | 0.1081ms | 9.2486 KOps/s | 9.0168 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.1975ms | 61.9772μs | 16.1350 KOps/s | 15.5495 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1921ms | 0.1024ms | 9.7682 KOps/s | 9.7515 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 98.7120μs | 43.7683μs | 22.8476 KOps/s | 22.6404 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1782ms | 0.1033ms | 9.6847 KOps/s | 9.6102 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1112ms | 45.1800μs | 22.1337 KOps/s | 22.1053 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1914ms | 0.1378ms | 7.2548 KOps/s | 7.2143 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1548ms | 24.8740μs | 40.2027 KOps/s | 39.6582 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1845ms | 0.1363ms | 7.3392 KOps/s | 7.6196 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 77.9810μs | 20.6584μs | 48.4065 KOps/s | 47.8995 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.2090ms | 0.1315ms | 7.6044 KOps/s | 7.5623 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 61.0710μs | 20.4194μs | 48.9731 KOps/s | 47.6644 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.2054ms | 0.1410ms | 7.0935 KOps/s | 7.0558 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5054ms | 24.5594μs | 40.7176 KOps/s | 39.4309 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1999ms | 0.1372ms | 7.2884 KOps/s | 7.5599 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 70.0910μs | 21.4300μs | 46.6636 KOps/s | 47.7780 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.2253ms | 0.1401ms | 7.1386 KOps/s | 7.5694 KOps/s | |
test_compile_indexing[int-pytree-eager] | 65.1710μs | 21.0618μs | 47.4792 KOps/s | 47.3638 KOps/s | |
test_mod_add[eager] | 83.9410μs | 33.8109μs | 29.5763 KOps/s | 29.8813 KOps/s | |
test_mod_add[compile] | 0.3053ms | 73.2673μs | 13.6487 KOps/s | 13.4030 KOps/s | |
test_mod_add[compile-overhead] | 0.2685ms | 0.1354ms | 7.3841 KOps/s | 7.1167 KOps/s | |
test_mod_wrap[eager] | 0.4463ms | 0.2412ms | 4.1452 KOps/s | 4.0216 KOps/s | |
test_mod_wrap[compile] | 1.4139ms | 0.3022ms | 3.3095 KOps/s | 3.3143 KOps/s | |
test_mod_wrap[compile-overhead] | 7.3649ms | 3.9822ms | 251.1184 Ops/s | 252.0569 Ops/s | |
test_mod_wrap_and_backward[eager] | 2.0268ms | 1.4318ms | 698.4413 Ops/s | 687.0644 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.7242ms | 1.4171ms | 705.6732 Ops/s | 691.6834 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.7533ms | 1.0582ms | 944.9720 Ops/s | 973.7104 Ops/s | |
test_seq_add[eager] | 0.1583ms | 0.1070ms | 9.3453 KOps/s | 9.2932 KOps/s | |
test_seq_add[compile] | 0.1558ms | 86.1136μs | 11.6126 KOps/s | 11.5218 KOps/s | |
test_seq_add[compile-overhead] | 0.2180ms | 0.1188ms | 8.4155 KOps/s | 8.6730 KOps/s | |
test_seq_wrap[eager] | 0.4521ms | 0.3819ms | 2.6185 KOps/s | 2.4201 KOps/s | |
test_seq_wrap[compile] | 0.4373ms | 0.3140ms | 3.1851 KOps/s | 3.0768 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2752ms | 0.2190ms | 4.5671 KOps/s | 4.4660 KOps/s | |
test_func_call_runtime[False-eager] | 1.2684ms | 0.7970ms | 1.2547 KOps/s | 1.2393 KOps/s | |
test_func_call_runtime[False-compile] | 0.9849ms | 0.7937ms | 1.2600 KOps/s | 1.2175 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4179ms | 0.3605ms | 2.7736 KOps/s | 2.7324 KOps/s | |
test_func_call_runtime[True-eager] | 0.9941ms | 0.9092ms | 1.0999 KOps/s | 1.0679 KOps/s | |
test_func_call_runtime[True-compile] | 0.8969ms | 0.8160ms | 1.2255 KOps/s | 1.1920 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4657ms | 0.3834ms | 2.6079 KOps/s | 2.5721 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.7914ms | 0.7377ms | 1.3555 KOps/s | 1.2405 KOps/s | |
test_func_call_cm_runtime[False-compile] | 1.2303ms | 0.7948ms | 1.2581 KOps/s | 1.1909 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4058ms | 0.3613ms | 2.7677 KOps/s | 2.7388 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1324ms | 1.0216ms | 978.9032 Ops/s | 972.3170 Ops/s | |
test_func_call_cm_runtime[True-compile] | 0.8967ms | 0.8409ms | 1.1892 KOps/s | 1.1620 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.5159ms | 0.4080ms | 2.4511 KOps/s | 2.4219 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.7968ms | 2.1052ms | 475.0093 Ops/s | 474.2261 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 1.3419ms | 0.8655ms | 1.1554 KOps/s | 1.1463 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.4759ms | 0.4084ms | 2.4487 KOps/s | 2.4139 KOps/s | |
test_distributed | 2.4353ms | 0.1257ms | 7.9528 KOps/s | 8.4203 KOps/s | |
test_tdmodule | 37.1610μs | 14.5963μs | 68.5107 KOps/s | 60.5615 KOps/s | |
test_tdmodule_dispatch | 53.3410μs | 30.0704μs | 33.2553 KOps/s | 31.3113 KOps/s | |
test_tdseq | 37.8400μs | 15.9109μs | 62.8499 KOps/s | 56.5135 KOps/s | |
test_tdseq_dispatch | 54.7310μs | 31.8573μs | 31.3900 KOps/s | 28.4562 KOps/s | |
test_instantiation_functorch | 2.0141ms | 1.8627ms | 536.8466 Ops/s | 530.8640 Ops/s | |
test_exec_functorch | 0.2668ms | 0.2093ms | 4.7772 KOps/s | 4.7214 KOps/s | |
test_exec_functional_call | 0.2520ms | 0.2092ms | 4.7808 KOps/s | 4.6584 KOps/s | |
test_exec_td_decorator | 0.4421ms | 0.2676ms | 3.7368 KOps/s | 3.7168 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8354ms | 0.7107ms | 1.4070 KOps/s | 1.4522 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8927ms | 0.7115ms | 1.4055 KOps/s | 1.4608 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7600ms | 0.6207ms | 1.6111 KOps/s | 1.6654 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7755ms | 0.6269ms | 1.5951 KOps/s | 1.6647 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 20.0367ms | 19.4618ms | 51.3827 Ops/s | 51.1022 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 20.2373ms | 19.5138ms | 51.2458 Ops/s | 51.1248 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 20.5781ms | 19.3984ms | 51.5508 Ops/s | 51.6004 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.4538ms | 19.3764ms | 51.6093 Ops/s | 51.3569 Ops/s | |
test_to_module_speed[True] | 1.2691ms | 0.9998ms | 1.0002 KOps/s | 994.2266 Ops/s | |
test_to_module_speed[False] | 1.4137ms | 0.9737ms | 1.0271 KOps/s | 1.0162 KOps/s | |
test_tc_init | 74.1410μs | 34.7403μs | 28.7851 KOps/s | 27.3324 KOps/s | |
test_tc_init_nested | 0.1329ms | 71.2356μs | 14.0379 KOps/s | 13.4444 KOps/s | |
test_tc_first_layer_tensor | 10.3416μs | 0.6703μs | 1.4918 MOps/s | 1.4770 MOps/s | |
test_tc_first_layer_nontensor | 25.7010μs | 2.2314μs | 448.1436 KOps/s | 446.5971 KOps/s | |
test_tc_second_layer_tensor | 11.6778μs | 1.3431μs | 744.5461 KOps/s | 728.0689 KOps/s | |
test_tc_second_layer_nontensor | 22.2600μs | 2.9199μs | 342.4771 KOps/s | 339.8466 KOps/s | |
test_unbind | 0.1835s | 11.9713ms | 83.5331 Ops/s | 93.2731 Ops/s | |
test_full_like | 0.6545ms | 0.5733ms | 1.7444 KOps/s | 1.7440 KOps/s | |
test_zeros_like | 0.2872ms | 0.1979ms | 5.0532 KOps/s | 5.0555 KOps/s | |
test_ones_like | 0.2353ms | 0.1977ms | 5.0576 KOps/s | 5.0540 KOps/s | |
test_clone | 0.4450ms | 0.4146ms | 2.4118 KOps/s | 2.4100 KOps/s | |
test_squeeze | 33.5610μs | 9.7537μs | 102.5252 KOps/s | 101.6714 KOps/s | |
test_unsqueeze | 0.2234ms | 74.3563μs | 13.4488 KOps/s | 13.6498 KOps/s | |
test_split | 0.4252ms | 0.1552ms | 6.4443 KOps/s | 6.3827 KOps/s | |
test_permute | 0.2214ms | 0.1805ms | 5.5414 KOps/s | 5.4155 KOps/s | |
test_stack | 1.2510ms | 0.8598ms | 1.1631 KOps/s | 1.2530 KOps/s | |
test_cat | 1.2531ms | 1.2311ms | 812.2777 Ops/s | 812.1694 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):