-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] _foreach_copy_ for update_ #1032
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Oct 7, 2024
ghstack-source-id: dc32497757d1fb19dc61dea9115810890d5a4acb Pull Request resolved: #1032
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Oct 7, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 64.8010μs | 24.0141μs | 41.6422 KOps/s | 41.4010 KOps/s | |
test_plain_set_stack_nested | 86.0520μs | 24.2061μs | 41.3119 KOps/s | 41.1943 KOps/s | |
test_plain_set_nested_inplace | 69.6600μs | 26.6616μs | 37.5071 KOps/s | 37.8898 KOps/s | |
test_plain_set_stack_nested_inplace | 69.2000μs | 26.5311μs | 37.6916 KOps/s | 38.2458 KOps/s | |
test_items | 36.6990μs | 4.2327μs | 236.2547 KOps/s | 245.1108 KOps/s | |
test_items_nested | 0.5870ms | 0.3864ms | 2.5877 KOps/s | 2.6480 KOps/s | |
test_items_nested_locked | 0.5795ms | 0.3845ms | 2.6006 KOps/s | 2.6129 KOps/s | |
test_items_nested_leaf | 0.1586ms | 81.6589μs | 12.2461 KOps/s | 12.4558 KOps/s | |
test_items_stack_nested | 0.8100ms | 0.3920ms | 2.5511 KOps/s | 2.6110 KOps/s | |
test_items_stack_nested_leaf | 0.1730ms | 83.8658μs | 11.9238 KOps/s | 11.9340 KOps/s | |
test_items_stack_nested_locked | 0.5512ms | 0.3868ms | 2.5851 KOps/s | 2.5973 KOps/s | |
test_keys | 25.2770μs | 3.4763μs | 287.6638 KOps/s | 279.4155 KOps/s | |
test_keys_nested | 0.2423ms | 0.1385ms | 7.2189 KOps/s | 7.4802 KOps/s | |
test_keys_nested_locked | 0.7327ms | 0.1445ms | 6.9201 KOps/s | 7.1056 KOps/s | |
test_keys_nested_leaf | 0.2066ms | 0.1205ms | 8.2998 KOps/s | 8.4928 KOps/s | |
test_keys_stack_nested | 0.2511ms | 0.1343ms | 7.4441 KOps/s | 7.5599 KOps/s | |
test_keys_stack_nested_leaf | 0.2131ms | 0.1158ms | 8.6367 KOps/s | 8.7217 KOps/s | |
test_keys_stack_nested_locked | 0.2598ms | 0.1404ms | 7.1224 KOps/s | 7.2320 KOps/s | |
test_values | 6.5402μs | 1.0484μs | 953.8468 KOps/s | 920.6233 KOps/s | |
test_values_nested | 0.1599ms | 92.1800μs | 10.8483 KOps/s | 10.5146 KOps/s | |
test_values_nested_locked | 0.2068ms | 92.3836μs | 10.8244 KOps/s | 10.5847 KOps/s | |
test_values_nested_leaf | 0.1326ms | 80.7621μs | 12.3820 KOps/s | 12.3260 KOps/s | |
test_values_stack_nested | 0.1693ms | 93.2887μs | 10.7194 KOps/s | 10.5191 KOps/s | |
test_values_stack_nested_leaf | 0.1350ms | 79.0742μs | 12.6464 KOps/s | 12.3568 KOps/s | |
test_values_stack_nested_locked | 0.2074ms | 93.6395μs | 10.6792 KOps/s | 10.5491 KOps/s | |
test_membership | 16.0086μs | 0.7602μs | 1.3155 MOps/s | 900.6911 KOps/s | |
test_membership_nested | 0.1190ms | 2.8216μs | 354.4067 KOps/s | 359.7426 KOps/s | |
test_membership_nested_leaf | 29.3450μs | 2.7670μs | 361.4071 KOps/s | 347.6074 KOps/s | |
test_membership_stacked_nested | 28.1420μs | 2.7415μs | 364.7603 KOps/s | 359.7738 KOps/s | |
test_membership_stacked_nested_leaf | 23.7640μs | 2.7400μs | 364.9631 KOps/s | 353.6571 KOps/s | |
test_membership_nested_last | 26.0990μs | 4.2340μs | 236.1852 KOps/s | 238.9292 KOps/s | |
test_membership_nested_leaf_last | 33.0620μs | 4.1784μs | 239.3288 KOps/s | 237.6224 KOps/s | |
test_membership_stacked_nested_last | 45.9560μs | 13.6206μs | 73.4184 KOps/s | 71.0024 KOps/s | |
test_membership_stacked_nested_leaf_last | 39.8550μs | 13.5369μs | 73.8719 KOps/s | 70.5443 KOps/s | |
test_nested_getleaf | 0.1478ms | 10.9505μs | 91.3203 KOps/s | 92.3273 KOps/s | |
test_nested_get | 49.5930μs | 10.2804μs | 97.2728 KOps/s | 91.8512 KOps/s | |
test_stacked_getleaf | 29.2050μs | 10.7441μs | 93.0746 KOps/s | 91.9895 KOps/s | |
test_stacked_get | 0.1029ms | 10.2586μs | 97.4795 KOps/s | 97.2419 KOps/s | |
test_nested_getitemleaf | 31.7600μs | 11.1743μs | 89.4912 KOps/s | 89.6610 KOps/s | |
test_nested_getitem | 28.9050μs | 10.5426μs | 94.8532 KOps/s | 94.4656 KOps/s | |
test_stacked_getitemleaf | 45.4560μs | 11.1718μs | 89.5112 KOps/s | 87.8583 KOps/s | |
test_stacked_getitem | 34.6350μs | 10.3859μs | 96.2847 KOps/s | 95.3498 KOps/s | |
test_lock_nested | 86.3091ms | 0.5913ms | 1.6911 KOps/s | 1.9614 KOps/s | |
test_lock_stack_nested | 0.7304ms | 0.4583ms | 2.1818 KOps/s | 2.1914 KOps/s | |
test_unlock_nested | 86.4482ms | 0.5112ms | 1.9561 KOps/s | 2.3082 KOps/s | |
test_unlock_stack_nested | 0.5547ms | 0.3721ms | 2.6874 KOps/s | 2.6645 KOps/s | |
test_flatten_speed | 0.1904ms | 0.1008ms | 9.9238 KOps/s | 9.8639 KOps/s | |
test_unflatten_speed | 1.0373ms | 0.5172ms | 1.9334 KOps/s | 1.9067 KOps/s | |
test_common_ops | 4.0085ms | 1.1418ms | 875.7719 Ops/s | 833.2848 Ops/s | |
test_creation | 23.3340μs | 2.1202μs | 471.6623 KOps/s | 486.3882 KOps/s | |
test_creation_empty | 73.3480μs | 17.8867μs | 55.9076 KOps/s | 58.8117 KOps/s | |
test_creation_nested_1 | 55.8740μs | 21.3792μs | 46.7744 KOps/s | 50.0850 KOps/s | |
test_creation_nested_2 | 57.4570μs | 25.4260μs | 39.3298 KOps/s | 41.3657 KOps/s | |
test_clone | 77.5650μs | 16.9128μs | 59.1269 KOps/s | 56.7087 KOps/s | |
test_getitem[int] | 1.0970ms | 17.0284μs | 58.7255 KOps/s | 58.4655 KOps/s | |
test_getitem[slice_int] | 0.1433ms | 31.1787μs | 32.0731 KOps/s | 31.5664 KOps/s | |
test_getitem[range] | 0.1814ms | 59.5864μs | 16.7823 KOps/s | 16.6463 KOps/s | |
test_getitem[tuple] | 0.1286ms | 25.3944μs | 39.3788 KOps/s | 38.7449 KOps/s | |
test_getitem[list] | 0.2611ms | 55.4562μs | 18.0323 KOps/s | 18.0521 KOps/s | |
test_setitem_dim[int] | 67.3360μs | 33.6503μs | 29.7174 KOps/s | 29.2610 KOps/s | |
test_setitem_dim[slice_int] | 99.1850μs | 62.4894μs | 16.0027 KOps/s | 15.9376 KOps/s | |
test_setitem_dim[range] | 0.1538ms | 85.5772μs | 11.6854 KOps/s | 11.3074 KOps/s | |
test_setitem_dim[tuple] | 0.1060ms | 50.2240μs | 19.9108 KOps/s | 19.5747 KOps/s | |
test_setitem | 77.3150μs | 30.3030μs | 33.0001 KOps/s | 32.5585 KOps/s | |
test_set | 76.6440μs | 29.6884μs | 33.6831 KOps/s | 33.6239 KOps/s | |
test_set_shared | 2.1281ms | 0.2220ms | 4.5051 KOps/s | 4.5083 KOps/s | |
test_update | 0.1448ms | 37.6572μs | 26.5554 KOps/s | 26.4567 KOps/s | |
test_update_nested | 0.1535ms | 48.8521μs | 20.4699 KOps/s | 20.2523 KOps/s | |
test_update__nested | 0.1128ms | 42.9490μs | 23.2834 KOps/s | 26.5252 KOps/s | |
test_set_nested | 0.1261ms | 32.3520μs | 30.9100 KOps/s | 30.3484 KOps/s | |
test_set_nested_new | 0.1115ms | 38.0813μs | 26.2596 KOps/s | 26.5214 KOps/s | |
test_select | 0.1415ms | 56.1207μs | 17.8187 KOps/s | 18.2068 KOps/s | |
test_select_nested | 0.1212ms | 58.8184μs | 17.0015 KOps/s | 16.7907 KOps/s | |
test_exclude_nested | 0.1426ms | 74.2447μs | 13.4690 KOps/s | 13.3682 KOps/s | |
test_empty[True] | 0.6330ms | 0.3531ms | 2.8322 KOps/s | 2.8070 KOps/s | |
test_empty[False] | 9.8433μs | 1.2853μs | 778.0311 KOps/s | 829.2650 KOps/s | |
test_unbind_speed | 0.6464ms | 0.3019ms | 3.3125 KOps/s | 3.1905 KOps/s | |
test_unbind_speed_stack0 | 0.4936ms | 0.2930ms | 3.4132 KOps/s | 3.4370 KOps/s | |
test_unbind_speed_stack1 | 84.8765ms | 0.7792ms | 1.2834 KOps/s | 1.3980 KOps/s | |
test_split | 85.1403ms | 2.1471ms | 465.7351 Ops/s | 452.5656 Ops/s | |
test_chunk | 2.2374ms | 1.9974ms | 500.6396 Ops/s | 447.4727 Ops/s | |
test_creation[device0] | 0.2079ms | 0.1182ms | 8.4632 KOps/s | 8.5666 KOps/s | |
test_creation_from_tensor | 3.7967ms | 0.1197ms | 8.3554 KOps/s | 8.4984 KOps/s | |
test_add_one[memmap_tensor0] | 0.2148ms | 7.5537μs | 132.3854 KOps/s | 132.4761 KOps/s | |
test_contiguous[memmap_tensor0] | 16.1000μs | 1.9035μs | 525.3345 KOps/s | 525.2369 KOps/s | |
test_stack[memmap_tensor0] | 57.8380μs | 5.6703μs | 176.3576 KOps/s | 180.6350 KOps/s | |
test_memmaptd_index | 1.1048ms | 0.4202ms | 2.3797 KOps/s | 2.4098 KOps/s | |
test_memmaptd_index_astensor | 1.2127ms | 0.5254ms | 1.9032 KOps/s | 1.9445 KOps/s | |
test_memmaptd_index_op | 1.4537ms | 1.0516ms | 950.8874 Ops/s | 971.5730 Ops/s | |
test_serialize_model | 0.2028s | 0.1274s | 7.8475 Ops/s | 8.4392 Ops/s | |
test_serialize_model_pickle | 0.4637s | 0.3910s | 2.5576 Ops/s | 2.5087 Ops/s | |
test_serialize_weights | 0.1210s | 0.1148s | 8.7088 Ops/s | 7.7696 Ops/s | |
test_serialize_weights_returnearly | 0.2553s | 0.1693s | 5.9064 Ops/s | 6.4020 Ops/s | |
test_serialize_weights_pickle | 0.5466s | 0.4266s | 2.3440 Ops/s | 2.3790 Ops/s | |
test_serialize_weights_filesystem | 0.1514s | 0.1425s | 7.0197 Ops/s | 6.9718 Ops/s | |
test_serialize_model_filesystem | 0.1552s | 0.1462s | 6.8410 Ops/s | 6.1481 Ops/s | |
test_reshape_pytree | 0.1144ms | 38.6991μs | 25.8404 KOps/s | 25.2415 KOps/s | |
test_reshape_td | 98.8950μs | 45.5778μs | 21.9405 KOps/s | 20.2350 KOps/s | |
test_view_pytree | 0.1182ms | 38.9018μs | 25.7058 KOps/s | 25.4236 KOps/s | |
test_view_td | 0.1357ms | 51.3082μs | 19.4901 KOps/s | 17.7673 KOps/s | |
test_unbind_pytree | 96.8910μs | 36.4161μs | 27.4604 KOps/s | 27.0109 KOps/s | |
test_unbind_td | 0.3125ms | 45.6841μs | 21.8894 KOps/s | 21.6507 KOps/s | |
test_split_pytree | 0.1159ms | 38.0462μs | 26.2838 KOps/s | 26.0376 KOps/s | |
test_split_td | 0.2141ms | 58.3270μs | 17.1447 KOps/s | 16.7152 KOps/s | |
test_add_pytree | 0.1175ms | 45.4882μs | 21.9837 KOps/s | 20.8843 KOps/s | |
test_add_td | 0.2285ms | 87.5797μs | 11.4182 KOps/s | 11.3360 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1704ms | 61.2824μs | 16.3179 KOps/s | 16.8975 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.3130ms | 0.2006ms | 4.9850 KOps/s | 5.0989 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1208ms | 57.5671μs | 17.3710 KOps/s | 17.1036 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2717ms | 0.1416ms | 7.0637 KOps/s | 6.9481 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 52.6290μs | 23.7750μs | 42.0609 KOps/s | 41.9418 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.2034ms | 75.2142μs | 13.2954 KOps/s | 13.3025 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1285ms | 75.1562μs | 13.3056 KOps/s | 13.1858 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1442ms | 69.1288μs | 14.4658 KOps/s | 14.7060 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.5272ms | 0.1865ms | 5.3619 KOps/s | 5.4316 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.4200ms | 0.2505ms | 3.9917 KOps/s | 4.2161 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1037ms | 48.9749μs | 20.4186 KOps/s | 20.5329 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1805ms | 77.6162μs | 12.8839 KOps/s | 13.0675 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.4163ms | 0.1766ms | 5.6626 KOps/s | 5.7469 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.5393ms | 0.2891ms | 3.4591 KOps/s | 3.4175 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.6239ms | 0.2855ms | 3.5032 KOps/s | 3.6655 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.3388ms | 0.1892ms | 5.2854 KOps/s | 5.4558 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1634ms | 74.7994μs | 13.3691 KOps/s | 14.0022 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 95.1080μs | 48.8925μs | 20.4530 KOps/s | 20.8502 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.4933ms | 0.2316ms | 4.3187 KOps/s | 4.2714 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.3616ms | 0.1762ms | 5.6746 KOps/s | 5.5778 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.2427ms | 0.1131ms | 8.8434 KOps/s | 8.9198 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1797ms | 79.3708μs | 12.5991 KOps/s | 12.8076 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1293ms | 77.2184μs | 12.9503 KOps/s | 12.6412 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1446ms | 68.8545μs | 14.5234 KOps/s | 14.4263 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.3614ms | 0.1940ms | 5.1555 KOps/s | 5.1582 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 1.9867ms | 1.7748ms | 563.4515 Ops/s | 576.7800 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.2831ms | 0.1905ms | 5.2488 KOps/s | 5.1136 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.3246ms | 1.1009ms | 908.3643 Ops/s | 897.5386 Ops/s | |
test_compile_assign_and_add_stack[compile] | 0.6961ms | 0.4159ms | 2.4046 KOps/s | 2.3628 KOps/s | |
test_compile_assign_and_add_stack[eager] | 4.3783ms | 4.0770ms | 245.2775 Ops/s | 244.0461 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 87.4240μs | 34.6774μs | 28.8372 KOps/s | 28.5545 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 1.0612ms | 50.0508μs | 19.9797 KOps/s | 13.0342 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 73.6080μs | 30.6740μs | 32.6009 KOps/s | 32.0815 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 77.5450μs | 29.4581μs | 33.9466 KOps/s | 32.2257 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 76.3030μs | 30.8439μs | 32.4213 KOps/s | 32.2850 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 75.6020μs | 28.8057μs | 34.7154 KOps/s | 33.6758 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1800ms | 75.0527μs | 13.3240 KOps/s | 13.1957 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.5159ms | 28.4976μs | 35.0906 KOps/s | 33.9905 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1285ms | 69.7341μs | 14.3402 KOps/s | 14.3352 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 84.8280μs | 23.7016μs | 42.1912 KOps/s | 41.8430 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1622ms | 70.0913μs | 14.2671 KOps/s | 14.3708 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 69.3000μs | 23.6711μs | 42.2456 KOps/s | 42.3783 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1679ms | 75.5571μs | 13.2350 KOps/s | 13.2502 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.9145ms | 27.9842μs | 35.7344 KOps/s | 34.7270 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1331ms | 69.6435μs | 14.3588 KOps/s | 14.5225 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 73.0970μs | 23.4779μs | 42.5933 KOps/s | 42.5237 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1470ms | 69.9145μs | 14.3032 KOps/s | 14.5061 KOps/s | |
test_compile_indexing[int-pytree-eager] | 65.9930μs | 23.1608μs | 43.1765 KOps/s | 42.5996 KOps/s | |
test_mod_add[eager] | 67.8470μs | 25.2217μs | 39.6485 KOps/s | 39.7810 KOps/s | |
test_mod_add[compile] | 0.1021ms | 38.6064μs | 25.9024 KOps/s | 24.8600 KOps/s | |
test_mod_add[compile-overhead] | 0.1076ms | 38.3425μs | 26.0807 KOps/s | 24.6406 KOps/s | |
test_mod_wrap[eager] | 0.3817ms | 0.2101ms | 4.7600 KOps/s | 4.6731 KOps/s | |
test_mod_wrap[compile] | 0.3444ms | 0.2357ms | 4.2419 KOps/s | 4.2243 KOps/s | |
test_mod_wrap[compile-overhead] | 0.3634ms | 0.2361ms | 4.2356 KOps/s | 4.2528 KOps/s | |
test_mod_wrap_and_backward[eager] | 12.2604ms | 10.8015ms | 92.5795 Ops/s | 92.7119 Ops/s | |
test_mod_wrap_and_backward[compile] | 13.3483ms | 10.9598ms | 91.2429 Ops/s | 92.1985 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 12.0849ms | 10.9239ms | 91.5422 Ops/s | 90.5969 Ops/s | |
test_seq_add[eager] | 0.2251ms | 92.7222μs | 10.7849 KOps/s | 10.7248 KOps/s | |
test_seq_add[compile] | 0.1300ms | 64.2425μs | 15.5660 KOps/s | 15.0828 KOps/s | |
test_seq_add[compile-overhead] | 0.1386ms | 61.6596μs | 16.2181 KOps/s | 15.5724 KOps/s | |
test_seq_wrap[eager] | 0.5623ms | 0.3876ms | 2.5797 KOps/s | 2.5452 KOps/s | |
test_seq_wrap[compile] | 1.1146ms | 0.2721ms | 3.6750 KOps/s | 3.6233 KOps/s | |
test_seq_wrap[compile-overhead] | 1.3154ms | 0.2713ms | 3.6859 KOps/s | 3.6310 KOps/s | |
test_func_call_runtime[False-eager] | 0.6616ms | 0.5198ms | 1.9240 KOps/s | 1.8656 KOps/s | |
test_func_call_runtime[False-compile] | 0.5957ms | 0.5011ms | 1.9955 KOps/s | 1.9687 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 1.0513ms | 0.5007ms | 1.9974 KOps/s | 1.9606 KOps/s | |
test_func_call_runtime[True-eager] | 1.0105ms | 0.7498ms | 1.3337 KOps/s | 1.3089 KOps/s | |
test_func_call_runtime[True-compile] | 0.6245ms | 0.5204ms | 1.9216 KOps/s | 1.9121 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.8607ms | 0.5149ms | 1.9420 KOps/s | 1.9356 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8187ms | 0.5175ms | 1.9326 KOps/s | 1.8602 KOps/s | |
test_func_call_cm_runtime[False-compile] | 1.1132ms | 0.5005ms | 1.9979 KOps/s | 1.9588 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.6246ms | 0.4999ms | 2.0004 KOps/s | 1.9652 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.0719ms | 0.8852ms | 1.1297 KOps/s | 1.0835 KOps/s | |
test_func_call_cm_runtime[True-compile] | 1.1586ms | 0.7361ms | 1.3584 KOps/s | 1.3100 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.9540ms | 0.7353ms | 1.3600 KOps/s | 1.3078 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.6782ms | 1.8951ms | 527.6697 Ops/s | 510.9624 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 2.4890ms | 1.9383ms | 515.9029 Ops/s | 501.6018 Ops/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 2.5993ms | 1.9551ms | 511.4851 Ops/s | 496.2218 Ops/s | |
test_distributed | 0.2247ms | 0.1269ms | 7.8818 KOps/s | 7.7501 KOps/s | |
test_tdmodule | 40.0750μs | 17.8151μs | 56.1322 KOps/s | 56.2238 KOps/s | |
test_tdmodule_dispatch | 68.6180μs | 35.6209μs | 28.0734 KOps/s | 28.6859 KOps/s | |
test_tdseq | 45.4850μs | 20.3813μs | 49.0645 KOps/s | 42.3135 KOps/s | |
test_tdseq_dispatch | 65.7930μs | 40.8584μs | 24.4748 KOps/s | 22.2819 KOps/s | |
test_instantiation_functorch | 1.7738ms | 1.5574ms | 642.0942 Ops/s | 624.2553 Ops/s | |
test_instantiation_td | 1.7690ms | 1.1595ms | 862.4566 Ops/s | 831.3938 Ops/s | |
test_exec_functorch | 0.2634ms | 0.1851ms | 5.4034 KOps/s | 5.2859 KOps/s | |
test_exec_functional_call | 0.2681ms | 0.1743ms | 5.7379 KOps/s | 5.4277 KOps/s | |
test_exec_td | 0.3644ms | 0.2050ms | 4.8788 KOps/s | 4.8449 KOps/s | |
test_exec_td_decorator | 0.9695ms | 0.2337ms | 4.2782 KOps/s | 4.1524 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.7842ms | 0.6826ms | 1.4650 KOps/s | 1.4312 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.7826ms | 0.6783ms | 1.4743 KOps/s | 1.4277 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.7103ms | 0.5379ms | 1.8591 KOps/s | 1.8285 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.8541ms | 0.5363ms | 1.8646 KOps/s | 1.8200 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.2993ms | 0.6435ms | 1.5541 KOps/s | 1.5212 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.9532ms | 0.6439ms | 1.5530 KOps/s | 1.5011 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.8189ms | 0.5322ms | 1.8788 KOps/s | 1.8280 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.6539ms | 0.5289ms | 1.8908 KOps/s | 1.8193 KOps/s | |
test_to_module_speed[True] | 2.0388ms | 1.4124ms | 707.9905 Ops/s | 695.2706 Ops/s | |
test_to_module_speed[False] | 1.8509ms | 1.3787ms | 725.3073 Ops/s | 722.1913 Ops/s | |
test_tc_init | 0.1040ms | 43.1519μs | 23.1739 KOps/s | 21.6386 KOps/s | |
test_tc_init_nested | 0.1505ms | 86.0397μs | 11.6225 KOps/s | 11.0476 KOps/s | |
test_tc_first_layer_tensor | 18.9450μs | 1.5213μs | 657.3260 KOps/s | 666.7584 KOps/s | |
test_tc_first_layer_nontensor | 19.1360μs | 4.6175μs | 216.5695 KOps/s | 205.8299 KOps/s | |
test_tc_second_layer_tensor | 23.9650μs | 2.7859μs | 358.9515 KOps/s | 346.4804 KOps/s | |
test_tc_second_layer_nontensor | 39.7440μs | 5.8990μs | 169.5196 KOps/s | 164.3170 KOps/s | |
test_unbind | 0.4728s | 13.1147ms | 76.2505 Ops/s | 76.7507 Ops/s | |
test_full_like | 7.7997ms | 7.0957ms | 140.9299 Ops/s | 142.1594 Ops/s | |
test_zeros_like | 5.3568ms | 2.8380ms | 352.3569 Ops/s | 360.4499 Ops/s | |
test_ones_like | 3.4685ms | 3.0574ms | 327.0776 Ops/s | 304.1609 Ops/s | |
test_clone | 5.3208ms | 4.7771ms | 209.3338 Ops/s | 202.0753 Ops/s | |
test_squeeze | 64.0400μs | 12.7988μs | 78.1323 KOps/s | 79.5954 KOps/s | |
test_unsqueeze | 0.3343ms | 93.8482μs | 10.6555 KOps/s | 10.5460 KOps/s | |
test_split | 0.3369ms | 0.1955ms | 5.1164 KOps/s | 4.9313 KOps/s | |
test_permute | 0.3586ms | 0.2172ms | 4.6036 KOps/s | 4.4080 KOps/s | |
test_stack | 29.7721ms | 24.6588ms | 40.5535 Ops/s | 40.8199 Ops/s | |
test_cat | 28.4933ms | 24.3613ms | 41.0488 Ops/s | 41.3429 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.1380ms | 16.8607μs | 59.3094 KOps/s | 62.8555 KOps/s | |
test_plain_set_stack_nested | 0.1051ms | 16.7985μs | 59.5293 KOps/s | 62.8198 KOps/s | |
test_plain_set_nested_inplace | 46.2800μs | 18.0193μs | 55.4961 KOps/s | 58.2314 KOps/s | |
test_plain_set_stack_nested_inplace | 52.3110μs | 17.9786μs | 55.6218 KOps/s | 58.5876 KOps/s | |
test_items | 24.6300μs | 2.8431μs | 351.7293 KOps/s | 345.1611 KOps/s | |
test_items_nested | 0.3803ms | 0.3425ms | 2.9201 KOps/s | 2.9395 KOps/s | |
test_items_nested_locked | 0.3711ms | 0.3403ms | 2.9383 KOps/s | 2.9166 KOps/s | |
test_items_nested_leaf | 0.1008ms | 62.6453μs | 15.9629 KOps/s | 15.8162 KOps/s | |
test_items_stack_nested | 0.3695ms | 0.3381ms | 2.9577 KOps/s | 2.9204 KOps/s | |
test_items_stack_nested_leaf | 92.1120μs | 62.8094μs | 15.9212 KOps/s | 15.5783 KOps/s | |
test_items_stack_nested_locked | 0.4305ms | 0.3420ms | 2.9242 KOps/s | 2.8992 KOps/s | |
test_keys | 32.3600μs | 3.4159μs | 292.7525 KOps/s | 291.9812 KOps/s | |
test_keys_nested | 0.1254ms | 71.1575μs | 14.0533 KOps/s | 13.9612 KOps/s | |
test_keys_nested_locked | 2.7126ms | 77.2891μs | 12.9384 KOps/s | 12.8048 KOps/s | |
test_keys_nested_leaf | 0.2393ms | 61.9084μs | 16.1529 KOps/s | 15.9094 KOps/s | |
test_keys_stack_nested | 0.2627ms | 71.1475μs | 14.0553 KOps/s | 14.0213 KOps/s | |
test_keys_stack_nested_leaf | 0.2384ms | 61.2409μs | 16.3290 KOps/s | 15.7660 KOps/s | |
test_keys_stack_nested_locked | 0.1682ms | 76.5259μs | 13.0675 KOps/s | 12.8742 KOps/s | |
test_values | 6.1152μs | 0.8351μs | 1.1974 MOps/s | 1.1935 MOps/s | |
test_values_nested | 0.1487ms | 48.4619μs | 20.6347 KOps/s | 20.4069 KOps/s | |
test_values_nested_locked | 0.1710ms | 50.2338μs | 19.9069 KOps/s | 19.7220 KOps/s | |
test_values_nested_leaf | 71.5510μs | 42.6252μs | 23.4603 KOps/s | 23.3839 KOps/s | |
test_values_stack_nested | 85.7520μs | 49.0126μs | 20.4029 KOps/s | 19.9411 KOps/s | |
test_values_stack_nested_leaf | 79.7020μs | 42.8081μs | 23.3601 KOps/s | 22.8918 KOps/s | |
test_values_stack_nested_locked | 81.3310μs | 50.3905μs | 19.8450 KOps/s | 19.2550 KOps/s | |
test_membership | 2.2626μs | 0.5042μs | 1.9832 MOps/s | 2.0020 MOps/s | |
test_membership_nested | 19.2355μs | 1.8890μs | 529.3839 KOps/s | 530.8155 KOps/s | |
test_membership_nested_leaf | 19.8305μs | 1.8827μs | 531.1591 KOps/s | 541.2089 KOps/s | |
test_membership_stacked_nested | 29.0510μs | 1.9604μs | 510.1011 KOps/s | 523.3775 KOps/s | |
test_membership_stacked_nested_leaf | 25.6000μs | 1.9575μs | 510.8473 KOps/s | 517.3982 KOps/s | |
test_membership_nested_last | 45.4510μs | 2.9868μs | 334.8101 KOps/s | 337.4363 KOps/s | |
test_membership_nested_leaf_last | 0.1035ms | 2.9944μs | 333.9550 KOps/s | 334.1197 KOps/s | |
test_membership_stacked_nested_last | 37.6600μs | 2.9516μs | 338.7993 KOps/s | 120.6990 KOps/s | |
test_membership_stacked_nested_leaf_last | 38.3210μs | 2.9642μs | 337.3571 KOps/s | 125.0318 KOps/s | |
test_nested_getleaf | 0.6715ms | 6.0906μs | 164.1867 KOps/s | 163.8605 KOps/s | |
test_nested_get | 35.4510μs | 5.6742μs | 176.2366 KOps/s | 172.9638 KOps/s | |
test_stacked_getleaf | 39.9110μs | 6.1494μs | 162.6166 KOps/s | 165.6807 KOps/s | |
test_stacked_get | 36.2700μs | 5.7103μs | 175.1235 KOps/s | 177.8744 KOps/s | |
test_nested_getitemleaf | 30.5600μs | 6.1287μs | 163.1656 KOps/s | 163.9230 KOps/s | |
test_nested_getitem | 33.5710μs | 5.7955μs | 172.5471 KOps/s | 173.8654 KOps/s | |
test_stacked_getitemleaf | 28.0100μs | 6.1095μs | 163.6801 KOps/s | 162.7440 KOps/s | |
test_stacked_getitem | 27.1210μs | 5.6900μs | 175.7479 KOps/s | 176.7920 KOps/s | |
test_lock_nested | 6.9664ms | 0.4318ms | 2.3157 KOps/s | 2.3001 KOps/s | |
test_lock_stack_nested | 0.5425ms | 0.3936ms | 2.5404 KOps/s | 2.6166 KOps/s | |
test_unlock_nested | 0.7647ms | 0.3676ms | 2.7201 KOps/s | 2.7200 KOps/s | |
test_unlock_stack_nested | 0.4655ms | 0.3328ms | 3.0046 KOps/s | 3.1252 KOps/s | |
test_flatten_speed | 0.1711ms | 75.7155μs | 13.2073 KOps/s | 13.0303 KOps/s | |
test_unflatten_speed | 0.3666ms | 0.3198ms | 3.1265 KOps/s | 3.0971 KOps/s | |
test_common_ops | 1.5606ms | 1.2657ms | 790.0478 Ops/s | 805.6473 Ops/s | |
test_creation | 0.1668ms | 1.4921μs | 670.2035 KOps/s | 681.6119 KOps/s | |
test_creation_empty | 0.1905ms | 15.5889μs | 64.1483 KOps/s | 70.8885 KOps/s | |
test_creation_nested_1 | 0.1912ms | 17.5571μs | 56.9570 KOps/s | 64.2117 KOps/s | |
test_creation_nested_2 | 0.1876ms | 19.8967μs | 50.2597 KOps/s | 55.2850 KOps/s | |
test_clone | 0.1863ms | 29.3155μs | 34.1117 KOps/s | 35.1605 KOps/s | |
test_getitem[int] | 1.2979ms | 16.2441μs | 61.5607 KOps/s | 62.9552 KOps/s | |
test_getitem[slice_int] | 0.2519ms | 28.4083μs | 35.2010 KOps/s | 36.7439 KOps/s | |
test_getitem[range] | 0.2361ms | 0.1114ms | 8.9756 KOps/s | 8.8708 KOps/s | |
test_getitem[tuple] | 0.1205ms | 23.7375μs | 42.1275 KOps/s | 43.1797 KOps/s | |
test_getitem[list] | 0.3228ms | 0.1034ms | 9.6729 KOps/s | 9.9333 KOps/s | |
test_setitem_dim[int] | 0.2328ms | 48.7993μs | 20.4921 KOps/s | 22.1245 KOps/s | |
test_setitem_dim[slice_int] | 92.8420μs | 67.8704μs | 14.7340 KOps/s | 14.6808 KOps/s | |
test_setitem_dim[range] | 0.3058ms | 0.1302ms | 7.6791 KOps/s | 7.7262 KOps/s | |
test_setitem_dim[tuple] | 0.2354ms | 64.8314μs | 15.4246 KOps/s | 16.3759 KOps/s | |
test_setitem | 0.1916ms | 42.1623μs | 23.7179 KOps/s | 24.3541 KOps/s | |
test_set | 0.2174ms | 42.4308μs | 23.5678 KOps/s | 25.1582 KOps/s | |
test_set_shared | 0.3582ms | 56.9665μs | 17.5542 KOps/s | 18.6133 KOps/s | |
test_update | 0.2177ms | 51.8390μs | 19.2905 KOps/s | 20.5924 KOps/s | |
test_update_nested | 0.2375ms | 58.9109μs | 16.9748 KOps/s | 17.8525 KOps/s | |
test_update__nested | 0.1596ms | 60.4790μs | 16.5347 KOps/s | 16.6089 KOps/s | |
test_set_nested | 0.1947ms | 44.7508μs | 22.3460 KOps/s | 23.4992 KOps/s | |
test_set_nested_new | 0.1932ms | 47.5848μs | 21.0151 KOps/s | 21.7511 KOps/s | |
test_select | 0.2062ms | 60.2968μs | 16.5846 KOps/s | 16.7028 KOps/s | |
test_select_nested | 79.8410μs | 41.5298μs | 24.0791 KOps/s | 24.0603 KOps/s | |
test_exclude_nested | 0.1904ms | 58.7596μs | 17.0185 KOps/s | 16.8838 KOps/s | |
test_empty[True] | 0.2998ms | 0.2568ms | 3.8935 KOps/s | 3.8054 KOps/s | |
test_empty[False] | 2.7161μs | 0.7518μs | 1.3301 MOps/s | 1.3508 MOps/s | |
test_to | 66.0220μs | 26.8030μs | 37.3092 KOps/s | 36.8120 KOps/s | |
test_to_nonblocking | 0.1104ms | 26.1276μs | 38.2737 KOps/s | 39.1655 KOps/s | |
test_unbind_speed | 1.3421ms | 0.2823ms | 3.5423 KOps/s | 3.5741 KOps/s | |
test_unbind_speed_stack0 | 0.3523ms | 0.2737ms | 3.6533 KOps/s | 3.6640 KOps/s | |
test_unbind_speed_stack1 | 95.6271ms | 0.7096ms | 1.4092 KOps/s | 1.4342 KOps/s | |
test_split | 99.2242ms | 2.2240ms | 449.6315 Ops/s | 462.7561 Ops/s | |
test_chunk | 98.0074ms | 2.2467ms | 445.0942 Ops/s | 459.7034 Ops/s | |
test_creation[device0] | 0.3406ms | 0.1296ms | 7.7179 KOps/s | 7.8323 KOps/s | |
test_creation_from_tensor | 0.3674ms | 0.1316ms | 7.6000 KOps/s | 7.7118 KOps/s | |
test_add_one[memmap_tensor0] | 0.2247ms | 9.2188μs | 108.4742 KOps/s | 113.1946 KOps/s | |
test_contiguous[memmap_tensor0] | 32.1810μs | 2.2602μs | 442.4367 KOps/s | 453.1080 KOps/s | |
test_stack[memmap_tensor0] | 37.2200μs | 6.8260μs | 146.4987 KOps/s | 154.2896 KOps/s | |
test_memmaptd_index | 1.2747ms | 0.4299ms | 2.3259 KOps/s | 2.3424 KOps/s | |
test_memmaptd_index_astensor | 0.7436ms | 0.5029ms | 1.9884 KOps/s | 2.0170 KOps/s | |
test_memmaptd_index_op | 1.4457ms | 1.0404ms | 961.2068 Ops/s | 976.1727 Ops/s | |
test_serialize_model | 0.1318s | 0.1308s | 7.6429 Ops/s | 7.6096 Ops/s | |
test_serialize_model_pickle | 1.3804s | 1.2178s | 0.8211 Ops/s | 0.8235 Ops/s | |
test_serialize_weights | 0.1319s | 0.1304s | 7.6712 Ops/s | 7.6385 Ops/s | |
test_serialize_weights_returnearly | 0.2212s | 56.8085ms | 17.6030 Ops/s | 17.5145 Ops/s | |
test_serialize_weights_pickle | 1.4039s | 1.2263s | 0.8155 Ops/s | 0.8219 Ops/s | |
test_reshape_pytree | 0.1344ms | 35.9757μs | 27.7966 KOps/s | 27.1398 KOps/s | |
test_reshape_td | 0.1326ms | 42.7532μs | 23.3901 KOps/s | 24.4726 KOps/s | |
test_view_pytree | 0.1730ms | 35.8852μs | 27.8667 KOps/s | 28.7004 KOps/s | |
test_view_td | 0.1828ms | 46.5803μs | 21.4683 KOps/s | 21.5632 KOps/s | |
test_unbind_pytree | 0.1630ms | 34.5515μs | 28.9423 KOps/s | 29.9645 KOps/s | |
test_unbind_td | 0.5469ms | 42.7540μs | 23.3896 KOps/s | 23.5879 KOps/s | |
test_split_pytree | 0.1449ms | 45.4746μs | 21.9903 KOps/s | 21.4216 KOps/s | |
test_split_td | 97.6364ms | 65.7410μs | 15.2112 KOps/s | 17.9208 KOps/s | |
test_add_pytree | 0.2110ms | 57.6553μs | 17.3445 KOps/s | 16.2812 KOps/s | |
test_add_td | 0.2387ms | 92.2956μs | 10.8347 KOps/s | 10.7860 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.3055ms | 0.1613ms | 6.2003 KOps/s | 6.0615 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.5730ms | 0.1659ms | 6.0260 KOps/s | 6.0185 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.2834ms | 0.1444ms | 6.9269 KOps/s | 6.8508 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.3478ms | 0.1879ms | 5.3213 KOps/s | 5.4806 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 0.4002ms | 21.7900μs | 45.8927 KOps/s | 47.0499 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.4330ms | 49.0133μs | 20.4026 KOps/s | 20.6276 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.4458ms | 65.2564μs | 15.3242 KOps/s | 15.5188 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.4309ms | 49.9903μs | 20.0039 KOps/s | 20.2026 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.3844ms | 0.3204ms | 3.1210 KOps/s | 3.0895 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.6253ms | 0.2348ms | 4.2592 KOps/s | 4.1977 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.6323ms | 0.1279ms | 7.8177 KOps/s | 7.7325 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.4560ms | 66.9081μs | 14.9459 KOps/s | 15.1501 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.5242ms | 0.3188ms | 3.1371 KOps/s | 3.1338 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.8306ms | 0.6444ms | 1.5517 KOps/s | 1.6176 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4805ms | 0.2825ms | 3.5394 KOps/s | 3.4752 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.5163ms | 0.3232ms | 3.0945 KOps/s | 3.0884 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.4650ms | 76.9646μs | 12.9930 KOps/s | 12.4066 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.2925ms | 0.1296ms | 7.7151 KOps/s | 7.6595 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.9205ms | 0.5314ms | 1.8817 KOps/s | 1.8892 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.4512ms | 0.3175ms | 3.1498 KOps/s | 3.1419 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.3971ms | 20.3226μs | 49.2064 KOps/s | 50.0233 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.4161ms | 38.3576μs | 26.0704 KOps/s | 26.0238 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.4501ms | 69.8978μs | 14.3066 KOps/s | 14.2611 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.4432ms | 51.1713μs | 19.5422 KOps/s | 19.3952 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 2.3603ms | 0.7804ms | 1.2814 KOps/s | 1.1249 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.5713ms | 3.3048ms | 302.5868 Ops/s | 305.3275 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 2.3075ms | 0.8178ms | 1.2227 KOps/s | 1.1284 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.6486ms | 3.2851ms | 304.4043 Ops/s | 313.8505 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.2563ms | 0.1097ms | 9.1173 KOps/s | 9.0707 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.2445ms | 63.2661μs | 15.8063 KOps/s | 15.5068 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.2473ms | 0.1028ms | 9.7254 KOps/s | 9.2071 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.4304ms | 45.3924μs | 22.0301 KOps/s | 21.0374 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.5020ms | 0.1076ms | 9.2973 KOps/s | 9.1297 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.4349ms | 44.1613μs | 22.6443 KOps/s | 20.7719 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.3099ms | 0.1427ms | 7.0065 KOps/s | 6.8823 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.4328ms | 27.1041μs | 36.8947 KOps/s | 39.2764 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.5221ms | 0.1367ms | 7.3129 KOps/s | 7.2365 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 0.1034ms | 20.9977μs | 47.6243 KOps/s | 48.8875 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.5280ms | 0.1324ms | 7.5511 KOps/s | 7.2235 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 0.4032ms | 20.8686μs | 47.9189 KOps/s | 50.0768 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.5440ms | 0.1389ms | 7.1991 KOps/s | 6.7805 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.4822ms | 25.0629μs | 39.8996 KOps/s | 38.8493 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.5376ms | 0.1325ms | 7.5455 KOps/s | 7.4145 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.4043ms | 20.6796μs | 48.3568 KOps/s | 49.3437 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.5179ms | 0.1330ms | 7.5171 KOps/s | 7.4378 KOps/s | |
test_compile_indexing[int-pytree-eager] | 0.2176ms | 20.9901μs | 47.6416 KOps/s | 49.3632 KOps/s | |
test_mod_add[eager] | 0.4388ms | 35.0630μs | 28.5201 KOps/s | 31.4470 KOps/s | |
test_mod_add[compile] | 0.2466ms | 71.4545μs | 13.9949 KOps/s | 14.1342 KOps/s | |
test_mod_add[compile-overhead] | 0.2521ms | 0.1313ms | 7.6137 KOps/s | 6.3716 KOps/s | |
test_mod_wrap[eager] | 1.1751ms | 0.7920ms | 1.2627 KOps/s | 1.2569 KOps/s | |
test_mod_wrap[compile] | 1.9849ms | 0.8444ms | 1.1843 KOps/s | 1.1696 KOps/s | |
test_mod_wrap[compile-overhead] | 4.8244ms | 3.0255ms | 330.5258 Ops/s | 328.2096 Ops/s | |
test_mod_wrap_and_backward[eager] | 4.2838ms | 4.0949ms | 244.2084 Ops/s | 238.8469 Ops/s | |
test_mod_wrap_and_backward[compile] | 4.4697ms | 4.0065ms | 249.5938 Ops/s | 244.8192 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3804ms | 0.9201ms | 1.0868 KOps/s | 992.7535 Ops/s | |
test_seq_add[eager] | 0.3123ms | 98.9533μs | 10.1058 KOps/s | 9.8600 KOps/s | |
test_seq_add[compile] | 0.2798ms | 86.1218μs | 11.6115 KOps/s | 12.0770 KOps/s | |
test_seq_add[compile-overhead] | 0.2640ms | 0.1145ms | 8.7308 KOps/s | 8.3125 KOps/s | |
test_seq_wrap[eager] | 1.0908ms | 0.9450ms | 1.0582 KOps/s | 1.0625 KOps/s | |
test_seq_wrap[compile] | 1.0330ms | 0.8573ms | 1.1664 KOps/s | 1.1520 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3467ms | 0.2190ms | 4.5656 KOps/s | 4.3024 KOps/s | |
test_func_call_runtime[False-eager] | 2.5971ms | 2.4209ms | 413.0613 Ops/s | 410.4836 Ops/s | |
test_func_call_runtime[False-compile] | 2.5734ms | 2.4190ms | 413.4008 Ops/s | 409.7735 Ops/s | |
test_func_call_runtime[False-compile-overhead] | 0.5031ms | 0.3608ms | 2.7720 KOps/s | 2.7222 KOps/s | |
test_func_call_runtime[True-eager] | 2.7663ms | 2.5753ms | 388.3069 Ops/s | 386.0810 Ops/s | |
test_func_call_runtime[True-compile] | 2.6160ms | 2.4576ms | 406.9045 Ops/s | 410.0992 Ops/s | |
test_func_call_runtime[True-compile-overhead] | 0.5034ms | 0.3812ms | 2.6234 KOps/s | 2.5886 KOps/s | |
test_func_call_cm_runtime[False-eager] | 2.6604ms | 2.3979ms | 417.0361 Ops/s | 414.7182 Ops/s | |
test_func_call_cm_runtime[False-compile] | 2.6029ms | 2.4275ms | 411.9524 Ops/s | 411.5932 Ops/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.5102ms | 0.3624ms | 2.7595 KOps/s | 2.7166 KOps/s | |
test_func_call_cm_runtime[True-eager] | 2.8528ms | 2.6786ms | 373.3292 Ops/s | 371.7620 Ops/s | |
test_func_call_cm_runtime[True-compile] | 2.6980ms | 2.4916ms | 401.3526 Ops/s | 403.7815 Ops/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.5615ms | 0.4082ms | 2.4496 KOps/s | 2.4326 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 4.2890ms | 3.8410ms | 260.3511 Ops/s | 262.0096 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 2.6587ms | 2.5122ms | 398.0599 Ops/s | 404.5932 Ops/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.5754ms | 0.4074ms | 2.4544 KOps/s | 2.4245 KOps/s | |
test_distributed | 2.2372ms | 0.1720ms | 5.8146 KOps/s | 8.6710 KOps/s | |
test_tdmodule | 33.1700μs | 14.5177μs | 68.8815 KOps/s | 71.0861 KOps/s | |
test_tdmodule_dispatch | 49.2410μs | 28.7060μs | 34.8359 KOps/s | 37.0522 KOps/s | |
test_tdseq | 35.7710μs | 15.6466μs | 63.9117 KOps/s | 67.5509 KOps/s | |
test_tdseq_dispatch | 56.0510μs | 32.4658μs | 30.8016 KOps/s | 33.9684 KOps/s | |
test_instantiation_functorch | 2.1633ms | 1.8707ms | 534.5609 Ops/s | 518.8702 Ops/s | |
test_instantiation_td | 1.8126ms | 1.1979ms | 834.8202 Ops/s | 818.0575 Ops/s | |
test_exec_functorch | 1.1552ms | 1.0178ms | 982.5247 Ops/s | 992.9311 Ops/s | |
test_exec_functional_call | 1.3689ms | 1.0140ms | 986.1937 Ops/s | 982.1234 Ops/s | |
test_exec_td | 1.2046ms | 1.0378ms | 963.5682 Ops/s | 959.5270 Ops/s | |
test_exec_td_decorator | 1.1993ms | 1.0766ms | 928.8509 Ops/s | 932.2729 Ops/s | |
test_vmap_mlp_speed[True-True] | 1.4707ms | 1.2803ms | 781.0518 Ops/s | 785.7149 Ops/s | |
test_vmap_mlp_speed[True-False] | 1.4371ms | 1.2755ms | 783.9837 Ops/s | 789.2893 Ops/s | |
test_vmap_mlp_speed[False-True] | 1.3767ms | 1.1721ms | 853.1698 Ops/s | 856.1587 Ops/s | |
test_vmap_mlp_speed[False-False] | 1.3732ms | 1.1722ms | 853.0777 Ops/s | 855.9759 Ops/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.8067ms | 1.2564ms | 795.9208 Ops/s | 802.9177 Ops/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.4642ms | 1.2555ms | 796.4653 Ops/s | 802.1324 Ops/s | |
test_vmap_mlp_speed_decorator[False-True] | 1.3618ms | 1.1710ms | 853.9454 Ops/s | 853.5334 Ops/s | |
test_vmap_mlp_speed_decorator[False-False] | 1.3189ms | 1.1704ms | 854.4341 Ops/s | 856.9635 Ops/s | |
test_vmap_transformer_speed[True-True] | 13.4793ms | 13.1739ms | 75.9075 Ops/s | 75.3659 Ops/s | |
test_vmap_transformer_speed[True-False] | 13.3967ms | 13.1490ms | 76.0513 Ops/s | 75.6843 Ops/s | |
test_vmap_transformer_speed[False-True] | 13.2303ms | 13.0249ms | 76.7759 Ops/s | 76.4357 Ops/s | |
test_vmap_transformer_speed[False-False] | 13.2962ms | 13.0044ms | 76.8972 Ops/s | 76.4627 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 34.4341ms | 34.1654ms | 29.2694 Ops/s | 29.3502 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 34.4239ms | 34.0867ms | 29.3369 Ops/s | 29.2304 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 34.7792ms | 33.9491ms | 29.4558 Ops/s | 29.5210 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 34.3171ms | 34.0173ms | 29.3968 Ops/s | 29.4775 Ops/s | |
test_to_module_speed[True] | 2.0825ms | 0.9988ms | 1.0012 KOps/s | 999.6883 Ops/s | |
test_to_module_speed[False] | 1.3562ms | 0.9754ms | 1.0253 KOps/s | 1.0212 KOps/s | |
test_tc_init | 0.1492ms | 35.8635μs | 27.8835 KOps/s | 32.2982 KOps/s | |
test_tc_init_nested | 0.1078ms | 72.8598μs | 13.7250 KOps/s | 15.7862 KOps/s | |
test_tc_first_layer_tensor | 3.5643μs | 0.6787μs | 1.4733 MOps/s | 1.4638 MOps/s | |
test_tc_first_layer_nontensor | 31.6410μs | 2.2367μs | 447.0913 KOps/s | 452.4715 KOps/s | |
test_tc_second_layer_tensor | 14.6080μs | 1.3772μs | 726.1369 KOps/s | 737.4186 KOps/s | |
test_tc_second_layer_nontensor | 24.5110μs | 2.9704μs | 336.6556 KOps/s | 340.9172 KOps/s | |
test_unbind | 0.1972s | 12.2284ms | 81.7770 Ops/s | 92.6614 Ops/s | |
test_full_like | 0.8031ms | 0.5756ms | 1.7373 KOps/s | 1.7374 KOps/s | |
test_zeros_like | 0.3553ms | 0.1983ms | 5.0431 KOps/s | 5.0400 KOps/s | |
test_ones_like | 0.3724ms | 0.1984ms | 5.0401 KOps/s | 5.0464 KOps/s | |
test_clone | 0.5932ms | 0.4146ms | 2.4122 KOps/s | 2.4131 KOps/s | |
test_squeeze | 0.1276ms | 9.8779μs | 101.2359 KOps/s | 101.5380 KOps/s | |
test_unsqueeze | 0.2192ms | 73.1197μs | 13.6762 KOps/s | 13.1540 KOps/s | |
test_split | 0.4098ms | 0.1579ms | 6.3321 KOps/s | 6.4060 KOps/s | |
test_permute | 0.3163ms | 0.1799ms | 5.5582 KOps/s | 5.4742 KOps/s | |
test_stack | 1.3172ms | 0.8188ms | 1.2213 KOps/s | 1.1593 KOps/s | |
test_cat | 1.3901ms | 1.2322ms | 811.5433 Ops/s | 811.5973 Ops/s |
vmoens
added a commit
that referenced
this pull request
Oct 8, 2024
ghstack-source-id: dc32497757d1fb19dc61dea9115810890d5a4acb Pull Request resolved: #1032
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):