-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] TD+NJT to(device) support #1022
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This was referenced Oct 2, 2024
vmoens
added a commit
that referenced
this pull request
Oct 2, 2024
ghstack-source-id: 792ce21cfa30eb2d62f4b30a469f30312d25909d Pull Request resolved: #1022
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Oct 2, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 60.2840μs | 24.2236μs | 41.2821 KOps/s | 41.8118 KOps/s | |
test_plain_set_stack_nested | 52.0980μs | 24.6545μs | 40.5605 KOps/s | 39.9663 KOps/s | |
test_plain_set_nested_inplace | 56.3060μs | 26.5761μs | 37.6278 KOps/s | 36.6333 KOps/s | |
test_plain_set_stack_nested_inplace | 67.9070μs | 26.5238μs | 37.7019 KOps/s | 36.6728 KOps/s | |
test_items | 30.4570μs | 4.1312μs | 242.0584 KOps/s | 241.4267 KOps/s | |
test_items_nested | 0.9022ms | 0.3911ms | 2.5567 KOps/s | 2.6695 KOps/s | |
test_items_nested_locked | 0.5320ms | 0.3865ms | 2.5876 KOps/s | 2.6492 KOps/s | |
test_items_nested_leaf | 0.1557ms | 80.1320μs | 12.4794 KOps/s | 12.4413 KOps/s | |
test_items_stack_nested | 0.4532ms | 0.3876ms | 2.5798 KOps/s | 2.6076 KOps/s | |
test_items_stack_nested_leaf | 0.1895ms | 83.0969μs | 12.0341 KOps/s | 12.1389 KOps/s | |
test_items_stack_nested_locked | 0.8127ms | 0.3913ms | 2.5553 KOps/s | 2.6211 KOps/s | |
test_keys | 21.7410μs | 3.4292μs | 291.6123 KOps/s | 288.0792 KOps/s | |
test_keys_nested | 0.2497ms | 0.1339ms | 7.4702 KOps/s | 7.6575 KOps/s | |
test_keys_nested_locked | 0.8288ms | 0.1388ms | 7.2034 KOps/s | 7.2152 KOps/s | |
test_keys_nested_leaf | 0.1770ms | 0.1182ms | 8.4606 KOps/s | 8.6588 KOps/s | |
test_keys_stack_nested | 0.2342ms | 0.1335ms | 7.4932 KOps/s | 7.6051 KOps/s | |
test_keys_stack_nested_leaf | 0.2140ms | 0.1148ms | 8.7081 KOps/s | 8.8055 KOps/s | |
test_keys_stack_nested_locked | 0.2685ms | 0.1396ms | 7.1632 KOps/s | 7.3353 KOps/s | |
test_values | 7.8066μs | 0.9730μs | 1.0277 MOps/s | 892.0130 KOps/s | |
test_values_nested | 0.1645ms | 93.2604μs | 10.7227 KOps/s | 10.7249 KOps/s | |
test_values_nested_locked | 0.1717ms | 93.4814μs | 10.6973 KOps/s | 10.6026 KOps/s | |
test_values_nested_leaf | 0.1344ms | 79.0718μs | 12.6467 KOps/s | 12.5050 KOps/s | |
test_values_stack_nested | 0.1765ms | 94.1674μs | 10.6194 KOps/s | 10.7493 KOps/s | |
test_values_stack_nested_leaf | 0.1358ms | 79.3764μs | 12.5982 KOps/s | 13.1503 KOps/s | |
test_values_stack_nested_locked | 0.1649ms | 95.4184μs | 10.4802 KOps/s | 10.6197 KOps/s | |
test_membership | 2.2713μs | 0.6987μs | 1.4313 MOps/s | 1.4167 MOps/s | |
test_membership_nested | 26.4590μs | 2.6826μs | 372.7704 KOps/s | 371.2618 KOps/s | |
test_membership_nested_leaf | 24.7860μs | 2.6921μs | 371.4531 KOps/s | 371.5758 KOps/s | |
test_membership_stacked_nested | 22.5020μs | 2.6834μs | 372.6603 KOps/s | 363.8044 KOps/s | |
test_membership_stacked_nested_leaf | 21.9110μs | 2.6896μs | 371.8012 KOps/s | 364.8475 KOps/s | |
test_membership_nested_last | 30.2270μs | 4.1098μs | 243.3198 KOps/s | 245.8942 KOps/s | |
test_membership_nested_leaf_last | 26.3000μs | 4.1302μs | 242.1173 KOps/s | 246.7042 KOps/s | |
test_membership_stacked_nested_last | 51.0520μs | 4.9154μs | 203.4412 KOps/s | 145.4860 KOps/s | |
test_membership_stacked_nested_leaf_last | 26.2490μs | 4.9672μs | 201.3216 KOps/s | 145.1583 KOps/s | |
test_nested_getleaf | 32.3410μs | 10.5564μs | 94.7289 KOps/s | 94.4945 KOps/s | |
test_nested_get | 38.7830μs | 10.0642μs | 99.3616 KOps/s | 99.7340 KOps/s | |
test_stacked_getleaf | 47.3220μs | 10.4271μs | 95.9038 KOps/s | 94.9662 KOps/s | |
test_stacked_get | 30.6280μs | 10.0199μs | 99.8015 KOps/s | 100.9739 KOps/s | |
test_nested_getitemleaf | 38.1010μs | 10.9154μs | 91.6137 KOps/s | 92.6658 KOps/s | |
test_nested_getitem | 41.3850μs | 10.3716μs | 96.4170 KOps/s | 97.4346 KOps/s | |
test_stacked_getitemleaf | 38.8430μs | 10.9232μs | 91.5482 KOps/s | 90.9344 KOps/s | |
test_stacked_getitem | 32.5610μs | 10.2449μs | 97.6092 KOps/s | 98.4391 KOps/s | |
test_lock_nested | 84.1560ms | 0.5825ms | 1.7168 KOps/s | 2.0244 KOps/s | |
test_lock_stack_nested | 0.8284ms | 0.4635ms | 2.1574 KOps/s | 2.2012 KOps/s | |
test_unlock_nested | 82.7726ms | 0.4998ms | 2.0009 KOps/s | 2.4180 KOps/s | |
test_unlock_stack_nested | 0.5820ms | 0.3776ms | 2.6480 KOps/s | 2.6575 KOps/s | |
test_flatten_speed | 0.2037ms | 0.1001ms | 9.9930 KOps/s | 9.9377 KOps/s | |
test_unflatten_speed | 0.9190ms | 0.5150ms | 1.9417 KOps/s | 1.9577 KOps/s | |
test_common_ops | 4.1091ms | 1.1414ms | 876.1197 Ops/s | 859.7592 Ops/s | |
test_creation | 66.0040μs | 2.0490μs | 488.0547 KOps/s | 485.7155 KOps/s | |
test_creation_empty | 63.7890μs | 18.0584μs | 55.3758 KOps/s | 51.8638 KOps/s | |
test_creation_nested_1 | 66.8950μs | 22.2996μs | 44.8438 KOps/s | 44.0170 KOps/s | |
test_creation_nested_2 | 78.2670μs | 25.9565μs | 38.5261 KOps/s | 37.6964 KOps/s | |
test_clone | 97.9440μs | 16.8346μs | 59.4016 KOps/s | 57.5680 KOps/s | |
test_getitem[int] | 0.8884ms | 16.7023μs | 59.8721 KOps/s | 59.5780 KOps/s | |
test_getitem[slice_int] | 0.2314ms | 31.9646μs | 31.2847 KOps/s | 32.5272 KOps/s | |
test_getitem[range] | 0.1721ms | 59.0312μs | 16.9402 KOps/s | 17.3340 KOps/s | |
test_getitem[tuple] | 0.1394ms | 25.7455μs | 38.8417 KOps/s | 39.8110 KOps/s | |
test_getitem[list] | 0.2410ms | 52.3686μs | 19.0954 KOps/s | 18.8745 KOps/s | |
test_setitem_dim[int] | 51.3660μs | 31.7871μs | 31.4593 KOps/s | 30.8478 KOps/s | |
test_setitem_dim[slice_int] | 0.1093ms | 59.5354μs | 16.7967 KOps/s | 16.1424 KOps/s | |
test_setitem_dim[range] | 0.1643ms | 83.2407μs | 12.0134 KOps/s | 11.9495 KOps/s | |
test_setitem_dim[tuple] | 67.8380μs | 47.5254μs | 21.0414 KOps/s | 20.5443 KOps/s | |
test_setitem | 0.1097ms | 30.1703μs | 33.1452 KOps/s | 32.1422 KOps/s | |
test_set | 0.1092ms | 29.3477μs | 34.0742 KOps/s | 33.0152 KOps/s | |
test_set_shared | 1.3135ms | 0.2133ms | 4.6873 KOps/s | 4.5941 KOps/s | |
test_update | 0.1506ms | 37.8792μs | 26.3997 KOps/s | 24.7757 KOps/s | |
test_update_nested | 0.1356ms | 48.8203μs | 20.4833 KOps/s | 19.2506 KOps/s | |
test_update__nested | 0.3514ms | 44.7314μs | 22.3556 KOps/s | 22.5315 KOps/s | |
test_set_nested | 89.8480μs | 32.7119μs | 30.5699 KOps/s | 29.7710 KOps/s | |
test_set_nested_new | 0.1259ms | 37.1504μs | 26.9176 KOps/s | 25.8804 KOps/s | |
test_select | 0.1346ms | 55.6227μs | 17.9783 KOps/s | 17.6949 KOps/s | |
test_select_nested | 0.1285ms | 59.1612μs | 16.9030 KOps/s | 16.9014 KOps/s | |
test_exclude_nested | 0.1524ms | 74.5453μs | 13.4147 KOps/s | 13.2552 KOps/s | |
test_empty[True] | 0.6417ms | 0.3512ms | 2.8475 KOps/s | 2.8463 KOps/s | |
test_empty[False] | 6.0594μs | 1.1905μs | 839.9576 KOps/s | 849.9397 KOps/s | |
test_unbind_speed | 0.6239ms | 0.3019ms | 3.3125 KOps/s | 3.4289 KOps/s | |
test_unbind_speed_stack0 | 0.4020ms | 0.2918ms | 3.4270 KOps/s | 3.4847 KOps/s | |
test_unbind_speed_stack1 | 87.3163ms | 0.8343ms | 1.1985 KOps/s | 1.4043 KOps/s | |
test_split | 75.0711ms | 2.1303ms | 469.4183 Ops/s | 469.3346 Ops/s | |
test_chunk | 2.1482ms | 1.9954ms | 501.1569 Ops/s | 464.6569 Ops/s | |
test_creation[device0] | 0.2317ms | 0.1171ms | 8.5416 KOps/s | 8.4670 KOps/s | |
test_creation_from_tensor | 3.0337ms | 0.1155ms | 8.6543 KOps/s | 8.6753 KOps/s | |
test_add_one[memmap_tensor0] | 0.2954ms | 7.4464μs | 134.2928 KOps/s | 134.0100 KOps/s | |
test_contiguous[memmap_tensor0] | 22.0210μs | 1.9345μs | 516.9284 KOps/s | 516.0961 KOps/s | |
test_stack[memmap_tensor0] | 50.2040μs | 5.6178μs | 178.0046 KOps/s | 174.5384 KOps/s | |
test_memmaptd_index | 1.1868ms | 0.4095ms | 2.4417 KOps/s | 2.4495 KOps/s | |
test_memmaptd_index_astensor | 86.3985ms | 0.5505ms | 1.8166 KOps/s | 1.9662 KOps/s | |
test_memmaptd_index_op | 1.9341ms | 1.0536ms | 949.1524 Ops/s | 930.8164 Ops/s | |
test_serialize_model | 0.1270s | 0.1188s | 8.4192 Ops/s | 8.5205 Ops/s | |
test_serialize_model_pickle | 0.4761s | 0.3964s | 2.5229 Ops/s | 2.5367 Ops/s | |
test_serialize_weights | 0.1188s | 0.1130s | 8.8494 Ops/s | 7.5998 Ops/s | |
test_serialize_weights_returnearly | 0.2397s | 0.1738s | 5.7527 Ops/s | 6.3589 Ops/s | |
test_serialize_weights_pickle | 0.5358s | 0.4197s | 2.3825 Ops/s | 2.4891 Ops/s | |
test_serialize_weights_filesystem | 0.1455s | 0.1411s | 7.0853 Ops/s | 7.1468 Ops/s | |
test_serialize_model_filesystem | 0.1617s | 0.1470s | 6.8012 Ops/s | 6.1395 Ops/s | |
test_reshape_pytree | 0.1049ms | 39.4591μs | 25.3427 KOps/s | 25.0509 KOps/s | |
test_reshape_td | 95.2080μs | 45.9448μs | 21.7653 KOps/s | 21.8526 KOps/s | |
test_view_pytree | 95.6590μs | 39.4134μs | 25.3721 KOps/s | 25.7766 KOps/s | |
test_view_td | 0.1277ms | 52.9120μs | 18.8993 KOps/s | 19.4589 KOps/s | |
test_unbind_pytree | 81.7130μs | 36.3285μs | 27.5266 KOps/s | 28.1439 KOps/s | |
test_unbind_td | 0.2875ms | 44.8109μs | 22.3160 KOps/s | 22.7015 KOps/s | |
test_split_pytree | 81.0620μs | 37.6361μs | 26.5702 KOps/s | 25.5594 KOps/s | |
test_split_td | 88.3124ms | 67.4802μs | 14.8192 KOps/s | 17.4333 KOps/s | |
test_add_pytree | 0.1440ms | 46.4542μs | 21.5266 KOps/s | 22.4601 KOps/s | |
test_add_td | 0.1894ms | 84.2600μs | 11.8680 KOps/s | 11.1793 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1265ms | 57.4676μs | 17.4011 KOps/s | 17.2000 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.4234ms | 0.1963ms | 5.0946 KOps/s | 5.1414 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1142ms | 56.6732μs | 17.6450 KOps/s | 17.6196 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.3259ms | 0.1429ms | 6.9990 KOps/s | 7.1179 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 69.8910μs | 22.7041μs | 44.0450 KOps/s | 41.9281 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1633ms | 74.0692μs | 13.5009 KOps/s | 13.5507 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1411ms | 75.4338μs | 13.2567 KOps/s | 13.1528 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1471ms | 68.7222μs | 14.5513 KOps/s | 14.7187 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.3648ms | 0.1813ms | 5.5143 KOps/s | 5.4999 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3530ms | 0.2446ms | 4.0891 KOps/s | 4.2072 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1161ms | 47.4577μs | 21.0714 KOps/s | 21.3610 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.3810ms | 78.4410μs | 12.7484 KOps/s | 12.6500 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.3429ms | 0.1748ms | 5.7214 KOps/s | 5.8091 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.4526ms | 0.2842ms | 3.5183 KOps/s | 3.4907 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.5520ms | 0.2799ms | 3.5727 KOps/s | 3.6785 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.3467ms | 0.1851ms | 5.4017 KOps/s | 5.5043 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1709ms | 73.9918μs | 13.5150 KOps/s | 13.6863 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1379ms | 48.0046μs | 20.8313 KOps/s | 20.6289 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.4466ms | 0.2319ms | 4.3127 KOps/s | 4.2649 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.3686ms | 0.1789ms | 5.5903 KOps/s | 5.7659 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.2063ms | 0.1117ms | 8.9499 KOps/s | 9.0741 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1504ms | 76.2909μs | 13.1077 KOps/s | 12.3806 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1886ms | 78.1301μs | 12.7992 KOps/s | 12.5602 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1339ms | 68.0955μs | 14.6853 KOps/s | 14.6621 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.2586ms | 0.1923ms | 5.1990 KOps/s | 5.2268 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 1.9332ms | 1.7367ms | 575.8187 Ops/s | 561.4636 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.3788ms | 0.1929ms | 5.1832 KOps/s | 5.0960 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.2150ms | 1.0962ms | 912.2289 Ops/s | 902.9832 Ops/s | |
test_compile_assign_and_add_stack[compile] | 0.7477ms | 0.4139ms | 2.4158 KOps/s | 2.4101 KOps/s | |
test_compile_assign_and_add_stack[eager] | 4.4658ms | 4.0180ms | 248.8810 Ops/s | 244.4880 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 82.7560μs | 33.7392μs | 29.6391 KOps/s | 29.3661 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.8048ms | 47.4350μs | 21.0815 KOps/s | 20.5879 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 95.1280μs | 29.9568μs | 33.3814 KOps/s | 34.2075 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 69.4300μs | 29.2082μs | 34.2369 KOps/s | 34.6861 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 76.9740μs | 29.7382μs | 33.6268 KOps/s | 32.6604 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 86.7330μs | 28.9922μs | 34.4920 KOps/s | 35.2652 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1468ms | 76.1901μs | 13.1251 KOps/s | 13.6208 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.4188ms | 27.8991μs | 35.8435 KOps/s | 35.6230 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1210ms | 68.7964μs | 14.5357 KOps/s | 14.6928 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 69.2000μs | 23.1130μs | 43.2656 KOps/s | 42.8497 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1536ms | 68.6189μs | 14.5732 KOps/s | 14.7104 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 57.5280μs | 23.3430μs | 42.8393 KOps/s | 42.8797 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1337ms | 74.5561μs | 13.4127 KOps/s | 13.7452 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.8845ms | 27.3666μs | 36.5409 KOps/s | 36.2310 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1265ms | 68.1529μs | 14.6729 KOps/s | 14.8023 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 66.1740μs | 23.3290μs | 42.8650 KOps/s | 43.8347 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1419ms | 68.8612μs | 14.5220 KOps/s | 15.2932 KOps/s | |
test_compile_indexing[int-pytree-eager] | 95.2120μs | 23.1996μs | 43.1042 KOps/s | 43.4891 KOps/s | |
test_mod_add[eager] | 79.7490μs | 25.2258μs | 39.6420 KOps/s | 37.7587 KOps/s | |
test_mod_add[compile] | 85.1590μs | 38.8868μs | 25.7157 KOps/s | 26.2546 KOps/s | |
test_mod_add[compile-overhead] | 81.6430μs | 38.7507μs | 25.8060 KOps/s | 25.9805 KOps/s | |
test_mod_wrap[eager] | 0.2885ms | 0.2075ms | 4.8196 KOps/s | 4.7671 KOps/s | |
test_mod_wrap[compile] | 0.3341ms | 0.2314ms | 4.3217 KOps/s | 4.2369 KOps/s | |
test_mod_wrap[compile-overhead] | 0.3575ms | 0.2282ms | 4.3820 KOps/s | 4.2820 KOps/s | |
test_mod_wrap_and_backward[eager] | 12.5830ms | 10.6842ms | 93.5961 Ops/s | 86.4794 Ops/s | |
test_mod_wrap_and_backward[compile] | 11.7456ms | 10.5685ms | 94.6212 Ops/s | 81.3863 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 11.2693ms | 10.5597ms | 94.6997 Ops/s | 85.1734 Ops/s | |
test_seq_add[eager] | 0.2492ms | 90.9683μs | 10.9928 KOps/s | 10.7897 KOps/s | |
test_seq_add[compile] | 0.1447ms | 64.7227μs | 15.4505 KOps/s | 15.3636 KOps/s | |
test_seq_add[compile-overhead] | 0.1235ms | 63.2175μs | 15.8184 KOps/s | 15.5001 KOps/s | |
test_seq_wrap[eager] | 0.5700ms | 0.3853ms | 2.5952 KOps/s | 2.5800 KOps/s | |
test_seq_wrap[compile] | 0.5416ms | 0.2712ms | 3.6871 KOps/s | 3.6673 KOps/s | |
test_seq_wrap[compile-overhead] | 0.4958ms | 0.2653ms | 3.7698 KOps/s | 3.6526 KOps/s | |
test_func_call_runtime[False-eager] | 0.8506ms | 0.5134ms | 1.9477 KOps/s | 1.9752 KOps/s | |
test_func_call_runtime[False-compile] | 0.8203ms | 0.4955ms | 2.0180 KOps/s | 1.9814 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.9124ms | 0.4954ms | 2.0187 KOps/s | 1.9733 KOps/s | |
test_func_call_runtime[True-eager] | 1.0643ms | 0.7389ms | 1.3534 KOps/s | 1.3557 KOps/s | |
test_func_call_runtime[True-compile] | 0.5901ms | 0.5079ms | 1.9690 KOps/s | 1.9416 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.6807ms | 0.5031ms | 1.9876 KOps/s | 1.9389 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8771ms | 0.5114ms | 1.9554 KOps/s | 1.9751 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.9287ms | 0.4958ms | 2.0170 KOps/s | 1.9909 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.6023ms | 0.4939ms | 2.0246 KOps/s | 2.0149 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.2099ms | 0.8744ms | 1.1436 KOps/s | 1.1291 KOps/s | |
test_func_call_cm_runtime[True-compile] | 1.4620ms | 0.7256ms | 1.3781 KOps/s | 1.3584 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.8803ms | 0.7135ms | 1.4016 KOps/s | 1.3594 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.3859ms | 1.8718ms | 534.2461 Ops/s | 529.1913 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 2.5380ms | 1.8995ms | 526.4413 Ops/s | 509.5502 Ops/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 2.6699ms | 1.9330ms | 517.3307 Ops/s | 507.0062 Ops/s | |
test_distributed | 0.2225ms | 0.1268ms | 7.8854 KOps/s | 7.7157 KOps/s | |
test_tdmodule | 70.4620μs | 18.0195μs | 55.4953 KOps/s | 51.1454 KOps/s | |
test_tdmodule_dispatch | 63.5590μs | 36.2494μs | 27.5867 KOps/s | 26.2214 KOps/s | |
test_tdseq | 44.0830μs | 20.9180μs | 47.8058 KOps/s | 44.2148 KOps/s | |
test_tdseq_dispatch | 68.9790μs | 41.3659μs | 24.1745 KOps/s | 22.8464 KOps/s | |
test_instantiation_functorch | 1.7939ms | 1.5572ms | 642.1638 Ops/s | 628.5840 Ops/s | |
test_exec_functorch | 0.2676ms | 0.1810ms | 5.5234 KOps/s | 5.4126 KOps/s | |
test_exec_functional_call | 0.2674ms | 0.1729ms | 5.7824 KOps/s | 5.9611 KOps/s | |
test_exec_td_decorator | 0.4700ms | 0.2334ms | 4.2843 KOps/s | 4.3537 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.9156ms | 0.6311ms | 1.5846 KOps/s | 1.5386 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8806ms | 0.6358ms | 1.5729 KOps/s | 1.5053 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.8930ms | 0.5232ms | 1.9113 KOps/s | 1.7732 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.6557ms | 0.5219ms | 1.9162 KOps/s | 1.6818 KOps/s | |
test_to_module_speed[True] | 2.3225ms | 1.4403ms | 694.3082 Ops/s | 712.7067 Ops/s | |
test_to_module_speed[False] | 1.6762ms | 1.3895ms | 719.6918 Ops/s | 729.6667 Ops/s | |
test_tc_init | 0.1110ms | 45.9697μs | 21.7534 KOps/s | 21.3986 KOps/s | |
test_tc_init_nested | 0.1891ms | 93.0233μs | 10.7500 KOps/s | 10.7240 KOps/s | |
test_tc_first_layer_tensor | 17.0210μs | 1.5255μs | 655.5219 KOps/s | 657.9079 KOps/s | |
test_tc_first_layer_nontensor | 47.3050μs | 4.6337μs | 215.8121 KOps/s | 217.5200 KOps/s | |
test_tc_second_layer_tensor | 25.2380μs | 2.7567μs | 362.7571 KOps/s | 358.0951 KOps/s | |
test_tc_second_layer_nontensor | 30.5070μs | 5.8913μs | 169.7425 KOps/s | 168.0896 KOps/s | |
test_unbind | 0.4712s | 13.0604ms | 76.5673 Ops/s | 78.0733 Ops/s | |
test_full_like | 8.0689ms | 7.0764ms | 141.3151 Ops/s | 142.9196 Ops/s | |
test_zeros_like | 3.0802ms | 2.7143ms | 368.4125 Ops/s | 378.7263 Ops/s | |
test_ones_like | 11.3424ms | 6.0383ms | 165.6101 Ops/s | 308.0212 Ops/s | |
test_clone | 16.6946ms | 7.9779ms | 125.3462 Ops/s | 206.1655 Ops/s | |
test_squeeze | 61.5550μs | 12.6954μs | 78.7687 KOps/s | 80.5539 KOps/s | |
test_unsqueeze | 0.3435ms | 93.2422μs | 10.7248 KOps/s | 10.8858 KOps/s | |
test_split | 0.3748ms | 0.1937ms | 5.1633 KOps/s | 5.1639 KOps/s | |
test_permute | 0.3770ms | 0.2156ms | 4.6376 KOps/s | 4.5404 KOps/s | |
test_stack | 31.3457ms | 24.9280ms | 40.1156 Ops/s | 40.3925 Ops/s | |
test_cat | 28.2500ms | 24.4273ms | 40.9379 Ops/s | 40.7908 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.1492ms | 16.6220μs | 60.1611 KOps/s | 60.1800 KOps/s | |
test_plain_set_stack_nested | 0.3915ms | 16.6041μs | 60.2262 KOps/s | 60.7220 KOps/s | |
test_plain_set_nested_inplace | 47.7510μs | 17.7504μs | 56.3368 KOps/s | 56.2550 KOps/s | |
test_plain_set_stack_nested_inplace | 0.4007ms | 17.7847μs | 56.2281 KOps/s | 57.8237 KOps/s | |
test_items | 0.3840ms | 2.8619μs | 349.4200 KOps/s | 345.3645 KOps/s | |
test_items_nested | 0.7300ms | 0.3498ms | 2.8585 KOps/s | 2.8467 KOps/s | |
test_items_nested_locked | 0.4096ms | 0.3485ms | 2.8694 KOps/s | 2.8385 KOps/s | |
test_items_nested_leaf | 0.4449ms | 62.6363μs | 15.9652 KOps/s | 16.0078 KOps/s | |
test_items_stack_nested | 0.5298ms | 0.3466ms | 2.8855 KOps/s | 2.8563 KOps/s | |
test_items_stack_nested_leaf | 0.1020ms | 62.6751μs | 15.9553 KOps/s | 15.4977 KOps/s | |
test_items_stack_nested_locked | 0.5266ms | 0.3513ms | 2.8467 KOps/s | 2.8130 KOps/s | |
test_keys | 22.6800μs | 3.4876μs | 286.7339 KOps/s | 272.3354 KOps/s | |
test_keys_nested | 98.1120μs | 70.2703μs | 14.2308 KOps/s | 14.0143 KOps/s | |
test_keys_nested_locked | 2.3870ms | 76.1699μs | 13.1285 KOps/s | 12.9691 KOps/s | |
test_keys_nested_leaf | 0.1045ms | 61.8880μs | 16.1582 KOps/s | 15.9934 KOps/s | |
test_keys_stack_nested | 0.1155ms | 69.7953μs | 14.3276 KOps/s | 14.0778 KOps/s | |
test_keys_stack_nested_leaf | 98.7510μs | 61.7131μs | 16.2040 KOps/s | 15.8822 KOps/s | |
test_keys_stack_nested_locked | 0.1275ms | 76.8881μs | 13.0059 KOps/s | 13.0091 KOps/s | |
test_values | 4.4683μs | 0.8344μs | 1.1985 MOps/s | 1.1587 MOps/s | |
test_values_nested | 81.2510μs | 48.4438μs | 20.6425 KOps/s | 20.5189 KOps/s | |
test_values_nested_locked | 78.7020μs | 50.7065μs | 19.7213 KOps/s | 19.7910 KOps/s | |
test_values_nested_leaf | 75.4020μs | 42.4023μs | 23.5836 KOps/s | 23.3607 KOps/s | |
test_values_stack_nested | 85.3010μs | 49.0567μs | 20.3846 KOps/s | 20.0711 KOps/s | |
test_values_stack_nested_leaf | 73.5810μs | 43.1496μs | 23.1752 KOps/s | 23.0483 KOps/s | |
test_values_stack_nested_locked | 80.8810μs | 50.9508μs | 19.6268 KOps/s | 19.2803 KOps/s | |
test_membership | 1.5926μs | 0.5106μs | 1.9584 MOps/s | 2.0040 MOps/s | |
test_membership_nested | 17.4905μs | 1.8965μs | 527.2736 KOps/s | 511.3687 KOps/s | |
test_membership_nested_leaf | 13.8467μs | 1.8638μs | 536.5457 KOps/s | 537.8368 KOps/s | |
test_membership_stacked_nested | 33.9010μs | 1.9327μs | 517.4216 KOps/s | 510.1132 KOps/s | |
test_membership_stacked_nested_leaf | 25.9400μs | 1.9451μs | 514.1034 KOps/s | 506.1495 KOps/s | |
test_membership_nested_last | 26.7700μs | 3.0264μs | 330.4217 KOps/s | 336.4528 KOps/s | |
test_membership_nested_leaf_last | 30.4410μs | 3.0206μs | 331.0604 KOps/s | 331.9962 KOps/s | |
test_membership_stacked_nested_last | 29.0910μs | 2.9971μs | 333.6549 KOps/s | 122.1304 KOps/s | |
test_membership_stacked_nested_leaf_last | 27.0900μs | 2.9952μs | 333.8681 KOps/s | 122.2034 KOps/s | |
test_nested_getleaf | 31.2400μs | 6.1107μs | 163.6471 KOps/s | 163.2516 KOps/s | |
test_nested_get | 33.2700μs | 5.7740μs | 173.1901 KOps/s | 173.1733 KOps/s | |
test_stacked_getleaf | 37.1500μs | 6.0648μs | 164.8849 KOps/s | 165.5486 KOps/s | |
test_stacked_get | 32.7710μs | 5.6987μs | 175.4791 KOps/s | 175.7518 KOps/s | |
test_nested_getitemleaf | 40.7510μs | 6.1463μs | 162.7006 KOps/s | 161.2979 KOps/s | |
test_nested_getitem | 31.0010μs | 5.8957μs | 169.6145 KOps/s | 171.7504 KOps/s | |
test_stacked_getitemleaf | 39.2210μs | 6.1728μs | 162.0014 KOps/s | 162.4643 KOps/s | |
test_stacked_getitem | 34.0810μs | 5.7420μs | 174.1565 KOps/s | 172.1604 KOps/s | |
test_lock_nested | 4.8928ms | 0.4290ms | 2.3311 KOps/s | 2.3297 KOps/s | |
test_lock_stack_nested | 0.5316ms | 0.3928ms | 2.5461 KOps/s | 2.6206 KOps/s | |
test_unlock_nested | 0.7662ms | 0.3645ms | 2.7438 KOps/s | 2.7350 KOps/s | |
test_unlock_stack_nested | 0.4063ms | 0.3299ms | 3.0315 KOps/s | 3.1229 KOps/s | |
test_flatten_speed | 0.1553ms | 76.6775μs | 13.0416 KOps/s | 12.9138 KOps/s | |
test_unflatten_speed | 0.3905ms | 0.3265ms | 3.0625 KOps/s | 3.0880 KOps/s | |
test_common_ops | 1.4844ms | 1.2287ms | 813.8540 Ops/s | 807.8529 Ops/s | |
test_creation | 24.0800μs | 1.5024μs | 665.6112 KOps/s | 667.8802 KOps/s | |
test_creation_empty | 42.9400μs | 15.3886μs | 64.9831 KOps/s | 67.2230 KOps/s | |
test_creation_nested_1 | 48.3610μs | 17.0169μs | 58.7651 KOps/s | 60.2488 KOps/s | |
test_creation_nested_2 | 50.8000μs | 19.7367μs | 50.6671 KOps/s | 51.6879 KOps/s | |
test_clone | 72.4110μs | 27.9791μs | 35.7410 KOps/s | 34.9239 KOps/s | |
test_getitem[int] | 1.2945ms | 15.8342μs | 63.1544 KOps/s | 63.2800 KOps/s | |
test_getitem[slice_int] | 0.1201ms | 26.6973μs | 37.4569 KOps/s | 36.4511 KOps/s | |
test_getitem[range] | 0.2179ms | 0.1078ms | 9.2725 KOps/s | 9.2464 KOps/s | |
test_getitem[tuple] | 0.1202ms | 23.6495μs | 42.2842 KOps/s | 42.0702 KOps/s | |
test_getitem[list] | 0.1936ms | 96.5096μs | 10.3617 KOps/s | 10.0156 KOps/s | |
test_setitem_dim[int] | 67.6620μs | 44.2192μs | 22.6146 KOps/s | 21.5801 KOps/s | |
test_setitem_dim[slice_int] | 90.6910μs | 64.1498μs | 15.5885 KOps/s | 15.3227 KOps/s | |
test_setitem_dim[range] | 0.1503ms | 0.1233ms | 8.1119 KOps/s | 7.8415 KOps/s | |
test_setitem_dim[tuple] | 96.3610μs | 58.2458μs | 17.1686 KOps/s | 17.0143 KOps/s | |
test_setitem | 79.6710μs | 40.2458μs | 24.8473 KOps/s | 24.1776 KOps/s | |
test_set | 87.5710μs | 42.0251μs | 23.7953 KOps/s | 25.2149 KOps/s | |
test_set_shared | 0.3822ms | 54.4254μs | 18.3738 KOps/s | 18.6017 KOps/s | |
test_update | 90.0410μs | 49.7449μs | 20.1026 KOps/s | 19.5427 KOps/s | |
test_update_nested | 99.2710μs | 59.7307μs | 16.7418 KOps/s | 16.9511 KOps/s | |
test_update__nested | 0.1608ms | 60.8926μs | 16.4224 KOps/s | 15.0337 KOps/s | |
test_set_nested | 82.9210μs | 42.9205μs | 23.2989 KOps/s | 23.4258 KOps/s | |
test_set_nested_new | 91.8010μs | 48.6058μs | 20.5737 KOps/s | 21.7995 KOps/s | |
test_select | 0.1020ms | 63.4615μs | 15.7576 KOps/s | 16.8386 KOps/s | |
test_select_nested | 73.5410μs | 42.2931μs | 23.6445 KOps/s | 23.9804 KOps/s | |
test_exclude_nested | 90.4510μs | 60.1127μs | 16.6354 KOps/s | 16.7186 KOps/s | |
test_empty[True] | 0.3933ms | 0.2661ms | 3.7584 KOps/s | 3.7622 KOps/s | |
test_empty[False] | 2.8011μs | 0.7431μs | 1.3458 MOps/s | 1.3458 MOps/s | |
test_to | 63.6410μs | 26.5139μs | 37.7160 KOps/s | 38.2096 KOps/s | |
test_to_nonblocking | 61.8210μs | 25.2478μs | 39.6074 KOps/s | 39.5748 KOps/s | |
test_unbind_speed | 0.3257ms | 0.2779ms | 3.5982 KOps/s | 3.5354 KOps/s | |
test_unbind_speed_stack0 | 0.4089ms | 0.2736ms | 3.6556 KOps/s | 3.6795 KOps/s | |
test_unbind_speed_stack1 | 92.2033ms | 0.7116ms | 1.4053 KOps/s | 1.5578 KOps/s | |
test_split | 93.5394ms | 2.1605ms | 462.8644 Ops/s | 455.7592 Ops/s | |
test_chunk | 94.2277ms | 2.1700ms | 460.8229 Ops/s | 455.3258 Ops/s | |
test_creation[device0] | 0.3565ms | 0.1258ms | 7.9518 KOps/s | 7.9602 KOps/s | |
test_creation_from_tensor | 0.3848ms | 0.1325ms | 7.5472 KOps/s | 7.5491 KOps/s | |
test_add_one[memmap_tensor0] | 0.2354ms | 8.8253μs | 113.3106 KOps/s | 114.2236 KOps/s | |
test_contiguous[memmap_tensor0] | 17.5100μs | 2.1328μs | 468.8716 KOps/s | 474.1445 KOps/s | |
test_stack[memmap_tensor0] | 37.0100μs | 6.6570μs | 150.2184 KOps/s | 146.6851 KOps/s | |
test_memmaptd_index | 1.4572ms | 0.4280ms | 2.3365 KOps/s | 2.3407 KOps/s | |
test_memmaptd_index_astensor | 0.7669ms | 0.5002ms | 1.9990 KOps/s | 1.9895 KOps/s | |
test_memmaptd_index_op | 1.4355ms | 1.0173ms | 983.0127 Ops/s | 972.8972 Ops/s | |
test_serialize_model | 0.1308s | 0.1297s | 7.7121 Ops/s | 7.6537 Ops/s | |
test_serialize_model_pickle | 1.3700s | 1.2181s | 0.8210 Ops/s | 0.8219 Ops/s | |
test_serialize_weights | 0.2216s | 0.1426s | 7.0112 Ops/s | 7.6755 Ops/s | |
test_serialize_weights_returnearly | 0.2150s | 57.0347ms | 17.5332 Ops/s | 20.8190 Ops/s | |
test_serialize_weights_pickle | 1.3464s | 1.1858s | 0.8433 Ops/s | 0.8369 Ops/s | |
test_reshape_pytree | 79.5110μs | 37.1218μs | 26.9383 KOps/s | 28.2268 KOps/s | |
test_reshape_td | 0.1698ms | 44.4624μs | 22.4909 KOps/s | 23.8956 KOps/s | |
test_view_pytree | 79.4510μs | 37.4137μs | 26.7282 KOps/s | 28.8165 KOps/s | |
test_view_td | 97.2410μs | 49.1858μs | 20.3311 KOps/s | 22.0470 KOps/s | |
test_unbind_pytree | 78.0720μs | 34.9508μs | 28.6117 KOps/s | 29.2415 KOps/s | |
test_unbind_td | 0.5457ms | 42.3191μs | 23.6300 KOps/s | 23.2616 KOps/s | |
test_split_pytree | 73.7410μs | 46.0915μs | 21.6960 KOps/s | 21.7559 KOps/s | |
test_split_td | 0.7132ms | 55.0364μs | 18.1698 KOps/s | 17.4699 KOps/s | |
test_add_pytree | 96.7820μs | 55.5916μs | 17.9883 KOps/s | 17.0890 KOps/s | |
test_add_td | 0.1223ms | 89.4392μs | 11.1808 KOps/s | 11.0456 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.2574ms | 0.1589ms | 6.2927 KOps/s | 6.0436 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2979ms | 0.1597ms | 6.2616 KOps/s | 6.2927 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1960ms | 0.1507ms | 6.6339 KOps/s | 6.5450 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2309ms | 0.1816ms | 5.5078 KOps/s | 5.4522 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 58.2010μs | 21.7524μs | 45.9718 KOps/s | 47.8408 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1052ms | 48.3999μs | 20.6612 KOps/s | 20.8042 KOps/s | |
test_compile_copy_nested[pytree-compile] | 1.1456ms | 65.5340μs | 15.2593 KOps/s | 15.4350 KOps/s | |
test_compile_copy_nested[pytree-eager] | 87.2520μs | 50.3415μs | 19.8643 KOps/s | 20.2174 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.3623ms | 0.3137ms | 3.1881 KOps/s | 3.1781 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3157ms | 0.2313ms | 4.3236 KOps/s | 4.3156 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.2011ms | 0.1283ms | 7.7942 KOps/s | 7.8385 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1279ms | 66.8238μs | 14.9647 KOps/s | 15.1896 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.3994ms | 0.3230ms | 3.0956 KOps/s | 3.0878 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.7446ms | 0.6202ms | 1.6124 KOps/s | 1.6027 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4150ms | 0.2820ms | 3.5467 KOps/s | 3.5173 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.3656ms | 0.3150ms | 3.1746 KOps/s | 3.1778 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1909ms | 79.1544μs | 12.6335 KOps/s | 13.0880 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1802ms | 0.1314ms | 7.6098 KOps/s | 7.8336 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.6794ms | 0.5673ms | 1.7626 KOps/s | 1.8924 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.3729ms | 0.3217ms | 3.1087 KOps/s | 3.0885 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 63.1610μs | 20.5378μs | 48.6907 KOps/s | 50.1036 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 72.5620μs | 38.3241μs | 26.0932 KOps/s | 26.0229 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1010ms | 69.2937μs | 14.4313 KOps/s | 14.4083 KOps/s | |
test_compile_copy_flat[pytree-eager] | 96.2410μs | 51.6226μs | 19.3713 KOps/s | 19.4751 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 2.3452ms | 0.8124ms | 1.2310 KOps/s | 1.1297 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.3313ms | 3.2017ms | 312.3356 Ops/s | 309.4735 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 2.4001ms | 0.8262ms | 1.2103 KOps/s | 1.1191 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.1621ms | 3.0909ms | 323.5282 Ops/s | 318.2927 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.2579ms | 0.1156ms | 8.6506 KOps/s | 8.5982 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.1845ms | 58.4270μs | 17.1154 KOps/s | 16.5010 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.2251ms | 0.1156ms | 8.6510 KOps/s | 8.9649 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 88.7620μs | 47.2184μs | 21.1782 KOps/s | 21.5347 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1565ms | 0.1168ms | 8.5582 KOps/s | 8.4444 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1129ms | 47.0080μs | 21.2730 KOps/s | 21.8449 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1854ms | 0.1486ms | 6.7305 KOps/s | 6.9993 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1591ms | 26.4017μs | 37.8763 KOps/s | 39.9515 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1829ms | 0.1431ms | 6.9873 KOps/s | 7.3201 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 50.8710μs | 21.5494μs | 46.4050 KOps/s | 48.4246 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1807ms | 0.1424ms | 7.0222 KOps/s | 7.1238 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 52.8610μs | 20.5452μs | 48.6732 KOps/s | 48.1657 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.3022ms | 0.1468ms | 6.8127 KOps/s | 6.6342 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5085ms | 24.9081μs | 40.1475 KOps/s | 37.9919 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.2138ms | 0.1409ms | 7.0965 KOps/s | 6.9236 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 52.5210μs | 20.9929μs | 47.6351 KOps/s | 48.7406 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.2092ms | 0.1406ms | 7.1133 KOps/s | 7.1447 KOps/s | |
test_compile_indexing[int-pytree-eager] | 56.2500μs | 20.7605μs | 48.1684 KOps/s | 48.6143 KOps/s | |
test_mod_add[eager] | 73.8810μs | 34.2619μs | 29.1869 KOps/s | 30.9765 KOps/s | |
test_mod_add[compile] | 0.2290ms | 79.9246μs | 12.5118 KOps/s | 12.4223 KOps/s | |
test_mod_add[compile-overhead] | 0.3045ms | 0.1489ms | 6.7154 KOps/s | 6.4663 KOps/s | |
test_mod_wrap[eager] | 0.3109ms | 0.2490ms | 4.0158 KOps/s | 4.0841 KOps/s | |
test_mod_wrap[compile] | 1.3315ms | 0.2904ms | 3.4439 KOps/s | 3.4179 KOps/s | |
test_mod_wrap[compile-overhead] | 7.7486ms | 4.0875ms | 244.6475 Ops/s | 241.9572 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.4364ms | 1.2963ms | 771.4522 Ops/s | 716.7777 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.5502ms | 1.2968ms | 771.1442 Ops/s | 709.8358 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3115ms | 0.8812ms | 1.1348 KOps/s | 1.0026 KOps/s | |
test_seq_add[eager] | 0.1623ms | 98.2292μs | 10.1803 KOps/s | 10.0477 KOps/s | |
test_seq_add[compile] | 0.1479ms | 89.7097μs | 11.1471 KOps/s | 10.4982 KOps/s | |
test_seq_add[compile-overhead] | 0.1672ms | 0.1219ms | 8.2060 KOps/s | 8.1553 KOps/s | |
test_seq_wrap[eager] | 0.4377ms | 0.3674ms | 2.7220 KOps/s | 2.5625 KOps/s | |
test_seq_wrap[compile] | 0.3672ms | 0.3074ms | 3.2532 KOps/s | 3.2171 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2612ms | 0.2150ms | 4.6519 KOps/s | 4.5882 KOps/s | |
test_func_call_runtime[False-eager] | 0.8794ms | 0.7321ms | 1.3659 KOps/s | 1.3978 KOps/s | |
test_func_call_runtime[False-compile] | 0.8584ms | 0.7728ms | 1.2939 KOps/s | 1.2749 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4061ms | 0.3533ms | 2.8306 KOps/s | 2.8387 KOps/s | |
test_func_call_runtime[True-eager] | 0.9513ms | 0.8763ms | 1.1412 KOps/s | 1.1440 KOps/s | |
test_func_call_runtime[True-compile] | 0.8688ms | 0.7946ms | 1.2586 KOps/s | 1.2478 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4198ms | 0.3724ms | 2.6853 KOps/s | 2.6702 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.7855ms | 0.7222ms | 1.3847 KOps/s | 1.4131 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8522ms | 0.7718ms | 1.2956 KOps/s | 1.2829 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.3966ms | 0.3528ms | 2.8341 KOps/s | 2.8029 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.0859ms | 0.9687ms | 1.0324 KOps/s | 1.0221 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.8756ms | 0.8172ms | 1.2236 KOps/s | 1.2106 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.5236ms | 0.3974ms | 2.5167 KOps/s | 2.4893 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.4836ms | 2.0315ms | 492.2496 Ops/s | 486.2142 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9432ms | 0.8305ms | 1.2041 KOps/s | 1.1601 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.4622ms | 0.4046ms | 2.4714 KOps/s | 2.4912 KOps/s | |
test_distributed | 5.3532ms | 0.2268ms | 4.4092 KOps/s | 8.8368 KOps/s | |
test_tdmodule | 0.1266ms | 14.7310μs | 67.8839 KOps/s | 62.6205 KOps/s | |
test_tdmodule_dispatch | 51.5210μs | 28.5277μs | 35.0537 KOps/s | 35.4019 KOps/s | |
test_tdseq | 36.0110μs | 15.7330μs | 63.5607 KOps/s | 61.4916 KOps/s | |
test_tdseq_dispatch | 51.3610μs | 31.5879μs | 31.6577 KOps/s | 30.7304 KOps/s | |
test_instantiation_functorch | 2.0047ms | 1.8253ms | 547.8561 Ops/s | 532.0969 Ops/s | |
test_exec_functorch | 0.2400ms | 0.2018ms | 4.9551 KOps/s | 4.8506 KOps/s | |
test_exec_functional_call | 0.3101ms | 0.1992ms | 5.0203 KOps/s | 4.9600 KOps/s | |
test_exec_td_decorator | 0.4277ms | 0.2521ms | 3.9671 KOps/s | 3.9174 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.7782ms | 0.6603ms | 1.5146 KOps/s | 1.5094 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.7862ms | 0.6612ms | 1.5124 KOps/s | 1.4979 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.6873ms | 0.5799ms | 1.7245 KOps/s | 1.7208 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.6803ms | 0.5805ms | 1.7226 KOps/s | 1.7088 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 19.0034ms | 18.9060ms | 52.8934 Ops/s | 53.0225 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 18.9540ms | 18.8768ms | 52.9752 Ops/s | 52.9012 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 18.8195ms | 18.7179ms | 53.4248 Ops/s | 53.3415 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 18.8365ms | 18.7691ms | 53.2791 Ops/s | 53.1122 Ops/s | |
test_to_module_speed[True] | 1.5216ms | 1.0036ms | 996.4160 Ops/s | 1.0004 KOps/s | |
test_to_module_speed[False] | 1.3961ms | 0.9840ms | 1.0163 KOps/s | 1.0267 KOps/s | |
test_tc_init | 68.4910μs | 35.9929μs | 27.7833 KOps/s | 28.8008 KOps/s | |
test_tc_init_nested | 0.1114ms | 75.7604μs | 13.1995 KOps/s | 13.8037 KOps/s | |
test_tc_first_layer_tensor | 4.0771μs | 0.6670μs | 1.4991 MOps/s | 1.4900 MOps/s | |
test_tc_first_layer_nontensor | 17.1200μs | 2.2372μs | 446.9893 KOps/s | 451.0545 KOps/s | |
test_tc_second_layer_tensor | 9.0027μs | 1.3645μs | 732.8806 KOps/s | 737.5249 KOps/s | |
test_tc_second_layer_nontensor | 28.0800μs | 2.9686μs | 336.8642 KOps/s | 342.5078 KOps/s | |
test_unbind | 0.1904s | 9.4685ms | 105.6136 Ops/s | 92.9038 Ops/s | |
test_full_like | 0.6592ms | 0.5756ms | 1.7372 KOps/s | 1.7467 KOps/s | |
test_zeros_like | 0.2779ms | 0.1980ms | 5.0511 KOps/s | 5.0530 KOps/s | |
test_ones_like | 0.2341ms | 0.1978ms | 5.0562 KOps/s | 5.0567 KOps/s | |
test_clone | 0.4437ms | 0.4149ms | 2.4105 KOps/s | 2.4106 KOps/s | |
test_squeeze | 86.7220μs | 9.8818μs | 101.1957 KOps/s | 103.0430 KOps/s | |
test_unsqueeze | 0.2206ms | 73.2976μs | 13.6430 KOps/s | 13.1024 KOps/s | |
test_split | 0.4125ms | 0.1593ms | 6.2764 KOps/s | 6.3708 KOps/s | |
test_permute | 0.2184ms | 0.1773ms | 5.6386 KOps/s | 5.6430 KOps/s | |
test_stack | 1.2524ms | 0.8560ms | 1.1682 KOps/s | 1.1740 KOps/s | |
test_cat | 1.2559ms | 1.2315ms | 812.0459 Ops/s | 812.0900 Ops/s |
vmoens
added a commit
that referenced
this pull request
Oct 2, 2024
ghstack-source-id: 71812497f1efb9d20f67a7561e74d5111c4cc3f0 Pull Request resolved: #1022
vmoens
added a commit
that referenced
this pull request
Oct 16, 2024
ghstack-source-id: 5f84ebc2a01e6dab26fe1d68d67bb166a295e885 Pull Request resolved: #1022
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
enhancement
New feature or request
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):