-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Better list casting in TensorDict.from_any #1108
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This was referenced Nov 24, 2024
vmoens
added a commit
that referenced
this pull request
Nov 24, 2024
ghstack-source-id: 6c4991313366cb29d58cc34463422aa3ab80da38 Pull Request resolved: #1108
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Nov 24, 2024
vmoens
added a commit
that referenced
this pull request
Nov 24, 2024
ghstack-source-id: f83f49735113d165537a2b9f92e8d0f9b8356187 Pull Request resolved: #1108
vmoens
added a commit
that referenced
this pull request
Nov 25, 2024
ghstack-source-id: 427d19d5ef7c0d2779e064e64522fc0094a885af Pull Request resolved: #1108
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 35.3420μs | 10.1986μs | 98.0525 KOps/s | 92.2355 KOps/s | |
test_plain_set_stack_nested | 35.6810μs | 10.2250μs | 97.7996 KOps/s | 91.2907 KOps/s | |
test_plain_set_nested_inplace | 48.7020μs | 11.1020μs | 90.0742 KOps/s | 85.0640 KOps/s | |
test_plain_set_stack_nested_inplace | 38.0520μs | 11.0919μs | 90.1558 KOps/s | 85.3993 KOps/s | |
test_items | 29.1610μs | 2.9116μs | 343.4591 KOps/s | 339.9752 KOps/s | |
test_items_nested | 0.3747ms | 0.3247ms | 3.0802 KOps/s | 3.0875 KOps/s | |
test_items_nested_locked | 0.3629ms | 0.3262ms | 3.0657 KOps/s | 3.0642 KOps/s | |
test_items_nested_leaf | 87.3850μs | 58.1667μs | 17.1920 KOps/s | 17.2714 KOps/s | |
test_items_stack_nested | 0.4131ms | 0.3282ms | 3.0472 KOps/s | 3.0880 KOps/s | |
test_items_stack_nested_leaf | 85.5850μs | 59.3682μs | 16.8440 KOps/s | 16.9839 KOps/s | |
test_items_stack_nested_locked | 0.3967ms | 0.3289ms | 3.0405 KOps/s | 3.0702 KOps/s | |
test_keys | 31.5510μs | 3.4661μs | 288.5113 KOps/s | 288.3811 KOps/s | |
test_keys_nested | 0.1028ms | 70.0410μs | 14.2773 KOps/s | 14.2170 KOps/s | |
test_keys_nested_locked | 0.7233ms | 75.3023μs | 13.2798 KOps/s | 13.1823 KOps/s | |
test_keys_nested_leaf | 0.1002ms | 61.2603μs | 16.3238 KOps/s | 16.2909 KOps/s | |
test_keys_stack_nested | 0.1026ms | 70.8050μs | 14.1233 KOps/s | 14.2684 KOps/s | |
test_keys_stack_nested_leaf | 86.6650μs | 61.8136μs | 16.1777 KOps/s | 16.3370 KOps/s | |
test_keys_stack_nested_locked | 0.1079ms | 75.8844μs | 13.1779 KOps/s | 13.3491 KOps/s | |
test_values | 7.1005μs | 0.8466μs | 1.1813 MOps/s | 1.1873 MOps/s | |
test_values_nested | 61.1730μs | 31.2708μs | 31.9787 KOps/s | 32.1209 KOps/s | |
test_values_nested_locked | 58.7730μs | 32.8836μs | 30.4103 KOps/s | 30.3815 KOps/s | |
test_values_nested_leaf | 63.8130μs | 33.5495μs | 29.8067 KOps/s | 30.0908 KOps/s | |
test_values_stack_nested | 64.9240μs | 31.7656μs | 31.4806 KOps/s | 31.8234 KOps/s | |
test_values_stack_nested_leaf | 64.7430μs | 34.1715μs | 29.2641 KOps/s | 29.6317 KOps/s | |
test_values_stack_nested_locked | 60.9830μs | 33.3780μs | 29.9599 KOps/s | 29.9812 KOps/s | |
test_membership | 1.8121μs | 0.5094μs | 1.9632 MOps/s | 1.9568 MOps/s | |
test_membership_nested | 21.3360μs | 1.8953μs | 527.6310 KOps/s | 520.4857 KOps/s | |
test_membership_nested_leaf | 14.5960μs | 1.8968μs | 527.2140 KOps/s | 520.0750 KOps/s | |
test_membership_stacked_nested | 27.7520μs | 1.9953μs | 501.1877 KOps/s | 495.0602 KOps/s | |
test_membership_stacked_nested_leaf | 27.2020μs | 1.9756μs | 506.1808 KOps/s | 494.5709 KOps/s | |
test_membership_nested_last | 28.7520μs | 2.8441μs | 351.6051 KOps/s | 349.3813 KOps/s | |
test_membership_nested_leaf_last | 31.5610μs | 2.8023μs | 356.8478 KOps/s | 346.9149 KOps/s | |
test_membership_stacked_nested_last | 39.8120μs | 2.8036μs | 356.6848 KOps/s | 125.2757 KOps/s | |
test_membership_stacked_nested_leaf_last | 34.1720μs | 2.8224μs | 354.3114 KOps/s | 126.9189 KOps/s | |
test_nested_getleaf | 24.9010μs | 5.9991μs | 166.6910 KOps/s | 165.8766 KOps/s | |
test_nested_get | 37.6820μs | 5.6719μs | 176.3075 KOps/s | 174.5797 KOps/s | |
test_stacked_getleaf | 35.7220μs | 6.0013μs | 166.6295 KOps/s | 166.4058 KOps/s | |
test_stacked_get | 28.9520μs | 5.7091μs | 175.1603 KOps/s | 176.2046 KOps/s | |
test_nested_getitemleaf | 26.0920μs | 6.0857μs | 164.3187 KOps/s | 164.7334 KOps/s | |
test_nested_getitem | 32.6320μs | 5.7984μs | 172.4607 KOps/s | 172.1268 KOps/s | |
test_stacked_getitemleaf | 36.5420μs | 6.1401μs | 162.8630 KOps/s | 164.0233 KOps/s | |
test_stacked_getitem | 40.6920μs | 5.7960μs | 172.5326 KOps/s | 173.8296 KOps/s | |
test_lock_nested | 9.4184ms | 0.3673ms | 2.7227 KOps/s | 2.6739 KOps/s | |
test_lock_stack_nested | 0.3918ms | 0.3327ms | 3.0056 KOps/s | 3.0126 KOps/s | |
test_unlock_nested | 0.6043ms | 0.3014ms | 3.3178 KOps/s | 3.2688 KOps/s | |
test_unlock_stack_nested | 0.3115ms | 0.2710ms | 3.6904 KOps/s | 3.6798 KOps/s | |
test_flatten_speed | 0.1069ms | 72.4973μs | 13.7936 KOps/s | 13.8275 KOps/s | |
test_unflatten_speed | 0.3500ms | 0.2938ms | 3.4037 KOps/s | 3.3668 KOps/s | |
test_common_ops | 1.6557ms | 0.5552ms | 1.8012 KOps/s | 1.7019 KOps/s | |
test_creation | 95.6850μs | 1.4774μs | 676.8795 KOps/s | 684.6274 KOps/s | |
test_creation_empty | 58.0130μs | 6.5957μs | 151.6144 KOps/s | 124.3860 KOps/s | |
test_creation_nested_1 | 1.7737ms | 8.0972μs | 123.5000 KOps/s | 104.4857 KOps/s | |
test_creation_nested_2 | 44.0320μs | 10.5975μs | 94.3617 KOps/s | 81.8739 KOps/s | |
test_clone | 0.1042ms | 9.9370μs | 100.6336 KOps/s | 100.1039 KOps/s | |
test_getitem[int] | 1.2864ms | 10.7026μs | 93.4354 KOps/s | 90.3272 KOps/s | |
test_getitem[slice_int] | 0.1080ms | 20.2532μs | 49.3750 KOps/s | 49.3623 KOps/s | |
test_getitem[range] | 0.1330ms | 36.6870μs | 27.2576 KOps/s | 27.2162 KOps/s | |
test_getitem[tuple] | 0.1114ms | 17.8640μs | 55.9786 KOps/s | 55.3705 KOps/s | |
test_getitem[list] | 0.1332ms | 32.3840μs | 30.8794 KOps/s | 30.6827 KOps/s | |
test_setitem_dim[int] | 38.0820μs | 17.8542μs | 56.0092 KOps/s | 56.6444 KOps/s | |
test_setitem_dim[slice_int] | 57.3920μs | 36.4241μs | 27.4544 KOps/s | 27.9949 KOps/s | |
test_setitem_dim[range] | 80.5640μs | 52.3101μs | 19.1168 KOps/s | 19.1522 KOps/s | |
test_setitem_dim[tuple] | 50.6430μs | 30.4594μs | 32.8306 KOps/s | 32.3304 KOps/s | |
test_setitem | 0.1007ms | 13.1854μs | 75.8413 KOps/s | 69.9914 KOps/s | |
test_set | 0.1025ms | 13.0829μs | 76.4357 KOps/s | 71.9121 KOps/s | |
test_set_shared | 1.8165ms | 0.1451ms | 6.8940 KOps/s | 6.7244 KOps/s | |
test_update | 0.2365ms | 15.2052μs | 65.7671 KOps/s | 61.1225 KOps/s | |
test_update_nested | 0.1857ms | 19.8664μs | 50.3363 KOps/s | 46.4240 KOps/s | |
test_update__nested | 0.7522ms | 23.2302μs | 43.0473 KOps/s | 42.9144 KOps/s | |
test_set_nested | 0.1007ms | 14.2446μs | 70.2020 KOps/s | 67.5618 KOps/s | |
test_set_nested_new | 0.1052ms | 16.5695μs | 60.3518 KOps/s | 58.8050 KOps/s | |
test_select | 0.1177ms | 28.5590μs | 35.0153 KOps/s | 34.5524 KOps/s | |
test_select_nested | 78.2640μs | 42.2841μs | 23.6496 KOps/s | 23.5999 KOps/s | |
test_exclude_nested | 87.1640μs | 59.9891μs | 16.6697 KOps/s | 16.6555 KOps/s | |
test_empty[True] | 0.3198ms | 0.2575ms | 3.8840 KOps/s | 3.8731 KOps/s | |
test_empty[False] | 3.4212μs | 0.7435μs | 1.3450 MOps/s | 1.3553 MOps/s | |
test_to | 86.4950μs | 54.7271μs | 18.2725 KOps/s | 18.2254 KOps/s | |
test_to_nonblocking | 76.4030μs | 45.3248μs | 22.0630 KOps/s | 22.0440 KOps/s | |
test_unbind_speed | 0.2678ms | 0.2295ms | 4.3574 KOps/s | 4.2464 KOps/s | |
test_unbind_speed_stack0 | 0.3628ms | 0.2329ms | 4.2933 KOps/s | 4.4015 KOps/s | |
test_unbind_speed_stack1 | 94.2187ms | 0.6429ms | 1.5554 KOps/s | 1.5692 KOps/s | |
test_split | 94.5341ms | 1.5916ms | 628.3006 Ops/s | 624.3463 Ops/s | |
test_chunk | 97.3880ms | 1.7357ms | 576.1495 Ops/s | 573.6328 Ops/s | |
test_consolidate[False-None] | 3.2095ms | 2.6025ms | 384.2478 Ops/s | 378.9902 Ops/s | |
test_consolidate[default-None] | 1.7609ms | 1.6683ms | 599.3969 Ops/s | 582.8712 Ops/s | |
test_consolidate[reduce-overhead-None] | 1.7956ms | 1.6984ms | 588.8012 Ops/s | 576.4415 Ops/s | |
test_consolidate_njt[False-None] | 6.6870ms | 6.3515ms | 157.4437 Ops/s | 149.3910 Ops/s | |
test_to[False-False-None] | 1.7558ms | 1.6806ms | 595.0322 Ops/s | 600.9274 Ops/s | |
test_to[True-False-None] | 1.3892ms | 1.2564ms | 795.9236 Ops/s | 751.1028 Ops/s | |
test_to[within-False-None] | 4.2081ms | 3.9427ms | 253.6350 Ops/s | 249.6503 Ops/s | |
test_to[True-default-None] | 5.2820ms | 5.0593ms | 197.6574 Ops/s | 196.3579 Ops/s | |
test_to_njt[False-False-None] | 6.9085ms | 6.7945ms | 147.1778 Ops/s | 144.5409 Ops/s | |
test_to_njt[True-False-None] | 5.4190ms | 5.2802ms | 189.3854 Ops/s | 186.1183 Ops/s | |
test_to_njt[within-False-None] | 11.8403ms | 11.6863ms | 85.5703 Ops/s | 84.6953 Ops/s | |
test_creation[device0] | 0.4686ms | 77.9357μs | 12.8311 KOps/s | 12.8194 KOps/s | |
test_creation_from_tensor | 0.4638ms | 81.2553μs | 12.3069 KOps/s | 12.0831 KOps/s | |
test_add_one[memmap_tensor0] | 0.2306ms | 6.1920μs | 161.4982 KOps/s | 160.2376 KOps/s | |
test_contiguous[memmap_tensor0] | 4.8583μs | 0.4073μs | 2.4555 MOps/s | 2.4204 MOps/s | |
test_stack[memmap_tensor0] | 68.9030μs | 4.6011μs | 217.3409 KOps/s | 218.8325 KOps/s | |
test_memmaptd_index | 1.7058ms | 0.2413ms | 4.1439 KOps/s | 4.0249 KOps/s | |
test_memmaptd_index_astensor | 0.5496ms | 0.2977ms | 3.3590 KOps/s | 3.2695 KOps/s | |
test_memmaptd_index_op | 0.9470ms | 0.5342ms | 1.8721 KOps/s | 1.7693 KOps/s | |
test_serialize_model | 0.1316s | 0.1308s | 7.6428 Ops/s | 7.7052 Ops/s | |
test_serialize_model_pickle | 1.3480s | 1.1901s | 0.8403 Ops/s | 0.8234 Ops/s | |
test_serialize_weights | 0.1311s | 0.1301s | 7.6838 Ops/s | 7.7195 Ops/s | |
test_serialize_weights_returnearly | 0.3101s | 54.1039ms | 18.4829 Ops/s | 23.4816 Ops/s | |
test_serialize_weights_pickle | 1.3906s | 1.1923s | 0.8387 Ops/s | 0.8141 Ops/s | |
test_reshape_pytree | 0.4038ms | 22.2950μs | 44.8530 KOps/s | 43.9165 KOps/s | |
test_reshape_td | 53.5330μs | 26.8382μs | 37.2603 KOps/s | 37.3088 KOps/s | |
test_view_pytree | 49.2730μs | 22.0791μs | 45.2917 KOps/s | 43.5367 KOps/s | |
test_view_td | 0.4118ms | 29.2539μs | 34.1835 KOps/s | 32.2514 KOps/s | |
test_unbind_pytree | 64.4730μs | 27.6746μs | 36.1343 KOps/s | 35.3019 KOps/s | |
test_unbind_td | 0.7148ms | 35.0231μs | 28.5526 KOps/s | 27.0778 KOps/s | |
test_split_pytree | 58.2530μs | 30.0964μs | 33.2265 KOps/s | 32.4673 KOps/s | |
test_split_td | 0.8970ms | 38.3520μs | 26.0743 KOps/s | 25.6561 KOps/s | |
test_add_pytree | 0.4085ms | 32.7869μs | 30.4999 KOps/s | 30.5324 KOps/s | |
test_add_td | 84.2340μs | 43.5715μs | 22.9508 KOps/s | 22.6464 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1721ms | 0.1232ms | 8.1185 KOps/s | 8.1167 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2143ms | 0.1256ms | 7.9635 KOps/s | 8.0804 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1585ms | 99.5829μs | 10.0419 KOps/s | 10.3882 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2178ms | 0.1485ms | 6.7351 KOps/s | 6.7715 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 63.1540μs | 24.0978μs | 41.4976 KOps/s | 46.7769 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1457ms | 27.3278μs | 36.5928 KOps/s | 36.2039 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.4334ms | 65.1392μs | 15.3517 KOps/s | 15.4252 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1033ms | 49.6135μs | 20.1558 KOps/s | 20.2506 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2191ms | 0.1448ms | 6.9068 KOps/s | 6.9207 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3143ms | 0.2068ms | 4.8352 KOps/s | 4.8964 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1496ms | 0.1013ms | 9.8716 KOps/s | 9.3782 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1205ms | 53.1502μs | 18.8146 KOps/s | 19.0654 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2026ms | 0.1415ms | 7.0663 KOps/s | 7.0313 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.6332ms | 0.4737ms | 2.1110 KOps/s | 2.0582 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4118ms | 0.2487ms | 4.0203 KOps/s | 3.9849 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2208ms | 0.1520ms | 6.5795 KOps/s | 6.7147 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1528ms | 63.3335μs | 15.7894 KOps/s | 15.7079 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1469ms | 0.1004ms | 9.9578 KOps/s | 9.5761 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.5145ms | 0.4048ms | 2.4701 KOps/s | 2.4476 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2005ms | 0.1447ms | 6.9132 KOps/s | 7.3884 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 52.1420μs | 20.1594μs | 49.6047 KOps/s | 56.4301 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.4286ms | 27.6504μs | 36.1658 KOps/s | 36.5296 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.4521ms | 69.9824μs | 14.2893 KOps/s | 14.1130 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.4367ms | 51.8852μs | 19.2733 KOps/s | 19.0572 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 1.6198ms | 0.3915ms | 2.5546 KOps/s | 2.1674 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.0094ms | 2.6054ms | 383.8114 Ops/s | 399.1623 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 1.5750ms | 0.4309ms | 2.3207 KOps/s | 2.2007 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 2.7791ms | 2.6231ms | 381.2308 Ops/s | 384.0557 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.4049ms | 0.1166ms | 8.5765 KOps/s | 8.5670 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5543ms | 79.4848μs | 12.5810 KOps/s | 11.6938 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1726ms | 0.1115ms | 8.9652 KOps/s | 8.9880 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1247ms | 71.5151μs | 13.9831 KOps/s | 14.8898 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1612ms | 0.1067ms | 9.3760 KOps/s | 9.1065 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1325ms | 70.0543μs | 14.2746 KOps/s | 13.9744 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1462ms | 0.1012ms | 9.8771 KOps/s | 9.7811 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1453ms | 17.0673μs | 58.5915 KOps/s | 57.1308 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.2121ms | 0.1007ms | 9.9291 KOps/s | 9.8353 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 52.6430μs | 15.9142μs | 62.8368 KOps/s | 60.9139 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1566ms | 0.1016ms | 9.8439 KOps/s | 9.7318 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 50.9530μs | 15.9170μs | 62.8259 KOps/s | 59.8242 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1538ms | 0.1047ms | 9.5479 KOps/s | 9.3102 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5715ms | 17.1423μs | 58.3351 KOps/s | 57.4872 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1482ms | 97.3122μs | 10.2762 KOps/s | 10.0093 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.1812ms | 15.9879μs | 62.5472 KOps/s | 60.6034 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1643ms | 96.8544μs | 10.3248 KOps/s | 9.6982 KOps/s | |
test_compile_indexing[int-pytree-eager] | 49.7820μs | 15.9650μs | 62.6370 KOps/s | 60.4070 KOps/s | |
test_mod_add[eager] | 92.2250μs | 34.3966μs | 29.0726 KOps/s | 31.7176 KOps/s | |
test_mod_add[compile] | 0.3696ms | 85.7510μs | 11.6617 KOps/s | 12.8303 KOps/s | |
test_mod_add[compile-overhead] | 0.3273ms | 0.1695ms | 5.9004 KOps/s | 5.7992 KOps/s | |
test_mod_wrap[eager] | 0.3160ms | 0.2378ms | 4.2057 KOps/s | 3.9895 KOps/s | |
test_mod_wrap[compile] | 0.3484ms | 0.2840ms | 3.5214 KOps/s | 3.4555 KOps/s | |
test_mod_wrap[compile-overhead] | 7.1114ms | 3.7673ms | 265.4455 Ops/s | 264.5247 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.4476ms | 1.3016ms | 768.3121 Ops/s | 714.8764 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.4042ms | 1.2627ms | 791.9403 Ops/s | 726.6895 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3920ms | 0.9226ms | 1.0839 KOps/s | 966.4891 Ops/s | |
test_seq_add[eager] | 0.1566ms | 96.7568μs | 10.3352 KOps/s | 10.2018 KOps/s | |
test_seq_add[compile] | 0.1457ms | 88.4876μs | 11.3010 KOps/s | 11.3106 KOps/s | |
test_seq_add[compile-overhead] | 0.1877ms | 0.1284ms | 7.7891 KOps/s | 7.7454 KOps/s | |
test_seq_wrap[eager] | 0.5043ms | 0.3808ms | 2.6263 KOps/s | 2.6279 KOps/s | |
test_seq_wrap[compile] | 0.3788ms | 0.3010ms | 3.3228 KOps/s | 3.2766 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2774ms | 0.2241ms | 4.4621 KOps/s | 4.4488 KOps/s | |
test_func_call_runtime[False-eager] | 1.4197ms | 0.7115ms | 1.4054 KOps/s | 1.3766 KOps/s | |
test_func_call_runtime[False-compile] | 0.8171ms | 0.7430ms | 1.3459 KOps/s | 1.3114 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4106ms | 0.3634ms | 2.7522 KOps/s | 2.7482 KOps/s | |
test_func_call_runtime[True-eager] | 0.9668ms | 0.8688ms | 1.1510 KOps/s | 1.1132 KOps/s | |
test_func_call_runtime[True-compile] | 0.8393ms | 0.7657ms | 1.3060 KOps/s | 1.2776 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4397ms | 0.3858ms | 2.5920 KOps/s | 2.6126 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8337ms | 0.7135ms | 1.4016 KOps/s | 1.3809 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8222ms | 0.7461ms | 1.3403 KOps/s | 1.3039 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4258ms | 0.3633ms | 2.7523 KOps/s | 2.7368 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.0655ms | 0.9631ms | 1.0383 KOps/s | 1.0070 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.8900ms | 0.7949ms | 1.2580 KOps/s | 1.2243 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.5802ms | 0.4127ms | 2.4233 KOps/s | 2.4260 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.4923ms | 2.0133ms | 496.6971 Ops/s | 496.5024 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.8734ms | 0.8215ms | 1.2173 KOps/s | 1.2110 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.4607ms | 0.4129ms | 2.4220 KOps/s | 2.3777 KOps/s | |
test_distributed | 3.2869ms | 0.1761ms | 5.6790 KOps/s | 8.8502 KOps/s | |
test_tdmodule | 48.6130μs | 14.6634μs | 68.1972 KOps/s | 72.0893 KOps/s | |
test_tdmodule_dispatch | 69.0440μs | 28.3538μs | 35.2686 KOps/s | 34.7372 KOps/s | |
test_tdseq | 33.0620μs | 14.5821μs | 68.5771 KOps/s | 64.7284 KOps/s | |
test_tdseq_dispatch | 57.5830μs | 30.0506μs | 33.2773 KOps/s | 30.4144 KOps/s | |
test_instantiation_functorch | 1.6153ms | 1.5357ms | 651.1558 Ops/s | 643.3343 Ops/s | |
test_exec_functorch | 0.2108ms | 0.1415ms | 7.0692 KOps/s | 7.0814 KOps/s | |
test_exec_functional_call | 0.1945ms | 0.1333ms | 7.5018 KOps/s | 7.5376 KOps/s | |
test_exec_td_decorator | 0.3621ms | 0.1758ms | 5.6894 KOps/s | 5.5808 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.7932ms | 0.6527ms | 1.5322 KOps/s | 1.5155 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.7621ms | 0.6531ms | 1.5311 KOps/s | 1.5162 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7245ms | 0.5724ms | 1.7471 KOps/s | 1.7412 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7105ms | 0.5773ms | 1.7322 KOps/s | 1.7396 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 18.6805ms | 18.6052ms | 53.7485 Ops/s | 53.7361 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.0465ms | 18.6209ms | 53.7032 Ops/s | 53.2531 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.7172ms | 18.4801ms | 54.1123 Ops/s | 53.8178 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 18.5381ms | 18.4786ms | 54.1166 Ops/s | 54.1104 Ops/s | |
test_to_module_speed[True] | 1.2738ms | 0.9330ms | 1.0718 KOps/s | 1.0680 KOps/s | |
test_to_module_speed[False] | 1.4649ms | 0.9153ms | 1.0926 KOps/s | 1.0759 KOps/s | |
test_tc_init | 61.4430μs | 33.2197μs | 30.1026 KOps/s | 28.5424 KOps/s | |
test_tc_init_nested | 0.1660ms | 68.6771μs | 14.5609 KOps/s | 13.4893 KOps/s | |
test_tc_first_layer_tensor | 8.6804μs | 0.7073μs | 1.4139 MOps/s | 1.4093 MOps/s | |
test_tc_first_layer_nontensor | 37.7220μs | 2.3388μs | 427.5726 KOps/s | 423.8412 KOps/s | |
test_tc_second_layer_tensor | 11.2173μs | 1.4162μs | 706.1146 KOps/s | 689.2062 KOps/s | |
test_tc_second_layer_nontensor | 33.1720μs | 3.0923μs | 323.3861 KOps/s | 324.9965 KOps/s | |
test_unbind | 0.2292s | 9.9510ms | 100.4926 Ops/s | 153.5423 Ops/s | |
test_full_like | 9.6231ms | 9.1841ms | 108.8837 Ops/s | 108.4763 Ops/s | |
test_zeros_like | 4.9663ms | 4.3256ms | 231.1800 Ops/s | 235.8125 Ops/s | |
test_ones_like | 4.4464ms | 4.3333ms | 230.7711 Ops/s | 230.9275 Ops/s | |
test_clone | 6.9491ms | 6.3322ms | 157.9240 Ops/s | 155.8412 Ops/s | |
test_squeeze | 60.5030μs | 9.5096μs | 105.1566 KOps/s | 107.6312 KOps/s | |
test_unsqueeze | 0.1266ms | 71.1931μs | 14.0463 KOps/s | 14.5741 KOps/s | |
test_split | 0.3987ms | 0.1571ms | 6.3635 KOps/s | 6.4680 KOps/s | |
test_permute | 0.2333ms | 0.1841ms | 5.4312 KOps/s | 5.3876 KOps/s | |
test_stack | 50.7966ms | 50.3951ms | 19.8432 Ops/s | 19.6539 Ops/s | |
test_cat | 50.7289ms | 50.3215ms | 19.8722 Ops/s | 19.5957 Ops/s |
vmoens
added a commit
that referenced
this pull request
Nov 25, 2024
ghstack-source-id: 427d19d5ef7c0d2779e064e64522fc0094a885af Pull Request resolved: #1108
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):