-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Frozen tensorclass #984
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Sep 10, 2024
ghstack-source-id: 06d10877115ce1659f200d37c4294fb90b10c342 Pull Request resolved: #984
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Sep 10, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 34.4210μs | 13.0639μs | 76.5466 KOps/s | 68.3677 KOps/s | |
test_plain_set_stack_nested | 65.7020μs | 13.4683μs | 74.2482 KOps/s | 68.8072 KOps/s | |
test_plain_set_nested_inplace | 46.1210μs | 14.0829μs | 71.0080 KOps/s | 64.1640 KOps/s | |
test_plain_set_stack_nested_inplace | 49.1010μs | 13.9580μs | 71.6434 KOps/s | 64.9127 KOps/s | |
test_items | 26.8110μs | 2.8030μs | 356.7559 KOps/s | 333.3529 KOps/s | |
test_items_nested | 0.4576ms | 0.3131ms | 3.1938 KOps/s | 3.1970 KOps/s | |
test_items_nested_locked | 0.3833ms | 0.3152ms | 3.1727 KOps/s | 3.1664 KOps/s | |
test_items_nested_leaf | 91.7620μs | 62.9993μs | 15.8732 KOps/s | 15.8528 KOps/s | |
test_items_stack_nested | 0.3690ms | 0.3150ms | 3.1750 KOps/s | 3.2057 KOps/s | |
test_items_stack_nested_leaf | 92.6630μs | 64.9372μs | 15.3995 KOps/s | 15.6401 KOps/s | |
test_items_stack_nested_locked | 0.3893ms | 0.3160ms | 3.1647 KOps/s | 3.2154 KOps/s | |
test_keys | 25.7910μs | 3.3675μs | 296.9560 KOps/s | 295.3800 KOps/s | |
test_keys_nested | 99.6120μs | 53.8958μs | 18.5543 KOps/s | 18.5901 KOps/s | |
test_keys_nested_locked | 2.5133ms | 60.7427μs | 16.4629 KOps/s | 16.3721 KOps/s | |
test_keys_nested_leaf | 82.9920μs | 45.6846μs | 21.8892 KOps/s | 21.0810 KOps/s | |
test_keys_stack_nested | 99.8720μs | 55.6760μs | 17.9611 KOps/s | 17.9923 KOps/s | |
test_keys_stack_nested_leaf | 87.4630μs | 47.5497μs | 21.0306 KOps/s | 20.7427 KOps/s | |
test_keys_stack_nested_locked | 99.2530μs | 60.3689μs | 16.5648 KOps/s | 16.5769 KOps/s | |
test_values | 6.6118μs | 0.8223μs | 1.2161 MOps/s | 1.1952 MOps/s | |
test_values_nested | 53.6110μs | 27.2153μs | 36.7440 KOps/s | 36.4196 KOps/s | |
test_values_nested_locked | 60.9020μs | 29.2427μs | 34.1966 KOps/s | 34.2274 KOps/s | |
test_values_nested_leaf | 51.3510μs | 24.0881μs | 41.5143 KOps/s | 41.1845 KOps/s | |
test_values_stack_nested | 61.7220μs | 28.4904μs | 35.0995 KOps/s | 35.0320 KOps/s | |
test_values_stack_nested_leaf | 47.2710μs | 25.2551μs | 39.5959 KOps/s | 39.9969 KOps/s | |
test_values_stack_nested_locked | 66.1520μs | 30.5470μs | 32.7365 KOps/s | 33.1668 KOps/s | |
test_membership | 1.9446μs | 0.5086μs | 1.9662 MOps/s | 1.9754 MOps/s | |
test_membership_nested | 31.4510μs | 1.8229μs | 548.5759 KOps/s | 576.0999 KOps/s | |
test_membership_nested_leaf | 17.1803μs | 1.7327μs | 577.1213 KOps/s | 585.2605 KOps/s | |
test_membership_stacked_nested | 37.8110μs | 1.7716μs | 564.4700 KOps/s | 568.3712 KOps/s | |
test_membership_stacked_nested_leaf | 95.9720μs | 1.7761μs | 563.0234 KOps/s | 571.5719 KOps/s | |
test_membership_nested_last | 28.6610μs | 2.5963μs | 385.1707 KOps/s | 377.6801 KOps/s | |
test_membership_nested_leaf_last | 40.4710μs | 2.6532μs | 376.9002 KOps/s | 379.3903 KOps/s | |
test_membership_stacked_nested_last | 31.3610μs | 2.9957μs | 333.8094 KOps/s | 386.0015 KOps/s | |
test_membership_stacked_nested_leaf_last | 39.7810μs | 2.9932μs | 334.0959 KOps/s | 381.6811 KOps/s | |
test_nested_getleaf | 31.2610μs | 6.1073μs | 163.7393 KOps/s | 165.1726 KOps/s | |
test_nested_get | 38.7210μs | 5.7194μs | 174.8448 KOps/s | 175.3099 KOps/s | |
test_stacked_getleaf | 34.1710μs | 5.9917μs | 166.8968 KOps/s | 164.7101 KOps/s | |
test_stacked_get | 42.0010μs | 5.6344μs | 177.4805 KOps/s | 177.3249 KOps/s | |
test_nested_getitemleaf | 33.7310μs | 6.1164μs | 163.4950 KOps/s | 162.4870 KOps/s | |
test_nested_getitem | 40.5810μs | 5.6960μs | 175.5622 KOps/s | 173.7685 KOps/s | |
test_stacked_getitemleaf | 32.5310μs | 6.0198μs | 166.1187 KOps/s | 165.5929 KOps/s | |
test_stacked_getitem | 39.4510μs | 5.6323μs | 177.5458 KOps/s | 175.0526 KOps/s | |
test_lock_nested | 5.1150ms | 0.4161ms | 2.4033 KOps/s | 2.3990 KOps/s | |
test_lock_stack_nested | 0.4536ms | 0.3770ms | 2.6525 KOps/s | 2.6842 KOps/s | |
test_unlock_nested | 0.7646ms | 0.3525ms | 2.8371 KOps/s | 2.8403 KOps/s | |
test_unlock_stack_nested | 0.4706ms | 0.3163ms | 3.1617 KOps/s | 3.2037 KOps/s | |
test_flatten_speed | 0.2851ms | 80.8025μs | 12.3759 KOps/s | 12.5466 KOps/s | |
test_unflatten_speed | 0.3529ms | 0.2779ms | 3.5988 KOps/s | 3.5712 KOps/s | |
test_common_ops | 1.3923ms | 1.2092ms | 826.9660 Ops/s | 790.3709 Ops/s | |
test_creation | 24.4000μs | 1.4675μs | 681.4183 KOps/s | 676.8297 KOps/s | |
test_creation_empty | 48.0910μs | 13.6022μs | 73.5176 KOps/s | 60.1735 KOps/s | |
test_creation_nested_1 | 48.3010μs | 15.2274μs | 65.6712 KOps/s | 55.3504 KOps/s | |
test_creation_nested_2 | 43.6710μs | 17.6939μs | 56.5167 KOps/s | 47.9619 KOps/s | |
test_clone | 70.9010μs | 28.9451μs | 34.5482 KOps/s | 35.3381 KOps/s | |
test_getitem[int] | 1.1702ms | 15.4623μs | 64.6734 KOps/s | 63.8507 KOps/s | |
test_getitem[slice_int] | 0.1174ms | 27.2216μs | 36.7355 KOps/s | 36.0967 KOps/s | |
test_getitem[range] | 0.1535ms | 0.1094ms | 9.1367 KOps/s | 8.9668 KOps/s | |
test_getitem[tuple] | 0.1212ms | 23.5785μs | 42.4116 KOps/s | 42.8505 KOps/s | |
test_getitem[list] | 0.2216ms | 98.7253μs | 10.1291 KOps/s | 10.1157 KOps/s | |
test_setitem_dim[int] | 70.3120μs | 48.9028μs | 20.4487 KOps/s | 19.2987 KOps/s | |
test_setitem_dim[slice_int] | 0.1148ms | 71.9333μs | 13.9018 KOps/s | 13.4285 KOps/s | |
test_setitem_dim[range] | 0.1829ms | 0.1337ms | 7.4789 KOps/s | 7.2820 KOps/s | |
test_setitem_dim[tuple] | 96.9320μs | 65.9256μs | 15.1686 KOps/s | 14.5990 KOps/s | |
test_setitem | 77.0820μs | 40.7218μs | 24.5569 KOps/s | 23.8172 KOps/s | |
test_set | 79.3420μs | 39.6310μs | 25.2328 KOps/s | 24.2883 KOps/s | |
test_set_shared | 0.3604ms | 50.6850μs | 19.7297 KOps/s | 20.1251 KOps/s | |
test_update | 91.1330μs | 48.1077μs | 20.7867 KOps/s | 19.9440 KOps/s | |
test_update_nested | 95.4120μs | 54.7539μs | 18.2635 KOps/s | 17.5625 KOps/s | |
test_update__nested | 98.2930μs | 60.4177μs | 16.5515 KOps/s | 16.8878 KOps/s | |
test_set_nested | 86.6930μs | 42.6972μs | 23.4208 KOps/s | 22.8242 KOps/s | |
test_set_nested_new | 0.4363ms | 46.3964μs | 21.5534 KOps/s | 21.3152 KOps/s | |
test_select | 99.4030μs | 58.7694μs | 17.0156 KOps/s | 16.5847 KOps/s | |
test_select_nested | 0.1268ms | 41.7297μs | 23.9637 KOps/s | 23.8127 KOps/s | |
test_exclude_nested | 85.5920μs | 58.4021μs | 17.1227 KOps/s | 16.6215 KOps/s | |
test_empty[True] | 0.3218ms | 0.2403ms | 4.1616 KOps/s | 4.0592 KOps/s | |
test_empty[False] | 3.6781μs | 0.7382μs | 1.3546 MOps/s | 1.3506 MOps/s | |
test_to | 54.7310μs | 25.3393μs | 39.4644 KOps/s | 38.8090 KOps/s | |
test_to_nonblocking | 57.6620μs | 23.3621μs | 42.8044 KOps/s | 42.0438 KOps/s | |
test_unbind_speed | 0.3465ms | 0.2763ms | 3.6197 KOps/s | 3.6345 KOps/s | |
test_unbind_speed_stack0 | 0.3339ms | 0.2690ms | 3.7170 KOps/s | 3.6337 KOps/s | |
test_unbind_speed_stack1 | 93.2698ms | 0.6939ms | 1.4410 KOps/s | 1.4298 KOps/s | |
test_split | 95.1194ms | 2.1033ms | 475.4431 Ops/s | 462.4089 Ops/s | |
test_chunk | 95.3029ms | 2.1021ms | 475.7137 Ops/s | 458.9178 Ops/s | |
test_creation[device0] | 0.3725ms | 0.1261ms | 7.9297 KOps/s | 8.0108 KOps/s | |
test_creation_from_tensor | 0.3541ms | 0.1282ms | 7.8006 KOps/s | 7.5958 KOps/s | |
test_add_one[memmap_tensor0] | 0.2189ms | 8.8767μs | 112.6549 KOps/s | 116.2419 KOps/s | |
test_contiguous[memmap_tensor0] | 36.5610μs | 2.1461μs | 465.9636 KOps/s | 464.5151 KOps/s | |
test_stack[memmap_tensor0] | 29.7610μs | 6.5835μs | 151.8960 KOps/s | 153.3374 KOps/s | |
test_memmaptd_index | 1.0495ms | 0.4091ms | 2.4443 KOps/s | 2.3818 KOps/s | |
test_memmaptd_index_astensor | 0.7423ms | 0.4690ms | 2.1324 KOps/s | 2.0895 KOps/s | |
test_memmaptd_index_op | 1.3895ms | 0.9735ms | 1.0272 KOps/s | 985.2381 Ops/s | |
test_serialize_model | 0.1306s | 0.1289s | 7.7561 Ops/s | 7.7517 Ops/s | |
test_serialize_model_pickle | 1.3470s | 1.2118s | 0.8252 Ops/s | 0.8241 Ops/s | |
test_serialize_weights | 0.2199s | 0.1413s | 7.0749 Ops/s | 6.9847 Ops/s | |
test_serialize_weights_returnearly | 0.2249s | 57.0550ms | 17.5269 Ops/s | 17.8680 Ops/s | |
test_serialize_weights_pickle | 1.3475s | 1.2167s | 0.8219 Ops/s | 0.8190 Ops/s | |
test_reshape_pytree | 79.0120μs | 36.0476μs | 27.7411 KOps/s | 28.2154 KOps/s | |
test_reshape_td | 70.7120μs | 39.9523μs | 25.0299 KOps/s | 23.5005 KOps/s | |
test_view_pytree | 74.2820μs | 35.4154μs | 28.2363 KOps/s | 27.8836 KOps/s | |
test_view_td | 81.1820μs | 44.0092μs | 22.7225 KOps/s | 21.2537 KOps/s | |
test_unbind_pytree | 63.5220μs | 34.4456μs | 29.0313 KOps/s | 27.8992 KOps/s | |
test_unbind_td | 0.3480ms | 41.6381μs | 24.0165 KOps/s | 23.4688 KOps/s | |
test_split_pytree | 82.6520μs | 46.7044μs | 21.4113 KOps/s | 20.9019 KOps/s | |
test_split_td | 0.4151ms | 55.1697μs | 18.1259 KOps/s | 17.2507 KOps/s | |
test_add_pytree | 0.1051ms | 58.0198μs | 17.2355 KOps/s | 16.2619 KOps/s | |
test_add_td | 0.1582ms | 87.2653μs | 11.4593 KOps/s | 10.3483 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.4063ms | 0.2075ms | 4.8202 KOps/s | 4.8332 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2438ms | 0.1552ms | 6.4433 KOps/s | 6.4133 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1933ms | 0.1436ms | 6.9629 KOps/s | 6.9770 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2760ms | 0.1871ms | 5.3452 KOps/s | 5.3453 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 69.5410μs | 20.5706μs | 48.6132 KOps/s | 50.5019 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 80.2020μs | 43.3225μs | 23.0827 KOps/s | 23.3797 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.2677ms | 65.2447μs | 15.3269 KOps/s | 15.4511 KOps/s | |
test_compile_copy_nested[pytree-eager] | 83.4620μs | 50.6604μs | 19.7393 KOps/s | 19.7982 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.3980ms | 0.3148ms | 3.1765 KOps/s | 3.1719 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.2536ms | 0.2110ms | 4.7388 KOps/s | 4.7489 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1813ms | 0.1274ms | 7.8466 KOps/s | 7.8580 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1207ms | 60.9978μs | 16.3940 KOps/s | 16.9470 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.3672ms | 0.3138ms | 3.1862 KOps/s | 3.1813 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.6999ms | 0.6340ms | 1.5773 KOps/s | 1.6217 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3149ms | 0.2495ms | 4.0078 KOps/s | 3.9890 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.4220ms | 0.3124ms | 3.2015 KOps/s | 3.1717 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1643ms | 73.6600μs | 13.5759 KOps/s | 13.5555 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.2388ms | 0.1339ms | 7.4662 KOps/s | 7.7455 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.6185ms | 0.5417ms | 1.8462 KOps/s | 1.7634 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.3663ms | 0.3130ms | 3.1950 KOps/s | 3.1694 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 46.9620μs | 18.7708μs | 53.2742 KOps/s | 55.9306 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 60.1410μs | 28.4860μs | 35.1050 KOps/s | 36.4146 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1054ms | 68.7130μs | 14.5533 KOps/s | 14.4382 KOps/s | |
test_compile_copy_flat[pytree-eager] | 83.8020μs | 51.4931μs | 19.4201 KOps/s | 19.4149 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 2.3103ms | 0.8004ms | 1.2494 KOps/s | 1.1601 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.4745ms | 3.2202ms | 310.5387 Ops/s | 311.3047 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 2.2927ms | 0.7862ms | 1.2719 KOps/s | 1.1792 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.3866ms | 3.2567ms | 307.0552 Ops/s | 316.2152 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1560ms | 0.1095ms | 9.1339 KOps/s | 9.2889 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.1838ms | 60.7826μs | 16.4521 KOps/s | 17.0262 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1647ms | 0.1069ms | 9.3541 KOps/s | 9.7911 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1333ms | 43.8141μs | 22.8237 KOps/s | 23.4041 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1954ms | 0.1093ms | 9.1475 KOps/s | 9.7332 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 95.3330μs | 43.1386μs | 23.1811 KOps/s | 23.4816 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.2495ms | 0.1421ms | 7.0349 KOps/s | 7.3562 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1519ms | 25.4462μs | 39.2986 KOps/s | 40.2452 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1774ms | 0.1333ms | 7.5044 KOps/s | 7.7241 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 66.5720μs | 20.9970μs | 47.6259 KOps/s | 48.1990 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1968ms | 0.1378ms | 7.2581 KOps/s | 7.6972 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 57.3810μs | 20.9179μs | 47.8060 KOps/s | 48.4954 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1956ms | 0.1433ms | 6.9788 KOps/s | 7.3400 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5053ms | 25.6304μs | 39.0161 KOps/s | 39.8131 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.2244ms | 0.1376ms | 7.2676 KOps/s | 7.6545 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 52.8420μs | 20.7787μs | 48.1263 KOps/s | 48.9205 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1783ms | 0.1382ms | 7.2361 KOps/s | 7.7102 KOps/s | |
test_compile_indexing[int-pytree-eager] | 52.5610μs | 21.9049μs | 45.6519 KOps/s | 48.3035 KOps/s | |
test_mod_add[eager] | 70.5610μs | 30.9904μs | 32.2681 KOps/s | 30.3743 KOps/s | |
test_mod_add[compile] | 0.1278ms | 71.7069μs | 13.9457 KOps/s | 14.3223 KOps/s | |
test_mod_add[compile-overhead] | 0.2643ms | 0.1362ms | 7.3403 KOps/s | 7.1205 KOps/s | |
test_mod_wrap[eager] | 0.3221ms | 0.2398ms | 4.1707 KOps/s | 4.1292 KOps/s | |
test_mod_wrap[compile] | 0.4512ms | 0.2847ms | 3.5122 KOps/s | 3.4721 KOps/s | |
test_mod_wrap[compile-overhead] | 7.5423ms | 4.1109ms | 243.2564 Ops/s | 248.0829 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.4568ms | 1.3401ms | 746.2275 Ops/s | 693.3521 Ops/s | |
test_mod_wrap_and_backward[compile] | 2.6719ms | 1.3176ms | 758.9574 Ops/s | 702.3983 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.2954ms | 0.8845ms | 1.1306 KOps/s | 1.0185 KOps/s | |
test_seq_add[eager] | 0.2009ms | 93.8608μs | 10.6541 KOps/s | 10.3174 KOps/s | |
test_seq_add[compile] | 0.3646ms | 80.3147μs | 12.4510 KOps/s | 12.5566 KOps/s | |
test_seq_add[compile-overhead] | 0.1897ms | 0.1137ms | 8.7933 KOps/s | 8.8347 KOps/s | |
test_seq_wrap[eager] | 0.4338ms | 0.3709ms | 2.6963 KOps/s | 2.6099 KOps/s | |
test_seq_wrap[compile] | 0.3518ms | 0.3014ms | 3.3183 KOps/s | 3.2908 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2546ms | 0.2078ms | 4.8113 KOps/s | 4.8428 KOps/s | |
test_func_call_runtime[False-eager] | 0.8294ms | 0.7547ms | 1.3250 KOps/s | 1.3600 KOps/s | |
test_func_call_runtime[False-compile] | 0.8749ms | 0.7734ms | 1.2930 KOps/s | 1.2624 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4052ms | 0.3432ms | 2.9142 KOps/s | 2.9067 KOps/s | |
test_func_call_runtime[True-eager] | 0.9752ms | 0.8939ms | 1.1187 KOps/s | 1.1215 KOps/s | |
test_func_call_runtime[True-compile] | 0.9161ms | 0.8169ms | 1.2241 KOps/s | 1.2107 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4320ms | 0.3798ms | 2.6330 KOps/s | 2.6354 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8583ms | 0.7240ms | 1.3812 KOps/s | 1.3657 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8935ms | 0.7780ms | 1.2853 KOps/s | 1.2620 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.3888ms | 0.3461ms | 2.8897 KOps/s | 2.8679 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.0946ms | 0.9872ms | 1.0130 KOps/s | 1.0051 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.8896ms | 0.8399ms | 1.1907 KOps/s | 1.1740 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.4807ms | 0.4029ms | 2.4823 KOps/s | 2.4659 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.6158ms | 2.0817ms | 480.3848 Ops/s | 477.6970 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9220ms | 0.8565ms | 1.1675 KOps/s | 1.1533 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.5100ms | 0.4070ms | 2.4570 KOps/s | 2.4473 KOps/s | |
test_distributed | 3.7133ms | 0.2103ms | 4.7545 KOps/s | 8.8302 KOps/s | |
test_tdmodule | 83.1820μs | 13.7728μs | 72.6071 KOps/s | 62.0871 KOps/s | |
test_tdmodule_dispatch | 39.7010μs | 27.0228μs | 37.0058 KOps/s | 32.6484 KOps/s | |
test_tdseq | 36.6910μs | 14.1754μs | 70.5449 KOps/s | 62.0518 KOps/s | |
test_tdseq_dispatch | 52.2810μs | 29.2690μs | 34.1658 KOps/s | 30.0784 KOps/s | |
test_instantiation_functorch | 1.9400ms | 1.8461ms | 541.6883 Ops/s | 534.5258 Ops/s | |
test_instantiation_td | 1.8114ms | 1.2007ms | 832.8471 Ops/s | 829.0192 Ops/s | |
test_exec_functorch | 0.2503ms | 0.2083ms | 4.8010 KOps/s | 4.7967 KOps/s | |
test_exec_functional_call | 0.2642ms | 0.2074ms | 4.8212 KOps/s | 4.7793 KOps/s | |
test_exec_td | 0.2942ms | 0.2119ms | 4.7184 KOps/s | 4.6796 KOps/s | |
test_exec_td_decorator | 1.0890ms | 0.2560ms | 3.9068 KOps/s | 3.9132 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.7842ms | 0.6836ms | 1.4629 KOps/s | 1.4450 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.7861ms | 0.6814ms | 1.4677 KOps/s | 1.4176 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.6298ms | 0.5770ms | 1.7330 KOps/s | 1.6514 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.6427ms | 0.5771ms | 1.7328 KOps/s | 1.6506 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.2559ms | 0.6703ms | 1.4918 KOps/s | 1.4634 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8416ms | 0.6703ms | 1.4918 KOps/s | 1.4811 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7082ms | 0.5923ms | 1.6882 KOps/s | 1.6758 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7198ms | 0.5933ms | 1.6855 KOps/s | 1.6958 KOps/s | |
test_vmap_transformer_speed[True-True] | 8.4416ms | 8.3040ms | 120.4245 Ops/s | 119.6202 Ops/s | |
test_vmap_transformer_speed[True-False] | 8.4544ms | 8.3018ms | 120.4556 Ops/s | 118.9336 Ops/s | |
test_vmap_transformer_speed[False-True] | 8.2343ms | 8.1098ms | 123.3072 Ops/s | 122.3587 Ops/s | |
test_vmap_transformer_speed[False-False] | 8.1878ms | 8.0967ms | 123.5073 Ops/s | 122.9034 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 20.1487ms | 19.5394ms | 51.1786 Ops/s | 50.8657 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.7141ms | 19.5046ms | 51.2699 Ops/s | 51.2235 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.4939ms | 19.3371ms | 51.7140 Ops/s | 51.6014 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.3817ms | 19.3208ms | 51.7576 Ops/s | 50.8817 Ops/s | |
test_to_module_speed[True] | 1.3011ms | 0.9454ms | 1.0577 KOps/s | 1.0880 KOps/s | |
test_to_module_speed[False] | 1.3348ms | 0.9261ms | 1.0798 KOps/s | 1.0952 KOps/s | |
test_tc_init | 59.6920μs | 32.5472μs | 30.7246 KOps/s | 28.8272 KOps/s | |
test_tc_init_nested | 0.1045ms | 66.2089μs | 15.1037 KOps/s | 14.3652 KOps/s | |
test_tc_first_layer_tensor | 4.3944μs | 0.6747μs | 1.4822 MOps/s | 1.4612 MOps/s | |
test_tc_first_layer_nontensor | 23.1610μs | 2.2781μs | 438.9570 KOps/s | 445.1184 KOps/s | |
test_tc_second_layer_tensor | 9.8477μs | 1.3833μs | 722.8894 KOps/s | 722.3961 KOps/s | |
test_tc_second_layer_nontensor | 52.1210μs | 2.9894μs | 334.5182 KOps/s | 336.0588 KOps/s | |
test_unbind | 0.1891s | 11.9835ms | 83.4478 Ops/s | 93.8265 Ops/s | |
test_full_like | 0.6587ms | 0.5740ms | 1.7423 KOps/s | 1.7388 KOps/s | |
test_zeros_like | 0.2611ms | 0.1979ms | 5.0527 KOps/s | 5.0529 KOps/s | |
test_ones_like | 0.2827ms | 0.1977ms | 5.0577 KOps/s | 5.0554 KOps/s | |
test_clone | 0.4360ms | 0.4142ms | 2.4142 KOps/s | 2.4084 KOps/s | |
test_squeeze | 42.0110μs | 9.6536μs | 103.5879 KOps/s | 103.8597 KOps/s | |
test_unsqueeze | 0.3000ms | 73.7500μs | 13.5593 KOps/s | 13.7187 KOps/s | |
test_split | 0.2578ms | 0.1556ms | 6.4275 KOps/s | 6.3257 KOps/s | |
test_permute | 0.2251ms | 0.1760ms | 5.6804 KOps/s | 5.6661 KOps/s | |
test_stack | 1.2675ms | 0.8630ms | 1.1587 KOps/s | 1.1615 KOps/s | |
test_cat | 1.2634ms | 1.2314ms | 812.0546 Ops/s | 811.7569 Ops/s |
vmoens
added a commit
that referenced
this pull request
Sep 10, 2024
ghstack-source-id: a53fb9db23682bea92399dd4cf7dab1ae6aa11f8 Pull Request resolved: #984
vmoens
added a commit
that referenced
this pull request
Sep 10, 2024
ghstack-source-id: 09026c1eb275dd0c3584ff0f4035992f7715bb73 Pull Request resolved: #984
vmoens
added a commit
that referenced
this pull request
Sep 10, 2024
ghstack-source-id: ae7e6170eebeb7026a230596f39f704971e0fc06 Pull Request resolved: #984
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 44.9840μs | 21.5329μs | 46.4406 KOps/s | 50.3239 KOps/s | |
test_plain_set_stack_nested | 49.1210μs | 21.4354μs | 46.6517 KOps/s | 49.1602 KOps/s | |
test_plain_set_nested_inplace | 77.4840μs | 22.9686μs | 43.5377 KOps/s | 46.0360 KOps/s | |
test_plain_set_stack_nested_inplace | 57.8180μs | 23.0757μs | 43.3357 KOps/s | 46.0490 KOps/s | |
test_items | 37.4100μs | 4.2921μs | 232.9863 KOps/s | 233.5784 KOps/s | |
test_items_nested | 0.7081ms | 0.3359ms | 2.9775 KOps/s | 3.0207 KOps/s | |
test_items_nested_locked | 0.4534ms | 0.3332ms | 3.0016 KOps/s | 2.9998 KOps/s | |
test_items_nested_leaf | 0.1647ms | 86.4791μs | 11.5635 KOps/s | 11.5668 KOps/s | |
test_items_stack_nested | 0.5395ms | 0.3379ms | 2.9591 KOps/s | 2.9838 KOps/s | |
test_items_stack_nested_leaf | 0.1631ms | 86.6088μs | 11.5462 KOps/s | 11.5133 KOps/s | |
test_items_stack_nested_locked | 0.5430ms | 0.3373ms | 2.9648 KOps/s | 2.9437 KOps/s | |
test_keys | 22.2110μs | 3.5111μs | 284.8111 KOps/s | 286.2371 KOps/s | |
test_keys_nested | 0.2151ms | 0.1008ms | 9.9231 KOps/s | 9.9733 KOps/s | |
test_keys_nested_locked | 0.7780ms | 0.1043ms | 9.5905 KOps/s | 9.4500 KOps/s | |
test_keys_nested_leaf | 0.1483ms | 84.3884μs | 11.8500 KOps/s | 11.8921 KOps/s | |
test_keys_stack_nested | 0.1650ms | 99.8027μs | 10.0198 KOps/s | 10.1207 KOps/s | |
test_keys_stack_nested_leaf | 0.1569ms | 84.3045μs | 11.8618 KOps/s | 12.0987 KOps/s | |
test_keys_stack_nested_locked | 0.1851ms | 0.1013ms | 9.8701 KOps/s | 9.6446 KOps/s | |
test_values | 8.2132μs | 1.0809μs | 925.1552 KOps/s | 907.4758 KOps/s | |
test_values_nested | 0.1034ms | 48.3684μs | 20.6747 KOps/s | 21.0064 KOps/s | |
test_values_nested_locked | 0.1020ms | 47.9699μs | 20.8464 KOps/s | 20.2960 KOps/s | |
test_values_nested_leaf | 89.8380μs | 42.6405μs | 23.4519 KOps/s | 23.5565 KOps/s | |
test_values_stack_nested | 98.6340μs | 47.6539μs | 20.9846 KOps/s | 20.8446 KOps/s | |
test_values_stack_nested_leaf | 94.2060μs | 43.3229μs | 23.0825 KOps/s | 23.6945 KOps/s | |
test_values_stack_nested_locked | 97.7620μs | 48.1147μs | 20.7837 KOps/s | 20.9268 KOps/s | |
test_membership | 32.9620μs | 0.8356μs | 1.1968 MOps/s | 1.1796 MOps/s | |
test_membership_nested | 45.6260μs | 2.6359μs | 379.3729 KOps/s | 391.3859 KOps/s | |
test_membership_nested_leaf | 41.6080μs | 2.6316μs | 380.0041 KOps/s | 388.4481 KOps/s | |
test_membership_stacked_nested | 42.8800μs | 2.6198μs | 381.7139 KOps/s | 398.3140 KOps/s | |
test_membership_stacked_nested_leaf | 18.1040μs | 2.6204μs | 381.6149 KOps/s | 391.7684 KOps/s | |
test_membership_nested_last | 46.4470μs | 3.8273μs | 261.2823 KOps/s | 267.5512 KOps/s | |
test_membership_nested_leaf_last | 26.8800μs | 3.7729μs | 265.0475 KOps/s | 266.2409 KOps/s | |
test_membership_stacked_nested_last | 33.9940μs | 3.7482μs | 266.7962 KOps/s | 270.1409 KOps/s | |
test_membership_stacked_nested_leaf_last | 24.7670μs | 3.7711μs | 265.1743 KOps/s | 267.9129 KOps/s | |
test_nested_getleaf | 47.1380μs | 10.8122μs | 92.4882 KOps/s | 93.1485 KOps/s | |
test_nested_get | 31.5290μs | 10.3452μs | 96.6628 KOps/s | 98.8148 KOps/s | |
test_stacked_getleaf | 52.7090μs | 10.7705μs | 92.8461 KOps/s | 93.9737 KOps/s | |
test_stacked_get | 52.3880μs | 10.2453μs | 97.6062 KOps/s | 99.1091 KOps/s | |
test_nested_getitemleaf | 44.4650μs | 11.2105μs | 89.2020 KOps/s | 91.1554 KOps/s | |
test_nested_getitem | 32.2700μs | 10.3990μs | 96.1627 KOps/s | 96.6634 KOps/s | |
test_stacked_getitemleaf | 46.7070μs | 11.1027μs | 90.0684 KOps/s | 91.8153 KOps/s | |
test_stacked_getitem | 57.0370μs | 10.4021μs | 96.1342 KOps/s | 98.1853 KOps/s | |
test_lock_nested | 82.9381ms | 0.5584ms | 1.7909 KOps/s | 2.1233 KOps/s | |
test_lock_stack_nested | 0.6786ms | 0.4482ms | 2.2312 KOps/s | 2.2374 KOps/s | |
test_unlock_nested | 85.8153ms | 0.4834ms | 2.0686 KOps/s | 2.4962 KOps/s | |
test_unlock_stack_nested | 0.5605ms | 0.3663ms | 2.7303 KOps/s | 2.7223 KOps/s | |
test_flatten_speed | 0.1874ms | 0.1034ms | 9.6728 KOps/s | 9.5281 KOps/s | |
test_unflatten_speed | 0.5298ms | 0.4611ms | 2.1689 KOps/s | 2.1654 KOps/s | |
test_common_ops | 6.1934ms | 1.1256ms | 888.3781 Ops/s | 951.0971 Ops/s | |
test_creation | 23.0530μs | 2.0594μs | 485.5781 KOps/s | 468.8565 KOps/s | |
test_creation_empty | 46.9080μs | 19.1137μs | 52.3185 KOps/s | 61.2358 KOps/s | |
test_creation_nested_1 | 79.8090μs | 21.8945μs | 45.6736 KOps/s | 50.9421 KOps/s | |
test_creation_nested_2 | 0.1034ms | 26.1580μs | 38.2292 KOps/s | 42.4081 KOps/s | |
test_clone | 84.3680μs | 17.2682μs | 57.9098 KOps/s | 60.7333 KOps/s | |
test_getitem[int] | 1.0807ms | 16.4829μs | 60.6688 KOps/s | 60.1681 KOps/s | |
test_getitem[slice_int] | 0.1322ms | 29.8292μs | 33.5242 KOps/s | 32.1040 KOps/s | |
test_getitem[range] | 0.2254ms | 60.6393μs | 16.4910 KOps/s | 18.1145 KOps/s | |
test_getitem[tuple] | 0.1677ms | 24.7476μs | 40.4079 KOps/s | 40.3308 KOps/s | |
test_getitem[list] | 0.2247ms | 54.1011μs | 18.4839 KOps/s | 19.5622 KOps/s | |
test_setitem_dim[int] | 86.8930μs | 40.3662μs | 24.7732 KOps/s | 27.1242 KOps/s | |
test_setitem_dim[slice_int] | 0.1203ms | 69.4534μs | 14.3982 KOps/s | 15.3506 KOps/s | |
test_setitem_dim[range] | 0.1747ms | 93.6477μs | 10.6783 KOps/s | 11.2102 KOps/s | |
test_setitem_dim[tuple] | 89.7680μs | 56.3488μs | 17.7466 KOps/s | 18.3554 KOps/s | |
test_setitem | 0.1002ms | 29.8071μs | 33.5490 KOps/s | 35.8209 KOps/s | |
test_set | 95.7990μs | 29.2684μs | 34.1666 KOps/s | 36.9296 KOps/s | |
test_set_shared | 2.4708ms | 0.2114ms | 4.7307 KOps/s | 4.7477 KOps/s | |
test_update | 0.1316ms | 36.3574μs | 27.5047 KOps/s | 29.8577 KOps/s | |
test_update_nested | 0.1249ms | 46.3629μs | 21.5690 KOps/s | 22.3640 KOps/s | |
test_update__nested | 93.2140μs | 34.2769μs | 29.1742 KOps/s | 29.6833 KOps/s | |
test_set_nested | 94.9080μs | 31.3045μs | 31.9443 KOps/s | 33.9977 KOps/s | |
test_set_nested_new | 84.4880μs | 36.4999μs | 27.3974 KOps/s | 29.0288 KOps/s | |
test_select | 1.1552ms | 53.6589μs | 18.6362 KOps/s | 19.2602 KOps/s | |
test_select_nested | 0.1281ms | 60.0279μs | 16.6589 KOps/s | 16.5797 KOps/s | |
test_exclude_nested | 0.1450ms | 76.0233μs | 13.1539 KOps/s | 13.2268 KOps/s | |
test_empty[True] | 0.4542ms | 0.3180ms | 3.1446 KOps/s | 3.1531 KOps/s | |
test_empty[False] | 31.6415μs | 1.2409μs | 805.8638 KOps/s | 816.8211 KOps/s | |
test_unbind_speed | 0.3794ms | 0.2965ms | 3.3725 KOps/s | 3.3844 KOps/s | |
test_unbind_speed_stack0 | 0.5989ms | 0.2958ms | 3.3812 KOps/s | 3.4551 KOps/s | |
test_unbind_speed_stack1 | 87.2133ms | 0.8001ms | 1.2498 KOps/s | 1.3685 KOps/s | |
test_split | 3.1162ms | 1.9936ms | 501.6173 Ops/s | 462.1853 Ops/s | |
test_chunk | 89.9514ms | 2.3404ms | 427.2765 Ops/s | 459.4865 Ops/s | |
test_creation[device0] | 4.1368ms | 0.1186ms | 8.4336 KOps/s | 8.6900 KOps/s | |
test_creation_from_tensor | 0.2388ms | 0.1148ms | 8.7139 KOps/s | 8.5743 KOps/s | |
test_add_one[memmap_tensor0] | 0.1998ms | 7.6249μs | 131.1492 KOps/s | 143.3581 KOps/s | |
test_contiguous[memmap_tensor0] | 22.1920μs | 1.8645μs | 536.3234 KOps/s | 531.2918 KOps/s | |
test_stack[memmap_tensor0] | 37.7010μs | 5.6753μs | 176.2030 KOps/s | 182.6504 KOps/s | |
test_memmaptd_index | 1.1308ms | 0.4049ms | 2.4700 KOps/s | 2.5512 KOps/s | |
test_memmaptd_index_astensor | 1.0327ms | 0.4816ms | 2.0764 KOps/s | 2.1221 KOps/s | |
test_memmaptd_index_op | 1.6901ms | 1.0329ms | 968.1400 Ops/s | 1.0509 KOps/s | |
test_serialize_model | 0.1310s | 0.1162s | 8.6082 Ops/s | 8.2293 Ops/s | |
test_serialize_model_pickle | 0.4759s | 0.4001s | 2.4991 Ops/s | 2.4874 Ops/s | |
test_serialize_weights | 0.1204s | 0.1158s | 8.6368 Ops/s | 7.4165 Ops/s | |
test_serialize_weights_returnearly | 0.1719s | 0.1594s | 6.2740 Ops/s | 6.2210 Ops/s | |
test_serialize_weights_pickle | 1.0558s | 0.7083s | 1.4119 Ops/s | 2.3093 Ops/s | |
test_serialize_weights_filesystem | 0.1469s | 0.1398s | 7.1522 Ops/s | 6.8979 Ops/s | |
test_serialize_model_filesystem | 0.2284s | 0.1523s | 6.5655 Ops/s | 6.0705 Ops/s | |
test_reshape_pytree | 85.9400μs | 37.9464μs | 26.3530 KOps/s | 25.7321 KOps/s | |
test_reshape_td | 0.1495ms | 47.2780μs | 21.1515 KOps/s | 21.5529 KOps/s | |
test_view_pytree | 0.1047ms | 38.3321μs | 26.0878 KOps/s | 26.4140 KOps/s | |
test_view_td | 93.9560μs | 51.8185μs | 19.2981 KOps/s | 19.3828 KOps/s | |
test_unbind_pytree | 98.5220μs | 36.0286μs | 27.7557 KOps/s | 28.4316 KOps/s | |
test_unbind_td | 0.3305ms | 44.4010μs | 22.5220 KOps/s | 22.5152 KOps/s | |
test_split_pytree | 0.1036ms | 37.5538μs | 26.6285 KOps/s | 26.7111 KOps/s | |
test_split_td | 0.2438ms | 56.9980μs | 17.5445 KOps/s | 17.4017 KOps/s | |
test_add_pytree | 0.1017ms | 44.7070μs | 22.3679 KOps/s | 23.0367 KOps/s | |
test_add_td | 0.2229ms | 81.5949μs | 12.2557 KOps/s | 12.8953 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1330ms | 56.3988μs | 17.7309 KOps/s | 17.5690 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.3236ms | 0.1816ms | 5.5058 KOps/s | 5.3790 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1283ms | 55.9533μs | 17.8720 KOps/s | 17.5840 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2857ms | 0.1446ms | 6.9179 KOps/s | 7.2493 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 47.2880μs | 20.6710μs | 48.3769 KOps/s | 49.1760 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1321ms | 66.6212μs | 15.0102 KOps/s | 15.0777 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1469ms | 74.4748μs | 13.4274 KOps/s | 13.2389 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1387ms | 67.8661μs | 14.7349 KOps/s | 14.6003 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.3752ms | 0.1717ms | 5.8239 KOps/s | 5.8251 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3100ms | 0.1852ms | 5.3988 KOps/s | 5.2700 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1098ms | 45.8242μs | 21.8225 KOps/s | 21.0290 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.6769ms | 67.5142μs | 14.8117 KOps/s | 14.7493 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.3238ms | 0.1726ms | 5.7923 KOps/s | 5.7364 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.5873ms | 0.3028ms | 3.3022 KOps/s | 3.5035 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.2986ms | 0.1978ms | 5.0559 KOps/s | 4.8972 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.5712ms | 0.1730ms | 5.7813 KOps/s | 5.7635 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1238ms | 59.8905μs | 16.6971 KOps/s | 16.1985 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 92.2420μs | 47.8768μs | 20.8870 KOps/s | 20.4903 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.4327ms | 0.2440ms | 4.0991 KOps/s | 4.2878 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2638ms | 0.1742ms | 5.7394 KOps/s | 5.6719 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.2365ms | 0.1009ms | 9.9121 KOps/s | 9.7187 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1356ms | 59.8672μs | 16.7036 KOps/s | 16.7114 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1420ms | 74.9491μs | 13.3424 KOps/s | 12.9129 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1349ms | 68.1726μs | 14.6687 KOps/s | 14.3151 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.2863ms | 0.1950ms | 5.1290 KOps/s | 5.1195 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.8615ms | 1.6761ms | 596.6323 Ops/s | 609.3192 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.2803ms | 0.1936ms | 5.1655 KOps/s | 5.1280 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 2.0577ms | 1.1548ms | 865.9164 Ops/s | 917.0348 Ops/s | |
test_compile_assign_and_add_stack[compile] | 0.5687ms | 0.4295ms | 2.3285 KOps/s | 2.3725 KOps/s | |
test_compile_assign_and_add_stack[eager] | 4.0710ms | 3.8262ms | 261.3531 Ops/s | 283.2649 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1072ms | 34.1381μs | 29.2928 KOps/s | 28.4930 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 1.0641ms | 47.3526μs | 21.1182 KOps/s | 21.2039 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 98.5440μs | 29.6774μs | 33.6957 KOps/s | 32.5581 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 67.9970μs | 29.0346μs | 34.4417 KOps/s | 35.2753 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1009ms | 30.5355μs | 32.7488 KOps/s | 31.9847 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 97.9830μs | 28.7475μs | 34.7856 KOps/s | 35.5823 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1689ms | 73.3614μs | 13.6311 KOps/s | 13.5518 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.5376ms | 26.8754μs | 37.2088 KOps/s | 35.3473 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1723ms | 69.1905μs | 14.4529 KOps/s | 14.7629 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 88.8060μs | 22.7526μs | 43.9510 KOps/s | 43.9518 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1422ms | 68.2289μs | 14.6566 KOps/s | 14.7681 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 68.8990μs | 22.6401μs | 44.1694 KOps/s | 43.9827 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.5148ms | 74.6181μs | 13.4016 KOps/s | 13.6217 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 1.1383ms | 26.7516μs | 37.3810 KOps/s | 35.1691 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1528ms | 69.1515μs | 14.4610 KOps/s | 14.7309 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 59.7010μs | 22.6173μs | 44.2139 KOps/s | 44.5768 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1345ms | 68.5514μs | 14.5876 KOps/s | 14.7494 KOps/s | |
test_compile_indexing[int-pytree-eager] | 82.9650μs | 22.6047μs | 44.2385 KOps/s | 44.7015 KOps/s | |
test_mod_add[eager] | 76.6540μs | 24.4125μs | 40.9627 KOps/s | 43.0117 KOps/s | |
test_mod_add[compile] | 0.1056ms | 37.8088μs | 26.4489 KOps/s | 25.2962 KOps/s | |
test_mod_add[compile-overhead] | 0.1048ms | 38.4771μs | 25.9895 KOps/s | 25.2319 KOps/s | |
test_mod_wrap[eager] | 0.4008ms | 0.2127ms | 4.7017 KOps/s | 4.9201 KOps/s | |
test_mod_wrap[compile] | 0.3780ms | 0.2341ms | 4.2713 KOps/s | 4.2306 KOps/s | |
test_mod_wrap[compile-overhead] | 0.4414ms | 0.2371ms | 4.2178 KOps/s | 4.3048 KOps/s | |
test_mod_wrap_and_backward[eager] | 12.0293ms | 10.7565ms | 92.9671 Ops/s | 94.3404 Ops/s | |
test_mod_wrap_and_backward[compile] | 12.5038ms | 10.8942ms | 91.7919 Ops/s | 91.2447 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 13.0064ms | 10.9209ms | 91.5674 Ops/s | 89.8358 Ops/s | |
test_seq_add[eager] | 0.1731ms | 87.2337μs | 11.4635 KOps/s | 11.7468 KOps/s | |
test_seq_add[compile] | 0.1221ms | 63.2962μs | 15.7987 KOps/s | 15.4943 KOps/s | |
test_seq_add[compile-overhead] | 0.1416ms | 62.5196μs | 15.9950 KOps/s | 15.8586 KOps/s | |
test_seq_wrap[eager] | 0.5732ms | 0.3905ms | 2.5611 KOps/s | 2.7126 KOps/s | |
test_seq_wrap[compile] | 0.5204ms | 0.2731ms | 3.6615 KOps/s | 3.7612 KOps/s | |
test_seq_wrap[compile-overhead] | 0.4923ms | 0.2714ms | 3.6851 KOps/s | 3.7859 KOps/s | |
test_func_call_runtime[False-eager] | 1.0038ms | 0.5396ms | 1.8531 KOps/s | 1.9502 KOps/s | |
test_func_call_runtime[False-compile] | 0.9366ms | 0.5140ms | 1.9454 KOps/s | 2.0358 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 1.3669ms | 0.5188ms | 1.9276 KOps/s | 2.0374 KOps/s | |
test_func_call_runtime[True-eager] | 1.5727ms | 0.7564ms | 1.3221 KOps/s | 1.3798 KOps/s | |
test_func_call_runtime[True-compile] | 0.8986ms | 0.5175ms | 1.9323 KOps/s | 1.9911 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.9358ms | 0.5195ms | 1.9250 KOps/s | 1.9985 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.7957ms | 0.5359ms | 1.8661 KOps/s | 1.9743 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.6856ms | 0.5201ms | 1.9227 KOps/s | 1.9935 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.7030ms | 0.5146ms | 1.9433 KOps/s | 2.0153 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.2203ms | 0.8910ms | 1.1223 KOps/s | 1.1746 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.8901ms | 0.7595ms | 1.3167 KOps/s | 1.3721 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.0675ms | 0.7602ms | 1.3154 KOps/s | 1.3757 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5629ms | 1.8774ms | 532.6477 Ops/s | 542.4046 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 3.0567ms | 1.9545ms | 511.6321 Ops/s | 529.3164 Ops/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 3.0241ms | 1.9378ms | 516.0506 Ops/s | 528.3973 Ops/s | |
test_distributed | 0.2568ms | 0.1239ms | 8.0680 KOps/s | 7.9100 KOps/s | |
test_tdmodule | 36.5680μs | 17.6618μs | 56.6193 KOps/s | 61.0267 KOps/s | |
test_tdmodule_dispatch | 65.7630μs | 36.6749μs | 27.2666 KOps/s | 29.3337 KOps/s | |
test_tdseq | 49.4320μs | 20.2553μs | 49.3698 KOps/s | 50.6109 KOps/s | |
test_tdseq_dispatch | 97.1910μs | 41.5029μs | 24.0947 KOps/s | 25.3998 KOps/s | |
test_instantiation_functorch | 3.2644ms | 1.5743ms | 635.2138 Ops/s | 634.1760 Ops/s | |
test_instantiation_td | 1.8652ms | 1.1554ms | 865.5054 Ops/s | 861.3966 Ops/s | |
test_exec_functorch | 0.4106ms | 0.1874ms | 5.3350 KOps/s | 5.4852 KOps/s | |
test_exec_functional_call | 0.3253ms | 0.1765ms | 5.6668 KOps/s | 5.9728 KOps/s | |
test_exec_td | 0.3120ms | 0.1688ms | 5.9230 KOps/s | 6.0703 KOps/s | |
test_exec_td_decorator | 0.9916ms | 0.2220ms | 4.5035 KOps/s | 4.5975 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.8321ms | 0.6390ms | 1.5649 KOps/s | 1.6153 KOps/s | |
test_vmap_mlp_speed[True-False] | 1.0907ms | 0.6464ms | 1.5470 KOps/s | 1.6125 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.6648ms | 0.4990ms | 2.0041 KOps/s | 2.0685 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.7162ms | 0.4953ms | 2.0189 KOps/s | 2.0552 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.0982ms | 0.6282ms | 1.5919 KOps/s | 1.6648 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8509ms | 0.6258ms | 1.5979 KOps/s | 1.6461 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.8004ms | 0.5133ms | 1.9482 KOps/s | 2.0080 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7466ms | 0.5141ms | 1.9453 KOps/s | 2.0135 KOps/s | |
test_to_module_speed[True] | 2.0656ms | 1.2834ms | 779.1773 Ops/s | 774.0285 Ops/s | |
test_to_module_speed[False] | 1.7664ms | 1.2304ms | 812.7364 Ops/s | 792.9305 Ops/s | |
test_tc_init | 99.0450μs | 44.9168μs | 22.2634 KOps/s | 23.2965 KOps/s | |
test_tc_init_nested | 0.1521ms | 86.1465μs | 11.6081 KOps/s | 11.6420 KOps/s | |
test_tc_first_layer_tensor | 39.0830μs | 1.5212μs | 657.3601 KOps/s | 660.3301 KOps/s | |
test_tc_first_layer_nontensor | 34.9050μs | 4.7029μs | 212.6343 KOps/s | 207.0421 KOps/s | |
test_tc_second_layer_tensor | 22.9130μs | 2.8298μs | 353.3776 KOps/s | 351.3518 KOps/s | |
test_tc_second_layer_nontensor | 44.5830μs | 6.0754μs | 164.5985 KOps/s | 161.5401 KOps/s | |
test_unbind | 0.4838s | 13.1776ms | 75.8864 Ops/s | 73.7831 Ops/s | |
test_full_like | 8.8281ms | 7.4582ms | 134.0802 Ops/s | 79.1745 Ops/s | |
test_zeros_like | 3.4232ms | 2.9410ms | 340.0236 Ops/s | 138.8001 Ops/s | |
test_ones_like | 3.6908ms | 3.2471ms | 307.9640 Ops/s | 127.7473 Ops/s | |
test_clone | 6.7749ms | 5.5245ms | 181.0127 Ops/s | 102.4663 Ops/s | |
test_squeeze | 64.6400μs | 12.9595μs | 77.1633 KOps/s | 78.2989 KOps/s | |
test_unsqueeze | 0.1688ms | 91.0375μs | 10.9845 KOps/s | 10.7024 KOps/s | |
test_split | 0.5122ms | 0.1920ms | 5.2076 KOps/s | 5.0761 KOps/s | |
test_permute | 0.3722ms | 0.2204ms | 4.5379 KOps/s | 4.5238 KOps/s | |
test_stack | 32.2006ms | 26.8205ms | 37.2850 Ops/s | 40.5113 Ops/s | |
test_cat | 32.3506ms | 26.1258ms | 38.2764 Ops/s | 40.2848 Ops/s |
vmoens
added a commit
that referenced
this pull request
Sep 10, 2024
ghstack-source-id: ae7e6170eebeb7026a230596f39f704971e0fc06 Pull Request resolved: #984
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
enhancement
New feature or request
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):