-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] TensorDict.record_stream #1016
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Oct 1, 2024
ghstack-source-id: e5ea6fef54f47304e1a6cafbd15f4bdade5e69b4 Pull Request resolved: #1016
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Oct 1, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 57.8090μs | 19.7476μs | 50.6391 KOps/s | 47.8349 KOps/s | |
test_plain_set_stack_nested | 50.3040μs | 19.8080μs | 50.4847 KOps/s | 46.9167 KOps/s | |
test_plain_set_nested_inplace | 82.6070μs | 21.5445μs | 46.4156 KOps/s | 43.5681 KOps/s | |
test_plain_set_stack_nested_inplace | 72.1650μs | 21.0420μs | 47.5239 KOps/s | 44.2032 KOps/s | |
test_items | 22.1820μs | 4.2157μs | 237.2103 KOps/s | 246.1842 KOps/s | |
test_items_nested | 0.5955ms | 0.3732ms | 2.6796 KOps/s | 2.7582 KOps/s | |
test_items_nested_locked | 0.7623ms | 0.3733ms | 2.6790 KOps/s | 2.7979 KOps/s | |
test_items_nested_leaf | 0.1215ms | 68.1096μs | 14.6822 KOps/s | 14.5594 KOps/s | |
test_items_stack_nested | 0.7047ms | 0.3814ms | 2.6221 KOps/s | 2.7649 KOps/s | |
test_items_stack_nested_leaf | 0.1342ms | 71.0043μs | 14.0836 KOps/s | 14.0226 KOps/s | |
test_items_stack_nested_locked | 0.4623ms | 0.3762ms | 2.6579 KOps/s | 2.7305 KOps/s | |
test_keys | 42.7200μs | 3.6244μs | 275.9080 KOps/s | 282.9106 KOps/s | |
test_keys_nested | 0.1830ms | 98.9406μs | 10.1071 KOps/s | 10.0450 KOps/s | |
test_keys_nested_locked | 0.6957ms | 0.1047ms | 9.5533 KOps/s | 9.5186 KOps/s | |
test_keys_nested_leaf | 0.1413ms | 82.0120μs | 12.1933 KOps/s | 12.1522 KOps/s | |
test_keys_stack_nested | 0.1654ms | 99.0379μs | 10.0971 KOps/s | 10.0016 KOps/s | |
test_keys_stack_nested_leaf | 0.1494ms | 80.7987μs | 12.3764 KOps/s | 12.1422 KOps/s | |
test_keys_stack_nested_locked | 0.1895ms | 0.1038ms | 9.6312 KOps/s | 9.2383 KOps/s | |
test_values | 6.6904μs | 1.0570μs | 946.1029 KOps/s | 947.0718 KOps/s | |
test_values_nested | 0.1382ms | 74.0404μs | 13.5061 KOps/s | 13.2217 KOps/s | |
test_values_nested_locked | 0.1350ms | 74.9287μs | 13.3460 KOps/s | 13.2011 KOps/s | |
test_values_nested_leaf | 0.1169ms | 62.0084μs | 16.1268 KOps/s | 16.0685 KOps/s | |
test_values_stack_nested | 0.1247ms | 76.1907μs | 13.1250 KOps/s | 12.8613 KOps/s | |
test_values_stack_nested_leaf | 0.1209ms | 61.6146μs | 16.2299 KOps/s | 15.9895 KOps/s | |
test_values_stack_nested_locked | 0.1467ms | 76.1302μs | 13.1354 KOps/s | 12.6032 KOps/s | |
test_membership | 3.3433μs | 0.7420μs | 1.3477 MOps/s | 1.1358 MOps/s | |
test_membership_nested | 43.9320μs | 2.8555μs | 350.2073 KOps/s | 362.1335 KOps/s | |
test_membership_nested_leaf | 35.1850μs | 2.8450μs | 351.4903 KOps/s | 361.6725 KOps/s | |
test_membership_stacked_nested | 47.6390μs | 2.8039μs | 356.6459 KOps/s | 362.0152 KOps/s | |
test_membership_stacked_nested_leaf | 29.0940μs | 2.8472μs | 351.2283 KOps/s | 363.8387 KOps/s | |
test_membership_nested_last | 47.7700μs | 4.1523μs | 240.8303 KOps/s | 242.3871 KOps/s | |
test_membership_nested_leaf_last | 35.5960μs | 4.1715μs | 239.7215 KOps/s | 248.4352 KOps/s | |
test_membership_stacked_nested_last | 25.6080μs | 4.1242μs | 242.4686 KOps/s | 250.0185 KOps/s | |
test_membership_stacked_nested_leaf_last | 28.2430μs | 4.1463μs | 241.1782 KOps/s | 249.7452 KOps/s | |
test_nested_getleaf | 51.0660μs | 10.5551μs | 94.7413 KOps/s | 93.2374 KOps/s | |
test_nested_get | 49.7830μs | 10.1341μs | 98.6770 KOps/s | 96.9506 KOps/s | |
test_stacked_getleaf | 32.8010μs | 10.6812μs | 93.6223 KOps/s | 91.3217 KOps/s | |
test_stacked_get | 56.7160μs | 10.0707μs | 99.2975 KOps/s | 99.0581 KOps/s | |
test_nested_getitemleaf | 41.0970μs | 11.1192μs | 89.9349 KOps/s | 89.8861 KOps/s | |
test_nested_getitem | 56.5450μs | 10.2781μs | 97.2945 KOps/s | 95.9438 KOps/s | |
test_stacked_getitemleaf | 53.9810μs | 11.2260μs | 89.0792 KOps/s | 91.0294 KOps/s | |
test_stacked_getitem | 60.9140μs | 10.2335μs | 97.7187 KOps/s | 96.0904 KOps/s | |
test_lock_nested | 83.1198ms | 0.5697ms | 1.7552 KOps/s | 2.0065 KOps/s | |
test_lock_stack_nested | 0.9177ms | 0.4557ms | 2.1944 KOps/s | 2.1726 KOps/s | |
test_unlock_nested | 90.1293ms | 0.4983ms | 2.0068 KOps/s | 2.3905 KOps/s | |
test_unlock_stack_nested | 0.8186ms | 0.3880ms | 2.5771 KOps/s | 2.6399 KOps/s | |
test_flatten_speed | 0.1610ms | 87.6151μs | 11.4136 KOps/s | 11.4397 KOps/s | |
test_unflatten_speed | 0.6144ms | 0.4700ms | 2.1276 KOps/s | 2.1206 KOps/s | |
test_common_ops | 4.2299ms | 1.1089ms | 901.8237 Ops/s | 868.5351 Ops/s | |
test_creation | 31.0690μs | 2.0809μs | 480.5581 KOps/s | 486.9312 KOps/s | |
test_creation_empty | 53.9510μs | 16.3422μs | 61.1914 KOps/s | 51.5454 KOps/s | |
test_creation_nested_1 | 67.7460μs | 19.4653μs | 51.3735 KOps/s | 43.9063 KOps/s | |
test_creation_nested_2 | 63.5990μs | 23.8593μs | 41.9123 KOps/s | 37.2786 KOps/s | |
test_clone | 0.1867ms | 17.5382μs | 57.0185 KOps/s | 58.0304 KOps/s | |
test_getitem[int] | 1.1681ms | 16.7077μs | 59.8525 KOps/s | 58.2499 KOps/s | |
test_getitem[slice_int] | 0.1441ms | 30.7387μs | 32.5323 KOps/s | 31.9232 KOps/s | |
test_getitem[range] | 0.8185ms | 59.7807μs | 16.7278 KOps/s | 17.1368 KOps/s | |
test_getitem[tuple] | 0.1525ms | 24.7450μs | 40.4122 KOps/s | 39.2698 KOps/s | |
test_getitem[list] | 0.1650ms | 54.2236μs | 18.4422 KOps/s | 18.5126 KOps/s | |
test_setitem_dim[int] | 62.8370μs | 33.4983μs | 29.8522 KOps/s | 29.8697 KOps/s | |
test_setitem_dim[slice_int] | 0.1122ms | 62.4561μs | 16.0112 KOps/s | 16.5342 KOps/s | |
test_setitem_dim[range] | 0.1449ms | 84.7425μs | 11.8005 KOps/s | 11.8681 KOps/s | |
test_setitem_dim[tuple] | 98.3840μs | 50.1206μs | 19.9519 KOps/s | 20.1242 KOps/s | |
test_setitem | 73.6870μs | 28.9742μs | 34.5134 KOps/s | 32.0359 KOps/s | |
test_set | 0.2518ms | 28.1426μs | 35.5333 KOps/s | 32.6689 KOps/s | |
test_set_shared | 6.1110ms | 0.2168ms | 4.6128 KOps/s | 4.6008 KOps/s | |
test_update | 0.2757ms | 34.6408μs | 28.8677 KOps/s | 25.8322 KOps/s | |
test_update_nested | 1.8965ms | 45.6021μs | 21.9288 KOps/s | 20.3513 KOps/s | |
test_update__nested | 81.2820μs | 35.2984μs | 28.3299 KOps/s | 27.1728 KOps/s | |
test_set_nested | 0.3550ms | 31.2292μs | 32.0213 KOps/s | 30.5810 KOps/s | |
test_set_nested_new | 0.1934ms | 35.9136μs | 27.8446 KOps/s | 26.0412 KOps/s | |
test_select | 0.2517ms | 52.9539μs | 18.8843 KOps/s | 18.0132 KOps/s | |
test_select_nested | 0.1288ms | 59.7454μs | 16.7377 KOps/s | 16.5404 KOps/s | |
test_exclude_nested | 0.1751ms | 75.0117μs | 13.3313 KOps/s | 13.2631 KOps/s | |
test_empty[True] | 0.6608ms | 0.3152ms | 3.1724 KOps/s | 3.1370 KOps/s | |
test_empty[False] | 22.9478μs | 1.1958μs | 836.2627 KOps/s | 832.1309 KOps/s | |
test_unbind_speed | 0.5216ms | 0.2999ms | 3.3343 KOps/s | 3.2468 KOps/s | |
test_unbind_speed_stack0 | 0.4290ms | 0.3004ms | 3.3290 KOps/s | 3.3582 KOps/s | |
test_unbind_speed_stack1 | 98.1837ms | 0.8248ms | 1.2125 KOps/s | 1.4532 KOps/s | |
test_split | 92.9165ms | 2.1574ms | 463.5170 Ops/s | 452.7703 Ops/s | |
test_chunk | 3.2902ms | 1.9859ms | 503.5394 Ops/s | 450.5945 Ops/s | |
test_creation[device0] | 3.8242ms | 0.1203ms | 8.3107 KOps/s | 8.4631 KOps/s | |
test_creation_from_tensor | 0.2467ms | 0.1174ms | 8.5153 KOps/s | 8.4477 KOps/s | |
test_add_one[memmap_tensor0] | 0.2535ms | 7.1436μs | 139.9857 KOps/s | 136.1869 KOps/s | |
test_contiguous[memmap_tensor0] | 20.5080μs | 1.9772μs | 505.7571 KOps/s | 536.1249 KOps/s | |
test_stack[memmap_tensor0] | 0.1000ms | 5.7674μs | 173.3884 KOps/s | 177.8316 KOps/s | |
test_memmaptd_index | 1.1715ms | 0.4004ms | 2.4976 KOps/s | 2.5303 KOps/s | |
test_memmaptd_index_astensor | 0.7345ms | 0.4766ms | 2.0981 KOps/s | 2.0901 KOps/s | |
test_memmaptd_index_op | 93.6502ms | 1.0890ms | 918.2857 Ops/s | 955.2303 Ops/s | |
test_serialize_model | 0.1297s | 0.1213s | 8.2442 Ops/s | 8.1407 Ops/s | |
test_serialize_model_pickle | 0.5012s | 0.4089s | 2.4455 Ops/s | 2.5316 Ops/s | |
test_serialize_weights | 0.1232s | 0.1162s | 8.6023 Ops/s | 7.4643 Ops/s | |
test_serialize_weights_returnearly | 0.2744s | 0.1749s | 5.7186 Ops/s | 6.1440 Ops/s | |
test_serialize_weights_pickle | 0.5774s | 0.4299s | 2.3263 Ops/s | 2.4863 Ops/s | |
test_serialize_weights_filesystem | 0.1438s | 0.1413s | 7.0763 Ops/s | 6.7427 Ops/s | |
test_serialize_model_filesystem | 0.1598s | 0.1479s | 6.7603 Ops/s | 6.0689 Ops/s | |
test_reshape_pytree | 79.1080μs | 40.3629μs | 24.7752 KOps/s | 25.7797 KOps/s | |
test_reshape_td | 0.1354ms | 47.2044μs | 21.1845 KOps/s | 20.4412 KOps/s | |
test_view_pytree | 87.8940μs | 39.3747μs | 25.3970 KOps/s | 25.6677 KOps/s | |
test_view_td | 0.1459ms | 52.6707μs | 18.9859 KOps/s | 18.2688 KOps/s | |
test_unbind_pytree | 76.5030μs | 36.1711μs | 27.6464 KOps/s | 27.6723 KOps/s | |
test_unbind_td | 0.3160ms | 43.5398μs | 22.9675 KOps/s | 21.3990 KOps/s | |
test_split_pytree | 0.1039ms | 38.0172μs | 26.3039 KOps/s | 26.1244 KOps/s | |
test_split_td | 0.5229ms | 57.6761μs | 17.3382 KOps/s | 17.4134 KOps/s | |
test_add_pytree | 0.1175ms | 45.7848μs | 21.8413 KOps/s | 22.1389 KOps/s | |
test_add_td | 0.1589ms | 78.9896μs | 12.6599 KOps/s | 11.7198 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1384ms | 59.6206μs | 16.7727 KOps/s | 17.5171 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.4267ms | 0.1796ms | 5.5693 KOps/s | 5.6105 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1351ms | 58.3802μs | 17.1291 KOps/s | 17.4221 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.3381ms | 0.1436ms | 6.9629 KOps/s | 7.0508 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 72.4050μs | 21.6438μs | 46.2026 KOps/s | 46.3840 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1430ms | 69.1630μs | 14.4586 KOps/s | 15.0714 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1471ms | 77.5081μs | 12.9019 KOps/s | 13.0819 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1841ms | 70.6892μs | 14.1464 KOps/s | 14.4154 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2638ms | 0.1793ms | 5.5767 KOps/s | 5.7635 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.4261ms | 0.1905ms | 5.2493 KOps/s | 5.1849 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1230ms | 48.6074μs | 20.5730 KOps/s | 20.6239 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1726ms | 69.8220μs | 14.3221 KOps/s | 14.6700 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.4593ms | 0.1831ms | 5.4613 KOps/s | 5.7282 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.7172ms | 0.2945ms | 3.3954 KOps/s | 3.4733 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4203ms | 0.2028ms | 4.9317 KOps/s | 4.9163 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.6256ms | 0.1813ms | 5.5143 KOps/s | 5.7970 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.2228ms | 62.8211μs | 15.9182 KOps/s | 15.9772 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1173ms | 48.9850μs | 20.4144 KOps/s | 21.2364 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.3750ms | 0.2376ms | 4.2094 KOps/s | 4.2457 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2931ms | 0.1804ms | 5.5419 KOps/s | 5.6779 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.2016ms | 0.1078ms | 9.2790 KOps/s | 9.6687 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1348ms | 60.0907μs | 16.6415 KOps/s | 17.1801 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1473ms | 78.0182μs | 12.8175 KOps/s | 12.5326 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1266ms | 69.8018μs | 14.3263 KOps/s | 14.1830 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.3854ms | 0.1987ms | 5.0335 KOps/s | 5.1441 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 1.8350ms | 1.6865ms | 592.9445 Ops/s | 599.6238 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.4370ms | 0.1974ms | 5.0657 KOps/s | 5.1955 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.3137ms | 1.1321ms | 883.3349 Ops/s | 915.5837 Ops/s | |
test_compile_assign_and_add_stack[compile] | 0.5157ms | 0.4212ms | 2.3739 KOps/s | 2.4062 KOps/s | |
test_compile_assign_and_add_stack[eager] | 3.9394ms | 3.6982ms | 270.4040 Ops/s | 255.2251 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1170ms | 35.0286μs | 28.5481 KOps/s | 27.8832 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 1.1510ms | 50.0505μs | 19.9798 KOps/s | 20.4236 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 80.9720μs | 29.7142μs | 33.6539 KOps/s | 31.6485 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 90.1790μs | 30.2068μs | 33.1052 KOps/s | 33.9651 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 77.9160μs | 29.4951μs | 33.9039 KOps/s | 31.2885 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 70.9630μs | 29.8607μs | 33.4888 KOps/s | 33.9741 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1474ms | 74.3924μs | 13.4422 KOps/s | 13.1872 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.5850ms | 28.0190μs | 35.6901 KOps/s | 36.2430 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1501ms | 69.0387μs | 14.4846 KOps/s | 14.0261 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 83.8770μs | 24.3036μs | 41.1461 KOps/s | 42.5667 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1449ms | 68.1576μs | 14.6719 KOps/s | 14.2486 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 77.3150μs | 24.1646μs | 41.3828 KOps/s | 42.4649 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1535ms | 74.1688μs | 13.4828 KOps/s | 13.1598 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 1.2897ms | 27.0832μs | 36.9233 KOps/s | 35.7284 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1329ms | 68.5750μs | 14.5826 KOps/s | 14.0574 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 72.5150μs | 24.0741μs | 41.5385 KOps/s | 42.9430 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1973ms | 70.0825μs | 14.2689 KOps/s | 14.2042 KOps/s | |
test_compile_indexing[int-pytree-eager] | 97.0920μs | 24.0859μs | 41.5181 KOps/s | 42.7735 KOps/s | |
test_mod_add[eager] | 0.1168ms | 25.1950μs | 39.6903 KOps/s | 36.4392 KOps/s | |
test_mod_add[compile] | 0.1045ms | 40.1812μs | 24.8873 KOps/s | 24.1569 KOps/s | |
test_mod_add[compile-overhead] | 0.1057ms | 39.7823μs | 25.1368 KOps/s | 24.8282 KOps/s | |
test_mod_wrap[eager] | 0.3119ms | 0.2066ms | 4.8398 KOps/s | 4.8533 KOps/s | |
test_mod_wrap[compile] | 0.3537ms | 0.2312ms | 4.3257 KOps/s | 4.2738 KOps/s | |
test_mod_wrap[compile-overhead] | 0.3550ms | 0.2288ms | 4.3707 KOps/s | 4.3265 KOps/s | |
test_mod_wrap_and_backward[eager] | 12.1286ms | 10.8329ms | 92.3117 Ops/s | 92.6425 Ops/s | |
test_mod_wrap_and_backward[compile] | 13.8425ms | 11.2454ms | 88.9255 Ops/s | 83.6687 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 17.5350ms | 12.7258ms | 78.5804 Ops/s | 85.4875 Ops/s | |
test_seq_add[eager] | 0.1922ms | 88.3181μs | 11.3227 KOps/s | 10.4263 KOps/s | |
test_seq_add[compile] | 0.2237ms | 66.9642μs | 14.9333 KOps/s | 15.0302 KOps/s | |
test_seq_add[compile-overhead] | 0.1493ms | 64.7723μs | 15.4387 KOps/s | 15.1796 KOps/s | |
test_seq_wrap[eager] | 0.5945ms | 0.3704ms | 2.6997 KOps/s | 2.5406 KOps/s | |
test_seq_wrap[compile] | 1.2435ms | 0.2742ms | 3.6474 KOps/s | 3.6318 KOps/s | |
test_seq_wrap[compile-overhead] | 1.2234ms | 0.2688ms | 3.7198 KOps/s | 3.6196 KOps/s | |
test_func_call_runtime[False-eager] | 0.7695ms | 0.5187ms | 1.9279 KOps/s | 1.8993 KOps/s | |
test_func_call_runtime[False-compile] | 0.6710ms | 0.5075ms | 1.9704 KOps/s | 2.0137 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.6366ms | 0.5026ms | 1.9898 KOps/s | 2.0167 KOps/s | |
test_func_call_runtime[True-eager] | 1.2276ms | 0.7289ms | 1.3719 KOps/s | 1.3458 KOps/s | |
test_func_call_runtime[True-compile] | 0.7101ms | 0.5185ms | 1.9285 KOps/s | 1.9634 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.6614ms | 0.5160ms | 1.9379 KOps/s | 1.9700 KOps/s | |
test_func_call_cm_runtime[False-eager] | 1.0140ms | 0.5250ms | 1.9046 KOps/s | 1.9231 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.6242ms | 0.5064ms | 1.9748 KOps/s | 2.0186 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.6590ms | 0.5067ms | 1.9735 KOps/s | 2.0247 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.0123ms | 0.8600ms | 1.1628 KOps/s | 1.1594 KOps/s | |
test_func_call_cm_runtime[True-compile] | 1.0015ms | 0.7374ms | 1.3561 KOps/s | 1.3693 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.5220ms | 0.7497ms | 1.3338 KOps/s | 1.3680 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.6595ms | 1.8630ms | 536.7731 Ops/s | 539.3825 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 2.9442ms | 1.9129ms | 522.7541 Ops/s | 517.4955 Ops/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 3.1697ms | 1.9148ms | 522.2461 Ops/s | 517.4568 Ops/s | |
test_distributed | 0.2666ms | 0.1284ms | 7.7874 KOps/s | 7.6841 KOps/s | |
test_tdmodule | 47.7590μs | 17.2314μs | 58.0336 KOps/s | 51.2797 KOps/s | |
test_tdmodule_dispatch | 81.8340μs | 34.4643μs | 29.0155 KOps/s | 25.9414 KOps/s | |
test_tdseq | 39.6250μs | 19.7698μs | 50.5822 KOps/s | 45.0715 KOps/s | |
test_tdseq_dispatch | 69.6310μs | 39.8608μs | 25.0873 KOps/s | 22.5395 KOps/s | |
test_instantiation_functorch | 2.7970ms | 1.5615ms | 640.3988 Ops/s | 626.0928 Ops/s | |
test_instantiation_td | 1.8306ms | 1.1638ms | 859.2528 Ops/s | 862.9671 Ops/s | |
test_exec_functorch | 0.3679ms | 0.1876ms | 5.3292 KOps/s | 5.3453 KOps/s | |
test_exec_functional_call | 0.4049ms | 0.1733ms | 5.7705 KOps/s | 5.5922 KOps/s | |
test_exec_td | 0.2612ms | 0.1661ms | 6.0187 KOps/s | 5.6806 KOps/s | |
test_exec_td_decorator | 1.1586ms | 0.2264ms | 4.4162 KOps/s | 4.3032 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.8772ms | 0.6418ms | 1.5580 KOps/s | 1.5108 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.7519ms | 0.6309ms | 1.5850 KOps/s | 1.5371 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.7575ms | 0.4944ms | 2.0226 KOps/s | 1.9880 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.6885ms | 0.4908ms | 2.0373 KOps/s | 1.9881 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.3675ms | 0.6136ms | 1.6296 KOps/s | 1.5905 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.0647ms | 0.6210ms | 1.6103 KOps/s | 1.5895 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7666ms | 0.5075ms | 1.9705 KOps/s | 1.9467 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7737ms | 0.5070ms | 1.9723 KOps/s | 1.9430 KOps/s | |
test_to_module_speed[True] | 1.4427ms | 1.2921ms | 773.9208 Ops/s | 758.9256 Ops/s | |
test_to_module_speed[False] | 2.0324ms | 1.2777ms | 782.6337 Ops/s | 784.6731 Ops/s | |
test_tc_init | 75.1420μs | 42.4004μs | 23.5847 KOps/s | 22.2737 KOps/s | |
test_tc_init_nested | 0.1528ms | 83.0681μs | 12.0383 KOps/s | 10.9700 KOps/s | |
test_tc_first_layer_tensor | 47.7500μs | 1.5779μs | 633.7672 KOps/s | 662.6703 KOps/s | |
test_tc_first_layer_nontensor | 22.1920μs | 4.7386μs | 211.0312 KOps/s | 213.6784 KOps/s | |
test_tc_second_layer_tensor | 39.4250μs | 2.8887μs | 346.1820 KOps/s | 359.8134 KOps/s | |
test_tc_second_layer_nontensor | 30.6680μs | 6.1111μs | 163.6356 KOps/s | 164.9867 KOps/s | |
test_unbind | 0.4895s | 14.2859ms | 69.9990 Ops/s | 64.9331 Ops/s | |
test_full_like | 14.4053ms | 8.8629ms | 112.8301 Ops/s | 139.3967 Ops/s | |
test_zeros_like | 4.6895ms | 3.3727ms | 296.4979 Ops/s | 361.2277 Ops/s | |
test_ones_like | 8.2147ms | 3.4775ms | 287.5591 Ops/s | 310.3643 Ops/s | |
test_clone | 6.5593ms | 5.4929ms | 182.0528 Ops/s | 183.9498 Ops/s | |
test_squeeze | 0.1040ms | 13.0432μs | 76.6686 KOps/s | 76.2556 KOps/s | |
test_unsqueeze | 0.2003ms | 93.6518μs | 10.6778 KOps/s | 10.8544 KOps/s | |
test_split | 0.3619ms | 0.1988ms | 5.0293 KOps/s | 5.2740 KOps/s | |
test_permute | 0.3595ms | 0.2206ms | 4.5324 KOps/s | 4.6006 KOps/s | |
test_stack | 45.7423ms | 30.4320ms | 32.8602 Ops/s | 39.5462 Ops/s | |
test_cat | 30.8122ms | 26.0593ms | 38.3740 Ops/s | 39.8274 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.1176ms | 13.7790μs | 72.5744 KOps/s | 72.2996 KOps/s | |
test_plain_set_stack_nested | 43.0800μs | 13.8870μs | 72.0098 KOps/s | 71.7719 KOps/s | |
test_plain_set_nested_inplace | 49.8510μs | 14.7769μs | 67.6732 KOps/s | 66.5799 KOps/s | |
test_plain_set_stack_nested_inplace | 48.7200μs | 14.7000μs | 68.0273 KOps/s | 67.4084 KOps/s | |
test_items | 31.2100μs | 2.8648μs | 349.0703 KOps/s | 345.3667 KOps/s | |
test_items_nested | 0.3572ms | 0.3221ms | 3.1044 KOps/s | 3.0548 KOps/s | |
test_items_nested_locked | 0.3785ms | 0.3242ms | 3.0845 KOps/s | 3.0603 KOps/s | |
test_items_nested_leaf | 76.8710μs | 55.5622μs | 17.9978 KOps/s | 17.9342 KOps/s | |
test_items_stack_nested | 0.3582ms | 0.3259ms | 3.0681 KOps/s | 3.0776 KOps/s | |
test_items_stack_nested_leaf | 81.9310μs | 55.7591μs | 17.9343 KOps/s | 17.5254 KOps/s | |
test_items_stack_nested_locked | 0.3755ms | 0.3239ms | 3.0874 KOps/s | 3.0347 KOps/s | |
test_keys | 27.0900μs | 3.4275μs | 291.7587 KOps/s | 291.3376 KOps/s | |
test_keys_nested | 85.9910μs | 55.3590μs | 18.0639 KOps/s | 18.3757 KOps/s | |
test_keys_nested_locked | 0.8157ms | 61.7944μs | 16.1827 KOps/s | 16.1765 KOps/s | |
test_keys_nested_leaf | 77.0310μs | 47.1619μs | 21.2036 KOps/s | 21.6342 KOps/s | |
test_keys_stack_nested | 99.4610μs | 54.7361μs | 18.2695 KOps/s | 18.2189 KOps/s | |
test_keys_stack_nested_leaf | 82.0410μs | 47.1480μs | 21.2098 KOps/s | 20.8447 KOps/s | |
test_keys_stack_nested_locked | 93.1910μs | 61.1073μs | 16.3647 KOps/s | 16.2116 KOps/s | |
test_values | 5.1585μs | 0.8482μs | 1.1790 MOps/s | 1.1958 MOps/s | |
test_values_nested | 63.8810μs | 40.8805μs | 24.4615 KOps/s | 24.5684 KOps/s | |
test_values_nested_locked | 90.9210μs | 42.7700μs | 23.3809 KOps/s | 23.5395 KOps/s | |
test_values_nested_leaf | 66.4610μs | 35.4402μs | 28.2165 KOps/s | 28.4427 KOps/s | |
test_values_stack_nested | 75.2010μs | 40.8580μs | 24.4750 KOps/s | 23.9962 KOps/s | |
test_values_stack_nested_leaf | 66.9710μs | 35.7244μs | 27.9920 KOps/s | 27.5192 KOps/s | |
test_values_stack_nested_locked | 74.8210μs | 42.7722μs | 23.3797 KOps/s | 22.8676 KOps/s | |
test_membership | 1.7265μs | 0.5071μs | 1.9720 MOps/s | 1.9971 MOps/s | |
test_membership_nested | 17.8800μs | 1.7987μs | 555.9582 KOps/s | 545.2477 KOps/s | |
test_membership_nested_leaf | 16.5400μs | 1.8322μs | 545.7984 KOps/s | 553.3762 KOps/s | |
test_membership_stacked_nested | 24.3310μs | 1.8356μs | 544.7760 KOps/s | 529.8085 KOps/s | |
test_membership_stacked_nested_leaf | 31.6310μs | 1.8704μs | 534.6340 KOps/s | 533.3935 KOps/s | |
test_membership_nested_last | 25.8410μs | 2.7584μs | 362.5239 KOps/s | 366.0320 KOps/s | |
test_membership_nested_leaf_last | 32.3500μs | 2.7947μs | 357.8153 KOps/s | 362.3194 KOps/s | |
test_membership_stacked_nested_last | 23.5500μs | 2.7689μs | 361.1538 KOps/s | 367.5362 KOps/s | |
test_membership_stacked_nested_leaf_last | 19.4900μs | 2.7190μs | 367.7805 KOps/s | 366.8846 KOps/s | |
test_nested_getleaf | 34.1600μs | 6.0528μs | 165.2126 KOps/s | 164.3478 KOps/s | |
test_nested_get | 42.8410μs | 5.7056μs | 175.2660 KOps/s | 176.2142 KOps/s | |
test_stacked_getleaf | 31.3310μs | 6.1039μs | 163.8296 KOps/s | 164.3479 KOps/s | |
test_stacked_get | 31.7000μs | 5.7007μs | 175.4157 KOps/s | 179.3316 KOps/s | |
test_nested_getitemleaf | 29.7710μs | 6.0892μs | 164.2255 KOps/s | 166.0118 KOps/s | |
test_nested_getitem | 41.7400μs | 5.7600μs | 173.6122 KOps/s | 176.4821 KOps/s | |
test_stacked_getitemleaf | 28.7400μs | 6.0505μs | 165.2768 KOps/s | 166.2272 KOps/s | |
test_stacked_getitem | 33.4310μs | 5.7168μs | 174.9227 KOps/s | 176.4234 KOps/s | |
test_lock_nested | 7.7919ms | 0.4201ms | 2.3803 KOps/s | 2.3949 KOps/s | |
test_lock_stack_nested | 0.4727ms | 0.3782ms | 2.6441 KOps/s | 2.6406 KOps/s | |
test_unlock_nested | 0.7557ms | 0.3519ms | 2.8418 KOps/s | 2.8476 KOps/s | |
test_unlock_stack_nested | 0.3927ms | 0.3180ms | 3.1448 KOps/s | 3.1501 KOps/s | |
test_flatten_speed | 0.1452ms | 68.1000μs | 14.6843 KOps/s | 14.5190 KOps/s | |
test_unflatten_speed | 0.3343ms | 0.2795ms | 3.5773 KOps/s | 3.5736 KOps/s | |
test_common_ops | 1.5411ms | 1.1970ms | 835.4067 Ops/s | 821.4684 Ops/s | |
test_creation | 24.9300μs | 1.4361μs | 696.3242 KOps/s | 692.6138 KOps/s | |
test_creation_empty | 49.9200μs | 14.8824μs | 67.1935 KOps/s | 66.0165 KOps/s | |
test_creation_nested_1 | 41.4610μs | 16.8141μs | 59.4740 KOps/s | 59.3690 KOps/s | |
test_creation_nested_2 | 47.7300μs | 19.3268μs | 51.7417 KOps/s | 50.6401 KOps/s | |
test_clone | 59.7200μs | 27.4162μs | 36.4748 KOps/s | 35.6328 KOps/s | |
test_getitem[int] | 1.4211ms | 15.4877μs | 64.5672 KOps/s | 63.1719 KOps/s | |
test_getitem[slice_int] | 0.1176ms | 26.4725μs | 37.7750 KOps/s | 37.2154 KOps/s | |
test_getitem[range] | 0.2205ms | 0.1061ms | 9.4285 KOps/s | 9.3029 KOps/s | |
test_getitem[tuple] | 0.1174ms | 22.7077μs | 44.0379 KOps/s | 43.7303 KOps/s | |
test_getitem[list] | 0.1902ms | 95.3700μs | 10.4855 KOps/s | 10.5459 KOps/s | |
test_setitem_dim[int] | 64.6200μs | 42.3337μs | 23.6218 KOps/s | 23.2940 KOps/s | |
test_setitem_dim[slice_int] | 0.1098ms | 63.3431μs | 15.7870 KOps/s | 15.6827 KOps/s | |
test_setitem_dim[range] | 0.1494ms | 0.1218ms | 8.2094 KOps/s | 8.1837 KOps/s | |
test_setitem_dim[tuple] | 83.9210μs | 57.5588μs | 17.3735 KOps/s | 16.2956 KOps/s | |
test_setitem | 70.3110μs | 39.9180μs | 25.0513 KOps/s | 23.1617 KOps/s | |
test_set | 81.4910μs | 38.9541μs | 25.6712 KOps/s | 25.0018 KOps/s | |
test_set_shared | 0.3608ms | 48.8086μs | 20.4882 KOps/s | 20.2925 KOps/s | |
test_update | 0.2522ms | 47.1652μs | 21.2021 KOps/s | 20.6528 KOps/s | |
test_update_nested | 0.1165ms | 53.7761μs | 18.5956 KOps/s | 17.2912 KOps/s | |
test_update__nested | 98.2510μs | 56.3069μs | 17.7598 KOps/s | 16.2919 KOps/s | |
test_set_nested | 78.4410μs | 40.7538μs | 24.5376 KOps/s | 23.5386 KOps/s | |
test_set_nested_new | 94.7010μs | 44.8013μs | 22.3208 KOps/s | 21.6472 KOps/s | |
test_select | 0.1184ms | 57.4844μs | 17.3960 KOps/s | 15.9761 KOps/s | |
test_select_nested | 66.8010μs | 41.3196μs | 24.2016 KOps/s | 24.0380 KOps/s | |
test_exclude_nested | 82.4610μs | 57.7623μs | 17.3123 KOps/s | 17.5050 KOps/s | |
test_empty[True] | 0.2977ms | 0.2380ms | 4.2019 KOps/s | 4.1268 KOps/s | |
test_empty[False] | 2.9680μs | 0.7392μs | 1.3529 MOps/s | 1.3651 MOps/s | |
test_to | 47.6200μs | 24.2129μs | 41.3003 KOps/s | 40.2506 KOps/s | |
test_to_nonblocking | 58.8010μs | 23.8795μs | 41.8769 KOps/s | 41.2012 KOps/s | |
test_unbind_speed | 0.3265ms | 0.2732ms | 3.6608 KOps/s | 3.6524 KOps/s | |
test_unbind_speed_stack0 | 0.4426ms | 0.2708ms | 3.6929 KOps/s | 3.7284 KOps/s | |
test_unbind_speed_stack1 | 92.6469ms | 0.7065ms | 1.4154 KOps/s | 1.4380 KOps/s | |
test_split | 94.9279ms | 2.1279ms | 469.9515 Ops/s | 467.8443 Ops/s | |
test_chunk | 96.8783ms | 2.1358ms | 468.2064 Ops/s | 468.5588 Ops/s | |
test_creation[device0] | 0.3531ms | 0.1240ms | 8.0624 KOps/s | 8.0406 KOps/s | |
test_creation_from_tensor | 0.4203ms | 0.1265ms | 7.9076 KOps/s | 7.8665 KOps/s | |
test_add_one[memmap_tensor0] | 0.2879ms | 8.3054μs | 120.4033 KOps/s | 120.4991 KOps/s | |
test_contiguous[memmap_tensor0] | 32.9800μs | 2.1045μs | 475.1701 KOps/s | 482.0211 KOps/s | |
test_stack[memmap_tensor0] | 33.1700μs | 6.5790μs | 151.9991 KOps/s | 152.6706 KOps/s | |
test_memmaptd_index | 1.1003ms | 0.4137ms | 2.4172 KOps/s | 2.4585 KOps/s | |
test_memmaptd_index_astensor | 0.7328ms | 0.4710ms | 2.1231 KOps/s | 2.1556 KOps/s | |
test_memmaptd_index_op | 1.3586ms | 0.9866ms | 1.0136 KOps/s | 1.0107 KOps/s | |
test_serialize_model | 0.1309s | 0.1301s | 7.6862 Ops/s | 7.6817 Ops/s | |
test_serialize_model_pickle | 1.3509s | 1.2130s | 0.8244 Ops/s | 0.8245 Ops/s | |
test_serialize_weights | 0.2242s | 0.1428s | 7.0031 Ops/s | 7.6650 Ops/s | |
test_serialize_weights_returnearly | 0.2241s | 55.6103ms | 17.9823 Ops/s | 18.3614 Ops/s | |
test_serialize_weights_pickle | 1.3467s | 1.2161s | 0.8223 Ops/s | 0.8184 Ops/s | |
test_reshape_pytree | 82.7710μs | 34.5255μs | 28.9641 KOps/s | 28.9173 KOps/s | |
test_reshape_td | 74.6510μs | 41.2811μs | 24.2241 KOps/s | 24.8497 KOps/s | |
test_view_pytree | 61.8910μs | 34.8783μs | 28.6711 KOps/s | 28.8789 KOps/s | |
test_view_td | 86.0610μs | 47.5371μs | 21.0362 KOps/s | 21.7438 KOps/s | |
test_unbind_pytree | 74.5500μs | 33.6865μs | 29.6855 KOps/s | 29.8926 KOps/s | |
test_unbind_td | 0.4912ms | 42.1444μs | 23.7279 KOps/s | 23.5985 KOps/s | |
test_split_pytree | 81.1800μs | 44.6298μs | 22.4066 KOps/s | 21.7930 KOps/s | |
test_split_td | 0.6557ms | 55.2846μs | 18.0882 KOps/s | 15.7018 KOps/s | |
test_add_pytree | 0.1048ms | 54.4377μs | 18.3696 KOps/s | 18.7224 KOps/s | |
test_add_td | 0.1327ms | 87.7166μs | 11.4004 KOps/s | 11.4209 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.4047ms | 0.2051ms | 4.8748 KOps/s | 4.9030 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2184ms | 0.1468ms | 6.8125 KOps/s | 6.6814 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1969ms | 0.1412ms | 7.0842 KOps/s | 7.1989 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2327ms | 0.1771ms | 5.6472 KOps/s | 5.6750 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 57.9310μs | 21.1983μs | 47.1735 KOps/s | 48.0951 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1744ms | 42.8284μs | 23.3490 KOps/s | 22.8150 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.2173ms | 64.1601μs | 15.5860 KOps/s | 15.4849 KOps/s | |
test_compile_copy_nested[pytree-eager] | 87.9810μs | 48.9435μs | 20.4317 KOps/s | 20.3726 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.4059ms | 0.3089ms | 3.2370 KOps/s | 3.2476 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.2594ms | 0.2062ms | 4.8500 KOps/s | 4.8785 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1714ms | 0.1269ms | 7.8812 KOps/s | 8.0870 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1044ms | 61.3613μs | 16.2969 KOps/s | 17.0373 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.3902ms | 0.3064ms | 3.2638 KOps/s | 3.2870 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.7105ms | 0.6212ms | 1.6099 KOps/s | 1.6935 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3027ms | 0.2455ms | 4.0739 KOps/s | 4.0522 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.3490ms | 0.3099ms | 3.2271 KOps/s | 3.2541 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1169ms | 71.8140μs | 13.9249 KOps/s | 14.4556 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1852ms | 0.1320ms | 7.5730 KOps/s | 8.0080 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.6793ms | 0.5211ms | 1.9192 KOps/s | 1.9306 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.3564ms | 0.3084ms | 3.2426 KOps/s | 3.2599 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.1046ms | 18.7146μs | 53.4341 KOps/s | 55.6178 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1865ms | 26.1437μs | 38.2501 KOps/s | 35.9410 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1133ms | 70.1830μs | 14.2485 KOps/s | 14.1584 KOps/s | |
test_compile_copy_flat[pytree-eager] | 82.9410μs | 51.1855μs | 19.5368 KOps/s | 19.2930 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 2.2993ms | 0.8044ms | 1.2431 KOps/s | 1.1633 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.4213ms | 3.2001ms | 312.4867 Ops/s | 312.6084 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 2.2182ms | 0.7893ms | 1.2669 KOps/s | 1.1679 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.1431ms | 3.0944ms | 323.1673 Ops/s | 323.2633 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1774ms | 0.1061ms | 9.4211 KOps/s | 9.5113 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.1878ms | 56.9767μs | 17.5510 KOps/s | 16.7283 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1731ms | 0.1043ms | 9.5920 KOps/s | 10.0123 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 85.3410μs | 43.1984μs | 23.1490 KOps/s | 24.6009 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1623ms | 0.1056ms | 9.4678 KOps/s | 9.9411 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 89.4410μs | 42.9741μs | 23.2698 KOps/s | 24.4453 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1749ms | 0.1344ms | 7.4396 KOps/s | 7.5289 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1630ms | 24.6604μs | 40.5508 KOps/s | 40.6209 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1663ms | 0.1270ms | 7.8729 KOps/s | 7.9254 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 54.9100μs | 20.5070μs | 48.7639 KOps/s | 49.6705 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1761ms | 0.1337ms | 7.4799 KOps/s | 7.9184 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 56.7610μs | 20.1896μs | 49.5304 KOps/s | 49.4215 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.2149ms | 0.1409ms | 7.0977 KOps/s | 7.5186 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5268ms | 24.7900μs | 40.3388 KOps/s | 40.9161 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1715ms | 0.1285ms | 7.7834 KOps/s | 7.6482 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 54.3810μs | 19.8653μs | 50.3390 KOps/s | 48.9264 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1802ms | 0.1282ms | 7.7977 KOps/s | 7.7242 KOps/s | |
test_compile_indexing[int-pytree-eager] | 62.3200μs | 20.4527μs | 48.8933 KOps/s | 48.7962 KOps/s | |
test_mod_add[eager] | 75.9710μs | 30.8925μs | 32.3704 KOps/s | 32.3568 KOps/s | |
test_mod_add[compile] | 0.1788ms | 70.4368μs | 14.1971 KOps/s | 14.2779 KOps/s | |
test_mod_add[compile-overhead] | 0.2696ms | 0.1344ms | 7.4385 KOps/s | 7.1644 KOps/s | |
test_mod_wrap[eager] | 0.3634ms | 0.2399ms | 4.1691 KOps/s | 4.2885 KOps/s | |
test_mod_wrap[compile] | 1.4441ms | 0.2900ms | 3.4478 KOps/s | 3.4338 KOps/s | |
test_mod_wrap[compile-overhead] | 7.6253ms | 4.1141ms | 243.0688 Ops/s | 246.5231 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.7021ms | 1.2967ms | 771.1804 Ops/s | 721.8503 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.5036ms | 1.2793ms | 781.6865 Ops/s | 712.8300 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3274ms | 0.8859ms | 1.1288 KOps/s | 1.0144 KOps/s | |
test_seq_add[eager] | 0.1540ms | 97.1850μs | 10.2897 KOps/s | 10.2427 KOps/s | |
test_seq_add[compile] | 0.1506ms | 81.8419μs | 12.2187 KOps/s | 12.1808 KOps/s | |
test_seq_add[compile-overhead] | 0.1879ms | 0.1161ms | 8.6167 KOps/s | 8.9553 KOps/s | |
test_seq_wrap[eager] | 0.4607ms | 0.3826ms | 2.6136 KOps/s | 2.6713 KOps/s | |
test_seq_wrap[compile] | 0.3695ms | 0.3119ms | 3.2064 KOps/s | 3.1023 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2670ms | 0.2153ms | 4.6452 KOps/s | 4.6130 KOps/s | |
test_func_call_runtime[False-eager] | 0.8561ms | 0.7395ms | 1.3522 KOps/s | 1.4000 KOps/s | |
test_func_call_runtime[False-compile] | 1.0381ms | 0.7553ms | 1.3240 KOps/s | 1.2827 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4159ms | 0.3499ms | 2.8579 KOps/s | 2.8591 KOps/s | |
test_func_call_runtime[True-eager] | 1.0051ms | 0.8683ms | 1.1517 KOps/s | 1.1420 KOps/s | |
test_func_call_runtime[True-compile] | 0.8974ms | 0.7825ms | 1.2779 KOps/s | 1.2638 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4390ms | 0.3728ms | 2.6824 KOps/s | 2.7001 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.7994ms | 0.7247ms | 1.3799 KOps/s | 1.4163 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8580ms | 0.7660ms | 1.3055 KOps/s | 1.2961 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4900ms | 0.3521ms | 2.8404 KOps/s | 2.8583 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.0536ms | 0.9582ms | 1.0436 KOps/s | 1.0262 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.8610ms | 0.8100ms | 1.2346 KOps/s | 1.2165 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.4840ms | 0.3942ms | 2.5368 KOps/s | 2.5365 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.4322ms | 1.9708ms | 507.4013 Ops/s | 505.1941 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9930ms | 0.8197ms | 1.2199 KOps/s | 1.1744 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.4469ms | 0.3989ms | 2.5066 KOps/s | 2.5181 KOps/s | |
test_distributed | 3.0114ms | 0.1804ms | 5.5434 KOps/s | 8.8804 KOps/s | |
test_tdmodule | 0.4129ms | 15.0639μs | 66.3840 KOps/s | 66.6909 KOps/s | |
test_tdmodule_dispatch | 56.0510μs | 28.2576μs | 35.3887 KOps/s | 34.3009 KOps/s | |
test_tdseq | 37.6310μs | 15.5780μs | 64.1932 KOps/s | 64.4216 KOps/s | |
test_tdseq_dispatch | 52.3700μs | 30.9920μs | 32.2664 KOps/s | 31.9886 KOps/s | |
test_instantiation_functorch | 1.9668ms | 1.8083ms | 552.9976 Ops/s | 550.2546 Ops/s | |
test_instantiation_td | 1.7809ms | 1.1649ms | 858.4295 Ops/s | 854.0178 Ops/s | |
test_exec_functorch | 0.3371ms | 0.2049ms | 4.8803 KOps/s | 4.8751 KOps/s | |
test_exec_functional_call | 0.2698ms | 0.2023ms | 4.9434 KOps/s | 4.8975 KOps/s | |
test_exec_td | 0.3091ms | 0.2060ms | 4.8551 KOps/s | 4.8361 KOps/s | |
test_exec_td_decorator | 1.1083ms | 0.2463ms | 4.0608 KOps/s | 3.9821 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.7563ms | 0.6578ms | 1.5201 KOps/s | 1.4961 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.7183ms | 0.6555ms | 1.5257 KOps/s | 1.5208 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.7190ms | 0.5499ms | 1.8184 KOps/s | 1.8187 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.6105ms | 0.5473ms | 1.8272 KOps/s | 1.8085 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.3675ms | 0.6462ms | 1.5474 KOps/s | 1.5577 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.7662ms | 0.6438ms | 1.5533 KOps/s | 1.5489 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.6847ms | 0.5629ms | 1.7764 KOps/s | 1.7755 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.6846ms | 0.5664ms | 1.7657 KOps/s | 1.7206 KOps/s | |
test_vmap_transformer_speed[True-True] | 8.7600ms | 7.9734ms | 125.4171 Ops/s | 124.4053 Ops/s | |
test_vmap_transformer_speed[True-False] | 8.1053ms | 7.9390ms | 125.9603 Ops/s | 124.7698 Ops/s | |
test_vmap_transformer_speed[False-True] | 7.9270ms | 7.7614ms | 128.8425 Ops/s | 126.7856 Ops/s | |
test_vmap_transformer_speed[False-False] | 8.1094ms | 7.7860ms | 128.4351 Ops/s | 127.5118 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 19.5412ms | 18.6590ms | 53.5935 Ops/s | 53.8603 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.1354ms | 18.7188ms | 53.4224 Ops/s | 53.5788 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 18.6280ms | 18.5589ms | 53.8824 Ops/s | 54.1023 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.8194ms | 18.5835ms | 53.8113 Ops/s | 54.0899 Ops/s | |
test_to_module_speed[True] | 1.9749ms | 0.9313ms | 1.0738 KOps/s | 1.0691 KOps/s | |
test_to_module_speed[False] | 0.9841ms | 0.8997ms | 1.1115 KOps/s | 1.0860 KOps/s | |
test_tc_init | 61.0710μs | 34.5744μs | 28.9231 KOps/s | 29.3490 KOps/s | |
test_tc_init_nested | 0.1102ms | 71.6088μs | 13.9648 KOps/s | 14.8388 KOps/s | |
test_tc_first_layer_tensor | 16.0401μs | 0.6635μs | 1.5072 MOps/s | 1.5031 MOps/s | |
test_tc_first_layer_nontensor | 30.1900μs | 2.2020μs | 454.1286 KOps/s | 453.3373 KOps/s | |
test_tc_second_layer_tensor | 8.7575μs | 1.3379μs | 747.4442 KOps/s | 740.5515 KOps/s | |
test_tc_second_layer_nontensor | 33.1900μs | 2.8723μs | 348.1505 KOps/s | 345.4798 KOps/s | |
test_unbind | 0.1961s | 10.8551ms | 92.1223 Ops/s | 94.1153 Ops/s | |
test_full_like | 0.6404ms | 0.5762ms | 1.7356 KOps/s | 1.7382 KOps/s | |
test_zeros_like | 0.2620ms | 0.1980ms | 5.0495 KOps/s | 5.0518 KOps/s | |
test_ones_like | 0.2334ms | 0.1979ms | 5.0535 KOps/s | 5.0559 KOps/s | |
test_clone | 0.4411ms | 0.4145ms | 2.4126 KOps/s | 2.4189 KOps/s | |
test_squeeze | 41.3600μs | 9.7764μs | 102.2871 KOps/s | 100.8873 KOps/s | |
test_unsqueeze | 0.2320ms | 74.4689μs | 13.4284 KOps/s | 13.5519 KOps/s | |
test_split | 0.4442ms | 0.1571ms | 6.3662 KOps/s | 6.2998 KOps/s | |
test_permute | 0.2936ms | 0.1767ms | 5.6578 KOps/s | 5.7232 KOps/s | |
test_stack | 1.2537ms | 0.8564ms | 1.1676 KOps/s | 1.1573 KOps/s | |
test_cat | 1.2548ms | 1.2315ms | 812.0121 Ops/s | 811.6978 Ops/s |
vmoens
added a commit
that referenced
this pull request
Oct 1, 2024
ghstack-source-id: e5ea6fef54f47304e1a6cafbd15f4bdade5e69b4 Pull Request resolved: #1016
vmoens
added a commit
that referenced
this pull request
Oct 1, 2024
ghstack-source-id: 36379bbed4125713f00e115dcc66c14fa439c12f Pull Request resolved: #1016
vmoens
added a commit
that referenced
this pull request
Oct 1, 2024
ghstack-source-id: 1d9bcff8e4f6e308d8f8e9fa06b3da4eca8905f1 Pull Request resolved: #1016
vmoens
added a commit
that referenced
this pull request
Oct 1, 2024
ghstack-source-id: 36d694db1da278fb84f36419b1b978de817ca453 Pull Request resolved: #1016
vmoens
added a commit
that referenced
this pull request
Oct 1, 2024
ghstack-source-id: 02c247e8fa01ba3d71ce61d48d141b7bafd064f5 Pull Request resolved: #1016
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
enhancement
New feature or request
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):