Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] TensorDict.record_stream #1016

Merged
merged 5 commits into from
Oct 1, 2024
Merged

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Oct 1, 2024

Stack from ghstack (oldest at bottom):

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Oct 1, 2024
ghstack-source-id: e5ea6fef54f47304e1a6cafbd15f4bdade5e69b4
Pull Request resolved: #1016
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 1, 2024
Copy link

github-actions bot commented Oct 1, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 222. Improved: $\large\color{#35bf28}33$. Worsened: $\large\color{#d91a1a}12$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 57.8090μs 19.7476μs 50.6391 KOps/s 47.8349 KOps/s $\textbf{\color{#35bf28}+5.86\%}$
test_plain_set_stack_nested 50.3040μs 19.8080μs 50.4847 KOps/s 46.9167 KOps/s $\textbf{\color{#35bf28}+7.61\%}$
test_plain_set_nested_inplace 82.6070μs 21.5445μs 46.4156 KOps/s 43.5681 KOps/s $\textbf{\color{#35bf28}+6.54\%}$
test_plain_set_stack_nested_inplace 72.1650μs 21.0420μs 47.5239 KOps/s 44.2032 KOps/s $\textbf{\color{#35bf28}+7.51\%}$
test_items 22.1820μs 4.2157μs 237.2103 KOps/s 246.1842 KOps/s $\color{#d91a1a}-3.65\%$
test_items_nested 0.5955ms 0.3732ms 2.6796 KOps/s 2.7582 KOps/s $\color{#d91a1a}-2.85\%$
test_items_nested_locked 0.7623ms 0.3733ms 2.6790 KOps/s 2.7979 KOps/s $\color{#d91a1a}-4.25\%$
test_items_nested_leaf 0.1215ms 68.1096μs 14.6822 KOps/s 14.5594 KOps/s $\color{#35bf28}+0.84\%$
test_items_stack_nested 0.7047ms 0.3814ms 2.6221 KOps/s 2.7649 KOps/s $\textbf{\color{#d91a1a}-5.16\%}$
test_items_stack_nested_leaf 0.1342ms 71.0043μs 14.0836 KOps/s 14.0226 KOps/s $\color{#35bf28}+0.44\%$
test_items_stack_nested_locked 0.4623ms 0.3762ms 2.6579 KOps/s 2.7305 KOps/s $\color{#d91a1a}-2.66\%$
test_keys 42.7200μs 3.6244μs 275.9080 KOps/s 282.9106 KOps/s $\color{#d91a1a}-2.48\%$
test_keys_nested 0.1830ms 98.9406μs 10.1071 KOps/s 10.0450 KOps/s $\color{#35bf28}+0.62\%$
test_keys_nested_locked 0.6957ms 0.1047ms 9.5533 KOps/s 9.5186 KOps/s $\color{#35bf28}+0.36\%$
test_keys_nested_leaf 0.1413ms 82.0120μs 12.1933 KOps/s 12.1522 KOps/s $\color{#35bf28}+0.34\%$
test_keys_stack_nested 0.1654ms 99.0379μs 10.0971 KOps/s 10.0016 KOps/s $\color{#35bf28}+0.96\%$
test_keys_stack_nested_leaf 0.1494ms 80.7987μs 12.3764 KOps/s 12.1422 KOps/s $\color{#35bf28}+1.93\%$
test_keys_stack_nested_locked 0.1895ms 0.1038ms 9.6312 KOps/s 9.2383 KOps/s $\color{#35bf28}+4.25\%$
test_values 6.6904μs 1.0570μs 946.1029 KOps/s 947.0718 KOps/s $\color{#d91a1a}-0.10\%$
test_values_nested 0.1382ms 74.0404μs 13.5061 KOps/s 13.2217 KOps/s $\color{#35bf28}+2.15\%$
test_values_nested_locked 0.1350ms 74.9287μs 13.3460 KOps/s 13.2011 KOps/s $\color{#35bf28}+1.10\%$
test_values_nested_leaf 0.1169ms 62.0084μs 16.1268 KOps/s 16.0685 KOps/s $\color{#35bf28}+0.36\%$
test_values_stack_nested 0.1247ms 76.1907μs 13.1250 KOps/s 12.8613 KOps/s $\color{#35bf28}+2.05\%$
test_values_stack_nested_leaf 0.1209ms 61.6146μs 16.2299 KOps/s 15.9895 KOps/s $\color{#35bf28}+1.50\%$
test_values_stack_nested_locked 0.1467ms 76.1302μs 13.1354 KOps/s 12.6032 KOps/s $\color{#35bf28}+4.22\%$
test_membership 3.3433μs 0.7420μs 1.3477 MOps/s 1.1358 MOps/s $\textbf{\color{#35bf28}+18.66\%}$
test_membership_nested 43.9320μs 2.8555μs 350.2073 KOps/s 362.1335 KOps/s $\color{#d91a1a}-3.29\%$
test_membership_nested_leaf 35.1850μs 2.8450μs 351.4903 KOps/s 361.6725 KOps/s $\color{#d91a1a}-2.82\%$
test_membership_stacked_nested 47.6390μs 2.8039μs 356.6459 KOps/s 362.0152 KOps/s $\color{#d91a1a}-1.48\%$
test_membership_stacked_nested_leaf 29.0940μs 2.8472μs 351.2283 KOps/s 363.8387 KOps/s $\color{#d91a1a}-3.47\%$
test_membership_nested_last 47.7700μs 4.1523μs 240.8303 KOps/s 242.3871 KOps/s $\color{#d91a1a}-0.64\%$
test_membership_nested_leaf_last 35.5960μs 4.1715μs 239.7215 KOps/s 248.4352 KOps/s $\color{#d91a1a}-3.51\%$
test_membership_stacked_nested_last 25.6080μs 4.1242μs 242.4686 KOps/s 250.0185 KOps/s $\color{#d91a1a}-3.02\%$
test_membership_stacked_nested_leaf_last 28.2430μs 4.1463μs 241.1782 KOps/s 249.7452 KOps/s $\color{#d91a1a}-3.43\%$
test_nested_getleaf 51.0660μs 10.5551μs 94.7413 KOps/s 93.2374 KOps/s $\color{#35bf28}+1.61\%$
test_nested_get 49.7830μs 10.1341μs 98.6770 KOps/s 96.9506 KOps/s $\color{#35bf28}+1.78\%$
test_stacked_getleaf 32.8010μs 10.6812μs 93.6223 KOps/s 91.3217 KOps/s $\color{#35bf28}+2.52\%$
test_stacked_get 56.7160μs 10.0707μs 99.2975 KOps/s 99.0581 KOps/s $\color{#35bf28}+0.24\%$
test_nested_getitemleaf 41.0970μs 11.1192μs 89.9349 KOps/s 89.8861 KOps/s $\color{#35bf28}+0.05\%$
test_nested_getitem 56.5450μs 10.2781μs 97.2945 KOps/s 95.9438 KOps/s $\color{#35bf28}+1.41\%$
test_stacked_getitemleaf 53.9810μs 11.2260μs 89.0792 KOps/s 91.0294 KOps/s $\color{#d91a1a}-2.14\%$
test_stacked_getitem 60.9140μs 10.2335μs 97.7187 KOps/s 96.0904 KOps/s $\color{#35bf28}+1.69\%$
test_lock_nested 83.1198ms 0.5697ms 1.7552 KOps/s 2.0065 KOps/s $\textbf{\color{#d91a1a}-12.53\%}$
test_lock_stack_nested 0.9177ms 0.4557ms 2.1944 KOps/s 2.1726 KOps/s $\color{#35bf28}+1.01\%$
test_unlock_nested 90.1293ms 0.4983ms 2.0068 KOps/s 2.3905 KOps/s $\textbf{\color{#d91a1a}-16.05\%}$
test_unlock_stack_nested 0.8186ms 0.3880ms 2.5771 KOps/s 2.6399 KOps/s $\color{#d91a1a}-2.38\%$
test_flatten_speed 0.1610ms 87.6151μs 11.4136 KOps/s 11.4397 KOps/s $\color{#d91a1a}-0.23\%$
test_unflatten_speed 0.6144ms 0.4700ms 2.1276 KOps/s 2.1206 KOps/s $\color{#35bf28}+0.33\%$
test_common_ops 4.2299ms 1.1089ms 901.8237 Ops/s 868.5351 Ops/s $\color{#35bf28}+3.83\%$
test_creation 31.0690μs 2.0809μs 480.5581 KOps/s 486.9312 KOps/s $\color{#d91a1a}-1.31\%$
test_creation_empty 53.9510μs 16.3422μs 61.1914 KOps/s 51.5454 KOps/s $\textbf{\color{#35bf28}+18.71\%}$
test_creation_nested_1 67.7460μs 19.4653μs 51.3735 KOps/s 43.9063 KOps/s $\textbf{\color{#35bf28}+17.01\%}$
test_creation_nested_2 63.5990μs 23.8593μs 41.9123 KOps/s 37.2786 KOps/s $\textbf{\color{#35bf28}+12.43\%}$
test_clone 0.1867ms 17.5382μs 57.0185 KOps/s 58.0304 KOps/s $\color{#d91a1a}-1.74\%$
test_getitem[int] 1.1681ms 16.7077μs 59.8525 KOps/s 58.2499 KOps/s $\color{#35bf28}+2.75\%$
test_getitem[slice_int] 0.1441ms 30.7387μs 32.5323 KOps/s 31.9232 KOps/s $\color{#35bf28}+1.91\%$
test_getitem[range] 0.8185ms 59.7807μs 16.7278 KOps/s 17.1368 KOps/s $\color{#d91a1a}-2.39\%$
test_getitem[tuple] 0.1525ms 24.7450μs 40.4122 KOps/s 39.2698 KOps/s $\color{#35bf28}+2.91\%$
test_getitem[list] 0.1650ms 54.2236μs 18.4422 KOps/s 18.5126 KOps/s $\color{#d91a1a}-0.38\%$
test_setitem_dim[int] 62.8370μs 33.4983μs 29.8522 KOps/s 29.8697 KOps/s $\color{#d91a1a}-0.06\%$
test_setitem_dim[slice_int] 0.1122ms 62.4561μs 16.0112 KOps/s 16.5342 KOps/s $\color{#d91a1a}-3.16\%$
test_setitem_dim[range] 0.1449ms 84.7425μs 11.8005 KOps/s 11.8681 KOps/s $\color{#d91a1a}-0.57\%$
test_setitem_dim[tuple] 98.3840μs 50.1206μs 19.9519 KOps/s 20.1242 KOps/s $\color{#d91a1a}-0.86\%$
test_setitem 73.6870μs 28.9742μs 34.5134 KOps/s 32.0359 KOps/s $\textbf{\color{#35bf28}+7.73\%}$
test_set 0.2518ms 28.1426μs 35.5333 KOps/s 32.6689 KOps/s $\textbf{\color{#35bf28}+8.77\%}$
test_set_shared 6.1110ms 0.2168ms 4.6128 KOps/s 4.6008 KOps/s $\color{#35bf28}+0.26\%$
test_update 0.2757ms 34.6408μs 28.8677 KOps/s 25.8322 KOps/s $\textbf{\color{#35bf28}+11.75\%}$
test_update_nested 1.8965ms 45.6021μs 21.9288 KOps/s 20.3513 KOps/s $\textbf{\color{#35bf28}+7.75\%}$
test_update__nested 81.2820μs 35.2984μs 28.3299 KOps/s 27.1728 KOps/s $\color{#35bf28}+4.26\%$
test_set_nested 0.3550ms 31.2292μs 32.0213 KOps/s 30.5810 KOps/s $\color{#35bf28}+4.71\%$
test_set_nested_new 0.1934ms 35.9136μs 27.8446 KOps/s 26.0412 KOps/s $\textbf{\color{#35bf28}+6.92\%}$
test_select 0.2517ms 52.9539μs 18.8843 KOps/s 18.0132 KOps/s $\color{#35bf28}+4.84\%$
test_select_nested 0.1288ms 59.7454μs 16.7377 KOps/s 16.5404 KOps/s $\color{#35bf28}+1.19\%$
test_exclude_nested 0.1751ms 75.0117μs 13.3313 KOps/s 13.2631 KOps/s $\color{#35bf28}+0.51\%$
test_empty[True] 0.6608ms 0.3152ms 3.1724 KOps/s 3.1370 KOps/s $\color{#35bf28}+1.13\%$
test_empty[False] 22.9478μs 1.1958μs 836.2627 KOps/s 832.1309 KOps/s $\color{#35bf28}+0.50\%$
test_unbind_speed 0.5216ms 0.2999ms 3.3343 KOps/s 3.2468 KOps/s $\color{#35bf28}+2.70\%$
test_unbind_speed_stack0 0.4290ms 0.3004ms 3.3290 KOps/s 3.3582 KOps/s $\color{#d91a1a}-0.87\%$
test_unbind_speed_stack1 98.1837ms 0.8248ms 1.2125 KOps/s 1.4532 KOps/s $\textbf{\color{#d91a1a}-16.56\%}$
test_split 92.9165ms 2.1574ms 463.5170 Ops/s 452.7703 Ops/s $\color{#35bf28}+2.37\%$
test_chunk 3.2902ms 1.9859ms 503.5394 Ops/s 450.5945 Ops/s $\textbf{\color{#35bf28}+11.75\%}$
test_creation[device0] 3.8242ms 0.1203ms 8.3107 KOps/s 8.4631 KOps/s $\color{#d91a1a}-1.80\%$
test_creation_from_tensor 0.2467ms 0.1174ms 8.5153 KOps/s 8.4477 KOps/s $\color{#35bf28}+0.80\%$
test_add_one[memmap_tensor0] 0.2535ms 7.1436μs 139.9857 KOps/s 136.1869 KOps/s $\color{#35bf28}+2.79\%$
test_contiguous[memmap_tensor0] 20.5080μs 1.9772μs 505.7571 KOps/s 536.1249 KOps/s $\textbf{\color{#d91a1a}-5.66\%}$
test_stack[memmap_tensor0] 0.1000ms 5.7674μs 173.3884 KOps/s 177.8316 KOps/s $\color{#d91a1a}-2.50\%$
test_memmaptd_index 1.1715ms 0.4004ms 2.4976 KOps/s 2.5303 KOps/s $\color{#d91a1a}-1.29\%$
test_memmaptd_index_astensor 0.7345ms 0.4766ms 2.0981 KOps/s 2.0901 KOps/s $\color{#35bf28}+0.38\%$
test_memmaptd_index_op 93.6502ms 1.0890ms 918.2857 Ops/s 955.2303 Ops/s $\color{#d91a1a}-3.87\%$
test_serialize_model 0.1297s 0.1213s 8.2442 Ops/s 8.1407 Ops/s $\color{#35bf28}+1.27\%$
test_serialize_model_pickle 0.5012s 0.4089s 2.4455 Ops/s 2.5316 Ops/s $\color{#d91a1a}-3.40\%$
test_serialize_weights 0.1232s 0.1162s 8.6023 Ops/s 7.4643 Ops/s $\textbf{\color{#35bf28}+15.25\%}$
test_serialize_weights_returnearly 0.2744s 0.1749s 5.7186 Ops/s 6.1440 Ops/s $\textbf{\color{#d91a1a}-6.92\%}$
test_serialize_weights_pickle 0.5774s 0.4299s 2.3263 Ops/s 2.4863 Ops/s $\textbf{\color{#d91a1a}-6.44\%}$
test_serialize_weights_filesystem 0.1438s 0.1413s 7.0763 Ops/s 6.7427 Ops/s $\color{#35bf28}+4.95\%$
test_serialize_model_filesystem 0.1598s 0.1479s 6.7603 Ops/s 6.0689 Ops/s $\textbf{\color{#35bf28}+11.39\%}$
test_reshape_pytree 79.1080μs 40.3629μs 24.7752 KOps/s 25.7797 KOps/s $\color{#d91a1a}-3.90\%$
test_reshape_td 0.1354ms 47.2044μs 21.1845 KOps/s 20.4412 KOps/s $\color{#35bf28}+3.64\%$
test_view_pytree 87.8940μs 39.3747μs 25.3970 KOps/s 25.6677 KOps/s $\color{#d91a1a}-1.05\%$
test_view_td 0.1459ms 52.6707μs 18.9859 KOps/s 18.2688 KOps/s $\color{#35bf28}+3.93\%$
test_unbind_pytree 76.5030μs 36.1711μs 27.6464 KOps/s 27.6723 KOps/s $\color{#d91a1a}-0.09\%$
test_unbind_td 0.3160ms 43.5398μs 22.9675 KOps/s 21.3990 KOps/s $\textbf{\color{#35bf28}+7.33\%}$
test_split_pytree 0.1039ms 38.0172μs 26.3039 KOps/s 26.1244 KOps/s $\color{#35bf28}+0.69\%$
test_split_td 0.5229ms 57.6761μs 17.3382 KOps/s 17.4134 KOps/s $\color{#d91a1a}-0.43\%$
test_add_pytree 0.1175ms 45.7848μs 21.8413 KOps/s 22.1389 KOps/s $\color{#d91a1a}-1.34\%$
test_add_td 0.1589ms 78.9896μs 12.6599 KOps/s 11.7198 KOps/s $\textbf{\color{#35bf28}+8.02\%}$
test_compile_add_one_nested[tensordict-compile] 0.1384ms 59.6206μs 16.7727 KOps/s 17.5171 KOps/s $\color{#d91a1a}-4.25\%$
test_compile_add_one_nested[tensordict-eager] 0.4267ms 0.1796ms 5.5693 KOps/s 5.6105 KOps/s $\color{#d91a1a}-0.73\%$
test_compile_add_one_nested[pytree-compile] 0.1351ms 58.3802μs 17.1291 KOps/s 17.4221 KOps/s $\color{#d91a1a}-1.68\%$
test_compile_add_one_nested[pytree-eager] 0.3381ms 0.1436ms 6.9629 KOps/s 7.0508 KOps/s $\color{#d91a1a}-1.25\%$
test_compile_copy_nested[tensordict-compile] 72.4050μs 21.6438μs 46.2026 KOps/s 46.3840 KOps/s $\color{#d91a1a}-0.39\%$
test_compile_copy_nested[tensordict-eager] 0.1430ms 69.1630μs 14.4586 KOps/s 15.0714 KOps/s $\color{#d91a1a}-4.07\%$
test_compile_copy_nested[pytree-compile] 0.1471ms 77.5081μs 12.9019 KOps/s 13.0819 KOps/s $\color{#d91a1a}-1.38\%$
test_compile_copy_nested[pytree-eager] 0.1841ms 70.6892μs 14.1464 KOps/s 14.4154 KOps/s $\color{#d91a1a}-1.87\%$
test_compile_add_one_flat[tensordict-compile] 0.2638ms 0.1793ms 5.5767 KOps/s 5.7635 KOps/s $\color{#d91a1a}-3.24\%$
test_compile_add_one_flat[tensordict-eager] 0.4261ms 0.1905ms 5.2493 KOps/s 5.1849 KOps/s $\color{#35bf28}+1.24\%$
test_compile_add_one_flat[tensorclass-compile] 0.1230ms 48.6074μs 20.5730 KOps/s 20.6239 KOps/s $\color{#d91a1a}-0.25\%$
test_compile_add_one_flat[tensorclass-eager] 0.1726ms 69.8220μs 14.3221 KOps/s 14.6700 KOps/s $\color{#d91a1a}-2.37\%$
test_compile_add_one_flat[pytree-compile] 0.4593ms 0.1831ms 5.4613 KOps/s 5.7282 KOps/s $\color{#d91a1a}-4.66\%$
test_compile_add_one_flat[pytree-eager] 0.7172ms 0.2945ms 3.3954 KOps/s 3.4733 KOps/s $\color{#d91a1a}-2.24\%$
test_compile_add_self_flat[tensordict-eager] 0.4203ms 0.2028ms 4.9317 KOps/s 4.9163 KOps/s $\color{#35bf28}+0.31\%$
test_compile_add_self_flat[tensordict-compile] 0.6256ms 0.1813ms 5.5143 KOps/s 5.7970 KOps/s $\color{#d91a1a}-4.88\%$
test_compile_add_self_flat[tensorclass-eager] 0.2228ms 62.8211μs 15.9182 KOps/s 15.9772 KOps/s $\color{#d91a1a}-0.37\%$
test_compile_add_self_flat[tensorclass-compile] 0.1173ms 48.9850μs 20.4144 KOps/s 21.2364 KOps/s $\color{#d91a1a}-3.87\%$
test_compile_add_self_flat[pytree-eager] 0.3750ms 0.2376ms 4.2094 KOps/s 4.2457 KOps/s $\color{#d91a1a}-0.86\%$
test_compile_add_self_flat[pytree-compile] 0.2931ms 0.1804ms 5.5419 KOps/s 5.6779 KOps/s $\color{#d91a1a}-2.39\%$
test_compile_copy_flat[tensordict-compile] 0.2016ms 0.1078ms 9.2790 KOps/s 9.6687 KOps/s $\color{#d91a1a}-4.03\%$
test_compile_copy_flat[tensordict-eager] 0.1348ms 60.0907μs 16.6415 KOps/s 17.1801 KOps/s $\color{#d91a1a}-3.13\%$
test_compile_copy_flat[pytree-compile] 0.1473ms 78.0182μs 12.8175 KOps/s 12.5326 KOps/s $\color{#35bf28}+2.27\%$
test_compile_copy_flat[pytree-eager] 0.1266ms 69.8018μs 14.3263 KOps/s 14.1830 KOps/s $\color{#35bf28}+1.01\%$
test_compile_assign_and_add[tensordict-compile] 0.3854ms 0.1987ms 5.0335 KOps/s 5.1441 KOps/s $\color{#d91a1a}-2.15\%$
test_compile_assign_and_add[tensordict-eager] 1.8350ms 1.6865ms 592.9445 Ops/s 599.6238 Ops/s $\color{#d91a1a}-1.11\%$
test_compile_assign_and_add[pytree-compile] 0.4370ms 0.1974ms 5.0657 KOps/s 5.1955 KOps/s $\color{#d91a1a}-2.50\%$
test_compile_assign_and_add[pytree-eager] 1.3137ms 1.1321ms 883.3349 Ops/s 915.5837 Ops/s $\color{#d91a1a}-3.52\%$
test_compile_assign_and_add_stack[compile] 0.5157ms 0.4212ms 2.3739 KOps/s 2.4062 KOps/s $\color{#d91a1a}-1.34\%$
test_compile_assign_and_add_stack[eager] 3.9394ms 3.6982ms 270.4040 Ops/s 255.2251 Ops/s $\textbf{\color{#35bf28}+5.95\%}$
test_compile_indexing[tensor-tensordict-compile] 0.1170ms 35.0286μs 28.5481 KOps/s 27.8832 KOps/s $\color{#35bf28}+2.38\%$
test_compile_indexing[tensor-tensordict-eager] 1.1510ms 50.0505μs 19.9798 KOps/s 20.4236 KOps/s $\color{#d91a1a}-2.17\%$
test_compile_indexing[tensor-tensorclass-compile] 80.9720μs 29.7142μs 33.6539 KOps/s 31.6485 KOps/s $\textbf{\color{#35bf28}+6.34\%}$
test_compile_indexing[tensor-tensorclass-eager] 90.1790μs 30.2068μs 33.1052 KOps/s 33.9651 KOps/s $\color{#d91a1a}-2.53\%$
test_compile_indexing[tensor-pytree-compile] 77.9160μs 29.4951μs 33.9039 KOps/s 31.2885 KOps/s $\textbf{\color{#35bf28}+8.36\%}$
test_compile_indexing[tensor-pytree-eager] 70.9630μs 29.8607μs 33.4888 KOps/s 33.9741 KOps/s $\color{#d91a1a}-1.43\%$
test_compile_indexing[slice-tensordict-compile] 0.1474ms 74.3924μs 13.4422 KOps/s 13.1872 KOps/s $\color{#35bf28}+1.93\%$
test_compile_indexing[slice-tensordict-eager] 0.5850ms 28.0190μs 35.6901 KOps/s 36.2430 KOps/s $\color{#d91a1a}-1.53\%$
test_compile_indexing[slice-tensorclass-compile] 0.1501ms 69.0387μs 14.4846 KOps/s 14.0261 KOps/s $\color{#35bf28}+3.27\%$
test_compile_indexing[slice-tensorclass-eager] 83.8770μs 24.3036μs 41.1461 KOps/s 42.5667 KOps/s $\color{#d91a1a}-3.34\%$
test_compile_indexing[slice-pytree-compile] 0.1449ms 68.1576μs 14.6719 KOps/s 14.2486 KOps/s $\color{#35bf28}+2.97\%$
test_compile_indexing[slice-pytree-eager] 77.3150μs 24.1646μs 41.3828 KOps/s 42.4649 KOps/s $\color{#d91a1a}-2.55\%$
test_compile_indexing[int-tensordict-compile] 0.1535ms 74.1688μs 13.4828 KOps/s 13.1598 KOps/s $\color{#35bf28}+2.45\%$
test_compile_indexing[int-tensordict-eager] 1.2897ms 27.0832μs 36.9233 KOps/s 35.7284 KOps/s $\color{#35bf28}+3.34\%$
test_compile_indexing[int-tensorclass-compile] 0.1329ms 68.5750μs 14.5826 KOps/s 14.0574 KOps/s $\color{#35bf28}+3.74\%$
test_compile_indexing[int-tensorclass-eager] 72.5150μs 24.0741μs 41.5385 KOps/s 42.9430 KOps/s $\color{#d91a1a}-3.27\%$
test_compile_indexing[int-pytree-compile] 0.1973ms 70.0825μs 14.2689 KOps/s 14.2042 KOps/s $\color{#35bf28}+0.46\%$
test_compile_indexing[int-pytree-eager] 97.0920μs 24.0859μs 41.5181 KOps/s 42.7735 KOps/s $\color{#d91a1a}-2.93\%$
test_mod_add[eager] 0.1168ms 25.1950μs 39.6903 KOps/s 36.4392 KOps/s $\textbf{\color{#35bf28}+8.92\%}$
test_mod_add[compile] 0.1045ms 40.1812μs 24.8873 KOps/s 24.1569 KOps/s $\color{#35bf28}+3.02\%$
test_mod_add[compile-overhead] 0.1057ms 39.7823μs 25.1368 KOps/s 24.8282 KOps/s $\color{#35bf28}+1.24\%$
test_mod_wrap[eager] 0.3119ms 0.2066ms 4.8398 KOps/s 4.8533 KOps/s $\color{#d91a1a}-0.28\%$
test_mod_wrap[compile] 0.3537ms 0.2312ms 4.3257 KOps/s 4.2738 KOps/s $\color{#35bf28}+1.21\%$
test_mod_wrap[compile-overhead] 0.3550ms 0.2288ms 4.3707 KOps/s 4.3265 KOps/s $\color{#35bf28}+1.02\%$
test_mod_wrap_and_backward[eager] 12.1286ms 10.8329ms 92.3117 Ops/s 92.6425 Ops/s $\color{#d91a1a}-0.36\%$
test_mod_wrap_and_backward[compile] 13.8425ms 11.2454ms 88.9255 Ops/s 83.6687 Ops/s $\textbf{\color{#35bf28}+6.28\%}$
test_mod_wrap_and_backward[compile-overhead] 17.5350ms 12.7258ms 78.5804 Ops/s 85.4875 Ops/s $\textbf{\color{#d91a1a}-8.08\%}$
test_seq_add[eager] 0.1922ms 88.3181μs 11.3227 KOps/s 10.4263 KOps/s $\textbf{\color{#35bf28}+8.60\%}$
test_seq_add[compile] 0.2237ms 66.9642μs 14.9333 KOps/s 15.0302 KOps/s $\color{#d91a1a}-0.64\%$
test_seq_add[compile-overhead] 0.1493ms 64.7723μs 15.4387 KOps/s 15.1796 KOps/s $\color{#35bf28}+1.71\%$
test_seq_wrap[eager] 0.5945ms 0.3704ms 2.6997 KOps/s 2.5406 KOps/s $\textbf{\color{#35bf28}+6.26\%}$
test_seq_wrap[compile] 1.2435ms 0.2742ms 3.6474 KOps/s 3.6318 KOps/s $\color{#35bf28}+0.43\%$
test_seq_wrap[compile-overhead] 1.2234ms 0.2688ms 3.7198 KOps/s 3.6196 KOps/s $\color{#35bf28}+2.77\%$
test_func_call_runtime[False-eager] 0.7695ms 0.5187ms 1.9279 KOps/s 1.8993 KOps/s $\color{#35bf28}+1.51\%$
test_func_call_runtime[False-compile] 0.6710ms 0.5075ms 1.9704 KOps/s 2.0137 KOps/s $\color{#d91a1a}-2.15\%$
test_func_call_runtime[False-compile-overhead] 0.6366ms 0.5026ms 1.9898 KOps/s 2.0167 KOps/s $\color{#d91a1a}-1.34\%$
test_func_call_runtime[True-eager] 1.2276ms 0.7289ms 1.3719 KOps/s 1.3458 KOps/s $\color{#35bf28}+1.94\%$
test_func_call_runtime[True-compile] 0.7101ms 0.5185ms 1.9285 KOps/s 1.9634 KOps/s $\color{#d91a1a}-1.77\%$
test_func_call_runtime[True-compile-overhead] 0.6614ms 0.5160ms 1.9379 KOps/s 1.9700 KOps/s $\color{#d91a1a}-1.63\%$
test_func_call_cm_runtime[False-eager] 1.0140ms 0.5250ms 1.9046 KOps/s 1.9231 KOps/s $\color{#d91a1a}-0.96\%$
test_func_call_cm_runtime[False-compile] 0.6242ms 0.5064ms 1.9748 KOps/s 2.0186 KOps/s $\color{#d91a1a}-2.17\%$
test_func_call_cm_runtime[False-compile-overhead] 0.6590ms 0.5067ms 1.9735 KOps/s 2.0247 KOps/s $\color{#d91a1a}-2.53\%$
test_func_call_cm_runtime[True-eager] 1.0123ms 0.8600ms 1.1628 KOps/s 1.1594 KOps/s $\color{#35bf28}+0.29\%$
test_func_call_cm_runtime[True-compile] 1.0015ms 0.7374ms 1.3561 KOps/s 1.3693 KOps/s $\color{#d91a1a}-0.97\%$
test_func_call_cm_runtime[True-compile-overhead] 1.5220ms 0.7497ms 1.3338 KOps/s 1.3680 KOps/s $\color{#d91a1a}-2.50\%$
test_vmap_func_call_cm_runtime[eager] 2.6595ms 1.8630ms 536.7731 Ops/s 539.3825 Ops/s $\color{#d91a1a}-0.48\%$
test_vmap_func_call_cm_runtime[compile] 2.9442ms 1.9129ms 522.7541 Ops/s 517.4955 Ops/s $\color{#35bf28}+1.02\%$
test_vmap_func_call_cm_runtime[compile-overhead] 3.1697ms 1.9148ms 522.2461 Ops/s 517.4568 Ops/s $\color{#35bf28}+0.93\%$
test_distributed 0.2666ms 0.1284ms 7.7874 KOps/s 7.6841 KOps/s $\color{#35bf28}+1.34\%$
test_tdmodule 47.7590μs 17.2314μs 58.0336 KOps/s 51.2797 KOps/s $\textbf{\color{#35bf28}+13.17\%}$
test_tdmodule_dispatch 81.8340μs 34.4643μs 29.0155 KOps/s 25.9414 KOps/s $\textbf{\color{#35bf28}+11.85\%}$
test_tdseq 39.6250μs 19.7698μs 50.5822 KOps/s 45.0715 KOps/s $\textbf{\color{#35bf28}+12.23\%}$
test_tdseq_dispatch 69.6310μs 39.8608μs 25.0873 KOps/s 22.5395 KOps/s $\textbf{\color{#35bf28}+11.30\%}$
test_instantiation_functorch 2.7970ms 1.5615ms 640.3988 Ops/s 626.0928 Ops/s $\color{#35bf28}+2.28\%$
test_instantiation_td 1.8306ms 1.1638ms 859.2528 Ops/s 862.9671 Ops/s $\color{#d91a1a}-0.43\%$
test_exec_functorch 0.3679ms 0.1876ms 5.3292 KOps/s 5.3453 KOps/s $\color{#d91a1a}-0.30\%$
test_exec_functional_call 0.4049ms 0.1733ms 5.7705 KOps/s 5.5922 KOps/s $\color{#35bf28}+3.19\%$
test_exec_td 0.2612ms 0.1661ms 6.0187 KOps/s 5.6806 KOps/s $\textbf{\color{#35bf28}+5.95\%}$
test_exec_td_decorator 1.1586ms 0.2264ms 4.4162 KOps/s 4.3032 KOps/s $\color{#35bf28}+2.63\%$
test_vmap_mlp_speed[True-True] 0.8772ms 0.6418ms 1.5580 KOps/s 1.5108 KOps/s $\color{#35bf28}+3.12\%$
test_vmap_mlp_speed[True-False] 0.7519ms 0.6309ms 1.5850 KOps/s 1.5371 KOps/s $\color{#35bf28}+3.11\%$
test_vmap_mlp_speed[False-True] 0.7575ms 0.4944ms 2.0226 KOps/s 1.9880 KOps/s $\color{#35bf28}+1.74\%$
test_vmap_mlp_speed[False-False] 0.6885ms 0.4908ms 2.0373 KOps/s 1.9881 KOps/s $\color{#35bf28}+2.48\%$
test_vmap_mlp_speed_decorator[True-True] 1.3675ms 0.6136ms 1.6296 KOps/s 1.5905 KOps/s $\color{#35bf28}+2.46\%$
test_vmap_mlp_speed_decorator[True-False] 1.0647ms 0.6210ms 1.6103 KOps/s 1.5895 KOps/s $\color{#35bf28}+1.31\%$
test_vmap_mlp_speed_decorator[False-True] 0.7666ms 0.5075ms 1.9705 KOps/s 1.9467 KOps/s $\color{#35bf28}+1.22\%$
test_vmap_mlp_speed_decorator[False-False] 0.7737ms 0.5070ms 1.9723 KOps/s 1.9430 KOps/s $\color{#35bf28}+1.51\%$
test_to_module_speed[True] 1.4427ms 1.2921ms 773.9208 Ops/s 758.9256 Ops/s $\color{#35bf28}+1.98\%$
test_to_module_speed[False] 2.0324ms 1.2777ms 782.6337 Ops/s 784.6731 Ops/s $\color{#d91a1a}-0.26\%$
test_tc_init 75.1420μs 42.4004μs 23.5847 KOps/s 22.2737 KOps/s $\textbf{\color{#35bf28}+5.89\%}$
test_tc_init_nested 0.1528ms 83.0681μs 12.0383 KOps/s 10.9700 KOps/s $\textbf{\color{#35bf28}+9.74\%}$
test_tc_first_layer_tensor 47.7500μs 1.5779μs 633.7672 KOps/s 662.6703 KOps/s $\color{#d91a1a}-4.36\%$
test_tc_first_layer_nontensor 22.1920μs 4.7386μs 211.0312 KOps/s 213.6784 KOps/s $\color{#d91a1a}-1.24\%$
test_tc_second_layer_tensor 39.4250μs 2.8887μs 346.1820 KOps/s 359.8134 KOps/s $\color{#d91a1a}-3.79\%$
test_tc_second_layer_nontensor 30.6680μs 6.1111μs 163.6356 KOps/s 164.9867 KOps/s $\color{#d91a1a}-0.82\%$
test_unbind 0.4895s 14.2859ms 69.9990 Ops/s 64.9331 Ops/s $\textbf{\color{#35bf28}+7.80\%}$
test_full_like 14.4053ms 8.8629ms 112.8301 Ops/s 139.3967 Ops/s $\textbf{\color{#d91a1a}-19.06\%}$
test_zeros_like 4.6895ms 3.3727ms 296.4979 Ops/s 361.2277 Ops/s $\textbf{\color{#d91a1a}-17.92\%}$
test_ones_like 8.2147ms 3.4775ms 287.5591 Ops/s 310.3643 Ops/s $\textbf{\color{#d91a1a}-7.35\%}$
test_clone 6.5593ms 5.4929ms 182.0528 Ops/s 183.9498 Ops/s $\color{#d91a1a}-1.03\%$
test_squeeze 0.1040ms 13.0432μs 76.6686 KOps/s 76.2556 KOps/s $\color{#35bf28}+0.54\%$
test_unsqueeze 0.2003ms 93.6518μs 10.6778 KOps/s 10.8544 KOps/s $\color{#d91a1a}-1.63\%$
test_split 0.3619ms 0.1988ms 5.0293 KOps/s 5.2740 KOps/s $\color{#d91a1a}-4.64\%$
test_permute 0.3595ms 0.2206ms 4.5324 KOps/s 4.6006 KOps/s $\color{#d91a1a}-1.48\%$
test_stack 45.7423ms 30.4320ms 32.8602 Ops/s 39.5462 Ops/s $\textbf{\color{#d91a1a}-16.91\%}$
test_cat 30.8122ms 26.0593ms 38.3740 Ops/s 39.8274 Ops/s $\color{#d91a1a}-3.65\%$

Copy link

github-actions bot commented Oct 1, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 228. Improved: $\large\color{#35bf28}12$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.1176ms 13.7790μs 72.5744 KOps/s 72.2996 KOps/s $\color{#35bf28}+0.38\%$
test_plain_set_stack_nested 43.0800μs 13.8870μs 72.0098 KOps/s 71.7719 KOps/s $\color{#35bf28}+0.33\%$
test_plain_set_nested_inplace 49.8510μs 14.7769μs 67.6732 KOps/s 66.5799 KOps/s $\color{#35bf28}+1.64\%$
test_plain_set_stack_nested_inplace 48.7200μs 14.7000μs 68.0273 KOps/s 67.4084 KOps/s $\color{#35bf28}+0.92\%$
test_items 31.2100μs 2.8648μs 349.0703 KOps/s 345.3667 KOps/s $\color{#35bf28}+1.07\%$
test_items_nested 0.3572ms 0.3221ms 3.1044 KOps/s 3.0548 KOps/s $\color{#35bf28}+1.62\%$
test_items_nested_locked 0.3785ms 0.3242ms 3.0845 KOps/s 3.0603 KOps/s $\color{#35bf28}+0.79\%$
test_items_nested_leaf 76.8710μs 55.5622μs 17.9978 KOps/s 17.9342 KOps/s $\color{#35bf28}+0.35\%$
test_items_stack_nested 0.3582ms 0.3259ms 3.0681 KOps/s 3.0776 KOps/s $\color{#d91a1a}-0.31\%$
test_items_stack_nested_leaf 81.9310μs 55.7591μs 17.9343 KOps/s 17.5254 KOps/s $\color{#35bf28}+2.33\%$
test_items_stack_nested_locked 0.3755ms 0.3239ms 3.0874 KOps/s 3.0347 KOps/s $\color{#35bf28}+1.74\%$
test_keys 27.0900μs 3.4275μs 291.7587 KOps/s 291.3376 KOps/s $\color{#35bf28}+0.14\%$
test_keys_nested 85.9910μs 55.3590μs 18.0639 KOps/s 18.3757 KOps/s $\color{#d91a1a}-1.70\%$
test_keys_nested_locked 0.8157ms 61.7944μs 16.1827 KOps/s 16.1765 KOps/s $\color{#35bf28}+0.04\%$
test_keys_nested_leaf 77.0310μs 47.1619μs 21.2036 KOps/s 21.6342 KOps/s $\color{#d91a1a}-1.99\%$
test_keys_stack_nested 99.4610μs 54.7361μs 18.2695 KOps/s 18.2189 KOps/s $\color{#35bf28}+0.28\%$
test_keys_stack_nested_leaf 82.0410μs 47.1480μs 21.2098 KOps/s 20.8447 KOps/s $\color{#35bf28}+1.75\%$
test_keys_stack_nested_locked 93.1910μs 61.1073μs 16.3647 KOps/s 16.2116 KOps/s $\color{#35bf28}+0.94\%$
test_values 5.1585μs 0.8482μs 1.1790 MOps/s 1.1958 MOps/s $\color{#d91a1a}-1.40\%$
test_values_nested 63.8810μs 40.8805μs 24.4615 KOps/s 24.5684 KOps/s $\color{#d91a1a}-0.43\%$
test_values_nested_locked 90.9210μs 42.7700μs 23.3809 KOps/s 23.5395 KOps/s $\color{#d91a1a}-0.67\%$
test_values_nested_leaf 66.4610μs 35.4402μs 28.2165 KOps/s 28.4427 KOps/s $\color{#d91a1a}-0.80\%$
test_values_stack_nested 75.2010μs 40.8580μs 24.4750 KOps/s 23.9962 KOps/s $\color{#35bf28}+2.00\%$
test_values_stack_nested_leaf 66.9710μs 35.7244μs 27.9920 KOps/s 27.5192 KOps/s $\color{#35bf28}+1.72\%$
test_values_stack_nested_locked 74.8210μs 42.7722μs 23.3797 KOps/s 22.8676 KOps/s $\color{#35bf28}+2.24\%$
test_membership 1.7265μs 0.5071μs 1.9720 MOps/s 1.9971 MOps/s $\color{#d91a1a}-1.25\%$
test_membership_nested 17.8800μs 1.7987μs 555.9582 KOps/s 545.2477 KOps/s $\color{#35bf28}+1.96\%$
test_membership_nested_leaf 16.5400μs 1.8322μs 545.7984 KOps/s 553.3762 KOps/s $\color{#d91a1a}-1.37\%$
test_membership_stacked_nested 24.3310μs 1.8356μs 544.7760 KOps/s 529.8085 KOps/s $\color{#35bf28}+2.83\%$
test_membership_stacked_nested_leaf 31.6310μs 1.8704μs 534.6340 KOps/s 533.3935 KOps/s $\color{#35bf28}+0.23\%$
test_membership_nested_last 25.8410μs 2.7584μs 362.5239 KOps/s 366.0320 KOps/s $\color{#d91a1a}-0.96\%$
test_membership_nested_leaf_last 32.3500μs 2.7947μs 357.8153 KOps/s 362.3194 KOps/s $\color{#d91a1a}-1.24\%$
test_membership_stacked_nested_last 23.5500μs 2.7689μs 361.1538 KOps/s 367.5362 KOps/s $\color{#d91a1a}-1.74\%$
test_membership_stacked_nested_leaf_last 19.4900μs 2.7190μs 367.7805 KOps/s 366.8846 KOps/s $\color{#35bf28}+0.24\%$
test_nested_getleaf 34.1600μs 6.0528μs 165.2126 KOps/s 164.3478 KOps/s $\color{#35bf28}+0.53\%$
test_nested_get 42.8410μs 5.7056μs 175.2660 KOps/s 176.2142 KOps/s $\color{#d91a1a}-0.54\%$
test_stacked_getleaf 31.3310μs 6.1039μs 163.8296 KOps/s 164.3479 KOps/s $\color{#d91a1a}-0.32\%$
test_stacked_get 31.7000μs 5.7007μs 175.4157 KOps/s 179.3316 KOps/s $\color{#d91a1a}-2.18\%$
test_nested_getitemleaf 29.7710μs 6.0892μs 164.2255 KOps/s 166.0118 KOps/s $\color{#d91a1a}-1.08\%$
test_nested_getitem 41.7400μs 5.7600μs 173.6122 KOps/s 176.4821 KOps/s $\color{#d91a1a}-1.63\%$
test_stacked_getitemleaf 28.7400μs 6.0505μs 165.2768 KOps/s 166.2272 KOps/s $\color{#d91a1a}-0.57\%$
test_stacked_getitem 33.4310μs 5.7168μs 174.9227 KOps/s 176.4234 KOps/s $\color{#d91a1a}-0.85\%$
test_lock_nested 7.7919ms 0.4201ms 2.3803 KOps/s 2.3949 KOps/s $\color{#d91a1a}-0.61\%$
test_lock_stack_nested 0.4727ms 0.3782ms 2.6441 KOps/s 2.6406 KOps/s $\color{#35bf28}+0.13\%$
test_unlock_nested 0.7557ms 0.3519ms 2.8418 KOps/s 2.8476 KOps/s $\color{#d91a1a}-0.21\%$
test_unlock_stack_nested 0.3927ms 0.3180ms 3.1448 KOps/s 3.1501 KOps/s $\color{#d91a1a}-0.17\%$
test_flatten_speed 0.1452ms 68.1000μs 14.6843 KOps/s 14.5190 KOps/s $\color{#35bf28}+1.14\%$
test_unflatten_speed 0.3343ms 0.2795ms 3.5773 KOps/s 3.5736 KOps/s $\color{#35bf28}+0.11\%$
test_common_ops 1.5411ms 1.1970ms 835.4067 Ops/s 821.4684 Ops/s $\color{#35bf28}+1.70\%$
test_creation 24.9300μs 1.4361μs 696.3242 KOps/s 692.6138 KOps/s $\color{#35bf28}+0.54\%$
test_creation_empty 49.9200μs 14.8824μs 67.1935 KOps/s 66.0165 KOps/s $\color{#35bf28}+1.78\%$
test_creation_nested_1 41.4610μs 16.8141μs 59.4740 KOps/s 59.3690 KOps/s $\color{#35bf28}+0.18\%$
test_creation_nested_2 47.7300μs 19.3268μs 51.7417 KOps/s 50.6401 KOps/s $\color{#35bf28}+2.18\%$
test_clone 59.7200μs 27.4162μs 36.4748 KOps/s 35.6328 KOps/s $\color{#35bf28}+2.36\%$
test_getitem[int] 1.4211ms 15.4877μs 64.5672 KOps/s 63.1719 KOps/s $\color{#35bf28}+2.21\%$
test_getitem[slice_int] 0.1176ms 26.4725μs 37.7750 KOps/s 37.2154 KOps/s $\color{#35bf28}+1.50\%$
test_getitem[range] 0.2205ms 0.1061ms 9.4285 KOps/s 9.3029 KOps/s $\color{#35bf28}+1.35\%$
test_getitem[tuple] 0.1174ms 22.7077μs 44.0379 KOps/s 43.7303 KOps/s $\color{#35bf28}+0.70\%$
test_getitem[list] 0.1902ms 95.3700μs 10.4855 KOps/s 10.5459 KOps/s $\color{#d91a1a}-0.57\%$
test_setitem_dim[int] 64.6200μs 42.3337μs 23.6218 KOps/s 23.2940 KOps/s $\color{#35bf28}+1.41\%$
test_setitem_dim[slice_int] 0.1098ms 63.3431μs 15.7870 KOps/s 15.6827 KOps/s $\color{#35bf28}+0.67\%$
test_setitem_dim[range] 0.1494ms 0.1218ms 8.2094 KOps/s 8.1837 KOps/s $\color{#35bf28}+0.31\%$
test_setitem_dim[tuple] 83.9210μs 57.5588μs 17.3735 KOps/s 16.2956 KOps/s $\textbf{\color{#35bf28}+6.61\%}$
test_setitem 70.3110μs 39.9180μs 25.0513 KOps/s 23.1617 KOps/s $\textbf{\color{#35bf28}+8.16\%}$
test_set 81.4910μs 38.9541μs 25.6712 KOps/s 25.0018 KOps/s $\color{#35bf28}+2.68\%$
test_set_shared 0.3608ms 48.8086μs 20.4882 KOps/s 20.2925 KOps/s $\color{#35bf28}+0.96\%$
test_update 0.2522ms 47.1652μs 21.2021 KOps/s 20.6528 KOps/s $\color{#35bf28}+2.66\%$
test_update_nested 0.1165ms 53.7761μs 18.5956 KOps/s 17.2912 KOps/s $\textbf{\color{#35bf28}+7.54\%}$
test_update__nested 98.2510μs 56.3069μs 17.7598 KOps/s 16.2919 KOps/s $\textbf{\color{#35bf28}+9.01\%}$
test_set_nested 78.4410μs 40.7538μs 24.5376 KOps/s 23.5386 KOps/s $\color{#35bf28}+4.24\%$
test_set_nested_new 94.7010μs 44.8013μs 22.3208 KOps/s 21.6472 KOps/s $\color{#35bf28}+3.11\%$
test_select 0.1184ms 57.4844μs 17.3960 KOps/s 15.9761 KOps/s $\textbf{\color{#35bf28}+8.89\%}$
test_select_nested 66.8010μs 41.3196μs 24.2016 KOps/s 24.0380 KOps/s $\color{#35bf28}+0.68\%$
test_exclude_nested 82.4610μs 57.7623μs 17.3123 KOps/s 17.5050 KOps/s $\color{#d91a1a}-1.10\%$
test_empty[True] 0.2977ms 0.2380ms 4.2019 KOps/s 4.1268 KOps/s $\color{#35bf28}+1.82\%$
test_empty[False] 2.9680μs 0.7392μs 1.3529 MOps/s 1.3651 MOps/s $\color{#d91a1a}-0.89\%$
test_to 47.6200μs 24.2129μs 41.3003 KOps/s 40.2506 KOps/s $\color{#35bf28}+2.61\%$
test_to_nonblocking 58.8010μs 23.8795μs 41.8769 KOps/s 41.2012 KOps/s $\color{#35bf28}+1.64\%$
test_unbind_speed 0.3265ms 0.2732ms 3.6608 KOps/s 3.6524 KOps/s $\color{#35bf28}+0.23\%$
test_unbind_speed_stack0 0.4426ms 0.2708ms 3.6929 KOps/s 3.7284 KOps/s $\color{#d91a1a}-0.95\%$
test_unbind_speed_stack1 92.6469ms 0.7065ms 1.4154 KOps/s 1.4380 KOps/s $\color{#d91a1a}-1.57\%$
test_split 94.9279ms 2.1279ms 469.9515 Ops/s 467.8443 Ops/s $\color{#35bf28}+0.45\%$
test_chunk 96.8783ms 2.1358ms 468.2064 Ops/s 468.5588 Ops/s $\color{#d91a1a}-0.08\%$
test_creation[device0] 0.3531ms 0.1240ms 8.0624 KOps/s 8.0406 KOps/s $\color{#35bf28}+0.27\%$
test_creation_from_tensor 0.4203ms 0.1265ms 7.9076 KOps/s 7.8665 KOps/s $\color{#35bf28}+0.52\%$
test_add_one[memmap_tensor0] 0.2879ms 8.3054μs 120.4033 KOps/s 120.4991 KOps/s $\color{#d91a1a}-0.08\%$
test_contiguous[memmap_tensor0] 32.9800μs 2.1045μs 475.1701 KOps/s 482.0211 KOps/s $\color{#d91a1a}-1.42\%$
test_stack[memmap_tensor0] 33.1700μs 6.5790μs 151.9991 KOps/s 152.6706 KOps/s $\color{#d91a1a}-0.44\%$
test_memmaptd_index 1.1003ms 0.4137ms 2.4172 KOps/s 2.4585 KOps/s $\color{#d91a1a}-1.68\%$
test_memmaptd_index_astensor 0.7328ms 0.4710ms 2.1231 KOps/s 2.1556 KOps/s $\color{#d91a1a}-1.51\%$
test_memmaptd_index_op 1.3586ms 0.9866ms 1.0136 KOps/s 1.0107 KOps/s $\color{#35bf28}+0.29\%$
test_serialize_model 0.1309s 0.1301s 7.6862 Ops/s 7.6817 Ops/s $\color{#35bf28}+0.06\%$
test_serialize_model_pickle 1.3509s 1.2130s 0.8244 Ops/s 0.8245 Ops/s $\color{#d91a1a}-0.01\%$
test_serialize_weights 0.2242s 0.1428s 7.0031 Ops/s 7.6650 Ops/s $\textbf{\color{#d91a1a}-8.63\%}$
test_serialize_weights_returnearly 0.2241s 55.6103ms 17.9823 Ops/s 18.3614 Ops/s $\color{#d91a1a}-2.06\%$
test_serialize_weights_pickle 1.3467s 1.2161s 0.8223 Ops/s 0.8184 Ops/s $\color{#35bf28}+0.48\%$
test_reshape_pytree 82.7710μs 34.5255μs 28.9641 KOps/s 28.9173 KOps/s $\color{#35bf28}+0.16\%$
test_reshape_td 74.6510μs 41.2811μs 24.2241 KOps/s 24.8497 KOps/s $\color{#d91a1a}-2.52\%$
test_view_pytree 61.8910μs 34.8783μs 28.6711 KOps/s 28.8789 KOps/s $\color{#d91a1a}-0.72\%$
test_view_td 86.0610μs 47.5371μs 21.0362 KOps/s 21.7438 KOps/s $\color{#d91a1a}-3.25\%$
test_unbind_pytree 74.5500μs 33.6865μs 29.6855 KOps/s 29.8926 KOps/s $\color{#d91a1a}-0.69\%$
test_unbind_td 0.4912ms 42.1444μs 23.7279 KOps/s 23.5985 KOps/s $\color{#35bf28}+0.55\%$
test_split_pytree 81.1800μs 44.6298μs 22.4066 KOps/s 21.7930 KOps/s $\color{#35bf28}+2.82\%$
test_split_td 0.6557ms 55.2846μs 18.0882 KOps/s 15.7018 KOps/s $\textbf{\color{#35bf28}+15.20\%}$
test_add_pytree 0.1048ms 54.4377μs 18.3696 KOps/s 18.7224 KOps/s $\color{#d91a1a}-1.88\%$
test_add_td 0.1327ms 87.7166μs 11.4004 KOps/s 11.4209 KOps/s $\color{#d91a1a}-0.18\%$
test_compile_add_one_nested[tensordict-compile] 0.4047ms 0.2051ms 4.8748 KOps/s 4.9030 KOps/s $\color{#d91a1a}-0.58\%$
test_compile_add_one_nested[tensordict-eager] 0.2184ms 0.1468ms 6.8125 KOps/s 6.6814 KOps/s $\color{#35bf28}+1.96\%$
test_compile_add_one_nested[pytree-compile] 0.1969ms 0.1412ms 7.0842 KOps/s 7.1989 KOps/s $\color{#d91a1a}-1.59\%$
test_compile_add_one_nested[pytree-eager] 0.2327ms 0.1771ms 5.6472 KOps/s 5.6750 KOps/s $\color{#d91a1a}-0.49\%$
test_compile_copy_nested[tensordict-compile] 57.9310μs 21.1983μs 47.1735 KOps/s 48.0951 KOps/s $\color{#d91a1a}-1.92\%$
test_compile_copy_nested[tensordict-eager] 0.1744ms 42.8284μs 23.3490 KOps/s 22.8150 KOps/s $\color{#35bf28}+2.34\%$
test_compile_copy_nested[pytree-compile] 0.2173ms 64.1601μs 15.5860 KOps/s 15.4849 KOps/s $\color{#35bf28}+0.65\%$
test_compile_copy_nested[pytree-eager] 87.9810μs 48.9435μs 20.4317 KOps/s 20.3726 KOps/s $\color{#35bf28}+0.29\%$
test_compile_add_one_flat[tensordict-compile] 0.4059ms 0.3089ms 3.2370 KOps/s 3.2476 KOps/s $\color{#d91a1a}-0.33\%$
test_compile_add_one_flat[tensordict-eager] 0.2594ms 0.2062ms 4.8500 KOps/s 4.8785 KOps/s $\color{#d91a1a}-0.59\%$
test_compile_add_one_flat[tensorclass-compile] 0.1714ms 0.1269ms 7.8812 KOps/s 8.0870 KOps/s $\color{#d91a1a}-2.54\%$
test_compile_add_one_flat[tensorclass-eager] 0.1044ms 61.3613μs 16.2969 KOps/s 17.0373 KOps/s $\color{#d91a1a}-4.35\%$
test_compile_add_one_flat[pytree-compile] 0.3902ms 0.3064ms 3.2638 KOps/s 3.2870 KOps/s $\color{#d91a1a}-0.71\%$
test_compile_add_one_flat[pytree-eager] 0.7105ms 0.6212ms 1.6099 KOps/s 1.6935 KOps/s $\color{#d91a1a}-4.94\%$
test_compile_add_self_flat[tensordict-eager] 0.3027ms 0.2455ms 4.0739 KOps/s 4.0522 KOps/s $\color{#35bf28}+0.54\%$
test_compile_add_self_flat[tensordict-compile] 0.3490ms 0.3099ms 3.2271 KOps/s 3.2541 KOps/s $\color{#d91a1a}-0.83\%$
test_compile_add_self_flat[tensorclass-eager] 0.1169ms 71.8140μs 13.9249 KOps/s 14.4556 KOps/s $\color{#d91a1a}-3.67\%$
test_compile_add_self_flat[tensorclass-compile] 0.1852ms 0.1320ms 7.5730 KOps/s 8.0080 KOps/s $\textbf{\color{#d91a1a}-5.43\%}$
test_compile_add_self_flat[pytree-eager] 0.6793ms 0.5211ms 1.9192 KOps/s 1.9306 KOps/s $\color{#d91a1a}-0.59\%$
test_compile_add_self_flat[pytree-compile] 0.3564ms 0.3084ms 3.2426 KOps/s 3.2599 KOps/s $\color{#d91a1a}-0.53\%$
test_compile_copy_flat[tensordict-compile] 0.1046ms 18.7146μs 53.4341 KOps/s 55.6178 KOps/s $\color{#d91a1a}-3.93\%$
test_compile_copy_flat[tensordict-eager] 0.1865ms 26.1437μs 38.2501 KOps/s 35.9410 KOps/s $\textbf{\color{#35bf28}+6.42\%}$
test_compile_copy_flat[pytree-compile] 0.1133ms 70.1830μs 14.2485 KOps/s 14.1584 KOps/s $\color{#35bf28}+0.64\%$
test_compile_copy_flat[pytree-eager] 82.9410μs 51.1855μs 19.5368 KOps/s 19.2930 KOps/s $\color{#35bf28}+1.26\%$
test_compile_assign_and_add[tensordict-compile] 2.2993ms 0.8044ms 1.2431 KOps/s 1.1633 KOps/s $\textbf{\color{#35bf28}+6.86\%}$
test_compile_assign_and_add[tensordict-eager] 3.4213ms 3.2001ms 312.4867 Ops/s 312.6084 Ops/s $\color{#d91a1a}-0.04\%$
test_compile_assign_and_add[pytree-compile] 2.2182ms 0.7893ms 1.2669 KOps/s 1.1679 KOps/s $\textbf{\color{#35bf28}+8.48\%}$
test_compile_assign_and_add[pytree-eager] 3.1431ms 3.0944ms 323.1673 Ops/s 323.2633 Ops/s $\color{#d91a1a}-0.03\%$
test_compile_indexing[tensor-tensordict-compile] 0.1774ms 0.1061ms 9.4211 KOps/s 9.5113 KOps/s $\color{#d91a1a}-0.95\%$
test_compile_indexing[tensor-tensordict-eager] 0.1878ms 56.9767μs 17.5510 KOps/s 16.7283 KOps/s $\color{#35bf28}+4.92\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1731ms 0.1043ms 9.5920 KOps/s 10.0123 KOps/s $\color{#d91a1a}-4.20\%$
test_compile_indexing[tensor-tensorclass-eager] 85.3410μs 43.1984μs 23.1490 KOps/s 24.6009 KOps/s $\textbf{\color{#d91a1a}-5.90\%}$
test_compile_indexing[tensor-pytree-compile] 0.1623ms 0.1056ms 9.4678 KOps/s 9.9411 KOps/s $\color{#d91a1a}-4.76\%$
test_compile_indexing[tensor-pytree-eager] 89.4410μs 42.9741μs 23.2698 KOps/s 24.4453 KOps/s $\color{#d91a1a}-4.81\%$
test_compile_indexing[slice-tensordict-compile] 0.1749ms 0.1344ms 7.4396 KOps/s 7.5289 KOps/s $\color{#d91a1a}-1.19\%$
test_compile_indexing[slice-tensordict-eager] 0.1630ms 24.6604μs 40.5508 KOps/s 40.6209 KOps/s $\color{#d91a1a}-0.17\%$
test_compile_indexing[slice-tensorclass-compile] 0.1663ms 0.1270ms 7.8729 KOps/s 7.9254 KOps/s $\color{#d91a1a}-0.66\%$
test_compile_indexing[slice-tensorclass-eager] 54.9100μs 20.5070μs 48.7639 KOps/s 49.6705 KOps/s $\color{#d91a1a}-1.83\%$
test_compile_indexing[slice-pytree-compile] 0.1761ms 0.1337ms 7.4799 KOps/s 7.9184 KOps/s $\textbf{\color{#d91a1a}-5.54\%}$
test_compile_indexing[slice-pytree-eager] 56.7610μs 20.1896μs 49.5304 KOps/s 49.4215 KOps/s $\color{#35bf28}+0.22\%$
test_compile_indexing[int-tensordict-compile] 0.2149ms 0.1409ms 7.0977 KOps/s 7.5186 KOps/s $\textbf{\color{#d91a1a}-5.60\%}$
test_compile_indexing[int-tensordict-eager] 0.5268ms 24.7900μs 40.3388 KOps/s 40.9161 KOps/s $\color{#d91a1a}-1.41\%$
test_compile_indexing[int-tensorclass-compile] 0.1715ms 0.1285ms 7.7834 KOps/s 7.6482 KOps/s $\color{#35bf28}+1.77\%$
test_compile_indexing[int-tensorclass-eager] 54.3810μs 19.8653μs 50.3390 KOps/s 48.9264 KOps/s $\color{#35bf28}+2.89\%$
test_compile_indexing[int-pytree-compile] 0.1802ms 0.1282ms 7.7977 KOps/s 7.7242 KOps/s $\color{#35bf28}+0.95\%$
test_compile_indexing[int-pytree-eager] 62.3200μs 20.4527μs 48.8933 KOps/s 48.7962 KOps/s $\color{#35bf28}+0.20\%$
test_mod_add[eager] 75.9710μs 30.8925μs 32.3704 KOps/s 32.3568 KOps/s $\color{#35bf28}+0.04\%$
test_mod_add[compile] 0.1788ms 70.4368μs 14.1971 KOps/s 14.2779 KOps/s $\color{#d91a1a}-0.57\%$
test_mod_add[compile-overhead] 0.2696ms 0.1344ms 7.4385 KOps/s 7.1644 KOps/s $\color{#35bf28}+3.83\%$
test_mod_wrap[eager] 0.3634ms 0.2399ms 4.1691 KOps/s 4.2885 KOps/s $\color{#d91a1a}-2.78\%$
test_mod_wrap[compile] 1.4441ms 0.2900ms 3.4478 KOps/s 3.4338 KOps/s $\color{#35bf28}+0.41\%$
test_mod_wrap[compile-overhead] 7.6253ms 4.1141ms 243.0688 Ops/s 246.5231 Ops/s $\color{#d91a1a}-1.40\%$
test_mod_wrap_and_backward[eager] 1.7021ms 1.2967ms 771.1804 Ops/s 721.8503 Ops/s $\textbf{\color{#35bf28}+6.83\%}$
test_mod_wrap_and_backward[compile] 1.5036ms 1.2793ms 781.6865 Ops/s 712.8300 Ops/s $\textbf{\color{#35bf28}+9.66\%}$
test_mod_wrap_and_backward[compile-overhead] 1.3274ms 0.8859ms 1.1288 KOps/s 1.0144 KOps/s $\textbf{\color{#35bf28}+11.28\%}$
test_seq_add[eager] 0.1540ms 97.1850μs 10.2897 KOps/s 10.2427 KOps/s $\color{#35bf28}+0.46\%$
test_seq_add[compile] 0.1506ms 81.8419μs 12.2187 KOps/s 12.1808 KOps/s $\color{#35bf28}+0.31\%$
test_seq_add[compile-overhead] 0.1879ms 0.1161ms 8.6167 KOps/s 8.9553 KOps/s $\color{#d91a1a}-3.78\%$
test_seq_wrap[eager] 0.4607ms 0.3826ms 2.6136 KOps/s 2.6713 KOps/s $\color{#d91a1a}-2.16\%$
test_seq_wrap[compile] 0.3695ms 0.3119ms 3.2064 KOps/s 3.1023 KOps/s $\color{#35bf28}+3.35\%$
test_seq_wrap[compile-overhead] 0.2670ms 0.2153ms 4.6452 KOps/s 4.6130 KOps/s $\color{#35bf28}+0.70\%$
test_func_call_runtime[False-eager] 0.8561ms 0.7395ms 1.3522 KOps/s 1.4000 KOps/s $\color{#d91a1a}-3.41\%$
test_func_call_runtime[False-compile] 1.0381ms 0.7553ms 1.3240 KOps/s 1.2827 KOps/s $\color{#35bf28}+3.22\%$
test_func_call_runtime[False-compile-overhead] 0.4159ms 0.3499ms 2.8579 KOps/s 2.8591 KOps/s $\color{#d91a1a}-0.04\%$
test_func_call_runtime[True-eager] 1.0051ms 0.8683ms 1.1517 KOps/s 1.1420 KOps/s $\color{#35bf28}+0.86\%$
test_func_call_runtime[True-compile] 0.8974ms 0.7825ms 1.2779 KOps/s 1.2638 KOps/s $\color{#35bf28}+1.12\%$
test_func_call_runtime[True-compile-overhead] 0.4390ms 0.3728ms 2.6824 KOps/s 2.7001 KOps/s $\color{#d91a1a}-0.65\%$
test_func_call_cm_runtime[False-eager] 0.7994ms 0.7247ms 1.3799 KOps/s 1.4163 KOps/s $\color{#d91a1a}-2.57\%$
test_func_call_cm_runtime[False-compile] 0.8580ms 0.7660ms 1.3055 KOps/s 1.2961 KOps/s $\color{#35bf28}+0.73\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4900ms 0.3521ms 2.8404 KOps/s 2.8583 KOps/s $\color{#d91a1a}-0.62\%$
test_func_call_cm_runtime[True-eager] 1.0536ms 0.9582ms 1.0436 KOps/s 1.0262 KOps/s $\color{#35bf28}+1.69\%$
test_func_call_cm_runtime[True-compile] 0.8610ms 0.8100ms 1.2346 KOps/s 1.2165 KOps/s $\color{#35bf28}+1.49\%$
test_func_call_cm_runtime[True-compile-overhead] 0.4840ms 0.3942ms 2.5368 KOps/s 2.5365 KOps/s $\color{#35bf28}+0.01\%$
test_vmap_func_call_cm_runtime[eager] 2.4322ms 1.9708ms 507.4013 Ops/s 505.1941 Ops/s $\color{#35bf28}+0.44\%$
test_vmap_func_call_cm_runtime[compile] 0.9930ms 0.8197ms 1.2199 KOps/s 1.1744 KOps/s $\color{#35bf28}+3.87\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.4469ms 0.3989ms 2.5066 KOps/s 2.5181 KOps/s $\color{#d91a1a}-0.46\%$
test_distributed 3.0114ms 0.1804ms 5.5434 KOps/s 8.8804 KOps/s $\textbf{\color{#d91a1a}-37.58\%}$
test_tdmodule 0.4129ms 15.0639μs 66.3840 KOps/s 66.6909 KOps/s $\color{#d91a1a}-0.46\%$
test_tdmodule_dispatch 56.0510μs 28.2576μs 35.3887 KOps/s 34.3009 KOps/s $\color{#35bf28}+3.17\%$
test_tdseq 37.6310μs 15.5780μs 64.1932 KOps/s 64.4216 KOps/s $\color{#d91a1a}-0.35\%$
test_tdseq_dispatch 52.3700μs 30.9920μs 32.2664 KOps/s 31.9886 KOps/s $\color{#35bf28}+0.87\%$
test_instantiation_functorch 1.9668ms 1.8083ms 552.9976 Ops/s 550.2546 Ops/s $\color{#35bf28}+0.50\%$
test_instantiation_td 1.7809ms 1.1649ms 858.4295 Ops/s 854.0178 Ops/s $\color{#35bf28}+0.52\%$
test_exec_functorch 0.3371ms 0.2049ms 4.8803 KOps/s 4.8751 KOps/s $\color{#35bf28}+0.11\%$
test_exec_functional_call 0.2698ms 0.2023ms 4.9434 KOps/s 4.8975 KOps/s $\color{#35bf28}+0.94\%$
test_exec_td 0.3091ms 0.2060ms 4.8551 KOps/s 4.8361 KOps/s $\color{#35bf28}+0.39\%$
test_exec_td_decorator 1.1083ms 0.2463ms 4.0608 KOps/s 3.9821 KOps/s $\color{#35bf28}+1.98\%$
test_vmap_mlp_speed[True-True] 0.7563ms 0.6578ms 1.5201 KOps/s 1.4961 KOps/s $\color{#35bf28}+1.61\%$
test_vmap_mlp_speed[True-False] 0.7183ms 0.6555ms 1.5257 KOps/s 1.5208 KOps/s $\color{#35bf28}+0.32\%$
test_vmap_mlp_speed[False-True] 0.7190ms 0.5499ms 1.8184 KOps/s 1.8187 KOps/s $\color{#d91a1a}-0.02\%$
test_vmap_mlp_speed[False-False] 0.6105ms 0.5473ms 1.8272 KOps/s 1.8085 KOps/s $\color{#35bf28}+1.04\%$
test_vmap_mlp_speed_decorator[True-True] 1.3675ms 0.6462ms 1.5474 KOps/s 1.5577 KOps/s $\color{#d91a1a}-0.66\%$
test_vmap_mlp_speed_decorator[True-False] 0.7662ms 0.6438ms 1.5533 KOps/s 1.5489 KOps/s $\color{#35bf28}+0.29\%$
test_vmap_mlp_speed_decorator[False-True] 0.6847ms 0.5629ms 1.7764 KOps/s 1.7755 KOps/s $\color{#35bf28}+0.05\%$
test_vmap_mlp_speed_decorator[False-False] 0.6846ms 0.5664ms 1.7657 KOps/s 1.7206 KOps/s $\color{#35bf28}+2.62\%$
test_vmap_transformer_speed[True-True] 8.7600ms 7.9734ms 125.4171 Ops/s 124.4053 Ops/s $\color{#35bf28}+0.81\%$
test_vmap_transformer_speed[True-False] 8.1053ms 7.9390ms 125.9603 Ops/s 124.7698 Ops/s $\color{#35bf28}+0.95\%$
test_vmap_transformer_speed[False-True] 7.9270ms 7.7614ms 128.8425 Ops/s 126.7856 Ops/s $\color{#35bf28}+1.62\%$
test_vmap_transformer_speed[False-False] 8.1094ms 7.7860ms 128.4351 Ops/s 127.5118 Ops/s $\color{#35bf28}+0.72\%$
test_vmap_transformer_speed_decorator[True-True] 19.5412ms 18.6590ms 53.5935 Ops/s 53.8603 Ops/s $\color{#d91a1a}-0.50\%$
test_vmap_transformer_speed_decorator[True-False] 19.1354ms 18.7188ms 53.4224 Ops/s 53.5788 Ops/s $\color{#d91a1a}-0.29\%$
test_vmap_transformer_speed_decorator[False-True] 18.6280ms 18.5589ms 53.8824 Ops/s 54.1023 Ops/s $\color{#d91a1a}-0.41\%$
test_vmap_transformer_speed_decorator[False-False] 19.8194ms 18.5835ms 53.8113 Ops/s 54.0899 Ops/s $\color{#d91a1a}-0.52\%$
test_to_module_speed[True] 1.9749ms 0.9313ms 1.0738 KOps/s 1.0691 KOps/s $\color{#35bf28}+0.44\%$
test_to_module_speed[False] 0.9841ms 0.8997ms 1.1115 KOps/s 1.0860 KOps/s $\color{#35bf28}+2.35\%$
test_tc_init 61.0710μs 34.5744μs 28.9231 KOps/s 29.3490 KOps/s $\color{#d91a1a}-1.45\%$
test_tc_init_nested 0.1102ms 71.6088μs 13.9648 KOps/s 14.8388 KOps/s $\textbf{\color{#d91a1a}-5.89\%}$
test_tc_first_layer_tensor 16.0401μs 0.6635μs 1.5072 MOps/s 1.5031 MOps/s $\color{#35bf28}+0.27\%$
test_tc_first_layer_nontensor 30.1900μs 2.2020μs 454.1286 KOps/s 453.3373 KOps/s $\color{#35bf28}+0.17\%$
test_tc_second_layer_tensor 8.7575μs 1.3379μs 747.4442 KOps/s 740.5515 KOps/s $\color{#35bf28}+0.93\%$
test_tc_second_layer_nontensor 33.1900μs 2.8723μs 348.1505 KOps/s 345.4798 KOps/s $\color{#35bf28}+0.77\%$
test_unbind 0.1961s 10.8551ms 92.1223 Ops/s 94.1153 Ops/s $\color{#d91a1a}-2.12\%$
test_full_like 0.6404ms 0.5762ms 1.7356 KOps/s 1.7382 KOps/s $\color{#d91a1a}-0.15\%$
test_zeros_like 0.2620ms 0.1980ms 5.0495 KOps/s 5.0518 KOps/s $\color{#d91a1a}-0.04\%$
test_ones_like 0.2334ms 0.1979ms 5.0535 KOps/s 5.0559 KOps/s $\color{#d91a1a}-0.05\%$
test_clone 0.4411ms 0.4145ms 2.4126 KOps/s 2.4189 KOps/s $\color{#d91a1a}-0.26\%$
test_squeeze 41.3600μs 9.7764μs 102.2871 KOps/s 100.8873 KOps/s $\color{#35bf28}+1.39\%$
test_unsqueeze 0.2320ms 74.4689μs 13.4284 KOps/s 13.5519 KOps/s $\color{#d91a1a}-0.91\%$
test_split 0.4442ms 0.1571ms 6.3662 KOps/s 6.2998 KOps/s $\color{#35bf28}+1.05\%$
test_permute 0.2936ms 0.1767ms 5.6578 KOps/s 5.7232 KOps/s $\color{#d91a1a}-1.14\%$
test_stack 1.2537ms 0.8564ms 1.1676 KOps/s 1.1573 KOps/s $\color{#35bf28}+0.89\%$
test_cat 1.2548ms 1.2315ms 812.0121 Ops/s 811.6978 Ops/s $\color{#35bf28}+0.04\%$

@vmoens vmoens added the enhancement New feature or request label Oct 1, 2024
vmoens added a commit that referenced this pull request Oct 1, 2024
ghstack-source-id: e5ea6fef54f47304e1a6cafbd15f4bdade5e69b4
Pull Request resolved: #1016
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Oct 1, 2024
ghstack-source-id: 36379bbed4125713f00e115dcc66c14fa439c12f
Pull Request resolved: #1016
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Oct 1, 2024
ghstack-source-id: 1d9bcff8e4f6e308d8f8e9fa06b3da4eca8905f1
Pull Request resolved: #1016
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Oct 1, 2024
ghstack-source-id: 36d694db1da278fb84f36419b1b978de817ca453
Pull Request resolved: #1016
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Oct 1, 2024
ghstack-source-id: 02c247e8fa01ba3d71ce61d48d141b7bafd064f5
Pull Request resolved: #1016
@vmoens vmoens merged commit 5e59b9f into gh/vmoens/20/base Oct 1, 2024
11 of 24 checks passed
@vmoens vmoens deleted the gh/vmoens/20/head branch October 1, 2024 14:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants