Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Frozen tensorclass #984

Merged
merged 4 commits into from
Sep 10, 2024
Merged

[Feature] Frozen tensorclass #984

merged 4 commits into from
Sep 10, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Sep 10, 2024

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Sep 10, 2024
ghstack-source-id: 06d10877115ce1659f200d37c4294fb90b10c342
Pull Request resolved: #984
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 10, 2024
@vmoens vmoens added the enhancement New feature or request label Sep 10, 2024
Copy link

github-actions bot commented Sep 10, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 228. Improved: $\large\color{#35bf28}26$. Worsened: $\large\color{#d91a1a}9$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 34.4210μs 13.0639μs 76.5466 KOps/s 68.3677 KOps/s $\textbf{\color{#35bf28}+11.96\%}$
test_plain_set_stack_nested 65.7020μs 13.4683μs 74.2482 KOps/s 68.8072 KOps/s $\textbf{\color{#35bf28}+7.91\%}$
test_plain_set_nested_inplace 46.1210μs 14.0829μs 71.0080 KOps/s 64.1640 KOps/s $\textbf{\color{#35bf28}+10.67\%}$
test_plain_set_stack_nested_inplace 49.1010μs 13.9580μs 71.6434 KOps/s 64.9127 KOps/s $\textbf{\color{#35bf28}+10.37\%}$
test_items 26.8110μs 2.8030μs 356.7559 KOps/s 333.3529 KOps/s $\textbf{\color{#35bf28}+7.02\%}$
test_items_nested 0.4576ms 0.3131ms 3.1938 KOps/s 3.1970 KOps/s $\color{#d91a1a}-0.10\%$
test_items_nested_locked 0.3833ms 0.3152ms 3.1727 KOps/s 3.1664 KOps/s $\color{#35bf28}+0.20\%$
test_items_nested_leaf 91.7620μs 62.9993μs 15.8732 KOps/s 15.8528 KOps/s $\color{#35bf28}+0.13\%$
test_items_stack_nested 0.3690ms 0.3150ms 3.1750 KOps/s 3.2057 KOps/s $\color{#d91a1a}-0.96\%$
test_items_stack_nested_leaf 92.6630μs 64.9372μs 15.3995 KOps/s 15.6401 KOps/s $\color{#d91a1a}-1.54\%$
test_items_stack_nested_locked 0.3893ms 0.3160ms 3.1647 KOps/s 3.2154 KOps/s $\color{#d91a1a}-1.58\%$
test_keys 25.7910μs 3.3675μs 296.9560 KOps/s 295.3800 KOps/s $\color{#35bf28}+0.53\%$
test_keys_nested 99.6120μs 53.8958μs 18.5543 KOps/s 18.5901 KOps/s $\color{#d91a1a}-0.19\%$
test_keys_nested_locked 2.5133ms 60.7427μs 16.4629 KOps/s 16.3721 KOps/s $\color{#35bf28}+0.55\%$
test_keys_nested_leaf 82.9920μs 45.6846μs 21.8892 KOps/s 21.0810 KOps/s $\color{#35bf28}+3.83\%$
test_keys_stack_nested 99.8720μs 55.6760μs 17.9611 KOps/s 17.9923 KOps/s $\color{#d91a1a}-0.17\%$
test_keys_stack_nested_leaf 87.4630μs 47.5497μs 21.0306 KOps/s 20.7427 KOps/s $\color{#35bf28}+1.39\%$
test_keys_stack_nested_locked 99.2530μs 60.3689μs 16.5648 KOps/s 16.5769 KOps/s $\color{#d91a1a}-0.07\%$
test_values 6.6118μs 0.8223μs 1.2161 MOps/s 1.1952 MOps/s $\color{#35bf28}+1.75\%$
test_values_nested 53.6110μs 27.2153μs 36.7440 KOps/s 36.4196 KOps/s $\color{#35bf28}+0.89\%$
test_values_nested_locked 60.9020μs 29.2427μs 34.1966 KOps/s 34.2274 KOps/s $\color{#d91a1a}-0.09\%$
test_values_nested_leaf 51.3510μs 24.0881μs 41.5143 KOps/s 41.1845 KOps/s $\color{#35bf28}+0.80\%$
test_values_stack_nested 61.7220μs 28.4904μs 35.0995 KOps/s 35.0320 KOps/s $\color{#35bf28}+0.19\%$
test_values_stack_nested_leaf 47.2710μs 25.2551μs 39.5959 KOps/s 39.9969 KOps/s $\color{#d91a1a}-1.00\%$
test_values_stack_nested_locked 66.1520μs 30.5470μs 32.7365 KOps/s 33.1668 KOps/s $\color{#d91a1a}-1.30\%$
test_membership 1.9446μs 0.5086μs 1.9662 MOps/s 1.9754 MOps/s $\color{#d91a1a}-0.47\%$
test_membership_nested 31.4510μs 1.8229μs 548.5759 KOps/s 576.0999 KOps/s $\color{#d91a1a}-4.78\%$
test_membership_nested_leaf 17.1803μs 1.7327μs 577.1213 KOps/s 585.2605 KOps/s $\color{#d91a1a}-1.39\%$
test_membership_stacked_nested 37.8110μs 1.7716μs 564.4700 KOps/s 568.3712 KOps/s $\color{#d91a1a}-0.69\%$
test_membership_stacked_nested_leaf 95.9720μs 1.7761μs 563.0234 KOps/s 571.5719 KOps/s $\color{#d91a1a}-1.50\%$
test_membership_nested_last 28.6610μs 2.5963μs 385.1707 KOps/s 377.6801 KOps/s $\color{#35bf28}+1.98\%$
test_membership_nested_leaf_last 40.4710μs 2.6532μs 376.9002 KOps/s 379.3903 KOps/s $\color{#d91a1a}-0.66\%$
test_membership_stacked_nested_last 31.3610μs 2.9957μs 333.8094 KOps/s 386.0015 KOps/s $\textbf{\color{#d91a1a}-13.52\%}$
test_membership_stacked_nested_leaf_last 39.7810μs 2.9932μs 334.0959 KOps/s 381.6811 KOps/s $\textbf{\color{#d91a1a}-12.47\%}$
test_nested_getleaf 31.2610μs 6.1073μs 163.7393 KOps/s 165.1726 KOps/s $\color{#d91a1a}-0.87\%$
test_nested_get 38.7210μs 5.7194μs 174.8448 KOps/s 175.3099 KOps/s $\color{#d91a1a}-0.27\%$
test_stacked_getleaf 34.1710μs 5.9917μs 166.8968 KOps/s 164.7101 KOps/s $\color{#35bf28}+1.33\%$
test_stacked_get 42.0010μs 5.6344μs 177.4805 KOps/s 177.3249 KOps/s $\color{#35bf28}+0.09\%$
test_nested_getitemleaf 33.7310μs 6.1164μs 163.4950 KOps/s 162.4870 KOps/s $\color{#35bf28}+0.62\%$
test_nested_getitem 40.5810μs 5.6960μs 175.5622 KOps/s 173.7685 KOps/s $\color{#35bf28}+1.03\%$
test_stacked_getitemleaf 32.5310μs 6.0198μs 166.1187 KOps/s 165.5929 KOps/s $\color{#35bf28}+0.32\%$
test_stacked_getitem 39.4510μs 5.6323μs 177.5458 KOps/s 175.0526 KOps/s $\color{#35bf28}+1.42\%$
test_lock_nested 5.1150ms 0.4161ms 2.4033 KOps/s 2.3990 KOps/s $\color{#35bf28}+0.18\%$
test_lock_stack_nested 0.4536ms 0.3770ms 2.6525 KOps/s 2.6842 KOps/s $\color{#d91a1a}-1.18\%$
test_unlock_nested 0.7646ms 0.3525ms 2.8371 KOps/s 2.8403 KOps/s $\color{#d91a1a}-0.11\%$
test_unlock_stack_nested 0.4706ms 0.3163ms 3.1617 KOps/s 3.2037 KOps/s $\color{#d91a1a}-1.31\%$
test_flatten_speed 0.2851ms 80.8025μs 12.3759 KOps/s 12.5466 KOps/s $\color{#d91a1a}-1.36\%$
test_unflatten_speed 0.3529ms 0.2779ms 3.5988 KOps/s 3.5712 KOps/s $\color{#35bf28}+0.77\%$
test_common_ops 1.3923ms 1.2092ms 826.9660 Ops/s 790.3709 Ops/s $\color{#35bf28}+4.63\%$
test_creation 24.4000μs 1.4675μs 681.4183 KOps/s 676.8297 KOps/s $\color{#35bf28}+0.68\%$
test_creation_empty 48.0910μs 13.6022μs 73.5176 KOps/s 60.1735 KOps/s $\textbf{\color{#35bf28}+22.18\%}$
test_creation_nested_1 48.3010μs 15.2274μs 65.6712 KOps/s 55.3504 KOps/s $\textbf{\color{#35bf28}+18.65\%}$
test_creation_nested_2 43.6710μs 17.6939μs 56.5167 KOps/s 47.9619 KOps/s $\textbf{\color{#35bf28}+17.84\%}$
test_clone 70.9010μs 28.9451μs 34.5482 KOps/s 35.3381 KOps/s $\color{#d91a1a}-2.24\%$
test_getitem[int] 1.1702ms 15.4623μs 64.6734 KOps/s 63.8507 KOps/s $\color{#35bf28}+1.29\%$
test_getitem[slice_int] 0.1174ms 27.2216μs 36.7355 KOps/s 36.0967 KOps/s $\color{#35bf28}+1.77\%$
test_getitem[range] 0.1535ms 0.1094ms 9.1367 KOps/s 8.9668 KOps/s $\color{#35bf28}+1.89\%$
test_getitem[tuple] 0.1212ms 23.5785μs 42.4116 KOps/s 42.8505 KOps/s $\color{#d91a1a}-1.02\%$
test_getitem[list] 0.2216ms 98.7253μs 10.1291 KOps/s 10.1157 KOps/s $\color{#35bf28}+0.13\%$
test_setitem_dim[int] 70.3120μs 48.9028μs 20.4487 KOps/s 19.2987 KOps/s $\textbf{\color{#35bf28}+5.96\%}$
test_setitem_dim[slice_int] 0.1148ms 71.9333μs 13.9018 KOps/s 13.4285 KOps/s $\color{#35bf28}+3.52\%$
test_setitem_dim[range] 0.1829ms 0.1337ms 7.4789 KOps/s 7.2820 KOps/s $\color{#35bf28}+2.70\%$
test_setitem_dim[tuple] 96.9320μs 65.9256μs 15.1686 KOps/s 14.5990 KOps/s $\color{#35bf28}+3.90\%$
test_setitem 77.0820μs 40.7218μs 24.5569 KOps/s 23.8172 KOps/s $\color{#35bf28}+3.11\%$
test_set 79.3420μs 39.6310μs 25.2328 KOps/s 24.2883 KOps/s $\color{#35bf28}+3.89\%$
test_set_shared 0.3604ms 50.6850μs 19.7297 KOps/s 20.1251 KOps/s $\color{#d91a1a}-1.96\%$
test_update 91.1330μs 48.1077μs 20.7867 KOps/s 19.9440 KOps/s $\color{#35bf28}+4.23\%$
test_update_nested 95.4120μs 54.7539μs 18.2635 KOps/s 17.5625 KOps/s $\color{#35bf28}+3.99\%$
test_update__nested 98.2930μs 60.4177μs 16.5515 KOps/s 16.8878 KOps/s $\color{#d91a1a}-1.99\%$
test_set_nested 86.6930μs 42.6972μs 23.4208 KOps/s 22.8242 KOps/s $\color{#35bf28}+2.61\%$
test_set_nested_new 0.4363ms 46.3964μs 21.5534 KOps/s 21.3152 KOps/s $\color{#35bf28}+1.12\%$
test_select 99.4030μs 58.7694μs 17.0156 KOps/s 16.5847 KOps/s $\color{#35bf28}+2.60\%$
test_select_nested 0.1268ms 41.7297μs 23.9637 KOps/s 23.8127 KOps/s $\color{#35bf28}+0.63\%$
test_exclude_nested 85.5920μs 58.4021μs 17.1227 KOps/s 16.6215 KOps/s $\color{#35bf28}+3.02\%$
test_empty[True] 0.3218ms 0.2403ms 4.1616 KOps/s 4.0592 KOps/s $\color{#35bf28}+2.52\%$
test_empty[False] 3.6781μs 0.7382μs 1.3546 MOps/s 1.3506 MOps/s $\color{#35bf28}+0.30\%$
test_to 54.7310μs 25.3393μs 39.4644 KOps/s 38.8090 KOps/s $\color{#35bf28}+1.69\%$
test_to_nonblocking 57.6620μs 23.3621μs 42.8044 KOps/s 42.0438 KOps/s $\color{#35bf28}+1.81\%$
test_unbind_speed 0.3465ms 0.2763ms 3.6197 KOps/s 3.6345 KOps/s $\color{#d91a1a}-0.41\%$
test_unbind_speed_stack0 0.3339ms 0.2690ms 3.7170 KOps/s 3.6337 KOps/s $\color{#35bf28}+2.29\%$
test_unbind_speed_stack1 93.2698ms 0.6939ms 1.4410 KOps/s 1.4298 KOps/s $\color{#35bf28}+0.79\%$
test_split 95.1194ms 2.1033ms 475.4431 Ops/s 462.4089 Ops/s $\color{#35bf28}+2.82\%$
test_chunk 95.3029ms 2.1021ms 475.7137 Ops/s 458.9178 Ops/s $\color{#35bf28}+3.66\%$
test_creation[device0] 0.3725ms 0.1261ms 7.9297 KOps/s 8.0108 KOps/s $\color{#d91a1a}-1.01\%$
test_creation_from_tensor 0.3541ms 0.1282ms 7.8006 KOps/s 7.5958 KOps/s $\color{#35bf28}+2.70\%$
test_add_one[memmap_tensor0] 0.2189ms 8.8767μs 112.6549 KOps/s 116.2419 KOps/s $\color{#d91a1a}-3.09\%$
test_contiguous[memmap_tensor0] 36.5610μs 2.1461μs 465.9636 KOps/s 464.5151 KOps/s $\color{#35bf28}+0.31\%$
test_stack[memmap_tensor0] 29.7610μs 6.5835μs 151.8960 KOps/s 153.3374 KOps/s $\color{#d91a1a}-0.94\%$
test_memmaptd_index 1.0495ms 0.4091ms 2.4443 KOps/s 2.3818 KOps/s $\color{#35bf28}+2.62\%$
test_memmaptd_index_astensor 0.7423ms 0.4690ms 2.1324 KOps/s 2.0895 KOps/s $\color{#35bf28}+2.05\%$
test_memmaptd_index_op 1.3895ms 0.9735ms 1.0272 KOps/s 985.2381 Ops/s $\color{#35bf28}+4.26\%$
test_serialize_model 0.1306s 0.1289s 7.7561 Ops/s 7.7517 Ops/s $\color{#35bf28}+0.06\%$
test_serialize_model_pickle 1.3470s 1.2118s 0.8252 Ops/s 0.8241 Ops/s $\color{#35bf28}+0.13\%$
test_serialize_weights 0.2199s 0.1413s 7.0749 Ops/s 6.9847 Ops/s $\color{#35bf28}+1.29\%$
test_serialize_weights_returnearly 0.2249s 57.0550ms 17.5269 Ops/s 17.8680 Ops/s $\color{#d91a1a}-1.91\%$
test_serialize_weights_pickle 1.3475s 1.2167s 0.8219 Ops/s 0.8190 Ops/s $\color{#35bf28}+0.35\%$
test_reshape_pytree 79.0120μs 36.0476μs 27.7411 KOps/s 28.2154 KOps/s $\color{#d91a1a}-1.68\%$
test_reshape_td 70.7120μs 39.9523μs 25.0299 KOps/s 23.5005 KOps/s $\textbf{\color{#35bf28}+6.51\%}$
test_view_pytree 74.2820μs 35.4154μs 28.2363 KOps/s 27.8836 KOps/s $\color{#35bf28}+1.27\%$
test_view_td 81.1820μs 44.0092μs 22.7225 KOps/s 21.2537 KOps/s $\textbf{\color{#35bf28}+6.91\%}$
test_unbind_pytree 63.5220μs 34.4456μs 29.0313 KOps/s 27.8992 KOps/s $\color{#35bf28}+4.06\%$
test_unbind_td 0.3480ms 41.6381μs 24.0165 KOps/s 23.4688 KOps/s $\color{#35bf28}+2.33\%$
test_split_pytree 82.6520μs 46.7044μs 21.4113 KOps/s 20.9019 KOps/s $\color{#35bf28}+2.44\%$
test_split_td 0.4151ms 55.1697μs 18.1259 KOps/s 17.2507 KOps/s $\textbf{\color{#35bf28}+5.07\%}$
test_add_pytree 0.1051ms 58.0198μs 17.2355 KOps/s 16.2619 KOps/s $\textbf{\color{#35bf28}+5.99\%}$
test_add_td 0.1582ms 87.2653μs 11.4593 KOps/s 10.3483 KOps/s $\textbf{\color{#35bf28}+10.74\%}$
test_compile_add_one_nested[tensordict-compile] 0.4063ms 0.2075ms 4.8202 KOps/s 4.8332 KOps/s $\color{#d91a1a}-0.27\%$
test_compile_add_one_nested[tensordict-eager] 0.2438ms 0.1552ms 6.4433 KOps/s 6.4133 KOps/s $\color{#35bf28}+0.47\%$
test_compile_add_one_nested[pytree-compile] 0.1933ms 0.1436ms 6.9629 KOps/s 6.9770 KOps/s $\color{#d91a1a}-0.20\%$
test_compile_add_one_nested[pytree-eager] 0.2760ms 0.1871ms 5.3452 KOps/s 5.3453 KOps/s $-0.00\%$
test_compile_copy_nested[tensordict-compile] 69.5410μs 20.5706μs 48.6132 KOps/s 50.5019 KOps/s $\color{#d91a1a}-3.74\%$
test_compile_copy_nested[tensordict-eager] 80.2020μs 43.3225μs 23.0827 KOps/s 23.3797 KOps/s $\color{#d91a1a}-1.27\%$
test_compile_copy_nested[pytree-compile] 0.2677ms 65.2447μs 15.3269 KOps/s 15.4511 KOps/s $\color{#d91a1a}-0.80\%$
test_compile_copy_nested[pytree-eager] 83.4620μs 50.6604μs 19.7393 KOps/s 19.7982 KOps/s $\color{#d91a1a}-0.30\%$
test_compile_add_one_flat[tensordict-compile] 0.3980ms 0.3148ms 3.1765 KOps/s 3.1719 KOps/s $\color{#35bf28}+0.15\%$
test_compile_add_one_flat[tensordict-eager] 0.2536ms 0.2110ms 4.7388 KOps/s 4.7489 KOps/s $\color{#d91a1a}-0.21\%$
test_compile_add_one_flat[tensorclass-compile] 0.1813ms 0.1274ms 7.8466 KOps/s 7.8580 KOps/s $\color{#d91a1a}-0.14\%$
test_compile_add_one_flat[tensorclass-eager] 0.1207ms 60.9978μs 16.3940 KOps/s 16.9470 KOps/s $\color{#d91a1a}-3.26\%$
test_compile_add_one_flat[pytree-compile] 0.3672ms 0.3138ms 3.1862 KOps/s 3.1813 KOps/s $\color{#35bf28}+0.16\%$
test_compile_add_one_flat[pytree-eager] 0.6999ms 0.6340ms 1.5773 KOps/s 1.6217 KOps/s $\color{#d91a1a}-2.74\%$
test_compile_add_self_flat[tensordict-eager] 0.3149ms 0.2495ms 4.0078 KOps/s 3.9890 KOps/s $\color{#35bf28}+0.47\%$
test_compile_add_self_flat[tensordict-compile] 0.4220ms 0.3124ms 3.2015 KOps/s 3.1717 KOps/s $\color{#35bf28}+0.94\%$
test_compile_add_self_flat[tensorclass-eager] 0.1643ms 73.6600μs 13.5759 KOps/s 13.5555 KOps/s $\color{#35bf28}+0.15\%$
test_compile_add_self_flat[tensorclass-compile] 0.2388ms 0.1339ms 7.4662 KOps/s 7.7455 KOps/s $\color{#d91a1a}-3.61\%$
test_compile_add_self_flat[pytree-eager] 0.6185ms 0.5417ms 1.8462 KOps/s 1.7634 KOps/s $\color{#35bf28}+4.70\%$
test_compile_add_self_flat[pytree-compile] 0.3663ms 0.3130ms 3.1950 KOps/s 3.1694 KOps/s $\color{#35bf28}+0.81\%$
test_compile_copy_flat[tensordict-compile] 46.9620μs 18.7708μs 53.2742 KOps/s 55.9306 KOps/s $\color{#d91a1a}-4.75\%$
test_compile_copy_flat[tensordict-eager] 60.1410μs 28.4860μs 35.1050 KOps/s 36.4146 KOps/s $\color{#d91a1a}-3.60\%$
test_compile_copy_flat[pytree-compile] 0.1054ms 68.7130μs 14.5533 KOps/s 14.4382 KOps/s $\color{#35bf28}+0.80\%$
test_compile_copy_flat[pytree-eager] 83.8020μs 51.4931μs 19.4201 KOps/s 19.4149 KOps/s $\color{#35bf28}+0.03\%$
test_compile_assign_and_add[tensordict-compile] 2.3103ms 0.8004ms 1.2494 KOps/s 1.1601 KOps/s $\textbf{\color{#35bf28}+7.69\%}$
test_compile_assign_and_add[tensordict-eager] 3.4745ms 3.2202ms 310.5387 Ops/s 311.3047 Ops/s $\color{#d91a1a}-0.25\%$
test_compile_assign_and_add[pytree-compile] 2.2927ms 0.7862ms 1.2719 KOps/s 1.1792 KOps/s $\textbf{\color{#35bf28}+7.86\%}$
test_compile_assign_and_add[pytree-eager] 3.3866ms 3.2567ms 307.0552 Ops/s 316.2152 Ops/s $\color{#d91a1a}-2.90\%$
test_compile_indexing[tensor-tensordict-compile] 0.1560ms 0.1095ms 9.1339 KOps/s 9.2889 KOps/s $\color{#d91a1a}-1.67\%$
test_compile_indexing[tensor-tensordict-eager] 0.1838ms 60.7826μs 16.4521 KOps/s 17.0262 KOps/s $\color{#d91a1a}-3.37\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1647ms 0.1069ms 9.3541 KOps/s 9.7911 KOps/s $\color{#d91a1a}-4.46\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1333ms 43.8141μs 22.8237 KOps/s 23.4041 KOps/s $\color{#d91a1a}-2.48\%$
test_compile_indexing[tensor-pytree-compile] 0.1954ms 0.1093ms 9.1475 KOps/s 9.7332 KOps/s $\textbf{\color{#d91a1a}-6.02\%}$
test_compile_indexing[tensor-pytree-eager] 95.3330μs 43.1386μs 23.1811 KOps/s 23.4816 KOps/s $\color{#d91a1a}-1.28\%$
test_compile_indexing[slice-tensordict-compile] 0.2495ms 0.1421ms 7.0349 KOps/s 7.3562 KOps/s $\color{#d91a1a}-4.37\%$
test_compile_indexing[slice-tensordict-eager] 0.1519ms 25.4462μs 39.2986 KOps/s 40.2452 KOps/s $\color{#d91a1a}-2.35\%$
test_compile_indexing[slice-tensorclass-compile] 0.1774ms 0.1333ms 7.5044 KOps/s 7.7241 KOps/s $\color{#d91a1a}-2.84\%$
test_compile_indexing[slice-tensorclass-eager] 66.5720μs 20.9970μs 47.6259 KOps/s 48.1990 KOps/s $\color{#d91a1a}-1.19\%$
test_compile_indexing[slice-pytree-compile] 0.1968ms 0.1378ms 7.2581 KOps/s 7.6972 KOps/s $\textbf{\color{#d91a1a}-5.70\%}$
test_compile_indexing[slice-pytree-eager] 57.3810μs 20.9179μs 47.8060 KOps/s 48.4954 KOps/s $\color{#d91a1a}-1.42\%$
test_compile_indexing[int-tensordict-compile] 0.1956ms 0.1433ms 6.9788 KOps/s 7.3400 KOps/s $\color{#d91a1a}-4.92\%$
test_compile_indexing[int-tensordict-eager] 0.5053ms 25.6304μs 39.0161 KOps/s 39.8131 KOps/s $\color{#d91a1a}-2.00\%$
test_compile_indexing[int-tensorclass-compile] 0.2244ms 0.1376ms 7.2676 KOps/s 7.6545 KOps/s $\textbf{\color{#d91a1a}-5.05\%}$
test_compile_indexing[int-tensorclass-eager] 52.8420μs 20.7787μs 48.1263 KOps/s 48.9205 KOps/s $\color{#d91a1a}-1.62\%$
test_compile_indexing[int-pytree-compile] 0.1783ms 0.1382ms 7.2361 KOps/s 7.7102 KOps/s $\textbf{\color{#d91a1a}-6.15\%}$
test_compile_indexing[int-pytree-eager] 52.5610μs 21.9049μs 45.6519 KOps/s 48.3035 KOps/s $\textbf{\color{#d91a1a}-5.49\%}$
test_mod_add[eager] 70.5610μs 30.9904μs 32.2681 KOps/s 30.3743 KOps/s $\textbf{\color{#35bf28}+6.23\%}$
test_mod_add[compile] 0.1278ms 71.7069μs 13.9457 KOps/s 14.3223 KOps/s $\color{#d91a1a}-2.63\%$
test_mod_add[compile-overhead] 0.2643ms 0.1362ms 7.3403 KOps/s 7.1205 KOps/s $\color{#35bf28}+3.09\%$
test_mod_wrap[eager] 0.3221ms 0.2398ms 4.1707 KOps/s 4.1292 KOps/s $\color{#35bf28}+1.00\%$
test_mod_wrap[compile] 0.4512ms 0.2847ms 3.5122 KOps/s 3.4721 KOps/s $\color{#35bf28}+1.15\%$
test_mod_wrap[compile-overhead] 7.5423ms 4.1109ms 243.2564 Ops/s 248.0829 Ops/s $\color{#d91a1a}-1.95\%$
test_mod_wrap_and_backward[eager] 1.4568ms 1.3401ms 746.2275 Ops/s 693.3521 Ops/s $\textbf{\color{#35bf28}+7.63\%}$
test_mod_wrap_and_backward[compile] 2.6719ms 1.3176ms 758.9574 Ops/s 702.3983 Ops/s $\textbf{\color{#35bf28}+8.05\%}$
test_mod_wrap_and_backward[compile-overhead] 1.2954ms 0.8845ms 1.1306 KOps/s 1.0185 KOps/s $\textbf{\color{#35bf28}+11.00\%}$
test_seq_add[eager] 0.2009ms 93.8608μs 10.6541 KOps/s 10.3174 KOps/s $\color{#35bf28}+3.26\%$
test_seq_add[compile] 0.3646ms 80.3147μs 12.4510 KOps/s 12.5566 KOps/s $\color{#d91a1a}-0.84\%$
test_seq_add[compile-overhead] 0.1897ms 0.1137ms 8.7933 KOps/s 8.8347 KOps/s $\color{#d91a1a}-0.47\%$
test_seq_wrap[eager] 0.4338ms 0.3709ms 2.6963 KOps/s 2.6099 KOps/s $\color{#35bf28}+3.31\%$
test_seq_wrap[compile] 0.3518ms 0.3014ms 3.3183 KOps/s 3.2908 KOps/s $\color{#35bf28}+0.84\%$
test_seq_wrap[compile-overhead] 0.2546ms 0.2078ms 4.8113 KOps/s 4.8428 KOps/s $\color{#d91a1a}-0.65\%$
test_func_call_runtime[False-eager] 0.8294ms 0.7547ms 1.3250 KOps/s 1.3600 KOps/s $\color{#d91a1a}-2.58\%$
test_func_call_runtime[False-compile] 0.8749ms 0.7734ms 1.2930 KOps/s 1.2624 KOps/s $\color{#35bf28}+2.42\%$
test_func_call_runtime[False-compile-overhead] 0.4052ms 0.3432ms 2.9142 KOps/s 2.9067 KOps/s $\color{#35bf28}+0.26\%$
test_func_call_runtime[True-eager] 0.9752ms 0.8939ms 1.1187 KOps/s 1.1215 KOps/s $\color{#d91a1a}-0.25\%$
test_func_call_runtime[True-compile] 0.9161ms 0.8169ms 1.2241 KOps/s 1.2107 KOps/s $\color{#35bf28}+1.11\%$
test_func_call_runtime[True-compile-overhead] 0.4320ms 0.3798ms 2.6330 KOps/s 2.6354 KOps/s $\color{#d91a1a}-0.09\%$
test_func_call_cm_runtime[False-eager] 0.8583ms 0.7240ms 1.3812 KOps/s 1.3657 KOps/s $\color{#35bf28}+1.13\%$
test_func_call_cm_runtime[False-compile] 0.8935ms 0.7780ms 1.2853 KOps/s 1.2620 KOps/s $\color{#35bf28}+1.85\%$
test_func_call_cm_runtime[False-compile-overhead] 0.3888ms 0.3461ms 2.8897 KOps/s 2.8679 KOps/s $\color{#35bf28}+0.76\%$
test_func_call_cm_runtime[True-eager] 1.0946ms 0.9872ms 1.0130 KOps/s 1.0051 KOps/s $\color{#35bf28}+0.79\%$
test_func_call_cm_runtime[True-compile] 0.8896ms 0.8399ms 1.1907 KOps/s 1.1740 KOps/s $\color{#35bf28}+1.42\%$
test_func_call_cm_runtime[True-compile-overhead] 0.4807ms 0.4029ms 2.4823 KOps/s 2.4659 KOps/s $\color{#35bf28}+0.66\%$
test_vmap_func_call_cm_runtime[eager] 2.6158ms 2.0817ms 480.3848 Ops/s 477.6970 Ops/s $\color{#35bf28}+0.56\%$
test_vmap_func_call_cm_runtime[compile] 0.9220ms 0.8565ms 1.1675 KOps/s 1.1533 KOps/s $\color{#35bf28}+1.23\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.5100ms 0.4070ms 2.4570 KOps/s 2.4473 KOps/s $\color{#35bf28}+0.40\%$
test_distributed 3.7133ms 0.2103ms 4.7545 KOps/s 8.8302 KOps/s $\textbf{\color{#d91a1a}-46.16\%}$
test_tdmodule 83.1820μs 13.7728μs 72.6071 KOps/s 62.0871 KOps/s $\textbf{\color{#35bf28}+16.94\%}$
test_tdmodule_dispatch 39.7010μs 27.0228μs 37.0058 KOps/s 32.6484 KOps/s $\textbf{\color{#35bf28}+13.35\%}$
test_tdseq 36.6910μs 14.1754μs 70.5449 KOps/s 62.0518 KOps/s $\textbf{\color{#35bf28}+13.69\%}$
test_tdseq_dispatch 52.2810μs 29.2690μs 34.1658 KOps/s 30.0784 KOps/s $\textbf{\color{#35bf28}+13.59\%}$
test_instantiation_functorch 1.9400ms 1.8461ms 541.6883 Ops/s 534.5258 Ops/s $\color{#35bf28}+1.34\%$
test_instantiation_td 1.8114ms 1.2007ms 832.8471 Ops/s 829.0192 Ops/s $\color{#35bf28}+0.46\%$
test_exec_functorch 0.2503ms 0.2083ms 4.8010 KOps/s 4.7967 KOps/s $\color{#35bf28}+0.09\%$
test_exec_functional_call 0.2642ms 0.2074ms 4.8212 KOps/s 4.7793 KOps/s $\color{#35bf28}+0.88\%$
test_exec_td 0.2942ms 0.2119ms 4.7184 KOps/s 4.6796 KOps/s $\color{#35bf28}+0.83\%$
test_exec_td_decorator 1.0890ms 0.2560ms 3.9068 KOps/s 3.9132 KOps/s $\color{#d91a1a}-0.16\%$
test_vmap_mlp_speed[True-True] 0.7842ms 0.6836ms 1.4629 KOps/s 1.4450 KOps/s $\color{#35bf28}+1.24\%$
test_vmap_mlp_speed[True-False] 0.7861ms 0.6814ms 1.4677 KOps/s 1.4176 KOps/s $\color{#35bf28}+3.53\%$
test_vmap_mlp_speed[False-True] 0.6298ms 0.5770ms 1.7330 KOps/s 1.6514 KOps/s $\color{#35bf28}+4.94\%$
test_vmap_mlp_speed[False-False] 0.6427ms 0.5771ms 1.7328 KOps/s 1.6506 KOps/s $\color{#35bf28}+4.98\%$
test_vmap_mlp_speed_decorator[True-True] 1.2559ms 0.6703ms 1.4918 KOps/s 1.4634 KOps/s $\color{#35bf28}+1.94\%$
test_vmap_mlp_speed_decorator[True-False] 0.8416ms 0.6703ms 1.4918 KOps/s 1.4811 KOps/s $\color{#35bf28}+0.73\%$
test_vmap_mlp_speed_decorator[False-True] 0.7082ms 0.5923ms 1.6882 KOps/s 1.6758 KOps/s $\color{#35bf28}+0.74\%$
test_vmap_mlp_speed_decorator[False-False] 0.7198ms 0.5933ms 1.6855 KOps/s 1.6958 KOps/s $\color{#d91a1a}-0.60\%$
test_vmap_transformer_speed[True-True] 8.4416ms 8.3040ms 120.4245 Ops/s 119.6202 Ops/s $\color{#35bf28}+0.67\%$
test_vmap_transformer_speed[True-False] 8.4544ms 8.3018ms 120.4556 Ops/s 118.9336 Ops/s $\color{#35bf28}+1.28\%$
test_vmap_transformer_speed[False-True] 8.2343ms 8.1098ms 123.3072 Ops/s 122.3587 Ops/s $\color{#35bf28}+0.78\%$
test_vmap_transformer_speed[False-False] 8.1878ms 8.0967ms 123.5073 Ops/s 122.9034 Ops/s $\color{#35bf28}+0.49\%$
test_vmap_transformer_speed_decorator[True-True] 20.1487ms 19.5394ms 51.1786 Ops/s 50.8657 Ops/s $\color{#35bf28}+0.62\%$
test_vmap_transformer_speed_decorator[True-False] 19.7141ms 19.5046ms 51.2699 Ops/s 51.2235 Ops/s $\color{#35bf28}+0.09\%$
test_vmap_transformer_speed_decorator[False-True] 19.4939ms 19.3371ms 51.7140 Ops/s 51.6014 Ops/s $\color{#35bf28}+0.22\%$
test_vmap_transformer_speed_decorator[False-False] 19.3817ms 19.3208ms 51.7576 Ops/s 50.8817 Ops/s $\color{#35bf28}+1.72\%$
test_to_module_speed[True] 1.3011ms 0.9454ms 1.0577 KOps/s 1.0880 KOps/s $\color{#d91a1a}-2.78\%$
test_to_module_speed[False] 1.3348ms 0.9261ms 1.0798 KOps/s 1.0952 KOps/s $\color{#d91a1a}-1.40\%$
test_tc_init 59.6920μs 32.5472μs 30.7246 KOps/s 28.8272 KOps/s $\textbf{\color{#35bf28}+6.58\%}$
test_tc_init_nested 0.1045ms 66.2089μs 15.1037 KOps/s 14.3652 KOps/s $\textbf{\color{#35bf28}+5.14\%}$
test_tc_first_layer_tensor 4.3944μs 0.6747μs 1.4822 MOps/s 1.4612 MOps/s $\color{#35bf28}+1.44\%$
test_tc_first_layer_nontensor 23.1610μs 2.2781μs 438.9570 KOps/s 445.1184 KOps/s $\color{#d91a1a}-1.38\%$
test_tc_second_layer_tensor 9.8477μs 1.3833μs 722.8894 KOps/s 722.3961 KOps/s $\color{#35bf28}+0.07\%$
test_tc_second_layer_nontensor 52.1210μs 2.9894μs 334.5182 KOps/s 336.0588 KOps/s $\color{#d91a1a}-0.46\%$
test_unbind 0.1891s 11.9835ms 83.4478 Ops/s 93.8265 Ops/s $\textbf{\color{#d91a1a}-11.06\%}$
test_full_like 0.6587ms 0.5740ms 1.7423 KOps/s 1.7388 KOps/s $\color{#35bf28}+0.20\%$
test_zeros_like 0.2611ms 0.1979ms 5.0527 KOps/s 5.0529 KOps/s $-0.00\%$
test_ones_like 0.2827ms 0.1977ms 5.0577 KOps/s 5.0554 KOps/s $\color{#35bf28}+0.05\%$
test_clone 0.4360ms 0.4142ms 2.4142 KOps/s 2.4084 KOps/s $\color{#35bf28}+0.24\%$
test_squeeze 42.0110μs 9.6536μs 103.5879 KOps/s 103.8597 KOps/s $\color{#d91a1a}-0.26\%$
test_unsqueeze 0.3000ms 73.7500μs 13.5593 KOps/s 13.7187 KOps/s $\color{#d91a1a}-1.16\%$
test_split 0.2578ms 0.1556ms 6.4275 KOps/s 6.3257 KOps/s $\color{#35bf28}+1.61\%$
test_permute 0.2251ms 0.1760ms 5.6804 KOps/s 5.6661 KOps/s $\color{#35bf28}+0.25\%$
test_stack 1.2675ms 0.8630ms 1.1587 KOps/s 1.1615 KOps/s $\color{#d91a1a}-0.24\%$
test_cat 1.2634ms 1.2314ms 812.0546 Ops/s 811.7569 Ops/s $\color{#35bf28}+0.04\%$

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Sep 10, 2024
ghstack-source-id: a53fb9db23682bea92399dd4cf7dab1ae6aa11f8
Pull Request resolved: #984
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Sep 10, 2024
ghstack-source-id: 09026c1eb275dd0c3584ff0f4035992f7715bb73
Pull Request resolved: #984
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Sep 10, 2024
ghstack-source-id: ae7e6170eebeb7026a230596f39f704971e0fc06
Pull Request resolved: #984
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 222. Improved: $\large\color{#35bf28}9$. Worsened: $\large\color{#d91a1a}35$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 44.9840μs 21.5329μs 46.4406 KOps/s 50.3239 KOps/s $\textbf{\color{#d91a1a}-7.72\%}$
test_plain_set_stack_nested 49.1210μs 21.4354μs 46.6517 KOps/s 49.1602 KOps/s $\textbf{\color{#d91a1a}-5.10\%}$
test_plain_set_nested_inplace 77.4840μs 22.9686μs 43.5377 KOps/s 46.0360 KOps/s $\textbf{\color{#d91a1a}-5.43\%}$
test_plain_set_stack_nested_inplace 57.8180μs 23.0757μs 43.3357 KOps/s 46.0490 KOps/s $\textbf{\color{#d91a1a}-5.89\%}$
test_items 37.4100μs 4.2921μs 232.9863 KOps/s 233.5784 KOps/s $\color{#d91a1a}-0.25\%$
test_items_nested 0.7081ms 0.3359ms 2.9775 KOps/s 3.0207 KOps/s $\color{#d91a1a}-1.43\%$
test_items_nested_locked 0.4534ms 0.3332ms 3.0016 KOps/s 2.9998 KOps/s $\color{#35bf28}+0.06\%$
test_items_nested_leaf 0.1647ms 86.4791μs 11.5635 KOps/s 11.5668 KOps/s $\color{#d91a1a}-0.03\%$
test_items_stack_nested 0.5395ms 0.3379ms 2.9591 KOps/s 2.9838 KOps/s $\color{#d91a1a}-0.83\%$
test_items_stack_nested_leaf 0.1631ms 86.6088μs 11.5462 KOps/s 11.5133 KOps/s $\color{#35bf28}+0.29\%$
test_items_stack_nested_locked 0.5430ms 0.3373ms 2.9648 KOps/s 2.9437 KOps/s $\color{#35bf28}+0.72\%$
test_keys 22.2110μs 3.5111μs 284.8111 KOps/s 286.2371 KOps/s $\color{#d91a1a}-0.50\%$
test_keys_nested 0.2151ms 0.1008ms 9.9231 KOps/s 9.9733 KOps/s $\color{#d91a1a}-0.50\%$
test_keys_nested_locked 0.7780ms 0.1043ms 9.5905 KOps/s 9.4500 KOps/s $\color{#35bf28}+1.49\%$
test_keys_nested_leaf 0.1483ms 84.3884μs 11.8500 KOps/s 11.8921 KOps/s $\color{#d91a1a}-0.35\%$
test_keys_stack_nested 0.1650ms 99.8027μs 10.0198 KOps/s 10.1207 KOps/s $\color{#d91a1a}-1.00\%$
test_keys_stack_nested_leaf 0.1569ms 84.3045μs 11.8618 KOps/s 12.0987 KOps/s $\color{#d91a1a}-1.96\%$
test_keys_stack_nested_locked 0.1851ms 0.1013ms 9.8701 KOps/s 9.6446 KOps/s $\color{#35bf28}+2.34\%$
test_values 8.2132μs 1.0809μs 925.1552 KOps/s 907.4758 KOps/s $\color{#35bf28}+1.95\%$
test_values_nested 0.1034ms 48.3684μs 20.6747 KOps/s 21.0064 KOps/s $\color{#d91a1a}-1.58\%$
test_values_nested_locked 0.1020ms 47.9699μs 20.8464 KOps/s 20.2960 KOps/s $\color{#35bf28}+2.71\%$
test_values_nested_leaf 89.8380μs 42.6405μs 23.4519 KOps/s 23.5565 KOps/s $\color{#d91a1a}-0.44\%$
test_values_stack_nested 98.6340μs 47.6539μs 20.9846 KOps/s 20.8446 KOps/s $\color{#35bf28}+0.67\%$
test_values_stack_nested_leaf 94.2060μs 43.3229μs 23.0825 KOps/s 23.6945 KOps/s $\color{#d91a1a}-2.58\%$
test_values_stack_nested_locked 97.7620μs 48.1147μs 20.7837 KOps/s 20.9268 KOps/s $\color{#d91a1a}-0.68\%$
test_membership 32.9620μs 0.8356μs 1.1968 MOps/s 1.1796 MOps/s $\color{#35bf28}+1.46\%$
test_membership_nested 45.6260μs 2.6359μs 379.3729 KOps/s 391.3859 KOps/s $\color{#d91a1a}-3.07\%$
test_membership_nested_leaf 41.6080μs 2.6316μs 380.0041 KOps/s 388.4481 KOps/s $\color{#d91a1a}-2.17\%$
test_membership_stacked_nested 42.8800μs 2.6198μs 381.7139 KOps/s 398.3140 KOps/s $\color{#d91a1a}-4.17\%$
test_membership_stacked_nested_leaf 18.1040μs 2.6204μs 381.6149 KOps/s 391.7684 KOps/s $\color{#d91a1a}-2.59\%$
test_membership_nested_last 46.4470μs 3.8273μs 261.2823 KOps/s 267.5512 KOps/s $\color{#d91a1a}-2.34\%$
test_membership_nested_leaf_last 26.8800μs 3.7729μs 265.0475 KOps/s 266.2409 KOps/s $\color{#d91a1a}-0.45\%$
test_membership_stacked_nested_last 33.9940μs 3.7482μs 266.7962 KOps/s 270.1409 KOps/s $\color{#d91a1a}-1.24\%$
test_membership_stacked_nested_leaf_last 24.7670μs 3.7711μs 265.1743 KOps/s 267.9129 KOps/s $\color{#d91a1a}-1.02\%$
test_nested_getleaf 47.1380μs 10.8122μs 92.4882 KOps/s 93.1485 KOps/s $\color{#d91a1a}-0.71\%$
test_nested_get 31.5290μs 10.3452μs 96.6628 KOps/s 98.8148 KOps/s $\color{#d91a1a}-2.18\%$
test_stacked_getleaf 52.7090μs 10.7705μs 92.8461 KOps/s 93.9737 KOps/s $\color{#d91a1a}-1.20\%$
test_stacked_get 52.3880μs 10.2453μs 97.6062 KOps/s 99.1091 KOps/s $\color{#d91a1a}-1.52\%$
test_nested_getitemleaf 44.4650μs 11.2105μs 89.2020 KOps/s 91.1554 KOps/s $\color{#d91a1a}-2.14\%$
test_nested_getitem 32.2700μs 10.3990μs 96.1627 KOps/s 96.6634 KOps/s $\color{#d91a1a}-0.52\%$
test_stacked_getitemleaf 46.7070μs 11.1027μs 90.0684 KOps/s 91.8153 KOps/s $\color{#d91a1a}-1.90\%$
test_stacked_getitem 57.0370μs 10.4021μs 96.1342 KOps/s 98.1853 KOps/s $\color{#d91a1a}-2.09\%$
test_lock_nested 82.9381ms 0.5584ms 1.7909 KOps/s 2.1233 KOps/s $\textbf{\color{#d91a1a}-15.66\%}$
test_lock_stack_nested 0.6786ms 0.4482ms 2.2312 KOps/s 2.2374 KOps/s $\color{#d91a1a}-0.27\%$
test_unlock_nested 85.8153ms 0.4834ms 2.0686 KOps/s 2.4962 KOps/s $\textbf{\color{#d91a1a}-17.13\%}$
test_unlock_stack_nested 0.5605ms 0.3663ms 2.7303 KOps/s 2.7223 KOps/s $\color{#35bf28}+0.29\%$
test_flatten_speed 0.1874ms 0.1034ms 9.6728 KOps/s 9.5281 KOps/s $\color{#35bf28}+1.52\%$
test_unflatten_speed 0.5298ms 0.4611ms 2.1689 KOps/s 2.1654 KOps/s $\color{#35bf28}+0.16\%$
test_common_ops 6.1934ms 1.1256ms 888.3781 Ops/s 951.0971 Ops/s $\textbf{\color{#d91a1a}-6.59\%}$
test_creation 23.0530μs 2.0594μs 485.5781 KOps/s 468.8565 KOps/s $\color{#35bf28}+3.57\%$
test_creation_empty 46.9080μs 19.1137μs 52.3185 KOps/s 61.2358 KOps/s $\textbf{\color{#d91a1a}-14.56\%}$
test_creation_nested_1 79.8090μs 21.8945μs 45.6736 KOps/s 50.9421 KOps/s $\textbf{\color{#d91a1a}-10.34\%}$
test_creation_nested_2 0.1034ms 26.1580μs 38.2292 KOps/s 42.4081 KOps/s $\textbf{\color{#d91a1a}-9.85\%}$
test_clone 84.3680μs 17.2682μs 57.9098 KOps/s 60.7333 KOps/s $\color{#d91a1a}-4.65\%$
test_getitem[int] 1.0807ms 16.4829μs 60.6688 KOps/s 60.1681 KOps/s $\color{#35bf28}+0.83\%$
test_getitem[slice_int] 0.1322ms 29.8292μs 33.5242 KOps/s 32.1040 KOps/s $\color{#35bf28}+4.42\%$
test_getitem[range] 0.2254ms 60.6393μs 16.4910 KOps/s 18.1145 KOps/s $\textbf{\color{#d91a1a}-8.96\%}$
test_getitem[tuple] 0.1677ms 24.7476μs 40.4079 KOps/s 40.3308 KOps/s $\color{#35bf28}+0.19\%$
test_getitem[list] 0.2247ms 54.1011μs 18.4839 KOps/s 19.5622 KOps/s $\textbf{\color{#d91a1a}-5.51\%}$
test_setitem_dim[int] 86.8930μs 40.3662μs 24.7732 KOps/s 27.1242 KOps/s $\textbf{\color{#d91a1a}-8.67\%}$
test_setitem_dim[slice_int] 0.1203ms 69.4534μs 14.3982 KOps/s 15.3506 KOps/s $\textbf{\color{#d91a1a}-6.20\%}$
test_setitem_dim[range] 0.1747ms 93.6477μs 10.6783 KOps/s 11.2102 KOps/s $\color{#d91a1a}-4.74\%$
test_setitem_dim[tuple] 89.7680μs 56.3488μs 17.7466 KOps/s 18.3554 KOps/s $\color{#d91a1a}-3.32\%$
test_setitem 0.1002ms 29.8071μs 33.5490 KOps/s 35.8209 KOps/s $\textbf{\color{#d91a1a}-6.34\%}$
test_set 95.7990μs 29.2684μs 34.1666 KOps/s 36.9296 KOps/s $\textbf{\color{#d91a1a}-7.48\%}$
test_set_shared 2.4708ms 0.2114ms 4.7307 KOps/s 4.7477 KOps/s $\color{#d91a1a}-0.36\%$
test_update 0.1316ms 36.3574μs 27.5047 KOps/s 29.8577 KOps/s $\textbf{\color{#d91a1a}-7.88\%}$
test_update_nested 0.1249ms 46.3629μs 21.5690 KOps/s 22.3640 KOps/s $\color{#d91a1a}-3.55\%$
test_update__nested 93.2140μs 34.2769μs 29.1742 KOps/s 29.6833 KOps/s $\color{#d91a1a}-1.72\%$
test_set_nested 94.9080μs 31.3045μs 31.9443 KOps/s 33.9977 KOps/s $\textbf{\color{#d91a1a}-6.04\%}$
test_set_nested_new 84.4880μs 36.4999μs 27.3974 KOps/s 29.0288 KOps/s $\textbf{\color{#d91a1a}-5.62\%}$
test_select 1.1552ms 53.6589μs 18.6362 KOps/s 19.2602 KOps/s $\color{#d91a1a}-3.24\%$
test_select_nested 0.1281ms 60.0279μs 16.6589 KOps/s 16.5797 KOps/s $\color{#35bf28}+0.48\%$
test_exclude_nested 0.1450ms 76.0233μs 13.1539 KOps/s 13.2268 KOps/s $\color{#d91a1a}-0.55\%$
test_empty[True] 0.4542ms 0.3180ms 3.1446 KOps/s 3.1531 KOps/s $\color{#d91a1a}-0.27\%$
test_empty[False] 31.6415μs 1.2409μs 805.8638 KOps/s 816.8211 KOps/s $\color{#d91a1a}-1.34\%$
test_unbind_speed 0.3794ms 0.2965ms 3.3725 KOps/s 3.3844 KOps/s $\color{#d91a1a}-0.35\%$
test_unbind_speed_stack0 0.5989ms 0.2958ms 3.3812 KOps/s 3.4551 KOps/s $\color{#d91a1a}-2.14\%$
test_unbind_speed_stack1 87.2133ms 0.8001ms 1.2498 KOps/s 1.3685 KOps/s $\textbf{\color{#d91a1a}-8.68\%}$
test_split 3.1162ms 1.9936ms 501.6173 Ops/s 462.1853 Ops/s $\textbf{\color{#35bf28}+8.53\%}$
test_chunk 89.9514ms 2.3404ms 427.2765 Ops/s 459.4865 Ops/s $\textbf{\color{#d91a1a}-7.01\%}$
test_creation[device0] 4.1368ms 0.1186ms 8.4336 KOps/s 8.6900 KOps/s $\color{#d91a1a}-2.95\%$
test_creation_from_tensor 0.2388ms 0.1148ms 8.7139 KOps/s 8.5743 KOps/s $\color{#35bf28}+1.63\%$
test_add_one[memmap_tensor0] 0.1998ms 7.6249μs 131.1492 KOps/s 143.3581 KOps/s $\textbf{\color{#d91a1a}-8.52\%}$
test_contiguous[memmap_tensor0] 22.1920μs 1.8645μs 536.3234 KOps/s 531.2918 KOps/s $\color{#35bf28}+0.95\%$
test_stack[memmap_tensor0] 37.7010μs 5.6753μs 176.2030 KOps/s 182.6504 KOps/s $\color{#d91a1a}-3.53\%$
test_memmaptd_index 1.1308ms 0.4049ms 2.4700 KOps/s 2.5512 KOps/s $\color{#d91a1a}-3.18\%$
test_memmaptd_index_astensor 1.0327ms 0.4816ms 2.0764 KOps/s 2.1221 KOps/s $\color{#d91a1a}-2.16\%$
test_memmaptd_index_op 1.6901ms 1.0329ms 968.1400 Ops/s 1.0509 KOps/s $\textbf{\color{#d91a1a}-7.88\%}$
test_serialize_model 0.1310s 0.1162s 8.6082 Ops/s 8.2293 Ops/s $\color{#35bf28}+4.60\%$
test_serialize_model_pickle 0.4759s 0.4001s 2.4991 Ops/s 2.4874 Ops/s $\color{#35bf28}+0.47\%$
test_serialize_weights 0.1204s 0.1158s 8.6368 Ops/s 7.4165 Ops/s $\textbf{\color{#35bf28}+16.45\%}$
test_serialize_weights_returnearly 0.1719s 0.1594s 6.2740 Ops/s 6.2210 Ops/s $\color{#35bf28}+0.85\%$
test_serialize_weights_pickle 1.0558s 0.7083s 1.4119 Ops/s 2.3093 Ops/s $\textbf{\color{#d91a1a}-38.86\%}$
test_serialize_weights_filesystem 0.1469s 0.1398s 7.1522 Ops/s 6.8979 Ops/s $\color{#35bf28}+3.69\%$
test_serialize_model_filesystem 0.2284s 0.1523s 6.5655 Ops/s 6.0705 Ops/s $\textbf{\color{#35bf28}+8.15\%}$
test_reshape_pytree 85.9400μs 37.9464μs 26.3530 KOps/s 25.7321 KOps/s $\color{#35bf28}+2.41\%$
test_reshape_td 0.1495ms 47.2780μs 21.1515 KOps/s 21.5529 KOps/s $\color{#d91a1a}-1.86\%$
test_view_pytree 0.1047ms 38.3321μs 26.0878 KOps/s 26.4140 KOps/s $\color{#d91a1a}-1.23\%$
test_view_td 93.9560μs 51.8185μs 19.2981 KOps/s 19.3828 KOps/s $\color{#d91a1a}-0.44\%$
test_unbind_pytree 98.5220μs 36.0286μs 27.7557 KOps/s 28.4316 KOps/s $\color{#d91a1a}-2.38\%$
test_unbind_td 0.3305ms 44.4010μs 22.5220 KOps/s 22.5152 KOps/s $\color{#35bf28}+0.03\%$
test_split_pytree 0.1036ms 37.5538μs 26.6285 KOps/s 26.7111 KOps/s $\color{#d91a1a}-0.31\%$
test_split_td 0.2438ms 56.9980μs 17.5445 KOps/s 17.4017 KOps/s $\color{#35bf28}+0.82\%$
test_add_pytree 0.1017ms 44.7070μs 22.3679 KOps/s 23.0367 KOps/s $\color{#d91a1a}-2.90\%$
test_add_td 0.2229ms 81.5949μs 12.2557 KOps/s 12.8953 KOps/s $\color{#d91a1a}-4.96\%$
test_compile_add_one_nested[tensordict-compile] 0.1330ms 56.3988μs 17.7309 KOps/s 17.5690 KOps/s $\color{#35bf28}+0.92\%$
test_compile_add_one_nested[tensordict-eager] 0.3236ms 0.1816ms 5.5058 KOps/s 5.3790 KOps/s $\color{#35bf28}+2.36\%$
test_compile_add_one_nested[pytree-compile] 0.1283ms 55.9533μs 17.8720 KOps/s 17.5840 KOps/s $\color{#35bf28}+1.64\%$
test_compile_add_one_nested[pytree-eager] 0.2857ms 0.1446ms 6.9179 KOps/s 7.2493 KOps/s $\color{#d91a1a}-4.57\%$
test_compile_copy_nested[tensordict-compile] 47.2880μs 20.6710μs 48.3769 KOps/s 49.1760 KOps/s $\color{#d91a1a}-1.63\%$
test_compile_copy_nested[tensordict-eager] 0.1321ms 66.6212μs 15.0102 KOps/s 15.0777 KOps/s $\color{#d91a1a}-0.45\%$
test_compile_copy_nested[pytree-compile] 0.1469ms 74.4748μs 13.4274 KOps/s 13.2389 KOps/s $\color{#35bf28}+1.42\%$
test_compile_copy_nested[pytree-eager] 0.1387ms 67.8661μs 14.7349 KOps/s 14.6003 KOps/s $\color{#35bf28}+0.92\%$
test_compile_add_one_flat[tensordict-compile] 0.3752ms 0.1717ms 5.8239 KOps/s 5.8251 KOps/s $\color{#d91a1a}-0.02\%$
test_compile_add_one_flat[tensordict-eager] 0.3100ms 0.1852ms 5.3988 KOps/s 5.2700 KOps/s $\color{#35bf28}+2.44\%$
test_compile_add_one_flat[tensorclass-compile] 0.1098ms 45.8242μs 21.8225 KOps/s 21.0290 KOps/s $\color{#35bf28}+3.77\%$
test_compile_add_one_flat[tensorclass-eager] 0.6769ms 67.5142μs 14.8117 KOps/s 14.7493 KOps/s $\color{#35bf28}+0.42\%$
test_compile_add_one_flat[pytree-compile] 0.3238ms 0.1726ms 5.7923 KOps/s 5.7364 KOps/s $\color{#35bf28}+0.97\%$
test_compile_add_one_flat[pytree-eager] 0.5873ms 0.3028ms 3.3022 KOps/s 3.5035 KOps/s $\textbf{\color{#d91a1a}-5.75\%}$
test_compile_add_self_flat[tensordict-eager] 0.2986ms 0.1978ms 5.0559 KOps/s 4.8972 KOps/s $\color{#35bf28}+3.24\%$
test_compile_add_self_flat[tensordict-compile] 0.5712ms 0.1730ms 5.7813 KOps/s 5.7635 KOps/s $\color{#35bf28}+0.31\%$
test_compile_add_self_flat[tensorclass-eager] 0.1238ms 59.8905μs 16.6971 KOps/s 16.1985 KOps/s $\color{#35bf28}+3.08\%$
test_compile_add_self_flat[tensorclass-compile] 92.2420μs 47.8768μs 20.8870 KOps/s 20.4903 KOps/s $\color{#35bf28}+1.94\%$
test_compile_add_self_flat[pytree-eager] 0.4327ms 0.2440ms 4.0991 KOps/s 4.2878 KOps/s $\color{#d91a1a}-4.40\%$
test_compile_add_self_flat[pytree-compile] 0.2638ms 0.1742ms 5.7394 KOps/s 5.6719 KOps/s $\color{#35bf28}+1.19\%$
test_compile_copy_flat[tensordict-compile] 0.2365ms 0.1009ms 9.9121 KOps/s 9.7187 KOps/s $\color{#35bf28}+1.99\%$
test_compile_copy_flat[tensordict-eager] 0.1356ms 59.8672μs 16.7036 KOps/s 16.7114 KOps/s $\color{#d91a1a}-0.05\%$
test_compile_copy_flat[pytree-compile] 0.1420ms 74.9491μs 13.3424 KOps/s 12.9129 KOps/s $\color{#35bf28}+3.33\%$
test_compile_copy_flat[pytree-eager] 0.1349ms 68.1726μs 14.6687 KOps/s 14.3151 KOps/s $\color{#35bf28}+2.47\%$
test_compile_assign_and_add[tensordict-compile] 0.2863ms 0.1950ms 5.1290 KOps/s 5.1195 KOps/s $\color{#35bf28}+0.19\%$
test_compile_assign_and_add[tensordict-eager] 2.8615ms 1.6761ms 596.6323 Ops/s 609.3192 Ops/s $\color{#d91a1a}-2.08\%$
test_compile_assign_and_add[pytree-compile] 0.2803ms 0.1936ms 5.1655 KOps/s 5.1280 KOps/s $\color{#35bf28}+0.73\%$
test_compile_assign_and_add[pytree-eager] 2.0577ms 1.1548ms 865.9164 Ops/s 917.0348 Ops/s $\textbf{\color{#d91a1a}-5.57\%}$
test_compile_assign_and_add_stack[compile] 0.5687ms 0.4295ms 2.3285 KOps/s 2.3725 KOps/s $\color{#d91a1a}-1.86\%$
test_compile_assign_and_add_stack[eager] 4.0710ms 3.8262ms 261.3531 Ops/s 283.2649 Ops/s $\textbf{\color{#d91a1a}-7.74\%}$
test_compile_indexing[tensor-tensordict-compile] 0.1072ms 34.1381μs 29.2928 KOps/s 28.4930 KOps/s $\color{#35bf28}+2.81\%$
test_compile_indexing[tensor-tensordict-eager] 1.0641ms 47.3526μs 21.1182 KOps/s 21.2039 KOps/s $\color{#d91a1a}-0.40\%$
test_compile_indexing[tensor-tensorclass-compile] 98.5440μs 29.6774μs 33.6957 KOps/s 32.5581 KOps/s $\color{#35bf28}+3.49\%$
test_compile_indexing[tensor-tensorclass-eager] 67.9970μs 29.0346μs 34.4417 KOps/s 35.2753 KOps/s $\color{#d91a1a}-2.36\%$
test_compile_indexing[tensor-pytree-compile] 0.1009ms 30.5355μs 32.7488 KOps/s 31.9847 KOps/s $\color{#35bf28}+2.39\%$
test_compile_indexing[tensor-pytree-eager] 97.9830μs 28.7475μs 34.7856 KOps/s 35.5823 KOps/s $\color{#d91a1a}-2.24\%$
test_compile_indexing[slice-tensordict-compile] 0.1689ms 73.3614μs 13.6311 KOps/s 13.5518 KOps/s $\color{#35bf28}+0.59\%$
test_compile_indexing[slice-tensordict-eager] 0.5376ms 26.8754μs 37.2088 KOps/s 35.3473 KOps/s $\textbf{\color{#35bf28}+5.27\%}$
test_compile_indexing[slice-tensorclass-compile] 0.1723ms 69.1905μs 14.4529 KOps/s 14.7629 KOps/s $\color{#d91a1a}-2.10\%$
test_compile_indexing[slice-tensorclass-eager] 88.8060μs 22.7526μs 43.9510 KOps/s 43.9518 KOps/s $-0.00\%$
test_compile_indexing[slice-pytree-compile] 0.1422ms 68.2289μs 14.6566 KOps/s 14.7681 KOps/s $\color{#d91a1a}-0.76\%$
test_compile_indexing[slice-pytree-eager] 68.8990μs 22.6401μs 44.1694 KOps/s 43.9827 KOps/s $\color{#35bf28}+0.42\%$
test_compile_indexing[int-tensordict-compile] 0.5148ms 74.6181μs 13.4016 KOps/s 13.6217 KOps/s $\color{#d91a1a}-1.62\%$
test_compile_indexing[int-tensordict-eager] 1.1383ms 26.7516μs 37.3810 KOps/s 35.1691 KOps/s $\textbf{\color{#35bf28}+6.29\%}$
test_compile_indexing[int-tensorclass-compile] 0.1528ms 69.1515μs 14.4610 KOps/s 14.7309 KOps/s $\color{#d91a1a}-1.83\%$
test_compile_indexing[int-tensorclass-eager] 59.7010μs 22.6173μs 44.2139 KOps/s 44.5768 KOps/s $\color{#d91a1a}-0.81\%$
test_compile_indexing[int-pytree-compile] 0.1345ms 68.5514μs 14.5876 KOps/s 14.7494 KOps/s $\color{#d91a1a}-1.10\%$
test_compile_indexing[int-pytree-eager] 82.9650μs 22.6047μs 44.2385 KOps/s 44.7015 KOps/s $\color{#d91a1a}-1.04\%$
test_mod_add[eager] 76.6540μs 24.4125μs 40.9627 KOps/s 43.0117 KOps/s $\color{#d91a1a}-4.76\%$
test_mod_add[compile] 0.1056ms 37.8088μs 26.4489 KOps/s 25.2962 KOps/s $\color{#35bf28}+4.56\%$
test_mod_add[compile-overhead] 0.1048ms 38.4771μs 25.9895 KOps/s 25.2319 KOps/s $\color{#35bf28}+3.00\%$
test_mod_wrap[eager] 0.4008ms 0.2127ms 4.7017 KOps/s 4.9201 KOps/s $\color{#d91a1a}-4.44\%$
test_mod_wrap[compile] 0.3780ms 0.2341ms 4.2713 KOps/s 4.2306 KOps/s $\color{#35bf28}+0.96\%$
test_mod_wrap[compile-overhead] 0.4414ms 0.2371ms 4.2178 KOps/s 4.3048 KOps/s $\color{#d91a1a}-2.02\%$
test_mod_wrap_and_backward[eager] 12.0293ms 10.7565ms 92.9671 Ops/s 94.3404 Ops/s $\color{#d91a1a}-1.46\%$
test_mod_wrap_and_backward[compile] 12.5038ms 10.8942ms 91.7919 Ops/s 91.2447 Ops/s $\color{#35bf28}+0.60\%$
test_mod_wrap_and_backward[compile-overhead] 13.0064ms 10.9209ms 91.5674 Ops/s 89.8358 Ops/s $\color{#35bf28}+1.93\%$
test_seq_add[eager] 0.1731ms 87.2337μs 11.4635 KOps/s 11.7468 KOps/s $\color{#d91a1a}-2.41\%$
test_seq_add[compile] 0.1221ms 63.2962μs 15.7987 KOps/s 15.4943 KOps/s $\color{#35bf28}+1.96\%$
test_seq_add[compile-overhead] 0.1416ms 62.5196μs 15.9950 KOps/s 15.8586 KOps/s $\color{#35bf28}+0.86\%$
test_seq_wrap[eager] 0.5732ms 0.3905ms 2.5611 KOps/s 2.7126 KOps/s $\textbf{\color{#d91a1a}-5.59\%}$
test_seq_wrap[compile] 0.5204ms 0.2731ms 3.6615 KOps/s 3.7612 KOps/s $\color{#d91a1a}-2.65\%$
test_seq_wrap[compile-overhead] 0.4923ms 0.2714ms 3.6851 KOps/s 3.7859 KOps/s $\color{#d91a1a}-2.66\%$
test_func_call_runtime[False-eager] 1.0038ms 0.5396ms 1.8531 KOps/s 1.9502 KOps/s $\color{#d91a1a}-4.98\%$
test_func_call_runtime[False-compile] 0.9366ms 0.5140ms 1.9454 KOps/s 2.0358 KOps/s $\color{#d91a1a}-4.44\%$
test_func_call_runtime[False-compile-overhead] 1.3669ms 0.5188ms 1.9276 KOps/s 2.0374 KOps/s $\textbf{\color{#d91a1a}-5.39\%}$
test_func_call_runtime[True-eager] 1.5727ms 0.7564ms 1.3221 KOps/s 1.3798 KOps/s $\color{#d91a1a}-4.18\%$
test_func_call_runtime[True-compile] 0.8986ms 0.5175ms 1.9323 KOps/s 1.9911 KOps/s $\color{#d91a1a}-2.95\%$
test_func_call_runtime[True-compile-overhead] 0.9358ms 0.5195ms 1.9250 KOps/s 1.9985 KOps/s $\color{#d91a1a}-3.68\%$
test_func_call_cm_runtime[False-eager] 0.7957ms 0.5359ms 1.8661 KOps/s 1.9743 KOps/s $\textbf{\color{#d91a1a}-5.48\%}$
test_func_call_cm_runtime[False-compile] 0.6856ms 0.5201ms 1.9227 KOps/s 1.9935 KOps/s $\color{#d91a1a}-3.55\%$
test_func_call_cm_runtime[False-compile-overhead] 0.7030ms 0.5146ms 1.9433 KOps/s 2.0153 KOps/s $\color{#d91a1a}-3.57\%$
test_func_call_cm_runtime[True-eager] 1.2203ms 0.8910ms 1.1223 KOps/s 1.1746 KOps/s $\color{#d91a1a}-4.46\%$
test_func_call_cm_runtime[True-compile] 0.8901ms 0.7595ms 1.3167 KOps/s 1.3721 KOps/s $\color{#d91a1a}-4.04\%$
test_func_call_cm_runtime[True-compile-overhead] 1.0675ms 0.7602ms 1.3154 KOps/s 1.3757 KOps/s $\color{#d91a1a}-4.38\%$
test_vmap_func_call_cm_runtime[eager] 2.5629ms 1.8774ms 532.6477 Ops/s 542.4046 Ops/s $\color{#d91a1a}-1.80\%$
test_vmap_func_call_cm_runtime[compile] 3.0567ms 1.9545ms 511.6321 Ops/s 529.3164 Ops/s $\color{#d91a1a}-3.34\%$
test_vmap_func_call_cm_runtime[compile-overhead] 3.0241ms 1.9378ms 516.0506 Ops/s 528.3973 Ops/s $\color{#d91a1a}-2.34\%$
test_distributed 0.2568ms 0.1239ms 8.0680 KOps/s 7.9100 KOps/s $\color{#35bf28}+2.00\%$
test_tdmodule 36.5680μs 17.6618μs 56.6193 KOps/s 61.0267 KOps/s $\textbf{\color{#d91a1a}-7.22\%}$
test_tdmodule_dispatch 65.7630μs 36.6749μs 27.2666 KOps/s 29.3337 KOps/s $\textbf{\color{#d91a1a}-7.05\%}$
test_tdseq 49.4320μs 20.2553μs 49.3698 KOps/s 50.6109 KOps/s $\color{#d91a1a}-2.45\%$
test_tdseq_dispatch 97.1910μs 41.5029μs 24.0947 KOps/s 25.3998 KOps/s $\textbf{\color{#d91a1a}-5.14\%}$
test_instantiation_functorch 3.2644ms 1.5743ms 635.2138 Ops/s 634.1760 Ops/s $\color{#35bf28}+0.16\%$
test_instantiation_td 1.8652ms 1.1554ms 865.5054 Ops/s 861.3966 Ops/s $\color{#35bf28}+0.48\%$
test_exec_functorch 0.4106ms 0.1874ms 5.3350 KOps/s 5.4852 KOps/s $\color{#d91a1a}-2.74\%$
test_exec_functional_call 0.3253ms 0.1765ms 5.6668 KOps/s 5.9728 KOps/s $\textbf{\color{#d91a1a}-5.12\%}$
test_exec_td 0.3120ms 0.1688ms 5.9230 KOps/s 6.0703 KOps/s $\color{#d91a1a}-2.43\%$
test_exec_td_decorator 0.9916ms 0.2220ms 4.5035 KOps/s 4.5975 KOps/s $\color{#d91a1a}-2.04\%$
test_vmap_mlp_speed[True-True] 0.8321ms 0.6390ms 1.5649 KOps/s 1.6153 KOps/s $\color{#d91a1a}-3.12\%$
test_vmap_mlp_speed[True-False] 1.0907ms 0.6464ms 1.5470 KOps/s 1.6125 KOps/s $\color{#d91a1a}-4.06\%$
test_vmap_mlp_speed[False-True] 0.6648ms 0.4990ms 2.0041 KOps/s 2.0685 KOps/s $\color{#d91a1a}-3.11\%$
test_vmap_mlp_speed[False-False] 0.7162ms 0.4953ms 2.0189 KOps/s 2.0552 KOps/s $\color{#d91a1a}-1.77\%$
test_vmap_mlp_speed_decorator[True-True] 1.0982ms 0.6282ms 1.5919 KOps/s 1.6648 KOps/s $\color{#d91a1a}-4.38\%$
test_vmap_mlp_speed_decorator[True-False] 0.8509ms 0.6258ms 1.5979 KOps/s 1.6461 KOps/s $\color{#d91a1a}-2.93\%$
test_vmap_mlp_speed_decorator[False-True] 0.8004ms 0.5133ms 1.9482 KOps/s 2.0080 KOps/s $\color{#d91a1a}-2.97\%$
test_vmap_mlp_speed_decorator[False-False] 0.7466ms 0.5141ms 1.9453 KOps/s 2.0135 KOps/s $\color{#d91a1a}-3.39\%$
test_to_module_speed[True] 2.0656ms 1.2834ms 779.1773 Ops/s 774.0285 Ops/s $\color{#35bf28}+0.67\%$
test_to_module_speed[False] 1.7664ms 1.2304ms 812.7364 Ops/s 792.9305 Ops/s $\color{#35bf28}+2.50\%$
test_tc_init 99.0450μs 44.9168μs 22.2634 KOps/s 23.2965 KOps/s $\color{#d91a1a}-4.43\%$
test_tc_init_nested 0.1521ms 86.1465μs 11.6081 KOps/s 11.6420 KOps/s $\color{#d91a1a}-0.29\%$
test_tc_first_layer_tensor 39.0830μs 1.5212μs 657.3601 KOps/s 660.3301 KOps/s $\color{#d91a1a}-0.45\%$
test_tc_first_layer_nontensor 34.9050μs 4.7029μs 212.6343 KOps/s 207.0421 KOps/s $\color{#35bf28}+2.70\%$
test_tc_second_layer_tensor 22.9130μs 2.8298μs 353.3776 KOps/s 351.3518 KOps/s $\color{#35bf28}+0.58\%$
test_tc_second_layer_nontensor 44.5830μs 6.0754μs 164.5985 KOps/s 161.5401 KOps/s $\color{#35bf28}+1.89\%$
test_unbind 0.4838s 13.1776ms 75.8864 Ops/s 73.7831 Ops/s $\color{#35bf28}+2.85\%$
test_full_like 8.8281ms 7.4582ms 134.0802 Ops/s 79.1745 Ops/s $\textbf{\color{#35bf28}+69.35\%}$
test_zeros_like 3.4232ms 2.9410ms 340.0236 Ops/s 138.8001 Ops/s $\textbf{\color{#35bf28}+144.97\%}$
test_ones_like 3.6908ms 3.2471ms 307.9640 Ops/s 127.7473 Ops/s $\textbf{\color{#35bf28}+141.07\%}$
test_clone 6.7749ms 5.5245ms 181.0127 Ops/s 102.4663 Ops/s $\textbf{\color{#35bf28}+76.66\%}$
test_squeeze 64.6400μs 12.9595μs 77.1633 KOps/s 78.2989 KOps/s $\color{#d91a1a}-1.45\%$
test_unsqueeze 0.1688ms 91.0375μs 10.9845 KOps/s 10.7024 KOps/s $\color{#35bf28}+2.64\%$
test_split 0.5122ms 0.1920ms 5.2076 KOps/s 5.0761 KOps/s $\color{#35bf28}+2.59\%$
test_permute 0.3722ms 0.2204ms 4.5379 KOps/s 4.5238 KOps/s $\color{#35bf28}+0.31\%$
test_stack 32.2006ms 26.8205ms 37.2850 Ops/s 40.5113 Ops/s $\textbf{\color{#d91a1a}-7.96\%}$
test_cat 32.3506ms 26.1258ms 38.2764 Ops/s 40.2848 Ops/s $\color{#d91a1a}-4.99\%$

@vmoens vmoens merged commit cf235c1 into gh/vmoens/18/base Sep 10, 2024
44 of 48 checks passed
vmoens added a commit that referenced this pull request Sep 10, 2024
ghstack-source-id: ae7e6170eebeb7026a230596f39f704971e0fc06
Pull Request resolved: #984
@vmoens vmoens deleted the gh/vmoens/18/head branch September 10, 2024 16:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants