Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] smarter check in set_interaction_type #1088

Merged
merged 2 commits into from
Nov 14, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Nov 14, 2024

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 14, 2024
ghstack-source-id: 4aff4d2671f1a296c806064c8b65b22252ed52f6
Pull Request resolved: #1088
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 14, 2024
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 14, 2024
ghstack-source-id: 1821309ad24827c22c40c41f3544e7a768325f72
Pull Request resolved: #1088
@vmoens vmoens merged commit 8fe1b84 into gh/vmoens/35/base Nov 14, 2024
5 checks passed
vmoens added a commit that referenced this pull request Nov 14, 2024
ghstack-source-id: 1821309ad24827c22c40c41f3544e7a768325f72
Pull Request resolved: #1088
@vmoens vmoens deleted the gh/vmoens/35/head branch November 14, 2024 06:34
vmoens added a commit that referenced this pull request Nov 14, 2024
ghstack-source-id: 1821309ad24827c22c40c41f3544e7a768325f72
Pull Request resolved: #1088

(cherry picked from commit db2b5e6)
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}24$. Worsened: $\large\color{#d91a1a}16$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 35.8980μs 17.4099μs 57.4384 KOps/s 59.5026 KOps/s $\color{#d91a1a}-3.47\%$
test_plain_set_stack_nested 51.4680μs 17.5791μs 56.8858 KOps/s 59.6125 KOps/s $\color{#d91a1a}-4.57\%$
test_plain_set_nested_inplace 52.2180μs 19.3406μs 51.7047 KOps/s 53.2251 KOps/s $\color{#d91a1a}-2.86\%$
test_plain_set_stack_nested_inplace 59.7520μs 19.3289μs 51.7360 KOps/s 52.6789 KOps/s $\color{#d91a1a}-1.79\%$
test_items 30.1560μs 4.1204μs 242.6957 KOps/s 240.8965 KOps/s $\color{#35bf28}+0.75\%$
test_items_nested 0.6654ms 0.3400ms 2.9409 KOps/s 2.8936 KOps/s $\color{#35bf28}+1.64\%$
test_items_nested_locked 0.7330ms 0.3423ms 2.9218 KOps/s 2.9181 KOps/s $\color{#35bf28}+0.13\%$
test_items_nested_leaf 0.1419ms 71.6045μs 13.9656 KOps/s 13.9872 KOps/s $\color{#d91a1a}-0.15\%$
test_items_stack_nested 0.5569ms 0.3435ms 2.9114 KOps/s 2.8615 KOps/s $\color{#35bf28}+1.74\%$
test_items_stack_nested_leaf 0.1566ms 73.8220μs 13.5461 KOps/s 13.6130 KOps/s $\color{#d91a1a}-0.49\%$
test_items_stack_nested_locked 0.5542ms 0.3445ms 2.9027 KOps/s 2.7588 KOps/s $\textbf{\color{#35bf28}+5.21\%}$
test_keys 32.5900μs 3.4914μs 286.4205 KOps/s 287.2921 KOps/s $\color{#d91a1a}-0.30\%$
test_keys_nested 0.2594ms 0.1379ms 7.2520 KOps/s 7.3547 KOps/s $\color{#d91a1a}-1.40\%$
test_keys_nested_locked 1.8391ms 0.1431ms 6.9885 KOps/s 7.0778 KOps/s $\color{#d91a1a}-1.26\%$
test_keys_nested_leaf 0.2300ms 0.1182ms 8.4596 KOps/s 8.5553 KOps/s $\color{#d91a1a}-1.12\%$
test_keys_stack_nested 0.2277ms 0.1382ms 7.2358 KOps/s 7.3932 KOps/s $\color{#d91a1a}-2.13\%$
test_keys_stack_nested_leaf 0.2442ms 0.1182ms 8.4623 KOps/s 8.6740 KOps/s $\color{#d91a1a}-2.44\%$
test_keys_stack_nested_locked 0.2705ms 0.1443ms 6.9316 KOps/s 7.0676 KOps/s $\color{#d91a1a}-1.92\%$
test_values 5.6828μs 1.0276μs 973.1560 KOps/s 956.6345 KOps/s $\color{#35bf28}+1.73\%$
test_values_nested 0.1246ms 54.7153μs 18.2764 KOps/s 17.5363 KOps/s $\color{#35bf28}+4.22\%$
test_values_nested_locked 0.1091ms 54.5701μs 18.3251 KOps/s 17.6756 KOps/s $\color{#35bf28}+3.67\%$
test_values_nested_leaf 0.1199ms 59.7767μs 16.7289 KOps/s 15.9658 KOps/s $\color{#35bf28}+4.78\%$
test_values_stack_nested 0.1094ms 55.7614μs 17.9336 KOps/s 17.3180 KOps/s $\color{#35bf28}+3.55\%$
test_values_stack_nested_leaf 0.1172ms 59.9909μs 16.6692 KOps/s 16.2058 KOps/s $\color{#35bf28}+2.86\%$
test_values_stack_nested_locked 0.1073ms 55.9325μs 17.8787 KOps/s 17.3356 KOps/s $\color{#35bf28}+3.13\%$
test_membership 5.2727μs 0.7247μs 1.3798 MOps/s 1.1426 MOps/s $\textbf{\color{#35bf28}+20.76\%}$
test_membership_nested 23.1340μs 2.7181μs 367.8992 KOps/s 360.9438 KOps/s $\color{#35bf28}+1.93\%$
test_membership_nested_leaf 39.7330μs 2.7443μs 364.3874 KOps/s 364.4274 KOps/s $\color{#d91a1a}-0.01\%$
test_membership_stacked_nested 28.7140μs 2.7354μs 365.5795 KOps/s 363.9033 KOps/s $\color{#35bf28}+0.46\%$
test_membership_stacked_nested_leaf 27.3530μs 2.7189μs 367.8025 KOps/s 350.3703 KOps/s $\color{#35bf28}+4.98\%$
test_membership_nested_last 44.7360μs 3.9182μs 255.2189 KOps/s 245.3872 KOps/s $\color{#35bf28}+4.01\%$
test_membership_nested_leaf_last 21.3790μs 3.9693μs 251.9321 KOps/s 242.8348 KOps/s $\color{#35bf28}+3.75\%$
test_membership_stacked_nested_last 30.6070μs 3.9557μs 252.8016 KOps/s 173.4037 KOps/s $\textbf{\color{#35bf28}+45.79\%}$
test_membership_stacked_nested_leaf_last 37.2990μs 3.9337μs 254.2151 KOps/s 172.3584 KOps/s $\textbf{\color{#35bf28}+47.49\%}$
test_nested_getleaf 48.3100μs 10.8186μs 92.4336 KOps/s 91.3867 KOps/s $\color{#35bf28}+1.15\%$
test_nested_get 40.7960μs 10.3190μs 96.9082 KOps/s 96.0105 KOps/s $\color{#35bf28}+0.94\%$
test_stacked_getleaf 28.4420μs 10.7205μs 93.2792 KOps/s 92.2501 KOps/s $\color{#35bf28}+1.12\%$
test_stacked_get 45.6050μs 10.0502μs 99.5005 KOps/s 96.4684 KOps/s $\color{#35bf28}+3.14\%$
test_nested_getitemleaf 36.1670μs 11.2156μs 89.1618 KOps/s 88.2029 KOps/s $\color{#35bf28}+1.09\%$
test_nested_getitem 37.8410μs 10.4592μs 95.6099 KOps/s 94.0497 KOps/s $\color{#35bf28}+1.66\%$
test_stacked_getitemleaf 38.1510μs 11.1493μs 89.6920 KOps/s 88.3432 KOps/s $\color{#35bf28}+1.53\%$
test_stacked_getitem 32.5110μs 10.4399μs 95.7862 KOps/s 93.6906 KOps/s $\color{#35bf28}+2.24\%$
test_lock_nested 3.0507ms 0.4455ms 2.2445 KOps/s 1.8813 KOps/s $\textbf{\color{#35bf28}+19.31\%}$
test_lock_stack_nested 0.5661ms 0.4083ms 2.4491 KOps/s 2.4981 KOps/s $\color{#d91a1a}-1.96\%$
test_unlock_nested 0.6840ms 0.3604ms 2.7746 KOps/s 2.8090 KOps/s $\color{#d91a1a}-1.23\%$
test_unlock_stack_nested 0.4368ms 0.3274ms 3.0545 KOps/s 3.0989 KOps/s $\color{#d91a1a}-1.43\%$
test_flatten_speed 0.1902ms 91.5098μs 10.9278 KOps/s 10.9602 KOps/s $\color{#d91a1a}-0.30\%$
test_unflatten_speed 0.9850ms 0.4760ms 2.1008 KOps/s 2.1080 KOps/s $\color{#d91a1a}-0.34\%$
test_common_ops 1.4814ms 0.7468ms 1.3390 KOps/s 1.3874 KOps/s $\color{#d91a1a}-3.49\%$
test_creation 24.6960μs 2.2234μs 449.7543 KOps/s 502.1704 KOps/s $\textbf{\color{#d91a1a}-10.44\%}$
test_creation_empty 33.3030μs 10.4013μs 96.1415 KOps/s 117.4298 KOps/s $\textbf{\color{#d91a1a}-18.13\%}$
test_creation_nested_1 39.0030μs 13.3121μs 75.1197 KOps/s 87.4421 KOps/s $\textbf{\color{#d91a1a}-14.09\%}$
test_creation_nested_2 43.5620μs 17.7576μs 56.3139 KOps/s 64.3052 KOps/s $\textbf{\color{#d91a1a}-12.43\%}$
test_clone 1.4160ms 12.8087μs 78.0720 KOps/s 75.0901 KOps/s $\color{#35bf28}+3.97\%$
test_getitem[int] 1.0470ms 12.2020μs 81.9538 KOps/s 76.6822 KOps/s $\textbf{\color{#35bf28}+6.87\%}$
test_getitem[slice_int] 0.1728ms 23.2889μs 42.9389 KOps/s 41.7165 KOps/s $\color{#35bf28}+2.93\%$
test_getitem[range] 0.1725ms 46.9888μs 21.2817 KOps/s 20.5196 KOps/s $\color{#35bf28}+3.71\%$
test_getitem[tuple] 0.1458ms 19.1490μs 52.2222 KOps/s 48.9783 KOps/s $\textbf{\color{#35bf28}+6.62\%}$
test_getitem[list] 0.1850ms 41.8030μs 23.9217 KOps/s 22.7269 KOps/s $\textbf{\color{#35bf28}+5.26\%}$
test_setitem_dim[int] 53.9310μs 24.6717μs 40.5322 KOps/s 40.2229 KOps/s $\color{#35bf28}+0.77\%$
test_setitem_dim[slice_int] 85.6900μs 50.4789μs 19.8103 KOps/s 19.8099 KOps/s $+0.00\%$
test_setitem_dim[range] 0.1341ms 72.1190μs 13.8660 KOps/s 13.8601 KOps/s $\color{#35bf28}+0.04\%$
test_setitem_dim[tuple] 93.6150μs 39.5271μs 25.2991 KOps/s 25.3483 KOps/s $\color{#d91a1a}-0.19\%$
test_setitem 75.8110μs 19.7138μs 50.7258 KOps/s 53.0860 KOps/s $\color{#d91a1a}-4.45\%$
test_set 82.1330μs 18.7210μs 53.4159 KOps/s 55.1527 KOps/s $\color{#d91a1a}-3.15\%$
test_set_shared 5.0482ms 0.1670ms 5.9871 KOps/s 6.0644 KOps/s $\color{#d91a1a}-1.28\%$
test_update 0.1092ms 20.7850μs 48.1115 KOps/s 50.8792 KOps/s $\textbf{\color{#d91a1a}-5.44\%}$
test_update_nested 84.9490μs 29.4012μs 34.0122 KOps/s 34.2626 KOps/s $\color{#d91a1a}-0.73\%$
test_update__nested 0.6534ms 31.1270μs 32.1265 KOps/s 29.6842 KOps/s $\textbf{\color{#35bf28}+8.23\%}$
test_set_nested 75.2310μs 20.3620μs 49.1111 KOps/s 49.0407 KOps/s $\color{#35bf28}+0.14\%$
test_set_nested_new 73.6270μs 24.8718μs 40.2062 KOps/s 40.3065 KOps/s $\color{#d91a1a}-0.25\%$
test_select 0.1003ms 42.1021μs 23.7518 KOps/s 24.0179 KOps/s $\color{#d91a1a}-1.11\%$
test_select_nested 0.1276ms 59.9147μs 16.6904 KOps/s 16.5380 KOps/s $\color{#35bf28}+0.92\%$
test_exclude_nested 0.1522ms 74.5800μs 13.4084 KOps/s 13.2387 KOps/s $\color{#35bf28}+1.28\%$
test_empty[True] 0.6609ms 0.3518ms 2.8429 KOps/s 2.8311 KOps/s $\color{#35bf28}+0.42\%$
test_empty[False] 17.4502μs 1.2381μs 807.7113 KOps/s 822.5416 KOps/s $\color{#d91a1a}-1.80\%$
test_unbind_speed 0.3726ms 0.2576ms 3.8823 KOps/s 3.8531 KOps/s $\color{#35bf28}+0.76\%$
test_unbind_speed_stack0 0.5714ms 0.2533ms 3.9486 KOps/s 3.9940 KOps/s $\color{#d91a1a}-1.14\%$
test_unbind_speed_stack1 98.8125ms 0.7465ms 1.3397 KOps/s 1.4832 KOps/s $\textbf{\color{#d91a1a}-9.68\%}$
test_split 2.3996ms 1.5599ms 641.0665 Ops/s 581.4415 Ops/s $\textbf{\color{#35bf28}+10.25\%}$
test_chunk 97.4279ms 1.8591ms 537.8844 Ops/s 576.7528 Ops/s $\textbf{\color{#d91a1a}-6.74\%}$
test_consolidate_njt[False-None] 8.5297ms 8.0856ms 123.6766 Ops/s 123.5580 Ops/s $\color{#35bf28}+0.10\%$
test_creation[device0] 3.3240ms 89.2132μs 11.2091 KOps/s 10.6211 KOps/s $\textbf{\color{#35bf28}+5.54\%}$
test_creation_from_tensor 0.2216ms 91.7588μs 10.8981 KOps/s 10.5641 KOps/s $\color{#35bf28}+3.16\%$
test_add_one[memmap_tensor0] 0.2054ms 4.7266μs 211.5698 KOps/s 210.8429 KOps/s $\color{#35bf28}+0.34\%$
test_contiguous[memmap_tensor0] 16.8020μs 0.5285μs 1.8922 MOps/s 1.9099 MOps/s $\color{#d91a1a}-0.93\%$
test_stack[memmap_tensor0] 39.0730μs 3.2637μs 306.3962 KOps/s 305.0429 KOps/s $\color{#35bf28}+0.44\%$
test_memmaptd_index 0.9105ms 0.2319ms 4.3125 KOps/s 4.3231 KOps/s $\color{#d91a1a}-0.25\%$
test_memmaptd_index_astensor 0.7928ms 0.3105ms 3.2204 KOps/s 3.1806 KOps/s $\color{#35bf28}+1.25\%$
test_memmaptd_index_op 1.3215ms 0.5637ms 1.7739 KOps/s 1.8615 KOps/s $\color{#d91a1a}-4.71\%$
test_serialize_model 0.1261s 0.1133s 8.8245 Ops/s 7.5610 Ops/s $\textbf{\color{#35bf28}+16.71\%}$
test_serialize_model_pickle 0.4450s 0.3838s 2.6055 Ops/s 2.4440 Ops/s $\textbf{\color{#35bf28}+6.61\%}$
test_serialize_weights 0.2223s 0.1284s 7.7852 Ops/s 8.7199 Ops/s $\textbf{\color{#d91a1a}-10.72\%}$
test_serialize_weights_returnearly 0.1650s 0.1577s 6.3430 Ops/s 6.3438 Ops/s $\color{#d91a1a}-0.01\%$
test_serialize_weights_pickle 0.5532s 0.4622s 2.1637 Ops/s 2.5521 Ops/s $\textbf{\color{#d91a1a}-15.22\%}$
test_serialize_weights_filesystem 0.1501s 0.1384s 7.2250 Ops/s 6.4586 Ops/s $\textbf{\color{#35bf28}+11.87\%}$
test_serialize_model_filesystem 0.2532s 0.1557s 6.4209 Ops/s 6.7851 Ops/s $\textbf{\color{#d91a1a}-5.37\%}$
test_reshape_pytree 72.3950μs 26.5996μs 37.5946 KOps/s 36.1634 KOps/s $\color{#35bf28}+3.96\%$
test_reshape_td 79.7890μs 31.7706μs 31.4757 KOps/s 28.4247 KOps/s $\textbf{\color{#35bf28}+10.73\%}$
test_view_pytree 84.0570μs 27.0280μs 36.9986 KOps/s 36.9471 KOps/s $\color{#35bf28}+0.14\%$
test_view_td 93.5140μs 36.4375μs 27.4443 KOps/s 25.6623 KOps/s $\textbf{\color{#35bf28}+6.94\%}$
test_unbind_pytree 74.6890μs 29.7999μs 33.5571 KOps/s 33.2385 KOps/s $\color{#35bf28}+0.96\%$
test_unbind_td 0.3582ms 37.8947μs 26.3889 KOps/s 25.4541 KOps/s $\color{#35bf28}+3.67\%$
test_split_pytree 86.2530μs 30.0286μs 33.3016 KOps/s 33.7539 KOps/s $\color{#d91a1a}-1.34\%$
test_split_td 0.5496ms 43.3783μs 23.0530 KOps/s 22.0282 KOps/s $\color{#35bf28}+4.65\%$
test_add_pytree 0.1164ms 34.9718μs 28.5945 KOps/s 28.1110 KOps/s $\color{#35bf28}+1.72\%$
test_add_td 0.1360ms 51.1460μs 19.5519 KOps/s 18.8706 KOps/s $\color{#35bf28}+3.61\%$
test_compile_add_one_nested[tensordict-compile] 0.1431ms 60.9761μs 16.3999 KOps/s 16.2846 KOps/s $\color{#35bf28}+0.71\%$
test_compile_add_one_nested[tensordict-eager] 0.3061ms 0.1602ms 6.2419 KOps/s 6.0759 KOps/s $\color{#35bf28}+2.73\%$
test_compile_add_one_nested[pytree-compile] 0.1118ms 46.0344μs 21.7229 KOps/s 22.2642 KOps/s $\color{#d91a1a}-2.43\%$
test_compile_add_one_nested[pytree-eager] 0.2503ms 0.1168ms 8.5603 KOps/s 8.3252 KOps/s $\color{#35bf28}+2.82\%$
test_compile_copy_nested[tensordict-compile] 72.4040μs 25.9649μs 38.5136 KOps/s 38.5554 KOps/s $\color{#d91a1a}-0.11\%$
test_compile_copy_nested[tensordict-eager] 0.1152ms 52.8982μs 18.9042 KOps/s 18.1901 KOps/s $\color{#35bf28}+3.93\%$
test_compile_copy_nested[pytree-compile] 0.1707ms 80.8384μs 12.3704 KOps/s 12.2207 KOps/s $\color{#35bf28}+1.22\%$
test_compile_copy_nested[pytree-eager] 0.1389ms 68.5397μs 14.5901 KOps/s 14.5089 KOps/s $\color{#35bf28}+0.56\%$
test_compile_add_one_flat[tensordict-compile] 0.2399ms 0.1039ms 9.6215 KOps/s 9.5147 KOps/s $\color{#35bf28}+1.12\%$
test_compile_add_one_flat[tensordict-eager] 0.3261ms 0.1984ms 5.0412 KOps/s 4.9961 KOps/s $\color{#35bf28}+0.90\%$
test_compile_add_one_flat[tensorclass-compile] 0.1120ms 45.3264μs 22.0622 KOps/s 22.9908 KOps/s $\color{#d91a1a}-4.04\%$
test_compile_add_one_flat[tensorclass-eager] 0.3560ms 61.2062μs 16.3382 KOps/s 16.1087 KOps/s $\color{#35bf28}+1.42\%$
test_compile_add_one_flat[pytree-compile] 0.1561ms 0.1027ms 9.7417 KOps/s 9.8213 KOps/s $\color{#d91a1a}-0.81\%$
test_compile_add_one_flat[pytree-eager] 0.3450ms 0.1977ms 5.0586 KOps/s 4.9116 KOps/s $\color{#35bf28}+2.99\%$
test_compile_add_self_flat[tensordict-eager] 0.4523ms 0.2090ms 4.7836 KOps/s 4.6776 KOps/s $\color{#35bf28}+2.27\%$
test_compile_add_self_flat[tensordict-compile] 0.2061ms 0.1094ms 9.1427 KOps/s 9.4914 KOps/s $\color{#d91a1a}-3.67\%$
test_compile_add_self_flat[tensorclass-eager] 1.1265ms 54.2871μs 18.4206 KOps/s 18.3427 KOps/s $\color{#35bf28}+0.42\%$
test_compile_add_self_flat[tensorclass-compile] 0.1342ms 47.2044μs 21.1845 KOps/s 22.1154 KOps/s $\color{#d91a1a}-4.21\%$
test_compile_add_self_flat[pytree-eager] 0.5619ms 0.1568ms 6.3774 KOps/s 6.1957 KOps/s $\color{#35bf28}+2.93\%$
test_compile_add_self_flat[pytree-compile] 0.1809ms 0.1031ms 9.6953 KOps/s 9.7046 KOps/s $\color{#d91a1a}-0.10\%$
test_compile_copy_flat[tensordict-compile] 50.3340μs 21.0136μs 47.5883 KOps/s 48.6919 KOps/s $\color{#d91a1a}-2.27\%$
test_compile_copy_flat[tensordict-eager] 0.1258ms 58.7718μs 17.0150 KOps/s 16.7179 KOps/s $\color{#35bf28}+1.78\%$
test_compile_copy_flat[pytree-compile] 0.1781ms 80.9412μs 12.3547 KOps/s 11.7140 KOps/s $\textbf{\color{#35bf28}+5.47\%}$
test_compile_copy_flat[pytree-eager] 0.1549ms 68.8372μs 14.5270 KOps/s 13.9109 KOps/s $\color{#35bf28}+4.43\%$
test_compile_assign_and_add[tensordict-compile] 0.3029ms 0.2069ms 4.8322 KOps/s 4.9000 KOps/s $\color{#d91a1a}-1.38\%$
test_compile_assign_and_add[tensordict-eager] 2.0244ms 1.2755ms 784.0297 Ops/s 778.3010 Ops/s $\color{#35bf28}+0.74\%$
test_compile_assign_and_add[pytree-compile] 0.3636ms 0.1998ms 5.0040 KOps/s 4.9632 KOps/s $\color{#35bf28}+0.82\%$
test_compile_assign_and_add[pytree-eager] 0.9519ms 0.7768ms 1.2874 KOps/s 1.2674 KOps/s $\color{#35bf28}+1.58\%$
test_compile_assign_and_add_stack[compile] 0.5952ms 0.4572ms 2.1872 KOps/s 2.2444 KOps/s $\color{#d91a1a}-2.55\%$
test_compile_assign_and_add_stack[eager] 3.9381ms 2.5073ms 398.8396 Ops/s 416.6871 Ops/s $\color{#d91a1a}-4.28\%$
test_compile_indexing[tensor-tensordict-compile] 0.1171ms 35.2161μs 28.3961 KOps/s 29.8570 KOps/s $\color{#d91a1a}-4.89\%$
test_compile_indexing[tensor-tensordict-eager] 0.4474ms 31.0716μs 32.1838 KOps/s 30.2805 KOps/s $\textbf{\color{#35bf28}+6.29\%}$
test_compile_indexing[tensor-tensorclass-compile] 72.6550μs 28.9036μs 34.5977 KOps/s 35.0973 KOps/s $\color{#d91a1a}-1.42\%$
test_compile_indexing[tensor-tensorclass-eager] 63.9800μs 23.0559μs 43.3728 KOps/s 43.1566 KOps/s $\color{#35bf28}+0.50\%$
test_compile_indexing[tensor-pytree-compile] 76.7740μs 29.5986μs 33.7854 KOps/s 34.3171 KOps/s $\color{#d91a1a}-1.55\%$
test_compile_indexing[tensor-pytree-eager] 71.9340μs 22.9704μs 43.5342 KOps/s 43.6304 KOps/s $\color{#d91a1a}-0.22\%$
test_compile_indexing[slice-tensordict-compile] 0.1313ms 50.8184μs 19.6779 KOps/s 20.0025 KOps/s $\color{#d91a1a}-1.62\%$
test_compile_indexing[slice-tensordict-eager] 0.4278ms 18.7060μs 53.4587 KOps/s 48.1827 KOps/s $\textbf{\color{#35bf28}+10.95\%}$
test_compile_indexing[slice-tensorclass-compile] 0.1043ms 44.0033μs 22.7256 KOps/s 22.9140 KOps/s $\color{#d91a1a}-0.82\%$
test_compile_indexing[slice-tensorclass-eager] 56.5960μs 18.7048μs 53.4621 KOps/s 51.9357 KOps/s $\color{#35bf28}+2.94\%$
test_compile_indexing[slice-pytree-compile] 0.1153ms 44.6741μs 22.3844 KOps/s 22.5715 KOps/s $\color{#d91a1a}-0.83\%$
test_compile_indexing[slice-pytree-eager] 55.4830μs 18.8395μs 53.0800 KOps/s 52.2337 KOps/s $\color{#35bf28}+1.62\%$
test_compile_indexing[int-tensordict-compile] 0.1152ms 51.9602μs 19.2455 KOps/s 19.4042 KOps/s $\color{#d91a1a}-0.82\%$
test_compile_indexing[int-tensordict-eager] 0.8443ms 18.6217μs 53.7007 KOps/s 48.4827 KOps/s $\textbf{\color{#35bf28}+10.76\%}$
test_compile_indexing[int-tensorclass-compile] 0.1034ms 44.8743μs 22.2845 KOps/s 22.7496 KOps/s $\color{#d91a1a}-2.04\%$
test_compile_indexing[int-tensorclass-eager] 53.8900μs 18.7509μs 53.3307 KOps/s 52.7792 KOps/s $\color{#35bf28}+1.04\%$
test_compile_indexing[int-pytree-compile] 0.1030ms 45.0894μs 22.1782 KOps/s 22.7636 KOps/s $\color{#d91a1a}-2.57\%$
test_compile_indexing[int-pytree-eager] 61.3340μs 18.9183μs 52.8588 KOps/s 53.0530 KOps/s $\color{#d91a1a}-0.37\%$
test_mod_add[eager] 87.0750μs 25.3163μs 39.5003 KOps/s 42.3380 KOps/s $\textbf{\color{#d91a1a}-6.70\%}$
test_mod_add[compile] 0.1055ms 44.6326μs 22.4051 KOps/s 23.1459 KOps/s $\color{#d91a1a}-3.20\%$
test_mod_add[compile-overhead] 0.1067ms 45.2310μs 22.1087 KOps/s 22.6220 KOps/s $\color{#d91a1a}-2.27\%$
test_mod_wrap[eager] 0.3573ms 0.2110ms 4.7387 KOps/s 4.7893 KOps/s $\color{#d91a1a}-1.06\%$
test_mod_wrap[compile] 1.2516ms 0.2036ms 4.9124 KOps/s 4.9648 KOps/s $\color{#d91a1a}-1.05\%$
test_mod_wrap[compile-overhead] 1.2395ms 0.2026ms 4.9362 KOps/s 4.9758 KOps/s $\color{#d91a1a}-0.80\%$
test_mod_wrap_and_backward[eager] 15.2617ms 10.9357ms 91.4435 Ops/s 81.0193 Ops/s $\textbf{\color{#35bf28}+12.87\%}$
test_mod_wrap_and_backward[compile] 12.2738ms 11.2457ms 88.9226 Ops/s 70.9064 Ops/s $\textbf{\color{#35bf28}+25.41\%}$
test_mod_wrap_and_backward[compile-overhead] 19.5706ms 12.1533ms 82.2818 Ops/s 79.4250 Ops/s $\color{#35bf28}+3.60\%$
test_seq_add[eager] 0.2233ms 91.0127μs 10.9875 KOps/s 11.3638 KOps/s $\color{#d91a1a}-3.31\%$
test_seq_add[compile] 0.1179ms 60.0787μs 16.6448 KOps/s 17.0088 KOps/s $\color{#d91a1a}-2.14\%$
test_seq_add[compile-overhead] 0.1293ms 58.1478μs 17.1975 KOps/s 17.2803 KOps/s $\color{#d91a1a}-0.48\%$
test_seq_wrap[eager] 0.4903ms 0.3755ms 2.6632 KOps/s 2.6550 KOps/s $\color{#35bf28}+0.31\%$
test_seq_wrap[compile] 0.3730ms 0.2235ms 4.4744 KOps/s 4.4726 KOps/s $\color{#35bf28}+0.04\%$
test_seq_wrap[compile-overhead] 0.3403ms 0.2229ms 4.4868 KOps/s 4.5349 KOps/s $\color{#d91a1a}-1.06\%$
test_func_call_runtime[False-eager] 0.9464ms 0.5355ms 1.8674 KOps/s 1.7853 KOps/s $\color{#35bf28}+4.60\%$
test_func_call_runtime[False-compile] 0.4984ms 0.4150ms 2.4095 KOps/s 2.3990 KOps/s $\color{#35bf28}+0.44\%$
test_func_call_runtime[False-compile-overhead] 0.5425ms 0.4170ms 2.3979 KOps/s 2.3887 KOps/s $\color{#35bf28}+0.38\%$
test_func_call_runtime[True-eager] 0.8786ms 0.7416ms 1.3484 KOps/s 1.3183 KOps/s $\color{#35bf28}+2.28\%$
test_func_call_runtime[True-compile] 0.8126ms 0.4613ms 2.1677 KOps/s 2.1744 KOps/s $\color{#d91a1a}-0.31\%$
test_func_call_runtime[True-compile-overhead] 0.5800ms 0.4630ms 2.1598 KOps/s 2.1677 KOps/s $\color{#d91a1a}-0.36\%$
test_func_call_cm_runtime[False-eager] 0.8765ms 0.5345ms 1.8709 KOps/s 1.8525 KOps/s $\color{#35bf28}+0.99\%$
test_func_call_cm_runtime[False-compile] 0.6834ms 0.4159ms 2.4044 KOps/s 2.3559 KOps/s $\color{#35bf28}+2.06\%$
test_func_call_cm_runtime[False-compile-overhead] 0.5499ms 0.4188ms 2.3876 KOps/s 2.3643 KOps/s $\color{#35bf28}+0.99\%$
test_func_call_cm_runtime[True-eager] 1.9608ms 0.8996ms 1.1117 KOps/s 1.1221 KOps/s $\color{#d91a1a}-0.93\%$
test_func_call_cm_runtime[True-compile] 0.6240ms 0.4840ms 2.0659 KOps/s 2.0600 KOps/s $\color{#35bf28}+0.29\%$
test_func_call_cm_runtime[True-compile-overhead] 0.8676ms 0.4867ms 2.0548 KOps/s 2.0703 KOps/s $\color{#d91a1a}-0.75\%$
test_vmap_func_call_cm_runtime[eager] 3.3883ms 1.8580ms 538.2103 Ops/s 536.2717 Ops/s $\color{#35bf28}+0.36\%$
test_vmap_func_call_cm_runtime[compile] 0.8815ms 0.5098ms 1.9615 KOps/s 1.9600 KOps/s $\color{#35bf28}+0.08\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.6193ms 0.5088ms 1.9655 KOps/s 1.9635 KOps/s $\color{#35bf28}+0.10\%$
test_distributed 0.2950ms 0.1254ms 7.9754 KOps/s 7.7688 KOps/s $\color{#35bf28}+2.66\%$
test_tdmodule 50.9350μs 18.4057μs 54.3310 KOps/s 56.9155 KOps/s $\color{#d91a1a}-4.54\%$
test_tdmodule_dispatch 65.0310μs 36.1947μs 27.6283 KOps/s 29.0057 KOps/s $\color{#d91a1a}-4.75\%$
test_tdseq 46.5970μs 20.5039μs 48.7712 KOps/s 51.3487 KOps/s $\textbf{\color{#d91a1a}-5.02\%}$
test_tdseq_dispatch 61.7850μs 40.3769μs 24.7667 KOps/s 25.9198 KOps/s $\color{#d91a1a}-4.45\%$
test_instantiation_functorch 2.1923ms 1.5491ms 645.5543 Ops/s 640.3584 Ops/s $\color{#35bf28}+0.81\%$
test_exec_functorch 0.3050ms 0.1793ms 5.5777 KOps/s 5.6244 KOps/s $\color{#d91a1a}-0.83\%$
test_exec_functional_call 0.2604ms 0.1707ms 5.8582 KOps/s 5.8937 KOps/s $\color{#d91a1a}-0.60\%$
test_exec_td_decorator 0.5191ms 0.2260ms 4.4250 KOps/s 4.2813 KOps/s $\color{#35bf28}+3.36\%$
test_vmap_mlp_speed_decorator[True-True] 1.0019ms 0.6348ms 1.5754 KOps/s 1.6126 KOps/s $\color{#d91a1a}-2.31\%$
test_vmap_mlp_speed_decorator[True-False] 0.9653ms 0.6329ms 1.5799 KOps/s 1.6163 KOps/s $\color{#d91a1a}-2.25\%$
test_vmap_mlp_speed_decorator[False-True] 0.8380ms 0.5200ms 1.9231 KOps/s 1.9318 KOps/s $\color{#d91a1a}-0.45\%$
test_vmap_mlp_speed_decorator[False-False] 0.6595ms 0.5172ms 1.9335 KOps/s 1.9312 KOps/s $\color{#35bf28}+0.12\%$
test_to_module_speed[True] 1.5102ms 1.3014ms 768.4017 Ops/s 755.4232 Ops/s $\color{#35bf28}+1.72\%$
test_to_module_speed[False] 1.6533ms 1.2791ms 781.8266 Ops/s 792.1184 Ops/s $\color{#d91a1a}-1.30\%$
test_tc_init 72.7460μs 44.2578μs 22.5949 KOps/s 21.9265 KOps/s $\color{#35bf28}+3.05\%$
test_tc_init_nested 0.1821ms 91.8115μs 10.8919 KOps/s 10.9410 KOps/s $\color{#d91a1a}-0.45\%$
test_tc_first_layer_tensor 18.3850μs 1.5711μs 636.4942 KOps/s 631.0430 KOps/s $\color{#35bf28}+0.86\%$
test_tc_first_layer_nontensor 24.3750μs 4.6636μs 214.4251 KOps/s 204.7612 KOps/s $\color{#35bf28}+4.72\%$
test_tc_second_layer_tensor 30.0360μs 2.8133μs 355.4499 KOps/s 339.1885 KOps/s $\color{#35bf28}+4.79\%$
test_tc_second_layer_nontensor 27.6720μs 6.0259μs 165.9505 KOps/s 157.6228 KOps/s $\textbf{\color{#35bf28}+5.28\%}$
test_unbind 0.2251s 12.3892ms 80.7156 Ops/s 77.5525 Ops/s $\color{#35bf28}+4.08\%$
test_full_like 16.6616ms 12.1302ms 82.4388 Ops/s 138.9978 Ops/s $\textbf{\color{#d91a1a}-40.69\%}$
test_zeros_like 12.0422ms 7.0696ms 141.4505 Ops/s 362.8178 Ops/s $\textbf{\color{#d91a1a}-61.01\%}$
test_ones_like 13.6793ms 7.3462ms 136.1245 Ops/s 305.5424 Ops/s $\textbf{\color{#d91a1a}-55.45\%}$
test_clone 12.0543ms 8.9079ms 112.2593 Ops/s 197.8508 Ops/s $\textbf{\color{#d91a1a}-43.26\%}$
test_squeeze 65.6720μs 11.9402μs 83.7503 KOps/s 83.8878 KOps/s $\color{#d91a1a}-0.16\%$
test_unsqueeze 0.1637ms 87.7094μs 11.4013 KOps/s 11.3718 KOps/s $\color{#35bf28}+0.26\%$
test_split 0.4832ms 0.1862ms 5.3703 KOps/s 5.3333 KOps/s $\color{#35bf28}+0.69\%$
test_permute 0.3596ms 0.2108ms 4.7427 KOps/s 4.4723 KOps/s $\textbf{\color{#35bf28}+6.05\%}$
test_stack 27.1289ms 24.8636ms 40.2194 Ops/s 40.1610 Ops/s $\color{#35bf28}+0.15\%$
test_cat 27.2721ms 24.7036ms 40.4799 Ops/s 40.6633 Ops/s $\color{#d91a1a}-0.45\%$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 229. Improved: $\large\color{#35bf28}15$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 33.7500μs 10.7623μs 92.9172 KOps/s 94.3459 KOps/s $\color{#d91a1a}-1.51\%$
test_plain_set_stack_nested 36.2410μs 10.8752μs 91.9524 KOps/s 94.1488 KOps/s $\color{#d91a1a}-2.33\%$
test_plain_set_nested_inplace 41.6300μs 11.6720μs 85.6750 KOps/s 88.1718 KOps/s $\color{#d91a1a}-2.83\%$
test_plain_set_stack_nested_inplace 36.7010μs 11.6223μs 86.0413 KOps/s 87.2667 KOps/s $\color{#d91a1a}-1.40\%$
test_items 34.2500μs 2.8892μs 346.1217 KOps/s 343.4461 KOps/s $\color{#35bf28}+0.78\%$
test_items_nested 0.3850ms 0.3227ms 3.0989 KOps/s 3.1243 KOps/s $\color{#d91a1a}-0.81\%$
test_items_nested_locked 0.3932ms 0.3266ms 3.0618 KOps/s 3.1268 KOps/s $\color{#d91a1a}-2.08\%$
test_items_nested_leaf 90.8420μs 59.3650μs 16.8449 KOps/s 17.0574 KOps/s $\color{#d91a1a}-1.25\%$
test_items_stack_nested 0.5001ms 0.3253ms 3.0737 KOps/s 3.1446 KOps/s $\color{#d91a1a}-2.25\%$
test_items_stack_nested_leaf 78.5520μs 58.8136μs 17.0029 KOps/s 16.8485 KOps/s $\color{#35bf28}+0.92\%$
test_items_stack_nested_locked 0.3829ms 0.3234ms 3.0924 KOps/s 3.1065 KOps/s $\color{#d91a1a}-0.45\%$
test_keys 38.6710μs 3.5111μs 284.8080 KOps/s 285.9702 KOps/s $\color{#d91a1a}-0.41\%$
test_keys_nested 0.1023ms 71.1768μs 14.0495 KOps/s 14.1962 KOps/s $\color{#d91a1a}-1.03\%$
test_keys_nested_locked 0.6883ms 76.4162μs 13.0862 KOps/s 13.1516 KOps/s $\color{#d91a1a}-0.50\%$
test_keys_nested_leaf 0.1080ms 62.4521μs 16.0123 KOps/s 16.1736 KOps/s $\color{#d91a1a}-1.00\%$
test_keys_stack_nested 0.1000ms 71.1642μs 14.0520 KOps/s 14.1424 KOps/s $\color{#d91a1a}-0.64\%$
test_keys_stack_nested_leaf 96.0720μs 61.7908μs 16.1836 KOps/s 16.1973 KOps/s $\color{#d91a1a}-0.08\%$
test_keys_stack_nested_locked 0.1350ms 76.0533μs 13.1487 KOps/s 13.2588 KOps/s $\color{#d91a1a}-0.83\%$
test_values 4.8818μs 0.8570μs 1.1669 MOps/s 1.1697 MOps/s $\color{#d91a1a}-0.24\%$
test_values_nested 58.8810μs 31.3523μs 31.8956 KOps/s 31.9798 KOps/s $\color{#d91a1a}-0.26\%$
test_values_nested_locked 66.1110μs 33.0194μs 30.2852 KOps/s 30.5057 KOps/s $\color{#d91a1a}-0.72\%$
test_values_nested_leaf 70.7110μs 34.0637μs 29.3568 KOps/s 29.6686 KOps/s $\color{#d91a1a}-1.05\%$
test_values_stack_nested 76.4520μs 31.6860μs 31.5596 KOps/s 31.4709 KOps/s $\color{#35bf28}+0.28\%$
test_values_stack_nested_leaf 81.4220μs 34.2122μs 29.2293 KOps/s 29.1831 KOps/s $\color{#35bf28}+0.16\%$
test_values_stack_nested_locked 71.1510μs 33.0992μs 30.2123 KOps/s 30.1312 KOps/s $\color{#35bf28}+0.27\%$
test_membership 2.4145μs 0.5295μs 1.8885 MOps/s 1.9652 MOps/s $\color{#d91a1a}-3.90\%$
test_membership_nested 24.4555μs 1.8643μs 536.3816 KOps/s 506.8249 KOps/s $\textbf{\color{#35bf28}+5.83\%}$
test_membership_nested_leaf 11.7937μs 1.8556μs 538.9002 KOps/s 522.9324 KOps/s $\color{#35bf28}+3.05\%$
test_membership_stacked_nested 35.6010μs 1.9440μs 514.4162 KOps/s 493.9891 KOps/s $\color{#35bf28}+4.14\%$
test_membership_stacked_nested_leaf 25.6700μs 1.9552μs 511.4659 KOps/s 493.9102 KOps/s $\color{#35bf28}+3.55\%$
test_membership_nested_last 45.3910μs 2.7871μs 358.7929 KOps/s 350.4614 KOps/s $\color{#35bf28}+2.38\%$
test_membership_nested_leaf_last 26.7300μs 2.7962μs 357.6262 KOps/s 353.3583 KOps/s $\color{#35bf28}+1.21\%$
test_membership_stacked_nested_last 29.9710μs 2.8018μs 356.9130 KOps/s 126.4670 KOps/s $\textbf{\color{#35bf28}+182.22\%}$
test_membership_stacked_nested_leaf_last 41.9210μs 2.8017μs 356.9322 KOps/s 126.8591 KOps/s $\textbf{\color{#35bf28}+181.36\%}$
test_nested_getleaf 49.9510μs 6.0254μs 165.9641 KOps/s 164.8005 KOps/s $\color{#35bf28}+0.71\%$
test_nested_get 27.5400μs 5.7537μs 173.8011 KOps/s 175.6067 KOps/s $\color{#d91a1a}-1.03\%$
test_stacked_getleaf 42.0500μs 6.0572μs 165.0926 KOps/s 166.3710 KOps/s $\color{#d91a1a}-0.77\%$
test_stacked_get 36.2800μs 5.7496μs 173.9236 KOps/s 174.8010 KOps/s $\color{#d91a1a}-0.50\%$
test_nested_getitemleaf 40.4310μs 6.1289μs 163.1626 KOps/s 163.2880 KOps/s $\color{#d91a1a}-0.08\%$
test_nested_getitem 29.1100μs 5.8552μs 170.7894 KOps/s 171.9674 KOps/s $\color{#d91a1a}-0.69\%$
test_stacked_getitemleaf 36.0710μs 6.0750μs 164.6087 KOps/s 164.2231 KOps/s $\color{#35bf28}+0.23\%$
test_stacked_getitem 34.1510μs 5.8472μs 171.0216 KOps/s 173.1542 KOps/s $\color{#d91a1a}-1.23\%$
test_lock_nested 9.6395ms 0.3803ms 2.6298 KOps/s 2.6612 KOps/s $\color{#d91a1a}-1.18\%$
test_lock_stack_nested 0.4431ms 0.3404ms 2.9377 KOps/s 2.9813 KOps/s $\color{#d91a1a}-1.46\%$
test_unlock_nested 0.6603ms 0.3130ms 3.1953 KOps/s 3.2356 KOps/s $\color{#d91a1a}-1.25\%$
test_unlock_stack_nested 0.3399ms 0.2818ms 3.5484 KOps/s 3.6744 KOps/s $\color{#d91a1a}-3.43\%$
test_flatten_speed 0.1068ms 73.5638μs 13.5937 KOps/s 13.7696 KOps/s $\color{#d91a1a}-1.28\%$
test_unflatten_speed 0.3500ms 0.2954ms 3.3852 KOps/s 3.3745 KOps/s $\color{#35bf28}+0.32\%$
test_common_ops 1.7912ms 0.6009ms 1.6641 KOps/s 1.6789 KOps/s $\color{#d91a1a}-0.88\%$
test_creation 0.1025ms 1.4889μs 671.6513 KOps/s 672.4776 KOps/s $\color{#d91a1a}-0.12\%$
test_creation_empty 30.1100μs 7.6515μs 130.6941 KOps/s 136.7355 KOps/s $\color{#d91a1a}-4.42\%$
test_creation_nested_1 31.7800μs 9.1503μs 109.2860 KOps/s 113.6074 KOps/s $\color{#d91a1a}-3.80\%$
test_creation_nested_2 39.2510μs 11.6677μs 85.7064 KOps/s 88.3408 KOps/s $\color{#d91a1a}-2.98\%$
test_clone 59.1810μs 11.2016μs 89.2731 KOps/s 89.8928 KOps/s $\color{#d91a1a}-0.69\%$
test_getitem[int] 93.5310ms 16.4671μs 60.7271 KOps/s 91.1164 KOps/s $\textbf{\color{#d91a1a}-33.35\%}$
test_getitem[slice_int] 0.1043ms 21.4566μs 46.6057 KOps/s 46.8429 KOps/s $\color{#d91a1a}-0.51\%$
test_getitem[range] 0.1377ms 38.7654μs 25.7962 KOps/s 25.7847 KOps/s $\color{#35bf28}+0.04\%$
test_getitem[tuple] 0.1221ms 18.6106μs 53.7329 KOps/s 53.4467 KOps/s $\color{#35bf28}+0.54\%$
test_getitem[list] 0.2125ms 34.5301μs 28.9602 KOps/s 28.9516 KOps/s $\color{#35bf28}+0.03\%$
test_setitem_dim[int] 42.8810μs 19.9163μs 50.2101 KOps/s 49.8760 KOps/s $\color{#35bf28}+0.67\%$
test_setitem_dim[slice_int] 61.0610μs 38.5563μs 25.9361 KOps/s 25.7876 KOps/s $\color{#35bf28}+0.58\%$
test_setitem_dim[range] 86.6220μs 54.9463μs 18.1996 KOps/s 18.6757 KOps/s $\color{#d91a1a}-2.55\%$
test_setitem_dim[tuple] 52.3710μs 32.4482μs 30.8183 KOps/s 30.8353 KOps/s $\color{#d91a1a}-0.05\%$
test_setitem 89.8820μs 15.5334μs 64.3774 KOps/s 63.9287 KOps/s $\color{#35bf28}+0.70\%$
test_set 93.2420μs 14.8652μs 67.2710 KOps/s 67.4911 KOps/s $\color{#d91a1a}-0.33\%$
test_set_shared 1.6582ms 0.1482ms 6.7494 KOps/s 6.7278 KOps/s $\color{#35bf28}+0.32\%$
test_update 53.0210μs 17.8934μs 55.8864 KOps/s 56.6615 KOps/s $\color{#d91a1a}-1.37\%$
test_update_nested 0.2090ms 22.9991μs 43.4800 KOps/s 45.4355 KOps/s $\color{#d91a1a}-4.30\%$
test_update__nested 0.1298ms 24.8517μs 40.2387 KOps/s 39.6895 KOps/s $\color{#35bf28}+1.38\%$
test_set_nested 75.9120μs 15.9931μs 62.5270 KOps/s 61.1379 KOps/s $\color{#35bf28}+2.27\%$
test_set_nested_new 84.9220μs 18.3412μs 54.5220 KOps/s 54.0751 KOps/s $\color{#35bf28}+0.83\%$
test_select 85.6320μs 30.6303μs 32.6474 KOps/s 31.2942 KOps/s $\color{#35bf28}+4.32\%$
test_select_nested 70.6120μs 42.0171μs 23.7998 KOps/s 23.7503 KOps/s $\color{#35bf28}+0.21\%$
test_exclude_nested 0.1076ms 59.5642μs 16.7886 KOps/s 16.8401 KOps/s $\color{#d91a1a}-0.31\%$
test_empty[True] 0.3288ms 0.2578ms 3.8785 KOps/s 3.9434 KOps/s $\color{#d91a1a}-1.64\%$
test_empty[False] 3.9251μs 0.7580μs 1.3192 MOps/s 1.3421 MOps/s $\color{#d91a1a}-1.71\%$
test_to 87.4620μs 56.4897μs 17.7023 KOps/s 17.8787 KOps/s $\color{#d91a1a}-0.99\%$
test_to_nonblocking 96.0820μs 46.7373μs 21.3962 KOps/s 20.9780 KOps/s $\color{#35bf28}+1.99\%$
test_unbind_speed 1.5702ms 0.2390ms 4.1849 KOps/s 4.2262 KOps/s $\color{#d91a1a}-0.98\%$
test_unbind_speed_stack0 0.3103ms 0.2391ms 4.1819 KOps/s 4.3364 KOps/s $\color{#d91a1a}-3.56\%$
test_unbind_speed_stack1 93.8259ms 0.6557ms 1.5250 KOps/s 1.6902 KOps/s $\textbf{\color{#d91a1a}-9.78\%}$
test_split 96.7470ms 1.7718ms 564.3994 Ops/s 605.8706 Ops/s $\textbf{\color{#d91a1a}-6.84\%}$
test_chunk 97.9202ms 1.6416ms 609.1538 Ops/s 609.2520 Ops/s $\color{#d91a1a}-0.02\%$
test_consolidate[False-None] 2.7656ms 2.6445ms 378.1372 Ops/s 347.5363 Ops/s $\textbf{\color{#35bf28}+8.81\%}$
test_consolidate[default-None] 1.8503ms 1.7537ms 570.2139 Ops/s 571.6644 Ops/s $\color{#d91a1a}-0.25\%$
test_consolidate[reduce-overhead-None] 1.9233ms 1.7998ms 555.6267 Ops/s 554.5658 Ops/s $\color{#35bf28}+0.19\%$
test_consolidate_njt[False-None] 6.9521ms 6.7030ms 149.1868 Ops/s 152.0320 Ops/s $\color{#d91a1a}-1.87\%$
test_to[False-False-None] 1.8172ms 1.7098ms 584.8551 Ops/s 575.9916 Ops/s $\color{#35bf28}+1.54\%$
test_to[True-False-None] 1.5669ms 1.3300ms 751.9054 Ops/s 743.7323 Ops/s $\color{#35bf28}+1.10\%$
test_to[within-False-None] 4.2369ms 4.1041ms 243.6601 Ops/s 244.8456 Ops/s $\color{#d91a1a}-0.48\%$
test_to[True-default-None] 5.7436ms 5.3412ms 187.2244 Ops/s 186.6455 Ops/s $\color{#35bf28}+0.31\%$
test_to_njt[False-False-None] 7.2876ms 7.0243ms 142.3636 Ops/s 141.9494 Ops/s $\color{#35bf28}+0.29\%$
test_to_njt[True-False-None] 5.8436ms 5.6123ms 178.1794 Ops/s 178.1171 Ops/s $\color{#35bf28}+0.04\%$
test_to_njt[within-False-None] 12.7395ms 12.3442ms 81.0095 Ops/s 82.7041 Ops/s $\color{#d91a1a}-2.05\%$
test_creation[device0] 0.4694ms 80.2694μs 12.4581 KOps/s 12.2100 KOps/s $\color{#35bf28}+2.03\%$
test_creation_from_tensor 0.5360ms 84.9980μs 11.7650 KOps/s 11.3722 KOps/s $\color{#35bf28}+3.45\%$
test_add_one[memmap_tensor0] 0.3331ms 7.2827μs 137.3110 KOps/s 134.5047 KOps/s $\color{#35bf28}+2.09\%$
test_contiguous[memmap_tensor0] 2.0200μs 0.4153μs 2.4081 MOps/s 2.3796 MOps/s $\color{#35bf28}+1.20\%$
test_stack[memmap_tensor0] 42.1610μs 4.8438μs 206.4492 KOps/s 210.5523 KOps/s $\color{#d91a1a}-1.95\%$
test_memmaptd_index 1.9716ms 0.2590ms 3.8608 KOps/s 3.8935 KOps/s $\color{#d91a1a}-0.84\%$
test_memmaptd_index_astensor 0.6046ms 0.3176ms 3.1484 KOps/s 3.1779 KOps/s $\color{#d91a1a}-0.93\%$
test_memmaptd_index_op 1.0307ms 0.6048ms 1.6534 KOps/s 1.6371 KOps/s $\color{#35bf28}+1.00\%$
test_serialize_model 0.1326s 0.1317s 7.5909 Ops/s 7.5175 Ops/s $\color{#35bf28}+0.98\%$
test_serialize_model_pickle 1.3465s 1.2186s 0.8206 Ops/s 0.8413 Ops/s $\color{#d91a1a}-2.46\%$
test_serialize_weights 0.1331s 0.1314s 7.6111 Ops/s 7.6193 Ops/s $\color{#d91a1a}-0.11\%$
test_serialize_weights_returnearly 0.3269s 55.0275ms 18.1727 Ops/s 12.5207 Ops/s $\textbf{\color{#35bf28}+45.14\%}$
test_serialize_weights_pickle 1.3737s 1.1994s 0.8337 Ops/s 0.8198 Ops/s $\color{#35bf28}+1.70\%$
test_reshape_pytree 66.1010μs 22.7706μs 43.9163 KOps/s 43.6830 KOps/s $\color{#35bf28}+0.53\%$
test_reshape_td 55.1910μs 26.9526μs 37.1022 KOps/s 37.3473 KOps/s $\color{#d91a1a}-0.66\%$
test_view_pytree 52.3110μs 22.4515μs 44.5405 KOps/s 44.5742 KOps/s $\color{#d91a1a}-0.08\%$
test_view_td 70.1110μs 30.9671μs 32.2923 KOps/s 31.7645 KOps/s $\color{#35bf28}+1.66\%$
test_unbind_pytree 53.9310μs 28.5524μs 35.0234 KOps/s 35.0781 KOps/s $\color{#d91a1a}-0.16\%$
test_unbind_td 0.5969ms 36.7331μs 27.2234 KOps/s 27.7694 KOps/s $\color{#d91a1a}-1.97\%$
test_split_pytree 75.4020μs 30.8515μs 32.4133 KOps/s 32.2672 KOps/s $\color{#35bf28}+0.45\%$
test_split_td 0.5558ms 39.5271μs 25.2991 KOps/s 24.8802 KOps/s $\color{#35bf28}+1.68\%$
test_add_pytree 72.0210μs 36.0855μs 27.7120 KOps/s 27.2830 KOps/s $\color{#35bf28}+1.57\%$
test_add_td 84.1620μs 48.3141μs 20.6979 KOps/s 20.9309 KOps/s $\color{#d91a1a}-1.11\%$
test_compile_add_one_nested[tensordict-compile] 0.2508ms 0.1221ms 8.1911 KOps/s 8.0022 KOps/s $\color{#35bf28}+2.36\%$
test_compile_add_one_nested[tensordict-eager] 0.2145ms 0.1257ms 7.9536 KOps/s 7.6805 KOps/s $\color{#35bf28}+3.56\%$
test_compile_add_one_nested[pytree-compile] 0.1577ms 0.1003ms 9.9728 KOps/s 10.0860 KOps/s $\color{#d91a1a}-1.12\%$
test_compile_add_one_nested[pytree-eager] 1.4397ms 0.1541ms 6.4883 KOps/s 6.3393 KOps/s $\color{#35bf28}+2.35\%$
test_compile_copy_nested[tensordict-compile] 89.8220μs 25.7582μs 38.8226 KOps/s 43.4929 KOps/s $\textbf{\color{#d91a1a}-10.74\%}$
test_compile_copy_nested[tensordict-eager] 58.7210μs 26.9453μs 37.1122 KOps/s 36.3689 KOps/s $\color{#35bf28}+2.04\%$
test_compile_copy_nested[pytree-compile] 99.9220μs 64.6933μs 15.4576 KOps/s 15.2020 KOps/s $\color{#35bf28}+1.68\%$
test_compile_copy_nested[pytree-eager] 76.7810μs 49.7177μs 20.1136 KOps/s 19.8415 KOps/s $\color{#35bf28}+1.37\%$
test_compile_add_one_flat[tensordict-compile] 0.1891ms 0.1458ms 6.8606 KOps/s 6.7870 KOps/s $\color{#35bf28}+1.08\%$
test_compile_add_one_flat[tensordict-eager] 0.3209ms 0.2090ms 4.7846 KOps/s 4.7279 KOps/s $\color{#35bf28}+1.20\%$
test_compile_add_one_flat[tensorclass-compile] 0.1543ms 0.1013ms 9.8677 KOps/s 9.8911 KOps/s $\color{#d91a1a}-0.24\%$
test_compile_add_one_flat[tensorclass-eager] 0.1096ms 50.8540μs 19.6641 KOps/s 19.1639 KOps/s $\color{#35bf28}+2.61\%$
test_compile_add_one_flat[pytree-compile] 0.1824ms 0.1400ms 7.1444 KOps/s 7.1959 KOps/s $\color{#d91a1a}-0.72\%$
test_compile_add_one_flat[pytree-eager] 0.7114ms 0.5025ms 1.9900 KOps/s 1.9678 KOps/s $\color{#35bf28}+1.13\%$
test_compile_add_self_flat[tensordict-eager] 0.4087ms 0.2474ms 4.0414 KOps/s 3.9382 KOps/s $\color{#35bf28}+2.62\%$
test_compile_add_self_flat[tensordict-compile] 0.2045ms 0.1476ms 6.7736 KOps/s 6.9139 KOps/s $\color{#d91a1a}-2.03\%$
test_compile_add_self_flat[tensorclass-eager] 0.1459ms 62.3067μs 16.0496 KOps/s 16.1479 KOps/s $\color{#d91a1a}-0.61\%$
test_compile_add_self_flat[tensorclass-compile] 0.1594ms 0.1011ms 9.8934 KOps/s 9.8562 KOps/s $\color{#35bf28}+0.38\%$
test_compile_add_self_flat[pytree-eager] 0.5525ms 0.4248ms 2.3540 KOps/s 2.3319 KOps/s $\color{#35bf28}+0.95\%$
test_compile_add_self_flat[pytree-compile] 0.1912ms 0.1392ms 7.1848 KOps/s 7.2919 KOps/s $\color{#d91a1a}-1.47\%$
test_compile_copy_flat[tensordict-compile] 53.8110μs 20.1780μs 49.5588 KOps/s 54.1490 KOps/s $\textbf{\color{#d91a1a}-8.48\%}$
test_compile_copy_flat[tensordict-eager] 55.9910μs 27.0975μs 36.9037 KOps/s 37.2673 KOps/s $\color{#d91a1a}-0.98\%$
test_compile_copy_flat[pytree-compile] 0.2515ms 71.2109μs 14.0428 KOps/s 14.1742 KOps/s $\color{#d91a1a}-0.93\%$
test_compile_copy_flat[pytree-eager] 85.2410μs 51.6491μs 19.3614 KOps/s 19.2454 KOps/s $\color{#35bf28}+0.60\%$
test_compile_assign_and_add[tensordict-compile] 1.6599ms 0.4056ms 2.4655 KOps/s 2.1732 KOps/s $\textbf{\color{#35bf28}+13.45\%}$
test_compile_assign_and_add[tensordict-eager] 2.7789ms 2.6282ms 380.4955 Ops/s 373.7844 Ops/s $\color{#35bf28}+1.80\%$
test_compile_assign_and_add[pytree-compile] 1.6445ms 0.4461ms 2.2416 KOps/s 2.1622 KOps/s $\color{#35bf28}+3.67\%$
test_compile_assign_and_add[pytree-eager] 2.8831ms 2.7447ms 364.3346 Ops/s 358.9467 Ops/s $\color{#35bf28}+1.50\%$
test_compile_indexing[tensor-tensordict-compile] 0.2509ms 0.1191ms 8.3978 KOps/s 8.3493 KOps/s $\color{#35bf28}+0.58\%$
test_compile_indexing[tensor-tensordict-eager] 0.5916ms 81.5861μs 12.2570 KOps/s 11.3601 KOps/s $\textbf{\color{#35bf28}+7.89\%}$
test_compile_indexing[tensor-tensorclass-compile] 0.1836ms 0.1103ms 9.0668 KOps/s 8.7364 KOps/s $\color{#35bf28}+3.78\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1092ms 69.8144μs 14.3237 KOps/s 13.8874 KOps/s $\color{#35bf28}+3.14\%$
test_compile_indexing[tensor-pytree-compile] 0.1825ms 0.1094ms 9.1393 KOps/s 8.7772 KOps/s $\color{#35bf28}+4.13\%$
test_compile_indexing[tensor-pytree-eager] 0.1292ms 70.4171μs 14.2011 KOps/s 14.0270 KOps/s $\color{#35bf28}+1.24\%$
test_compile_indexing[slice-tensordict-compile] 0.1535ms 0.1031ms 9.6972 KOps/s 9.7970 KOps/s $\color{#d91a1a}-1.02\%$
test_compile_indexing[slice-tensordict-eager] 0.1655ms 17.0428μs 58.6759 KOps/s 53.5134 KOps/s $\textbf{\color{#35bf28}+9.65\%}$
test_compile_indexing[slice-tensorclass-compile] 0.1345ms 98.9034μs 10.1109 KOps/s 10.1978 KOps/s $\color{#d91a1a}-0.85\%$
test_compile_indexing[slice-tensorclass-eager] 47.7610μs 16.5459μs 60.4381 KOps/s 60.7227 KOps/s $\color{#d91a1a}-0.47\%$
test_compile_indexing[slice-pytree-compile] 0.1485ms 99.1287μs 10.0879 KOps/s 10.0549 KOps/s $\color{#35bf28}+0.33\%$
test_compile_indexing[slice-pytree-eager] 44.4710μs 16.2565μs 61.5138 KOps/s 61.0925 KOps/s $\color{#35bf28}+0.69\%$
test_compile_indexing[int-tensordict-compile] 0.1675ms 0.1029ms 9.7151 KOps/s 9.6516 KOps/s $\color{#35bf28}+0.66\%$
test_compile_indexing[int-tensordict-eager] 0.5590ms 17.3865μs 57.5158 KOps/s 55.5459 KOps/s $\color{#35bf28}+3.55\%$
test_compile_indexing[int-tensorclass-compile] 0.1400ms 99.5370μs 10.0465 KOps/s 10.0543 KOps/s $\color{#d91a1a}-0.08\%$
test_compile_indexing[int-tensorclass-eager] 53.7210μs 16.3263μs 61.2510 KOps/s 60.6490 KOps/s $\color{#35bf28}+0.99\%$
test_compile_indexing[int-pytree-compile] 0.1463ms 99.3950μs 10.0609 KOps/s 10.1683 KOps/s $\color{#d91a1a}-1.06\%$
test_compile_indexing[int-pytree-eager] 0.1629ms 16.2237μs 61.6382 KOps/s 61.3353 KOps/s $\color{#35bf28}+0.49\%$
test_mod_add[eager] 73.4610μs 32.3178μs 30.9427 KOps/s 29.1566 KOps/s $\textbf{\color{#35bf28}+6.13\%}$
test_mod_add[compile] 0.1419ms 78.2576μs 12.7783 KOps/s 12.4485 KOps/s $\color{#35bf28}+2.65\%$
test_mod_add[compile-overhead] 0.3213ms 0.1680ms 5.9524 KOps/s 5.3197 KOps/s $\textbf{\color{#35bf28}+11.89\%}$
test_mod_wrap[eager] 0.3275ms 0.2498ms 4.0024 KOps/s 3.8193 KOps/s $\color{#35bf28}+4.79\%$
test_mod_wrap[compile] 1.5766ms 0.2884ms 3.4672 KOps/s 3.4021 KOps/s $\color{#35bf28}+1.92\%$
test_mod_wrap[compile-overhead] 6.9695ms 3.6735ms 272.2188 Ops/s 265.3132 Ops/s $\color{#35bf28}+2.60\%$
test_mod_wrap_and_backward[eager] 1.6886ms 1.4050ms 711.7633 Ops/s 673.9979 Ops/s $\textbf{\color{#35bf28}+5.60\%}$
test_mod_wrap_and_backward[compile] 1.5902ms 1.2889ms 775.8538 Ops/s 709.8176 Ops/s $\textbf{\color{#35bf28}+9.30\%}$
test_mod_wrap_and_backward[compile-overhead] 1.3768ms 0.9335ms 1.0712 KOps/s 952.2861 Ops/s $\textbf{\color{#35bf28}+12.49\%}$
test_seq_add[eager] 0.1666ms 0.1040ms 9.6120 KOps/s 9.9522 KOps/s $\color{#d91a1a}-3.42\%$
test_seq_add[compile] 0.5616ms 92.5478μs 10.8052 KOps/s 11.1444 KOps/s $\color{#d91a1a}-3.04\%$
test_seq_add[compile-overhead] 0.1955ms 0.1340ms 7.4641 KOps/s 7.6642 KOps/s $\color{#d91a1a}-2.61\%$
test_seq_wrap[eager] 0.8132ms 0.4043ms 2.4736 KOps/s 2.5700 KOps/s $\color{#d91a1a}-3.75\%$
test_seq_wrap[compile] 0.4015ms 0.3051ms 3.2777 KOps/s 3.2272 KOps/s $\color{#35bf28}+1.56\%$
test_seq_wrap[compile-overhead] 0.6640ms 0.2271ms 4.4027 KOps/s 4.3856 KOps/s $\color{#35bf28}+0.39\%$
test_func_call_runtime[False-eager] 1.1891ms 0.7628ms 1.3110 KOps/s 1.2813 KOps/s $\color{#35bf28}+2.31\%$
test_func_call_runtime[False-compile] 1.2136ms 0.7635ms 1.3098 KOps/s 1.3064 KOps/s $\color{#35bf28}+0.26\%$
test_func_call_runtime[False-compile-overhead] 0.4646ms 0.3726ms 2.6836 KOps/s 2.6796 KOps/s $\color{#35bf28}+0.15\%$
test_func_call_runtime[True-eager] 1.3812ms 0.9279ms 1.0777 KOps/s 1.0587 KOps/s $\color{#35bf28}+1.79\%$
test_func_call_runtime[True-compile] 1.2411ms 0.7846ms 1.2746 KOps/s 1.2651 KOps/s $\color{#35bf28}+0.75\%$
test_func_call_runtime[True-compile-overhead] 0.8285ms 0.3918ms 2.5526 KOps/s 2.5445 KOps/s $\color{#35bf28}+0.32\%$
test_func_call_cm_runtime[False-eager] 1.2021ms 0.7611ms 1.3139 KOps/s 1.2845 KOps/s $\color{#35bf28}+2.29\%$
test_func_call_cm_runtime[False-compile] 1.2322ms 0.7855ms 1.2731 KOps/s 1.2952 KOps/s $\color{#d91a1a}-1.70\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4295ms 0.3722ms 2.6869 KOps/s 2.6835 KOps/s $\color{#35bf28}+0.13\%$
test_func_call_cm_runtime[True-eager] 1.1865ms 1.0147ms 985.5607 Ops/s 959.9797 Ops/s $\color{#35bf28}+2.66\%$
test_func_call_cm_runtime[True-compile] 1.2869ms 0.8108ms 1.2333 KOps/s 1.2264 KOps/s $\color{#35bf28}+0.56\%$
test_func_call_cm_runtime[True-compile-overhead] 0.5390ms 0.4254ms 2.3507 KOps/s 2.3747 KOps/s $\color{#d91a1a}-1.01\%$
test_vmap_func_call_cm_runtime[eager] 2.6534ms 2.1621ms 462.5229 Ops/s 467.0256 Ops/s $\color{#d91a1a}-0.96\%$
test_vmap_func_call_cm_runtime[compile] 1.3475ms 0.8428ms 1.1866 KOps/s 1.1883 KOps/s $\color{#d91a1a}-0.15\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.4858ms 0.4220ms 2.3697 KOps/s 2.3526 KOps/s $\color{#35bf28}+0.73\%$
test_distributed 3.0622ms 0.1799ms 5.5581 KOps/s 8.4847 KOps/s $\textbf{\color{#d91a1a}-34.49\%}$
test_tdmodule 0.1282ms 13.8087μs 72.4179 KOps/s 69.6237 KOps/s $\color{#35bf28}+4.01\%$
test_tdmodule_dispatch 47.1010μs 27.1238μs 36.8681 KOps/s 35.4598 KOps/s $\color{#35bf28}+3.97\%$
test_tdseq 33.9600μs 15.1551μs 65.9845 KOps/s 63.4530 KOps/s $\color{#35bf28}+3.99\%$
test_tdseq_dispatch 52.5220μs 30.1610μs 33.1554 KOps/s 31.9017 KOps/s $\color{#35bf28}+3.93\%$
test_instantiation_functorch 1.7651ms 1.5590ms 641.4203 Ops/s 616.5528 Ops/s $\color{#35bf28}+4.03\%$
test_exec_functorch 0.2492ms 0.1510ms 6.6209 KOps/s 6.6035 KOps/s $\color{#35bf28}+0.26\%$
test_exec_functional_call 0.1939ms 0.1425ms 7.0173 KOps/s 6.8553 KOps/s $\color{#35bf28}+2.36\%$
test_exec_td_decorator 0.3760ms 0.1863ms 5.3686 KOps/s 5.2104 KOps/s $\color{#35bf28}+3.04\%$
test_vmap_mlp_speed_decorator[True-True] 1.1269ms 0.6850ms 1.4598 KOps/s 1.4503 KOps/s $\color{#35bf28}+0.66\%$
test_vmap_mlp_speed_decorator[True-False] 1.0993ms 0.6837ms 1.4626 KOps/s 1.4464 KOps/s $\color{#35bf28}+1.12\%$
test_vmap_mlp_speed_decorator[False-True] 1.0784ms 0.6058ms 1.6506 KOps/s 1.5869 KOps/s $\color{#35bf28}+4.02\%$
test_vmap_mlp_speed_decorator[False-False] 1.0319ms 0.6035ms 1.6570 KOps/s 1.5886 KOps/s $\color{#35bf28}+4.31\%$
test_vmap_transformer_speed_decorator[True-True] 20.5019ms 19.5734ms 51.0896 Ops/s 50.5036 Ops/s $\color{#35bf28}+1.16\%$
test_vmap_transformer_speed_decorator[True-False] 20.0139ms 19.6333ms 50.9339 Ops/s 50.6989 Ops/s $\color{#35bf28}+0.46\%$
test_vmap_transformer_speed_decorator[False-True] 19.8487ms 19.4924ms 51.3020 Ops/s 50.7647 Ops/s $\color{#35bf28}+1.06\%$
test_vmap_transformer_speed_decorator[False-False] 20.0441ms 19.4771ms 51.3425 Ops/s 50.6562 Ops/s $\color{#35bf28}+1.35\%$
test_to_module_speed[True] 2.2927ms 0.9400ms 1.0639 KOps/s 1.0635 KOps/s $\color{#35bf28}+0.04\%$
test_to_module_speed[False] 1.0583ms 0.9215ms 1.0852 KOps/s 1.0855 KOps/s $\color{#d91a1a}-0.02\%$
test_tc_init 72.8610μs 35.9081μs 27.8489 KOps/s 27.5689 KOps/s $\color{#35bf28}+1.02\%$
test_tc_init_nested 0.1254ms 70.3635μs 14.2119 KOps/s 14.2986 KOps/s $\color{#d91a1a}-0.61\%$
test_tc_first_layer_tensor 13.6360μs 0.7007μs 1.4271 MOps/s 1.4132 MOps/s $\color{#35bf28}+0.98\%$
test_tc_first_layer_nontensor 91.3620μs 2.2828μs 438.0667 KOps/s 432.7547 KOps/s $\color{#35bf28}+1.23\%$
test_tc_second_layer_tensor 97.3667μs 1.4289μs 699.8313 KOps/s 688.3363 KOps/s $\color{#35bf28}+1.67\%$
test_tc_second_layer_nontensor 28.6210μs 3.0622μs 326.5575 KOps/s 332.3500 KOps/s $\color{#d91a1a}-1.74\%$
test_unbind 6.9423ms 6.6896ms 149.4860 Ops/s 150.7109 Ops/s $\color{#d91a1a}-0.81\%$
test_full_like 12.1765ms 9.1553ms 109.2259 Ops/s 107.0234 Ops/s $\color{#35bf28}+2.06\%$
test_zeros_like 5.8828ms 4.2183ms 237.0643 Ops/s 114.2577 Ops/s $\textbf{\color{#35bf28}+107.48\%}$
test_ones_like 4.4575ms 4.2324ms 236.2753 Ops/s 231.3651 Ops/s $\color{#35bf28}+2.12\%$
test_clone 11.3526ms 9.0458ms 110.5485 Ops/s 157.1430 Ops/s $\textbf{\color{#d91a1a}-29.65\%}$
test_squeeze 0.4395ms 9.2157μs 108.5105 KOps/s 107.7182 KOps/s $\color{#35bf28}+0.74\%$
test_unsqueeze 0.1183ms 69.7808μs 14.3306 KOps/s 13.1858 KOps/s $\textbf{\color{#35bf28}+8.68\%}$
test_split 0.5771ms 0.1572ms 6.3604 KOps/s 6.2373 KOps/s $\color{#35bf28}+1.97\%$
test_permute 0.2550ms 0.1839ms 5.4367 KOps/s 5.1887 KOps/s $\color{#35bf28}+4.78\%$
test_stack 51.2410ms 50.7933ms 19.6876 Ops/s 19.5819 Ops/s $\color{#35bf28}+0.54\%$
test_cat 50.8395ms 50.4707ms 19.8135 Ops/s 19.7042 Ops/s $\color{#35bf28}+0.55\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants