Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Better typing for tensorclass #983

Merged
merged 4 commits into from
Sep 10, 2024
Merged

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Sep 10, 2024

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Sep 10, 2024
ghstack-source-id: a9bfe99c84cc33d4fd0fed8fcf29bac827f3f129
Pull Request resolved: #983
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 10, 2024
Copy link

github-actions bot commented Sep 10, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 228. Improved: $\large\color{#35bf28}29$. Worsened: $\large\color{#d91a1a}8$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.1239ms 13.1415μs 76.0947 KOps/s 68.7821 KOps/s $\textbf{\color{#35bf28}+10.63\%}$
test_plain_set_stack_nested 46.2320μs 13.2986μs 75.1960 KOps/s 67.9668 KOps/s $\textbf{\color{#35bf28}+10.64\%}$
test_plain_set_nested_inplace 65.8530μs 13.8701μs 72.0977 KOps/s 64.4589 KOps/s $\textbf{\color{#35bf28}+11.85\%}$
test_plain_set_stack_nested_inplace 36.6320μs 13.9168μs 71.8557 KOps/s 65.1915 KOps/s $\textbf{\color{#35bf28}+10.22\%}$
test_items 25.0610μs 2.8163μs 355.0803 KOps/s 353.0562 KOps/s $\color{#35bf28}+0.57\%$
test_items_nested 0.4944ms 0.3136ms 3.1891 KOps/s 3.1796 KOps/s $\color{#35bf28}+0.30\%$
test_items_nested_locked 0.3493ms 0.3130ms 3.1951 KOps/s 3.1413 KOps/s $\color{#35bf28}+1.71\%$
test_items_nested_leaf 85.0140μs 62.6781μs 15.9545 KOps/s 15.8642 KOps/s $\color{#35bf28}+0.57\%$
test_items_stack_nested 0.3475ms 0.3122ms 3.2029 KOps/s 3.2061 KOps/s $\color{#d91a1a}-0.10\%$
test_items_stack_nested_leaf 89.5130μs 63.1871μs 15.8260 KOps/s 15.5779 KOps/s $\color{#35bf28}+1.59\%$
test_items_stack_nested_locked 0.3707ms 0.3117ms 3.2085 KOps/s 3.1568 KOps/s $\color{#35bf28}+1.64\%$
test_keys 27.2410μs 3.3749μs 296.3037 KOps/s 295.3248 KOps/s $\color{#35bf28}+0.33\%$
test_keys_nested 85.1540μs 55.0834μs 18.1543 KOps/s 18.8260 KOps/s $\color{#d91a1a}-3.57\%$
test_keys_nested_locked 2.6343ms 60.1669μs 16.6204 KOps/s 16.5520 KOps/s $\color{#35bf28}+0.41\%$
test_keys_nested_leaf 78.7240μs 46.4832μs 21.5131 KOps/s 21.4408 KOps/s $\color{#35bf28}+0.34\%$
test_keys_stack_nested 81.9830μs 54.4676μs 18.3595 KOps/s 17.9678 KOps/s $\color{#35bf28}+2.18\%$
test_keys_stack_nested_leaf 78.9830μs 46.0441μs 21.7183 KOps/s 21.2261 KOps/s $\color{#35bf28}+2.32\%$
test_keys_stack_nested_locked 96.0650μs 59.5332μs 16.7973 KOps/s 16.7499 KOps/s $\color{#35bf28}+0.28\%$
test_values 4.9646μs 0.8267μs 1.2097 MOps/s 1.2034 MOps/s $\color{#35bf28}+0.52\%$
test_values_nested 52.4820μs 27.2069μs 36.7554 KOps/s 36.5665 KOps/s $\color{#35bf28}+0.52\%$
test_values_nested_locked 54.6520μs 28.9797μs 34.5069 KOps/s 34.1139 KOps/s $\color{#35bf28}+1.15\%$
test_values_nested_leaf 46.1820μs 23.9844μs 41.6938 KOps/s 41.3097 KOps/s $\color{#35bf28}+0.93\%$
test_values_stack_nested 71.8530μs 27.5477μs 36.3007 KOps/s 35.1668 KOps/s $\color{#35bf28}+3.22\%$
test_values_stack_nested_leaf 50.7220μs 24.0530μs 41.5749 KOps/s 39.9040 KOps/s $\color{#35bf28}+4.19\%$
test_values_stack_nested_locked 57.0920μs 29.1826μs 34.2670 KOps/s 33.0163 KOps/s $\color{#35bf28}+3.79\%$
test_membership 1.5906μs 0.4692μs 2.1314 MOps/s 2.1169 MOps/s $\color{#35bf28}+0.69\%$
test_membership_nested 22.2200μs 1.7975μs 556.3246 KOps/s 549.8856 KOps/s $\color{#35bf28}+1.17\%$
test_membership_nested_leaf 17.2507μs 1.7193μs 581.6464 KOps/s 575.0417 KOps/s $\color{#35bf28}+1.15\%$
test_membership_stacked_nested 23.4510μs 1.7840μs 560.5477 KOps/s 568.0304 KOps/s $\color{#d91a1a}-1.32\%$
test_membership_stacked_nested_leaf 23.0010μs 1.7669μs 565.9526 KOps/s 566.3434 KOps/s $\color{#d91a1a}-0.07\%$
test_membership_nested_last 26.3810μs 2.6086μs 383.3490 KOps/s 388.2769 KOps/s $\color{#d91a1a}-1.27\%$
test_membership_nested_leaf_last 32.2910μs 2.5821μs 387.2860 KOps/s 388.4820 KOps/s $\color{#d91a1a}-0.31\%$
test_membership_stacked_nested_last 31.7020μs 2.5680μs 389.4145 KOps/s 339.7404 KOps/s $\textbf{\color{#35bf28}+14.62\%}$
test_membership_stacked_nested_leaf_last 23.3510μs 2.5577μs 390.9821 KOps/s 341.6913 KOps/s $\textbf{\color{#35bf28}+14.43\%}$
test_nested_getleaf 32.9710μs 6.0859μs 164.3140 KOps/s 165.7879 KOps/s $\color{#d91a1a}-0.89\%$
test_nested_get 34.5420μs 5.7387μs 174.2544 KOps/s 174.7792 KOps/s $\color{#d91a1a}-0.30\%$
test_stacked_getleaf 46.6520μs 6.0361μs 165.6705 KOps/s 166.9420 KOps/s $\color{#d91a1a}-0.76\%$
test_stacked_get 25.4110μs 5.5822μs 179.1422 KOps/s 179.0888 KOps/s $\color{#35bf28}+0.03\%$
test_nested_getitemleaf 35.4810μs 6.0786μs 164.5113 KOps/s 162.5466 KOps/s $\color{#35bf28}+1.21\%$
test_nested_getitem 46.0820μs 5.7323μs 174.4511 KOps/s 175.6317 KOps/s $\color{#d91a1a}-0.67\%$
test_stacked_getitemleaf 40.0120μs 6.1366μs 162.9568 KOps/s 166.0539 KOps/s $\color{#d91a1a}-1.87\%$
test_stacked_getitem 27.7310μs 5.6569μs 176.7741 KOps/s 175.6515 KOps/s $\color{#35bf28}+0.64\%$
test_lock_nested 7.3591ms 0.4256ms 2.3495 KOps/s 2.3890 KOps/s $\color{#d91a1a}-1.66\%$
test_lock_stack_nested 0.4644ms 0.3829ms 2.6115 KOps/s 2.6209 KOps/s $\color{#d91a1a}-0.36\%$
test_unlock_nested 0.8017ms 0.3595ms 2.7820 KOps/s 2.8078 KOps/s $\color{#d91a1a}-0.92\%$
test_unlock_stack_nested 0.3766ms 0.3242ms 3.0843 KOps/s 3.1114 KOps/s $\color{#d91a1a}-0.87\%$
test_flatten_speed 0.3237ms 77.9898μs 12.8222 KOps/s 12.8129 KOps/s $\color{#35bf28}+0.07\%$
test_unflatten_speed 0.3315ms 0.2810ms 3.5585 KOps/s 3.6318 KOps/s $\color{#d91a1a}-2.02\%$
test_common_ops 1.4783ms 1.2615ms 792.6998 Ops/s 797.1247 Ops/s $\color{#d91a1a}-0.56\%$
test_creation 23.4510μs 1.4639μs 683.1200 KOps/s 678.5927 KOps/s $\color{#35bf28}+0.67\%$
test_creation_empty 0.7839ms 13.8960μs 71.9630 KOps/s 60.5744 KOps/s $\textbf{\color{#35bf28}+18.80\%}$
test_creation_nested_1 51.6820μs 15.5909μs 64.1400 KOps/s 55.1220 KOps/s $\textbf{\color{#35bf28}+16.36\%}$
test_creation_nested_2 49.7020μs 18.1911μs 54.9720 KOps/s 47.7224 KOps/s $\textbf{\color{#35bf28}+15.19\%}$
test_clone 66.3420μs 29.3154μs 34.1117 KOps/s 34.4717 KOps/s $\color{#d91a1a}-1.04\%$
test_getitem[int] 1.1627ms 16.1256μs 62.0132 KOps/s 62.7840 KOps/s $\color{#d91a1a}-1.23\%$
test_getitem[slice_int] 0.1203ms 28.6031μs 34.9612 KOps/s 35.9733 KOps/s $\color{#d91a1a}-2.81\%$
test_getitem[range] 0.2154ms 0.1097ms 9.1166 KOps/s 8.8720 KOps/s $\color{#35bf28}+2.76\%$
test_getitem[tuple] 0.1142ms 24.0058μs 41.6565 KOps/s 42.5481 KOps/s $\color{#d91a1a}-2.10\%$
test_getitem[list] 0.1897ms 97.7115μs 10.2342 KOps/s 9.9691 KOps/s $\color{#35bf28}+2.66\%$
test_setitem_dim[int] 72.5930μs 49.2761μs 20.2938 KOps/s 19.1948 KOps/s $\textbf{\color{#35bf28}+5.73\%}$
test_setitem_dim[slice_int] 0.1009ms 72.7973μs 13.7368 KOps/s 13.2020 KOps/s $\color{#35bf28}+4.05\%$
test_setitem_dim[range] 0.1819ms 0.1332ms 7.5093 KOps/s 7.2110 KOps/s $\color{#35bf28}+4.14\%$
test_setitem_dim[tuple] 90.2940μs 65.9705μs 15.1583 KOps/s 14.4167 KOps/s $\textbf{\color{#35bf28}+5.14\%}$
test_setitem 80.0140μs 40.1170μs 24.9271 KOps/s 23.7324 KOps/s $\textbf{\color{#35bf28}+5.03\%}$
test_set 86.0840μs 39.2657μs 25.4675 KOps/s 24.2232 KOps/s $\textbf{\color{#35bf28}+5.14\%}$
test_set_shared 0.3588ms 50.0950μs 19.9621 KOps/s 19.6496 KOps/s $\color{#35bf28}+1.59\%$
test_update 82.6430μs 47.5096μs 21.0484 KOps/s 20.0350 KOps/s $\textbf{\color{#35bf28}+5.06\%}$
test_update_nested 0.1124ms 54.5197μs 18.3420 KOps/s 17.4607 KOps/s $\textbf{\color{#35bf28}+5.05\%}$
test_update__nested 92.3850μs 59.8915μs 16.6969 KOps/s 16.6846 KOps/s $\color{#35bf28}+0.07\%$
test_set_nested 0.4727ms 42.5167μs 23.5202 KOps/s 22.7589 KOps/s $\color{#35bf28}+3.35\%$
test_set_nested_new 87.3940μs 45.0539μs 22.1956 KOps/s 21.1811 KOps/s $\color{#35bf28}+4.79\%$
test_select 93.6340μs 57.2916μs 17.4546 KOps/s 16.7251 KOps/s $\color{#35bf28}+4.36\%$
test_select_nested 70.3830μs 41.4516μs 24.1245 KOps/s 23.7560 KOps/s $\color{#35bf28}+1.55\%$
test_exclude_nested 90.6940μs 58.9790μs 16.9552 KOps/s 16.9637 KOps/s $\color{#d91a1a}-0.05\%$
test_empty[True] 0.3157ms 0.2408ms 4.1533 KOps/s 4.1685 KOps/s $\color{#d91a1a}-0.36\%$
test_empty[False] 3.1792μs 0.7535μs 1.3272 MOps/s 1.3575 MOps/s $\color{#d91a1a}-2.23\%$
test_to 59.6620μs 25.2687μs 39.5747 KOps/s 38.8833 KOps/s $\color{#35bf28}+1.78\%$
test_to_nonblocking 66.8320μs 24.1635μs 41.3847 KOps/s 40.9927 KOps/s $\color{#35bf28}+0.96\%$
test_unbind_speed 0.3128ms 0.2790ms 3.5844 KOps/s 3.5949 KOps/s $\color{#d91a1a}-0.29\%$
test_unbind_speed_stack0 0.3240ms 0.2782ms 3.5939 KOps/s 3.6279 KOps/s $\color{#d91a1a}-0.94\%$
test_unbind_speed_stack1 92.9038ms 0.7083ms 1.4119 KOps/s 1.4028 KOps/s $\color{#35bf28}+0.65\%$
test_split 93.8652ms 2.1938ms 455.8246 Ops/s 456.2332 Ops/s $\color{#d91a1a}-0.09\%$
test_chunk 94.6946ms 2.2054ms 453.4385 Ops/s 455.4933 Ops/s $\color{#d91a1a}-0.45\%$
test_creation[device0] 0.4310ms 0.1252ms 7.9877 KOps/s 7.8930 KOps/s $\color{#35bf28}+1.20\%$
test_creation_from_tensor 0.3803ms 0.1290ms 7.7501 KOps/s 7.6920 KOps/s $\color{#35bf28}+0.75\%$
test_add_one[memmap_tensor0] 0.1405ms 9.2347μs 108.2878 KOps/s 111.0567 KOps/s $\color{#d91a1a}-2.49\%$
test_contiguous[memmap_tensor0] 29.6820μs 2.2006μs 454.4177 KOps/s 448.8761 KOps/s $\color{#35bf28}+1.23\%$
test_stack[memmap_tensor0] 48.0720μs 6.9704μs 143.4647 KOps/s 142.7763 KOps/s $\color{#35bf28}+0.48\%$
test_memmaptd_index 1.2627ms 0.4259ms 2.3478 KOps/s 2.3416 KOps/s $\color{#35bf28}+0.27\%$
test_memmaptd_index_astensor 0.7384ms 0.4815ms 2.0770 KOps/s 2.0654 KOps/s $\color{#35bf28}+0.56\%$
test_memmaptd_index_op 1.3985ms 1.0070ms 993.0408 Ops/s 952.0881 Ops/s $\color{#35bf28}+4.30\%$
test_serialize_model 0.1303s 0.1291s 7.7484 Ops/s 7.7457 Ops/s $\color{#35bf28}+0.04\%$
test_serialize_model_pickle 1.3516s 1.2134s 0.8241 Ops/s 0.8243 Ops/s $\color{#d91a1a}-0.02\%$
test_serialize_weights 0.2255s 0.1426s 7.0109 Ops/s 7.0065 Ops/s $\color{#35bf28}+0.06\%$
test_serialize_weights_returnearly 0.2108s 55.0934ms 18.1510 Ops/s 18.0029 Ops/s $\color{#35bf28}+0.82\%$
test_serialize_weights_pickle 1.3756s 1.2177s 0.8213 Ops/s 0.8221 Ops/s $\color{#d91a1a}-0.11\%$
test_reshape_pytree 97.5940μs 36.3002μs 27.5480 KOps/s 28.2496 KOps/s $\color{#d91a1a}-2.48\%$
test_reshape_td 76.5530μs 42.2151μs 23.6882 KOps/s 24.1841 KOps/s $\color{#d91a1a}-2.05\%$
test_view_pytree 61.3930μs 36.1587μs 27.6559 KOps/s 28.2490 KOps/s $\color{#d91a1a}-2.10\%$
test_view_td 92.4440μs 46.3306μs 21.5840 KOps/s 21.2188 KOps/s $\color{#35bf28}+1.72\%$
test_unbind_pytree 88.7840μs 35.2879μs 28.3383 KOps/s 28.6958 KOps/s $\color{#d91a1a}-1.25\%$
test_unbind_td 0.4427ms 43.5026μs 22.9871 KOps/s 23.2745 KOps/s $\color{#d91a1a}-1.23\%$
test_split_pytree 81.5130μs 46.8575μs 21.3413 KOps/s 21.5550 KOps/s $\color{#d91a1a}-0.99\%$
test_split_td 94.7003ms 69.4287μs 14.4033 KOps/s 17.4991 KOps/s $\textbf{\color{#d91a1a}-17.69\%}$
test_add_pytree 0.1070ms 63.5781μs 15.7287 KOps/s 16.0698 KOps/s $\color{#d91a1a}-2.12\%$
test_add_td 0.1562ms 97.4840μs 10.2581 KOps/s 10.3653 KOps/s $\color{#d91a1a}-1.03\%$
test_compile_add_one_nested[tensordict-compile] 0.4039ms 0.2060ms 4.8552 KOps/s 4.6986 KOps/s $\color{#35bf28}+3.33\%$
test_compile_add_one_nested[tensordict-eager] 0.2548ms 0.1599ms 6.2536 KOps/s 6.2254 KOps/s $\color{#35bf28}+0.45\%$
test_compile_add_one_nested[pytree-compile] 0.2082ms 0.1452ms 6.8892 KOps/s 6.6176 KOps/s $\color{#35bf28}+4.10\%$
test_compile_add_one_nested[pytree-eager] 0.2564ms 0.1883ms 5.3110 KOps/s 5.2061 KOps/s $\color{#35bf28}+2.02\%$
test_compile_copy_nested[tensordict-compile] 64.4030μs 21.1321μs 47.3214 KOps/s 47.8639 KOps/s $\color{#d91a1a}-1.13\%$
test_compile_copy_nested[tensordict-eager] 77.5730μs 44.3346μs 22.5558 KOps/s 22.6453 KOps/s $\color{#d91a1a}-0.40\%$
test_compile_copy_nested[pytree-compile] 0.2311ms 64.3039μs 15.5512 KOps/s 15.9251 KOps/s $\color{#d91a1a}-2.35\%$
test_compile_copy_nested[pytree-eager] 89.1340μs 49.7379μs 20.1054 KOps/s 20.5104 KOps/s $\color{#d91a1a}-1.97\%$
test_compile_add_one_flat[tensordict-compile] 0.4426ms 0.3160ms 3.1649 KOps/s 3.0978 KOps/s $\color{#35bf28}+2.17\%$
test_compile_add_one_flat[tensordict-eager] 0.2555ms 0.2093ms 4.7786 KOps/s 4.7540 KOps/s $\color{#35bf28}+0.52\%$
test_compile_add_one_flat[tensorclass-compile] 0.2201ms 0.1287ms 7.7727 KOps/s 7.6639 KOps/s $\color{#35bf28}+1.42\%$
test_compile_add_one_flat[tensorclass-eager] 0.1346ms 60.5607μs 16.5124 KOps/s 16.2531 KOps/s $\color{#35bf28}+1.60\%$
test_compile_add_one_flat[pytree-compile] 0.3663ms 0.3163ms 3.1613 KOps/s 3.1152 KOps/s $\color{#35bf28}+1.48\%$
test_compile_add_one_flat[pytree-eager] 0.7412ms 0.6448ms 1.5508 KOps/s 1.5437 KOps/s $\color{#35bf28}+0.46\%$
test_compile_add_self_flat[tensordict-eager] 0.2961ms 0.2484ms 4.0265 KOps/s 3.9333 KOps/s $\color{#35bf28}+2.37\%$
test_compile_add_self_flat[tensordict-compile] 0.4027ms 0.3166ms 3.1589 KOps/s 3.1025 KOps/s $\color{#35bf28}+1.82\%$
test_compile_add_self_flat[tensorclass-eager] 0.1630ms 71.4965μs 13.9867 KOps/s 14.1483 KOps/s $\color{#d91a1a}-1.14\%$
test_compile_add_self_flat[tensorclass-compile] 0.1829ms 0.1364ms 7.3336 KOps/s 7.6909 KOps/s $\color{#d91a1a}-4.65\%$
test_compile_add_self_flat[pytree-eager] 0.8125ms 0.5485ms 1.8232 KOps/s 1.8116 KOps/s $\color{#35bf28}+0.64\%$
test_compile_add_self_flat[pytree-compile] 0.3751ms 0.3183ms 3.1413 KOps/s 3.1120 KOps/s $\color{#35bf28}+0.94\%$
test_compile_copy_flat[tensordict-compile] 88.3030μs 17.9100μs 55.8347 KOps/s 55.2167 KOps/s $\color{#35bf28}+1.12\%$
test_compile_copy_flat[tensordict-eager] 64.3330μs 26.9208μs 37.1460 KOps/s 36.2550 KOps/s $\color{#35bf28}+2.46\%$
test_compile_copy_flat[pytree-compile] 0.1086ms 67.8375μs 14.7411 KOps/s 14.6658 KOps/s $\color{#35bf28}+0.51\%$
test_compile_copy_flat[pytree-eager] 82.9130μs 50.9090μs 19.6429 KOps/s 19.5172 KOps/s $\color{#35bf28}+0.64\%$
test_compile_assign_and_add[tensordict-compile] 2.3430ms 0.8127ms 1.2304 KOps/s 1.1236 KOps/s $\textbf{\color{#35bf28}+9.51\%}$
test_compile_assign_and_add[tensordict-eager] 3.3262ms 3.1979ms 312.7006 Ops/s 312.9067 Ops/s $\color{#d91a1a}-0.07\%$
test_compile_assign_and_add[pytree-compile] 2.2757ms 0.7977ms 1.2535 KOps/s 1.1557 KOps/s $\textbf{\color{#35bf28}+8.46\%}$
test_compile_assign_and_add[pytree-eager] 3.3455ms 3.2812ms 304.7678 Ops/s 297.9161 Ops/s $\color{#35bf28}+2.30\%$
test_compile_indexing[tensor-tensordict-compile] 0.1589ms 0.1100ms 9.0923 KOps/s 8.9168 KOps/s $\color{#35bf28}+1.97\%$
test_compile_indexing[tensor-tensordict-eager] 0.1905ms 60.3379μs 16.5733 KOps/s 16.0342 KOps/s $\color{#35bf28}+3.36\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1799ms 0.1033ms 9.6832 KOps/s 9.5808 KOps/s $\color{#35bf28}+1.07\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1783ms 44.5518μs 22.4458 KOps/s 22.4683 KOps/s $\color{#d91a1a}-0.10\%$
test_compile_indexing[tensor-pytree-compile] 0.1925ms 0.1066ms 9.3791 KOps/s 9.2080 KOps/s $\color{#35bf28}+1.86\%$
test_compile_indexing[tensor-pytree-eager] 85.6040μs 44.3898μs 22.5277 KOps/s 21.6304 KOps/s $\color{#35bf28}+4.15\%$
test_compile_indexing[slice-tensordict-compile] 0.1900ms 0.1396ms 7.1643 KOps/s 6.9628 KOps/s $\color{#35bf28}+2.89\%$
test_compile_indexing[slice-tensordict-eager] 0.1583ms 25.7006μs 38.9096 KOps/s 38.9052 KOps/s $\color{#35bf28}+0.01\%$
test_compile_indexing[slice-tensorclass-compile] 0.1735ms 0.1321ms 7.5717 KOps/s 7.5395 KOps/s $\color{#35bf28}+0.43\%$
test_compile_indexing[slice-tensorclass-eager] 64.5620μs 21.3205μs 46.9033 KOps/s 47.9385 KOps/s $\color{#d91a1a}-2.16\%$
test_compile_indexing[slice-pytree-compile] 0.1856ms 0.1389ms 7.2006 KOps/s 7.3073 KOps/s $\color{#d91a1a}-1.46\%$
test_compile_indexing[slice-pytree-eager] 59.7430μs 21.0349μs 47.5400 KOps/s 48.9586 KOps/s $\color{#d91a1a}-2.90\%$
test_compile_indexing[int-tensordict-compile] 0.1856ms 0.1398ms 7.1516 KOps/s 7.0461 KOps/s $\color{#35bf28}+1.50\%$
test_compile_indexing[int-tensordict-eager] 0.4674ms 25.2879μs 39.5446 KOps/s 39.6652 KOps/s $\color{#d91a1a}-0.30\%$
test_compile_indexing[int-tensorclass-compile] 0.1797ms 0.1335ms 7.4905 KOps/s 7.4839 KOps/s $\color{#35bf28}+0.09\%$
test_compile_indexing[int-tensorclass-eager] 0.1595ms 21.8489μs 45.7688 KOps/s 47.7649 KOps/s $\color{#d91a1a}-4.18\%$
test_compile_indexing[int-pytree-compile] 0.1980ms 0.1333ms 7.5003 KOps/s 7.2277 KOps/s $\color{#35bf28}+3.77\%$
test_compile_indexing[int-pytree-eager] 0.3377ms 21.3110μs 46.9241 KOps/s 36.6938 KOps/s $\textbf{\color{#35bf28}+27.88\%}$
test_mod_add[eager] 75.9830μs 29.5353μs 33.8577 KOps/s 31.0207 KOps/s $\textbf{\color{#35bf28}+9.15\%}$
test_mod_add[compile] 0.1163ms 72.1520μs 13.8596 KOps/s 14.2716 KOps/s $\color{#d91a1a}-2.89\%$
test_mod_add[compile-overhead] 0.2610ms 0.1353ms 7.3936 KOps/s 7.0005 KOps/s $\textbf{\color{#35bf28}+5.61\%}$
test_mod_wrap[eager] 0.3514ms 0.2413ms 4.1447 KOps/s 4.0186 KOps/s $\color{#35bf28}+3.14\%$
test_mod_wrap[compile] 1.2211ms 0.3023ms 3.3083 KOps/s 3.4570 KOps/s $\color{#d91a1a}-4.30\%$
test_mod_wrap[compile-overhead] 7.8455ms 4.1134ms 243.1061 Ops/s 255.8213 Ops/s $\color{#d91a1a}-4.97\%$
test_mod_wrap_and_backward[eager] 1.5098ms 1.3690ms 730.4559 Ops/s 684.5274 Ops/s $\textbf{\color{#35bf28}+6.71\%}$
test_mod_wrap_and_backward[compile] 2.7277ms 1.3166ms 759.5365 Ops/s 700.8757 Ops/s $\textbf{\color{#35bf28}+8.37\%}$
test_mod_wrap_and_backward[compile-overhead] 1.3024ms 0.8853ms 1.1295 KOps/s 1.0091 KOps/s $\textbf{\color{#35bf28}+11.94\%}$
test_seq_add[eager] 0.1854ms 99.1858μs 10.0821 KOps/s 10.0729 KOps/s $\color{#35bf28}+0.09\%$
test_seq_add[compile] 0.1474ms 80.4463μs 12.4307 KOps/s 12.5328 KOps/s $\color{#d91a1a}-0.82\%$
test_seq_add[compile-overhead] 0.1629ms 0.1150ms 8.6992 KOps/s 8.7377 KOps/s $\color{#d91a1a}-0.44\%$
test_seq_wrap[eager] 0.4331ms 0.3730ms 2.6808 KOps/s 2.6033 KOps/s $\color{#35bf28}+2.98\%$
test_seq_wrap[compile] 0.3559ms 0.3022ms 3.3088 KOps/s 3.2722 KOps/s $\color{#35bf28}+1.12\%$
test_seq_wrap[compile-overhead] 0.2639ms 0.2088ms 4.7883 KOps/s 4.7609 KOps/s $\color{#35bf28}+0.58\%$
test_func_call_runtime[False-eager] 0.8278ms 0.7318ms 1.3664 KOps/s 1.3340 KOps/s $\color{#35bf28}+2.43\%$
test_func_call_runtime[False-compile] 1.1575ms 0.7889ms 1.2675 KOps/s 1.2624 KOps/s $\color{#35bf28}+0.40\%$
test_func_call_runtime[False-compile-overhead] 0.3871ms 0.3478ms 2.8750 KOps/s 2.8604 KOps/s $\color{#35bf28}+0.51\%$
test_func_call_runtime[True-eager] 0.9481ms 0.8965ms 1.1154 KOps/s 1.0990 KOps/s $\color{#35bf28}+1.50\%$
test_func_call_runtime[True-compile] 0.8907ms 0.8222ms 1.2163 KOps/s 1.2143 KOps/s $\color{#35bf28}+0.16\%$
test_func_call_runtime[True-compile-overhead] 0.4381ms 0.3820ms 2.6176 KOps/s 2.6102 KOps/s $\color{#35bf28}+0.28\%$
test_func_call_cm_runtime[False-eager] 0.8330ms 0.7216ms 1.3858 KOps/s 1.3437 KOps/s $\color{#35bf28}+3.14\%$
test_func_call_cm_runtime[False-compile] 0.8275ms 0.7828ms 1.2774 KOps/s 1.2615 KOps/s $\color{#35bf28}+1.26\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4067ms 0.3501ms 2.8567 KOps/s 2.8529 KOps/s $\color{#35bf28}+0.13\%$
test_func_call_cm_runtime[True-eager] 1.0827ms 0.9876ms 1.0125 KOps/s 997.5445 Ops/s $\color{#35bf28}+1.50\%$
test_func_call_cm_runtime[True-compile] 0.9032ms 0.8465ms 1.1814 KOps/s 1.1733 KOps/s $\color{#35bf28}+0.69\%$
test_func_call_cm_runtime[True-compile-overhead] 0.4668ms 0.4064ms 2.4605 KOps/s 2.4438 KOps/s $\color{#35bf28}+0.68\%$
test_vmap_func_call_cm_runtime[eager] 2.6595ms 2.0825ms 480.1857 Ops/s 476.2045 Ops/s $\color{#35bf28}+0.84\%$
test_vmap_func_call_cm_runtime[compile] 0.9562ms 0.8614ms 1.1609 KOps/s 1.1537 KOps/s $\color{#35bf28}+0.63\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.4677ms 0.4099ms 2.4398 KOps/s 2.4183 KOps/s $\color{#35bf28}+0.89\%$
test_distributed 3.0670ms 0.2320ms 4.3097 KOps/s 8.8004 KOps/s $\textbf{\color{#d91a1a}-51.03\%}$
test_tdmodule 98.3340μs 13.3407μs 74.9588 KOps/s 66.9329 KOps/s $\textbf{\color{#35bf28}+11.99\%}$
test_tdmodule_dispatch 49.2820μs 26.9485μs 37.1078 KOps/s 32.7019 KOps/s $\textbf{\color{#35bf28}+13.47\%}$
test_tdseq 21.3010μs 13.7890μs 72.5214 KOps/s 60.9089 KOps/s $\textbf{\color{#35bf28}+19.07\%}$
test_tdseq_dispatch 49.9220μs 29.2884μs 34.1432 KOps/s 29.8503 KOps/s $\textbf{\color{#35bf28}+14.38\%}$
test_instantiation_functorch 1.9363ms 1.8408ms 543.2423 Ops/s 533.7081 Ops/s $\color{#35bf28}+1.79\%$
test_instantiation_td 1.7878ms 1.1900ms 840.3530 Ops/s 835.9706 Ops/s $\color{#35bf28}+0.52\%$
test_exec_functorch 0.2534ms 0.2097ms 4.7676 KOps/s 4.7285 KOps/s $\color{#35bf28}+0.83\%$
test_exec_functional_call 0.2503ms 0.2098ms 4.7662 KOps/s 4.7275 KOps/s $\color{#35bf28}+0.82\%$
test_exec_td 0.2717ms 0.2130ms 4.6958 KOps/s 4.6576 KOps/s $\color{#35bf28}+0.82\%$
test_exec_td_decorator 0.3406ms 0.2576ms 3.8826 KOps/s 3.8801 KOps/s $\color{#35bf28}+0.06\%$
test_vmap_mlp_speed[True-True] 0.7894ms 0.6798ms 1.4710 KOps/s 1.4419 KOps/s $\color{#35bf28}+2.02\%$
test_vmap_mlp_speed[True-False] 0.7879ms 0.6866ms 1.4564 KOps/s 1.4499 KOps/s $\color{#35bf28}+0.45\%$
test_vmap_mlp_speed[False-True] 0.6615ms 0.5917ms 1.6901 KOps/s 1.7211 KOps/s $\color{#d91a1a}-1.80\%$
test_vmap_mlp_speed[False-False] 0.6829ms 0.6127ms 1.6320 KOps/s 1.7216 KOps/s $\textbf{\color{#d91a1a}-5.21\%}$
test_vmap_mlp_speed_decorator[True-True] 1.4363ms 0.7056ms 1.4173 KOps/s 1.4764 KOps/s $\color{#d91a1a}-4.00\%$
test_vmap_mlp_speed_decorator[True-False] 0.8353ms 0.7080ms 1.4124 KOps/s 1.4725 KOps/s $\color{#d91a1a}-4.08\%$
test_vmap_mlp_speed_decorator[False-True] 0.7641ms 0.6240ms 1.6025 KOps/s 1.6849 KOps/s $\color{#d91a1a}-4.89\%$
test_vmap_mlp_speed_decorator[False-False] 0.7248ms 0.6229ms 1.6055 KOps/s 1.6890 KOps/s $\color{#d91a1a}-4.94\%$
test_vmap_transformer_speed[True-True] 8.8306ms 8.3845ms 119.2674 Ops/s 118.7906 Ops/s $\color{#35bf28}+0.40\%$
test_vmap_transformer_speed[True-False] 8.8243ms 8.4629ms 118.1622 Ops/s 119.2528 Ops/s $\color{#d91a1a}-0.91\%$
test_vmap_transformer_speed[False-True] 8.5438ms 8.3070ms 120.3805 Ops/s 121.7488 Ops/s $\color{#d91a1a}-1.12\%$
test_vmap_transformer_speed[False-False] 8.5733ms 8.1573ms 122.5896 Ops/s 121.9764 Ops/s $\color{#35bf28}+0.50\%$
test_vmap_transformer_speed_decorator[True-True] 19.9654ms 19.4751ms 51.3477 Ops/s 51.1282 Ops/s $\color{#35bf28}+0.43\%$
test_vmap_transformer_speed_decorator[True-False] 19.5393ms 19.4438ms 51.4304 Ops/s 51.0383 Ops/s $\color{#35bf28}+0.77\%$
test_vmap_transformer_speed_decorator[False-True] 20.1904ms 19.3129ms 51.7787 Ops/s 51.4998 Ops/s $\color{#35bf28}+0.54\%$
test_vmap_transformer_speed_decorator[False-False] 19.3924ms 19.2928ms 51.8329 Ops/s 51.0499 Ops/s $\color{#35bf28}+1.53\%$
test_to_module_speed[True] 1.1949ms 0.9242ms 1.0821 KOps/s 1.0828 KOps/s $\color{#d91a1a}-0.07\%$
test_to_module_speed[False] 1.3312ms 0.9181ms 1.0892 KOps/s 1.1050 KOps/s $\color{#d91a1a}-1.44\%$
test_tc_init 71.0930μs 32.9217μs 30.3751 KOps/s 28.7439 KOps/s $\textbf{\color{#35bf28}+5.67\%}$
test_tc_init_nested 0.1147ms 67.9026μs 14.7270 KOps/s 13.5831 KOps/s $\textbf{\color{#35bf28}+8.42\%}$
test_tc_first_layer_tensor 5.4868μs 0.6885μs 1.4525 MOps/s 1.4709 MOps/s $\color{#d91a1a}-1.26\%$
test_tc_first_layer_nontensor 48.2420μs 2.2615μs 442.1873 KOps/s 455.5218 KOps/s $\color{#d91a1a}-2.93\%$
test_tc_second_layer_tensor 10.2230μs 1.3584μs 736.1811 KOps/s 727.2560 KOps/s $\color{#35bf28}+1.23\%$
test_tc_second_layer_nontensor 39.8120μs 2.9626μs 337.5376 KOps/s 341.5610 KOps/s $\color{#d91a1a}-1.18\%$
test_unbind 0.1873s 12.0304ms 83.1226 Ops/s 94.5385 Ops/s $\textbf{\color{#d91a1a}-12.08\%}$
test_full_like 0.6652ms 0.5757ms 1.7371 KOps/s 1.7393 KOps/s $\color{#d91a1a}-0.13\%$
test_zeros_like 0.2696ms 0.1978ms 5.0560 KOps/s 5.0527 KOps/s $\color{#35bf28}+0.07\%$
test_ones_like 0.2953ms 0.1977ms 5.0588 KOps/s 5.0554 KOps/s $\color{#35bf28}+0.07\%$
test_clone 0.4520ms 0.4150ms 2.4098 KOps/s 2.4140 KOps/s $\color{#d91a1a}-0.17\%$
test_squeeze 42.3420μs 10.8171μs 92.4466 KOps/s 101.5576 KOps/s $\textbf{\color{#d91a1a}-8.97\%}$
test_unsqueeze 0.2449ms 82.5366μs 12.1158 KOps/s 13.3411 KOps/s $\textbf{\color{#d91a1a}-9.18\%}$
test_split 0.4304ms 0.1728ms 5.7856 KOps/s 6.3444 KOps/s $\textbf{\color{#d91a1a}-8.81\%}$
test_permute 0.3004ms 0.2100ms 4.7616 KOps/s 5.6323 KOps/s $\textbf{\color{#d91a1a}-15.46\%}$
test_stack 1.2631ms 0.8428ms 1.1865 KOps/s 1.1680 KOps/s $\color{#35bf28}+1.58\%$
test_cat 1.2570ms 1.2312ms 812.1972 Ops/s 812.0139 Ops/s $\color{#35bf28}+0.02\%$

[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
@vmoens vmoens added the enhancement New feature or request label Sep 10, 2024
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 222. Improved: $\large\color{#35bf28}11$. Worsened: $\large\color{#d91a1a}18$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 67.8170μs 21.0939μs 47.4071 KOps/s 50.1567 KOps/s $\textbf{\color{#d91a1a}-5.48\%}$
test_plain_set_stack_nested 49.1510μs 20.4680μs 48.8569 KOps/s 48.9927 KOps/s $\color{#d91a1a}-0.28\%$
test_plain_set_nested_inplace 60.2320μs 22.5663μs 44.3138 KOps/s 46.0879 KOps/s $\color{#d91a1a}-3.85\%$
test_plain_set_stack_nested_inplace 63.2480μs 22.0821μs 45.2856 KOps/s 45.3428 KOps/s $\color{#d91a1a}-0.13\%$
test_items 28.1330μs 4.1247μs 242.4417 KOps/s 240.2206 KOps/s $\color{#35bf28}+0.92\%$
test_items_nested 0.4681ms 0.3289ms 3.0403 KOps/s 3.0219 KOps/s $\color{#35bf28}+0.61\%$
test_items_nested_locked 0.5310ms 0.3316ms 3.0159 KOps/s 3.0558 KOps/s $\color{#d91a1a}-1.31\%$
test_items_nested_leaf 0.1485ms 86.5267μs 11.5571 KOps/s 11.9023 KOps/s $\color{#d91a1a}-2.90\%$
test_items_stack_nested 0.5572ms 0.3329ms 3.0038 KOps/s 3.0162 KOps/s $\color{#d91a1a}-0.41\%$
test_items_stack_nested_leaf 0.1709ms 86.3346μs 11.5828 KOps/s 12.3199 KOps/s $\textbf{\color{#d91a1a}-5.98\%}$
test_items_stack_nested_locked 0.5186ms 0.3333ms 3.0005 KOps/s 3.0267 KOps/s $\color{#d91a1a}-0.86\%$
test_keys 30.3060μs 3.5278μs 283.4625 KOps/s 287.0964 KOps/s $\color{#d91a1a}-1.27\%$
test_keys_nested 0.2150ms 0.1005ms 9.9505 KOps/s 10.5405 KOps/s $\textbf{\color{#d91a1a}-5.60\%}$
test_keys_nested_locked 0.7768ms 0.1065ms 9.3910 KOps/s 9.9571 KOps/s $\textbf{\color{#d91a1a}-5.69\%}$
test_keys_nested_leaf 0.1586ms 83.5223μs 11.9728 KOps/s 12.6721 KOps/s $\textbf{\color{#d91a1a}-5.52\%}$
test_keys_stack_nested 0.1677ms 98.8172μs 10.1197 KOps/s 10.5267 KOps/s $\color{#d91a1a}-3.87\%$
test_keys_stack_nested_leaf 0.1532ms 80.3297μs 12.4487 KOps/s 12.6869 KOps/s $\color{#d91a1a}-1.88\%$
test_keys_stack_nested_locked 0.4355ms 0.1067ms 9.3681 KOps/s 10.0049 KOps/s $\textbf{\color{#d91a1a}-6.36\%}$
test_values 5.7628μs 1.1209μs 892.1650 KOps/s 932.0481 KOps/s $\color{#d91a1a}-4.28\%$
test_values_nested 0.1763ms 49.3216μs 20.2751 KOps/s 20.9006 KOps/s $\color{#d91a1a}-2.99\%$
test_values_nested_locked 0.1241ms 47.7393μs 20.9471 KOps/s 20.4947 KOps/s $\color{#35bf28}+2.21\%$
test_values_nested_leaf 86.8710μs 41.9689μs 23.8272 KOps/s 23.7303 KOps/s $\color{#35bf28}+0.41\%$
test_values_stack_nested 0.1072ms 48.7645μs 20.5067 KOps/s 21.2127 KOps/s $\color{#d91a1a}-3.33\%$
test_values_stack_nested_leaf 0.1016ms 42.8307μs 23.3477 KOps/s 24.5349 KOps/s $\color{#d91a1a}-4.84\%$
test_values_stack_nested_locked 0.3074ms 48.1198μs 20.7815 KOps/s 20.0757 KOps/s $\color{#35bf28}+3.52\%$
test_membership 9.3044μs 0.6888μs 1.4517 MOps/s 925.5687 KOps/s $\textbf{\color{#35bf28}+56.85\%}$
test_membership_nested 38.5020μs 2.5620μs 390.3181 KOps/s 389.0961 KOps/s $\color{#35bf28}+0.31\%$
test_membership_nested_leaf 21.9310μs 2.5840μs 387.0026 KOps/s 387.1552 KOps/s $\color{#d91a1a}-0.04\%$
test_membership_stacked_nested 29.6250μs 2.6137μs 382.5972 KOps/s 386.6299 KOps/s $\color{#d91a1a}-1.04\%$
test_membership_stacked_nested_leaf 49.8130μs 2.5816μs 387.3566 KOps/s 379.7010 KOps/s $\color{#35bf28}+2.02\%$
test_membership_nested_last 29.0840μs 3.7339μs 267.8153 KOps/s 267.5493 KOps/s $\color{#35bf28}+0.10\%$
test_membership_nested_leaf_last 41.7580μs 3.8042μs 262.8647 KOps/s 264.6689 KOps/s $\color{#d91a1a}-0.68\%$
test_membership_stacked_nested_last 30.0960μs 3.8053μs 262.7919 KOps/s 78.4623 KOps/s $\textbf{\color{#35bf28}+234.93\%}$
test_membership_stacked_nested_leaf_last 40.0350μs 3.7049μs 269.9140 KOps/s 78.9360 KOps/s $\textbf{\color{#35bf28}+241.94\%}$
test_nested_getleaf 48.1200μs 10.7665μs 92.8810 KOps/s 93.9579 KOps/s $\color{#d91a1a}-1.15\%$
test_nested_get 52.6280μs 10.0792μs 99.2140 KOps/s 99.1643 KOps/s $\color{#35bf28}+0.05\%$
test_stacked_getleaf 82.1470μs 10.5670μs 94.6339 KOps/s 94.2496 KOps/s $\color{#35bf28}+0.41\%$
test_stacked_get 29.4350μs 10.1192μs 98.8220 KOps/s 99.5663 KOps/s $\color{#d91a1a}-0.75\%$
test_nested_getitemleaf 29.5450μs 11.0535μs 90.4693 KOps/s 92.6838 KOps/s $\color{#d91a1a}-2.39\%$
test_nested_getitem 41.9280μs 10.2315μs 97.7371 KOps/s 94.7939 KOps/s $\color{#35bf28}+3.10\%$
test_stacked_getitemleaf 43.6610μs 11.0699μs 90.3347 KOps/s 87.7695 KOps/s $\color{#35bf28}+2.92\%$
test_stacked_getitem 44.9040μs 10.3451μs 96.6643 KOps/s 97.9537 KOps/s $\color{#d91a1a}-1.32\%$
test_lock_nested 99.7250ms 0.5914ms 1.6908 KOps/s 2.0726 KOps/s $\textbf{\color{#d91a1a}-18.42\%}$
test_lock_stack_nested 0.7023ms 0.4539ms 2.2030 KOps/s 2.3080 KOps/s $\color{#d91a1a}-4.55\%$
test_unlock_nested 0.1022s 0.5150ms 1.9417 KOps/s 2.4194 KOps/s $\textbf{\color{#d91a1a}-19.74\%}$
test_unlock_stack_nested 0.6547ms 0.3744ms 2.6713 KOps/s 2.8238 KOps/s $\textbf{\color{#d91a1a}-5.40\%}$
test_flatten_speed 0.2569ms 0.1071ms 9.3386 KOps/s 9.5526 KOps/s $\color{#d91a1a}-2.24\%$
test_unflatten_speed 1.0892ms 0.4671ms 2.1409 KOps/s 2.1861 KOps/s $\color{#d91a1a}-2.07\%$
test_common_ops 5.8084ms 1.1320ms 883.3895 Ops/s 890.8554 Ops/s $\color{#d91a1a}-0.84\%$
test_creation 42.1190μs 2.0904μs 478.3716 KOps/s 495.3382 KOps/s $\color{#d91a1a}-3.43\%$
test_creation_empty 42.6990μs 17.2402μs 58.0038 KOps/s 57.6296 KOps/s $\color{#35bf28}+0.65\%$
test_creation_nested_1 58.2090μs 20.7517μs 48.1888 KOps/s 48.3304 KOps/s $\color{#d91a1a}-0.29\%$
test_creation_nested_2 59.0900μs 24.7404μs 40.4197 KOps/s 40.1236 KOps/s $\color{#35bf28}+0.74\%$
test_clone 0.3319ms 16.7448μs 59.7201 KOps/s 60.4797 KOps/s $\color{#d91a1a}-1.26\%$
test_getitem[int] 1.2941ms 16.0994μs 62.1139 KOps/s 60.3589 KOps/s $\color{#35bf28}+2.91\%$
test_getitem[slice_int] 0.1630ms 29.7881μs 33.5705 KOps/s 33.3872 KOps/s $\color{#35bf28}+0.55\%$
test_getitem[range] 0.2601ms 59.2733μs 16.8710 KOps/s 17.7094 KOps/s $\color{#d91a1a}-4.73\%$
test_getitem[tuple] 0.1287ms 24.5036μs 40.8103 KOps/s 40.2944 KOps/s $\color{#35bf28}+1.28\%$
test_getitem[list] 0.1979ms 53.7483μs 18.6052 KOps/s 19.5272 KOps/s $\color{#d91a1a}-4.72\%$
test_setitem_dim[int] 0.1282ms 39.3041μs 25.4427 KOps/s 25.5480 KOps/s $\color{#d91a1a}-0.41\%$
test_setitem_dim[slice_int] 0.1442ms 68.1734μs 14.6685 KOps/s 14.8266 KOps/s $\color{#d91a1a}-1.07\%$
test_setitem_dim[range] 0.2124ms 92.0132μs 10.8680 KOps/s 11.0574 KOps/s $\color{#d91a1a}-1.71\%$
test_setitem_dim[tuple] 96.2900μs 55.4421μs 18.0368 KOps/s 18.2372 KOps/s $\color{#d91a1a}-1.10\%$
test_setitem 79.0680μs 28.9603μs 34.5301 KOps/s 33.9586 KOps/s $\color{#35bf28}+1.68\%$
test_set 0.2715ms 27.6200μs 36.2056 KOps/s 34.6510 KOps/s $\color{#35bf28}+4.49\%$
test_set_shared 3.7871ms 0.2151ms 4.6490 KOps/s 4.7025 KOps/s $\color{#d91a1a}-1.14\%$
test_update 0.2515ms 34.5088μs 28.9781 KOps/s 28.9171 KOps/s $\color{#35bf28}+0.21\%$
test_update_nested 0.1097ms 45.6727μs 21.8949 KOps/s 22.1652 KOps/s $\color{#d91a1a}-1.22\%$
test_update__nested 0.1605ms 33.6045μs 29.7579 KOps/s 29.6638 KOps/s $\color{#35bf28}+0.32\%$
test_set_nested 0.1813ms 30.1101μs 33.2115 KOps/s 32.3980 KOps/s $\color{#35bf28}+2.51\%$
test_set_nested_new 0.5272ms 36.5159μs 27.3853 KOps/s 27.6978 KOps/s $\color{#d91a1a}-1.13\%$
test_select 0.1825ms 52.8500μs 18.9215 KOps/s 18.8487 KOps/s $\color{#35bf28}+0.39\%$
test_select_nested 0.1171ms 59.1930μs 16.8939 KOps/s 16.9945 KOps/s $\color{#d91a1a}-0.59\%$
test_exclude_nested 0.1544ms 76.2030μs 13.1228 KOps/s 13.5686 KOps/s $\color{#d91a1a}-3.28\%$
test_empty[True] 1.0483ms 0.3183ms 3.1415 KOps/s 3.2707 KOps/s $\color{#d91a1a}-3.95\%$
test_empty[False] 31.8770μs 1.2278μs 814.4901 KOps/s 834.4569 KOps/s $\color{#d91a1a}-2.39\%$
test_unbind_speed 0.4543ms 0.2957ms 3.3813 KOps/s 3.4380 KOps/s $\color{#d91a1a}-1.65\%$
test_unbind_speed_stack0 0.4617ms 0.2963ms 3.3752 KOps/s 3.5879 KOps/s $\textbf{\color{#d91a1a}-5.93\%}$
test_unbind_speed_stack1 0.1015s 0.8073ms 1.2387 KOps/s 1.4008 KOps/s $\textbf{\color{#d91a1a}-11.58\%}$
test_split 93.1424ms 2.1666ms 461.5437 Ops/s 456.1330 Ops/s $\color{#35bf28}+1.19\%$
test_chunk 89.6827ms 2.1610ms 462.7454 Ops/s 458.9606 Ops/s $\color{#35bf28}+0.82\%$
test_creation[device0] 0.2358ms 0.1163ms 8.5967 KOps/s 8.5154 KOps/s $\color{#35bf28}+0.96\%$
test_creation_from_tensor 3.6653ms 0.1190ms 8.4062 KOps/s 8.5594 KOps/s $\color{#d91a1a}-1.79\%$
test_add_one[memmap_tensor0] 0.3565ms 7.3716μs 135.6557 KOps/s 137.6515 KOps/s $\color{#d91a1a}-1.45\%$
test_contiguous[memmap_tensor0] 36.6880μs 1.9416μs 515.0388 KOps/s 539.0390 KOps/s $\color{#d91a1a}-4.45\%$
test_stack[memmap_tensor0] 29.7050μs 5.7170μs 174.9159 KOps/s 176.4493 KOps/s $\color{#d91a1a}-0.87\%$
test_memmaptd_index 1.0114ms 0.3890ms 2.5707 KOps/s 2.5140 KOps/s $\color{#35bf28}+2.26\%$
test_memmaptd_index_astensor 0.9627ms 0.4727ms 2.1157 KOps/s 2.1002 KOps/s $\color{#35bf28}+0.74\%$
test_memmaptd_index_op 1.3927ms 1.0080ms 992.0717 Ops/s 987.9252 Ops/s $\color{#35bf28}+0.42\%$
test_serialize_model 0.1287s 0.1201s 8.3260 Ops/s 8.2908 Ops/s $\color{#35bf28}+0.43\%$
test_serialize_model_pickle 0.5070s 0.4056s 2.4653 Ops/s 2.5622 Ops/s $\color{#d91a1a}-3.78\%$
test_serialize_weights 0.1234s 0.1176s 8.5039 Ops/s 7.2714 Ops/s $\textbf{\color{#35bf28}+16.95\%}$
test_serialize_weights_returnearly 0.2520s 0.1708s 5.8557 Ops/s 6.2366 Ops/s $\textbf{\color{#d91a1a}-6.11\%}$
test_serialize_weights_pickle 1.0928s 0.7353s 1.3600 Ops/s 2.2384 Ops/s $\textbf{\color{#d91a1a}-39.24\%}$
test_serialize_weights_filesystem 0.1452s 0.1432s 6.9823 Ops/s 7.0211 Ops/s $\color{#d91a1a}-0.55\%$
test_serialize_model_filesystem 0.1551s 0.1467s 6.8165 Ops/s 5.9382 Ops/s $\textbf{\color{#35bf28}+14.79\%}$
test_reshape_pytree 88.8150μs 37.5281μs 26.6467 KOps/s 25.7770 KOps/s $\color{#35bf28}+3.37\%$
test_reshape_td 0.1263ms 44.8469μs 22.2981 KOps/s 22.0758 KOps/s $\color{#35bf28}+1.01\%$
test_view_pytree 76.4030μs 37.7088μs 26.5190 KOps/s 25.7789 KOps/s $\color{#35bf28}+2.87\%$
test_view_td 0.1242ms 51.5625μs 19.3939 KOps/s 18.8881 KOps/s $\color{#35bf28}+2.68\%$
test_unbind_pytree 79.9090μs 35.1589μs 28.4423 KOps/s 28.1036 KOps/s $\color{#35bf28}+1.21\%$
test_unbind_td 0.3164ms 43.9798μs 22.7377 KOps/s 22.9088 KOps/s $\color{#d91a1a}-0.75\%$
test_split_pytree 0.1153ms 37.3710μs 26.7587 KOps/s 26.4195 KOps/s $\color{#35bf28}+1.28\%$
test_split_td 0.2138ms 56.2919μs 17.7645 KOps/s 17.9007 KOps/s $\color{#d91a1a}-0.76\%$
test_add_pytree 99.3650μs 45.2046μs 22.1216 KOps/s 22.7092 KOps/s $\color{#d91a1a}-2.59\%$
test_add_td 0.1615ms 81.1504μs 12.3228 KOps/s 12.8697 KOps/s $\color{#d91a1a}-4.25\%$
test_compile_add_one_nested[tensordict-compile] 0.1307ms 55.8923μs 17.8916 KOps/s 18.1867 KOps/s $\color{#d91a1a}-1.62\%$
test_compile_add_one_nested[tensordict-eager] 0.3693ms 0.1894ms 5.2798 KOps/s 5.3014 KOps/s $\color{#d91a1a}-0.41\%$
test_compile_add_one_nested[pytree-compile] 0.1291ms 56.7108μs 17.6333 KOps/s 18.1626 KOps/s $\color{#d91a1a}-2.91\%$
test_compile_add_one_nested[pytree-eager] 0.2333ms 0.1422ms 7.0330 KOps/s 7.0773 KOps/s $\color{#d91a1a}-0.63\%$
test_compile_copy_nested[tensordict-compile] 60.5230μs 20.4349μs 48.9358 KOps/s 47.9827 KOps/s $\color{#35bf28}+1.99\%$
test_compile_copy_nested[tensordict-eager] 0.2798ms 70.9383μs 14.0968 KOps/s 15.3345 KOps/s $\textbf{\color{#d91a1a}-8.07\%}$
test_compile_copy_nested[pytree-compile] 0.3358ms 77.3896μs 12.9216 KOps/s 13.6554 KOps/s $\textbf{\color{#d91a1a}-5.37\%}$
test_compile_copy_nested[pytree-eager] 0.1510ms 68.8577μs 14.5227 KOps/s 14.6695 KOps/s $\color{#d91a1a}-1.00\%$
test_compile_add_one_flat[tensordict-compile] 0.3804ms 0.1737ms 5.7555 KOps/s 5.8897 KOps/s $\color{#d91a1a}-2.28\%$
test_compile_add_one_flat[tensordict-eager] 0.6230ms 0.1922ms 5.2026 KOps/s 5.2922 KOps/s $\color{#d91a1a}-1.69\%$
test_compile_add_one_flat[tensorclass-compile] 0.1121ms 46.0142μs 21.7324 KOps/s 21.6894 KOps/s $\color{#35bf28}+0.20\%$
test_compile_add_one_flat[tensorclass-eager] 0.5310ms 70.0620μs 14.2731 KOps/s 14.2390 KOps/s $\color{#35bf28}+0.24\%$
test_compile_add_one_flat[pytree-compile] 0.2821ms 0.1754ms 5.7006 KOps/s 5.8306 KOps/s $\color{#d91a1a}-2.23\%$
test_compile_add_one_flat[pytree-eager] 0.5560ms 0.2943ms 3.3978 KOps/s 3.4150 KOps/s $\color{#d91a1a}-0.50\%$
test_compile_add_self_flat[tensordict-eager] 0.3824ms 0.2028ms 4.9300 KOps/s 4.9261 KOps/s $\color{#35bf28}+0.08\%$
test_compile_add_self_flat[tensordict-compile] 1.1310ms 0.1771ms 5.6458 KOps/s 5.8557 KOps/s $\color{#d91a1a}-3.58\%$
test_compile_add_self_flat[tensorclass-eager] 0.1264ms 62.0104μs 16.1263 KOps/s 15.9041 KOps/s $\color{#35bf28}+1.40\%$
test_compile_add_self_flat[tensorclass-compile] 0.1196ms 47.3684μs 21.1111 KOps/s 20.6471 KOps/s $\color{#35bf28}+2.25\%$
test_compile_add_self_flat[pytree-eager] 0.5110ms 0.2336ms 4.2801 KOps/s 4.2632 KOps/s $\color{#35bf28}+0.40\%$
test_compile_add_self_flat[pytree-compile] 0.2895ms 0.1756ms 5.6936 KOps/s 5.8107 KOps/s $\color{#d91a1a}-2.02\%$
test_compile_copy_flat[tensordict-compile] 0.2291ms 0.1021ms 9.7988 KOps/s 9.8719 KOps/s $\color{#d91a1a}-0.74\%$
test_compile_copy_flat[tensordict-eager] 0.1423ms 58.5957μs 17.0661 KOps/s 17.5573 KOps/s $\color{#d91a1a}-2.80\%$
test_compile_copy_flat[pytree-compile] 0.1555ms 78.8317μs 12.6852 KOps/s 12.8090 KOps/s $\color{#d91a1a}-0.97\%$
test_compile_copy_flat[pytree-eager] 0.1510ms 72.1034μs 13.8690 KOps/s 14.0940 KOps/s $\color{#d91a1a}-1.60\%$
test_compile_assign_and_add[tensordict-compile] 0.2783ms 0.1962ms 5.0960 KOps/s 5.0911 KOps/s $\color{#35bf28}+0.10\%$
test_compile_assign_and_add[tensordict-eager] 2.9304ms 1.6680ms 599.5137 Ops/s 617.7817 Ops/s $\color{#d91a1a}-2.96\%$
test_compile_assign_and_add[pytree-compile] 0.3878ms 0.1943ms 5.1456 KOps/s 5.0619 KOps/s $\color{#35bf28}+1.65\%$
test_compile_assign_and_add[pytree-eager] 1.1969ms 1.0954ms 912.8891 Ops/s 911.8807 Ops/s $\color{#35bf28}+0.11\%$
test_compile_assign_and_add_stack[compile] 1.0715ms 0.4389ms 2.2783 KOps/s 2.4038 KOps/s $\textbf{\color{#d91a1a}-5.22\%}$
test_compile_assign_and_add_stack[eager] 5.6182ms 3.7932ms 263.6312 Ops/s 265.5177 Ops/s $\color{#d91a1a}-0.71\%$
test_compile_indexing[tensor-tensordict-compile] 81.0810μs 33.5364μs 29.8184 KOps/s 29.8547 KOps/s $\color{#d91a1a}-0.12\%$
test_compile_indexing[tensor-tensordict-eager] 1.4268ms 47.1226μs 21.2212 KOps/s 21.7730 KOps/s $\color{#d91a1a}-2.53\%$
test_compile_indexing[tensor-tensorclass-compile] 90.4180μs 29.4856μs 33.9148 KOps/s 34.5492 KOps/s $\color{#d91a1a}-1.84\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1020ms 27.7626μs 36.0197 KOps/s 35.6145 KOps/s $\color{#35bf28}+1.14\%$
test_compile_indexing[tensor-pytree-compile] 0.1948ms 29.4661μs 33.9373 KOps/s 34.9497 KOps/s $\color{#d91a1a}-2.90\%$
test_compile_indexing[tensor-pytree-eager] 95.0570μs 28.0164μs 35.6933 KOps/s 35.6725 KOps/s $\color{#35bf28}+0.06\%$
test_compile_indexing[slice-tensordict-compile] 0.1532ms 72.4602μs 13.8007 KOps/s 13.8049 KOps/s $\color{#d91a1a}-0.03\%$
test_compile_indexing[slice-tensordict-eager] 0.6305ms 27.0361μs 36.9876 KOps/s 37.0965 KOps/s $\color{#d91a1a}-0.29\%$
test_compile_indexing[slice-tensorclass-compile] 0.2128ms 67.8134μs 14.7464 KOps/s 14.8966 KOps/s $\color{#d91a1a}-1.01\%$
test_compile_indexing[slice-tensorclass-eager] 68.8780μs 22.4632μs 44.5172 KOps/s 43.4887 KOps/s $\color{#35bf28}+2.37\%$
test_compile_indexing[slice-pytree-compile] 0.1673ms 67.4537μs 14.8250 KOps/s 14.8398 KOps/s $\color{#d91a1a}-0.10\%$
test_compile_indexing[slice-pytree-eager] 58.8300μs 22.4294μs 44.5844 KOps/s 43.0931 KOps/s $\color{#35bf28}+3.46\%$
test_compile_indexing[int-tensordict-compile] 0.1885ms 71.4932μs 13.9873 KOps/s 14.0678 KOps/s $\color{#d91a1a}-0.57\%$
test_compile_indexing[int-tensordict-eager] 1.4181ms 26.7522μs 37.3801 KOps/s 37.0910 KOps/s $\color{#35bf28}+0.78\%$
test_compile_indexing[int-tensorclass-compile] 0.1316ms 67.0960μs 14.9040 KOps/s 14.8579 KOps/s $\color{#35bf28}+0.31\%$
test_compile_indexing[int-tensorclass-eager] 64.6200μs 22.5107μs 44.4234 KOps/s 43.5378 KOps/s $\color{#35bf28}+2.03\%$
test_compile_indexing[int-pytree-compile] 0.1390ms 66.1219μs 15.1236 KOps/s 14.7016 KOps/s $\color{#35bf28}+2.87\%$
test_compile_indexing[int-pytree-eager] 0.5375ms 22.5231μs 44.3988 KOps/s 43.5859 KOps/s $\color{#35bf28}+1.87\%$
test_mod_add[eager] 75.3200μs 23.8509μs 41.9271 KOps/s 42.6317 KOps/s $\color{#d91a1a}-1.65\%$
test_mod_add[compile] 81.8120μs 39.3813μs 25.3928 KOps/s 25.7963 KOps/s $\color{#d91a1a}-1.56\%$
test_mod_add[compile-overhead] 90.2380μs 38.4935μs 25.9784 KOps/s 26.4322 KOps/s $\color{#d91a1a}-1.72\%$
test_mod_wrap[eager] 0.3561ms 0.2091ms 4.7825 KOps/s 4.6977 KOps/s $\color{#35bf28}+1.81\%$
test_mod_wrap[compile] 0.4683ms 0.2343ms 4.2673 KOps/s 4.2924 KOps/s $\color{#d91a1a}-0.58\%$
test_mod_wrap[compile-overhead] 0.4539ms 0.2337ms 4.2782 KOps/s 4.3774 KOps/s $\color{#d91a1a}-2.27\%$
test_mod_wrap_and_backward[eager] 22.4828ms 12.4902ms 80.0629 Ops/s 89.3947 Ops/s $\textbf{\color{#d91a1a}-10.44\%}$
test_mod_wrap_and_backward[compile] 19.4181ms 11.8734ms 84.2220 Ops/s 89.8312 Ops/s $\textbf{\color{#d91a1a}-6.24\%}$
test_mod_wrap_and_backward[compile-overhead] 14.2390ms 12.6604ms 78.9865 Ops/s 82.6933 Ops/s $\color{#d91a1a}-4.48\%$
test_seq_add[eager] 0.1675ms 86.0374μs 11.6229 KOps/s 11.2970 KOps/s $\color{#35bf28}+2.88\%$
test_seq_add[compile] 0.1539ms 62.8098μs 15.9211 KOps/s 15.5778 KOps/s $\color{#35bf28}+2.20\%$
test_seq_add[compile-overhead] 0.1301ms 62.2824μs 16.0559 KOps/s 16.0196 KOps/s $\color{#35bf28}+0.23\%$
test_seq_wrap[eager] 0.5195ms 0.3810ms 2.6247 KOps/s 2.5967 KOps/s $\color{#35bf28}+1.08\%$
test_seq_wrap[compile] 0.4722ms 0.2663ms 3.7551 KOps/s 3.7248 KOps/s $\color{#35bf28}+0.81\%$
test_seq_wrap[compile-overhead] 0.5215ms 0.2720ms 3.6762 KOps/s 3.7208 KOps/s $\color{#d91a1a}-1.20\%$
test_func_call_runtime[False-eager] 0.9322ms 0.5428ms 1.8422 KOps/s 1.8794 KOps/s $\color{#d91a1a}-1.98\%$
test_func_call_runtime[False-compile] 0.6769ms 0.4989ms 2.0046 KOps/s 2.0044 KOps/s $+0.01\%$
test_func_call_runtime[False-compile-overhead] 0.6314ms 0.5025ms 1.9902 KOps/s 2.0087 KOps/s $\color{#d91a1a}-0.92\%$
test_func_call_runtime[True-eager] 0.8694ms 0.7445ms 1.3432 KOps/s 1.3448 KOps/s $\color{#d91a1a}-0.12\%$
test_func_call_runtime[True-compile] 0.6968ms 0.5141ms 1.9450 KOps/s 1.9677 KOps/s $\color{#d91a1a}-1.15\%$
test_func_call_runtime[True-compile-overhead] 0.9278ms 0.5138ms 1.9462 KOps/s 1.9650 KOps/s $\color{#d91a1a}-0.95\%$
test_func_call_cm_runtime[False-eager] 0.8860ms 0.5189ms 1.9272 KOps/s 1.9225 KOps/s $\color{#35bf28}+0.24\%$
test_func_call_cm_runtime[False-compile] 0.8934ms 0.5024ms 1.9903 KOps/s 2.0065 KOps/s $\color{#d91a1a}-0.81\%$
test_func_call_cm_runtime[False-compile-overhead] 1.0483ms 0.5058ms 1.9769 KOps/s 1.9640 KOps/s $\color{#35bf28}+0.65\%$
test_func_call_cm_runtime[True-eager] 1.4853ms 0.8703ms 1.1490 KOps/s 1.1630 KOps/s $\color{#d91a1a}-1.21\%$
test_func_call_cm_runtime[True-compile] 0.9023ms 0.7354ms 1.3597 KOps/s 1.3556 KOps/s $\color{#35bf28}+0.30\%$
test_func_call_cm_runtime[True-compile-overhead] 1.0947ms 0.7410ms 1.3495 KOps/s 1.3603 KOps/s $\color{#d91a1a}-0.80\%$
test_vmap_func_call_cm_runtime[eager] 2.6092ms 1.8886ms 529.4951 Ops/s 531.0629 Ops/s $\color{#d91a1a}-0.30\%$
test_vmap_func_call_cm_runtime[compile] 3.0704ms 1.9364ms 516.4097 Ops/s 517.6453 Ops/s $\color{#d91a1a}-0.24\%$
test_vmap_func_call_cm_runtime[compile-overhead] 3.0028ms 1.9355ms 516.6672 Ops/s 519.7108 Ops/s $\color{#d91a1a}-0.59\%$
test_distributed 0.4526ms 0.1254ms 7.9755 KOps/s 7.8049 KOps/s $\color{#35bf28}+2.19\%$
test_tdmodule 88.1740μs 17.1242μs 58.3968 KOps/s 57.9145 KOps/s $\color{#35bf28}+0.83\%$
test_tdmodule_dispatch 70.5220μs 34.6839μs 28.8318 KOps/s 26.9323 KOps/s $\textbf{\color{#35bf28}+7.05\%}$
test_tdseq 38.3110μs 19.7748μs 50.5694 KOps/s 49.0335 KOps/s $\color{#35bf28}+3.13\%$
test_tdseq_dispatch 69.0590μs 39.3184μs 25.4334 KOps/s 24.9294 KOps/s $\color{#35bf28}+2.02\%$
test_instantiation_functorch 1.8570ms 1.5933ms 627.6354 Ops/s 638.7080 Ops/s $\color{#d91a1a}-1.73\%$
test_instantiation_td 1.9730ms 1.1770ms 849.6269 Ops/s 854.8208 Ops/s $\color{#d91a1a}-0.61\%$
test_exec_functorch 0.2762ms 0.1825ms 5.4791 KOps/s 5.3705 KOps/s $\color{#35bf28}+2.02\%$
test_exec_functional_call 0.3468ms 0.1720ms 5.8123 KOps/s 5.5673 KOps/s $\color{#35bf28}+4.40\%$
test_exec_td 0.2959ms 0.1657ms 6.0334 KOps/s 5.8745 KOps/s $\color{#35bf28}+2.71\%$
test_exec_td_decorator 0.4539ms 0.2211ms 4.5222 KOps/s 4.5185 KOps/s $\color{#35bf28}+0.08\%$
test_vmap_mlp_speed[True-True] 1.1805ms 0.6503ms 1.5378 KOps/s 1.5630 KOps/s $\color{#d91a1a}-1.61\%$
test_vmap_mlp_speed[True-False] 1.5428ms 0.6622ms 1.5102 KOps/s 1.5733 KOps/s $\color{#d91a1a}-4.01\%$
test_vmap_mlp_speed[False-True] 0.9178ms 0.5041ms 1.9836 KOps/s 2.0256 KOps/s $\color{#d91a1a}-2.08\%$
test_vmap_mlp_speed[False-False] 0.8544ms 0.5070ms 1.9723 KOps/s 2.0128 KOps/s $\color{#d91a1a}-2.01\%$
test_vmap_mlp_speed_decorator[True-True] 1.4553ms 0.6226ms 1.6061 KOps/s 1.6239 KOps/s $\color{#d91a1a}-1.10\%$
test_vmap_mlp_speed_decorator[True-False] 1.6415ms 0.6274ms 1.5938 KOps/s 1.5684 KOps/s $\color{#35bf28}+1.62\%$
test_vmap_mlp_speed_decorator[False-True] 0.9008ms 0.5117ms 1.9541 KOps/s 1.9498 KOps/s $\color{#35bf28}+0.22\%$
test_vmap_mlp_speed_decorator[False-False] 0.7689ms 0.5131ms 1.9489 KOps/s 1.9611 KOps/s $\color{#d91a1a}-0.62\%$
test_to_module_speed[True] 2.0700ms 1.2889ms 775.8329 Ops/s 781.5421 Ops/s $\color{#d91a1a}-0.73\%$
test_to_module_speed[False] 1.7583ms 1.2490ms 800.6244 Ops/s 807.9357 Ops/s $\color{#d91a1a}-0.90\%$
test_tc_init 94.3860μs 41.1408μs 24.3068 KOps/s 23.9411 KOps/s $\color{#35bf28}+1.53\%$
test_tc_init_nested 0.1588ms 80.5400μs 12.4162 KOps/s 11.6734 KOps/s $\textbf{\color{#35bf28}+6.36\%}$
test_tc_first_layer_tensor 19.2950μs 1.5337μs 652.0315 KOps/s 686.0180 KOps/s $\color{#d91a1a}-4.95\%$
test_tc_first_layer_nontensor 30.8970μs 4.7884μs 208.8384 KOps/s 217.3065 KOps/s $\color{#d91a1a}-3.90\%$
test_tc_second_layer_tensor 35.3760μs 2.8090μs 356.0041 KOps/s 362.3759 KOps/s $\color{#d91a1a}-1.76\%$
test_tc_second_layer_nontensor 31.5990μs 6.0488μs 165.3223 KOps/s 170.8914 KOps/s $\color{#d91a1a}-3.26\%$
test_unbind 0.5128s 13.2864ms 75.2651 Ops/s 74.1879 Ops/s $\color{#35bf28}+1.45\%$
test_full_like 9.3643ms 8.3125ms 120.3000 Ops/s 117.3161 Ops/s $\color{#35bf28}+2.54\%$
test_zeros_like 3.9617ms 3.3663ms 297.0636 Ops/s 289.6747 Ops/s $\color{#35bf28}+2.55\%$
test_ones_like 4.1037ms 3.5695ms 280.1490 Ops/s 156.7263 Ops/s $\textbf{\color{#35bf28}+78.75\%}$
test_clone 6.3342ms 5.8662ms 170.4668 Ops/s 116.4829 Ops/s $\textbf{\color{#35bf28}+46.34\%}$
test_squeeze 74.7590μs 12.7344μs 78.5274 KOps/s 80.7664 KOps/s $\color{#d91a1a}-2.77\%$
test_unsqueeze 0.1577ms 91.4478μs 10.9352 KOps/s 10.7267 KOps/s $\color{#35bf28}+1.94\%$
test_split 0.5571ms 0.1901ms 5.2606 KOps/s 5.1792 KOps/s $\color{#35bf28}+1.57\%$
test_permute 0.4037ms 0.2214ms 4.5163 KOps/s 4.3354 KOps/s $\color{#35bf28}+4.17\%$
test_stack 28.1663ms 25.6260ms 39.0229 Ops/s 36.0667 Ops/s $\textbf{\color{#35bf28}+8.20\%}$
test_cat 28.4527ms 25.3897ms 39.3860 Ops/s 36.4976 Ops/s $\textbf{\color{#35bf28}+7.91\%}$

@vmoens vmoens merged commit 951e4e9 into gh/vmoens/17/base Sep 10, 2024
44 of 48 checks passed
vmoens added a commit that referenced this pull request Sep 10, 2024
ghstack-source-id: 1000aeadc067a9c5f9f206f652abd74172012e29
Pull Request resolved: #983
@vmoens vmoens deleted the gh/vmoens/17/head branch September 10, 2024 16:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants