Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] _foreach_copy_ for update_ #1032

Merged
merged 1 commit into from
Oct 8, 2024
Merged

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Oct 7, 2024

Stack from ghstack (oldest at bottom):

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Oct 7, 2024
ghstack-source-id: dc32497757d1fb19dc61dea9115810890d5a4acb
Pull Request resolved: #1032
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 7, 2024
Copy link

github-actions bot commented Oct 7, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 222. Improved: $\large\color{#35bf28}19$. Worsened: $\large\color{#d91a1a}9$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 64.8010μs 24.0141μs 41.6422 KOps/s 41.4010 KOps/s $\color{#35bf28}+0.58\%$
test_plain_set_stack_nested 86.0520μs 24.2061μs 41.3119 KOps/s 41.1943 KOps/s $\color{#35bf28}+0.29\%$
test_plain_set_nested_inplace 69.6600μs 26.6616μs 37.5071 KOps/s 37.8898 KOps/s $\color{#d91a1a}-1.01\%$
test_plain_set_stack_nested_inplace 69.2000μs 26.5311μs 37.6916 KOps/s 38.2458 KOps/s $\color{#d91a1a}-1.45\%$
test_items 36.6990μs 4.2327μs 236.2547 KOps/s 245.1108 KOps/s $\color{#d91a1a}-3.61\%$
test_items_nested 0.5870ms 0.3864ms 2.5877 KOps/s 2.6480 KOps/s $\color{#d91a1a}-2.28\%$
test_items_nested_locked 0.5795ms 0.3845ms 2.6006 KOps/s 2.6129 KOps/s $\color{#d91a1a}-0.47\%$
test_items_nested_leaf 0.1586ms 81.6589μs 12.2461 KOps/s 12.4558 KOps/s $\color{#d91a1a}-1.68\%$
test_items_stack_nested 0.8100ms 0.3920ms 2.5511 KOps/s 2.6110 KOps/s $\color{#d91a1a}-2.30\%$
test_items_stack_nested_leaf 0.1730ms 83.8658μs 11.9238 KOps/s 11.9340 KOps/s $\color{#d91a1a}-0.09\%$
test_items_stack_nested_locked 0.5512ms 0.3868ms 2.5851 KOps/s 2.5973 KOps/s $\color{#d91a1a}-0.47\%$
test_keys 25.2770μs 3.4763μs 287.6638 KOps/s 279.4155 KOps/s $\color{#35bf28}+2.95\%$
test_keys_nested 0.2423ms 0.1385ms 7.2189 KOps/s 7.4802 KOps/s $\color{#d91a1a}-3.49\%$
test_keys_nested_locked 0.7327ms 0.1445ms 6.9201 KOps/s 7.1056 KOps/s $\color{#d91a1a}-2.61\%$
test_keys_nested_leaf 0.2066ms 0.1205ms 8.2998 KOps/s 8.4928 KOps/s $\color{#d91a1a}-2.27\%$
test_keys_stack_nested 0.2511ms 0.1343ms 7.4441 KOps/s 7.5599 KOps/s $\color{#d91a1a}-1.53\%$
test_keys_stack_nested_leaf 0.2131ms 0.1158ms 8.6367 KOps/s 8.7217 KOps/s $\color{#d91a1a}-0.97\%$
test_keys_stack_nested_locked 0.2598ms 0.1404ms 7.1224 KOps/s 7.2320 KOps/s $\color{#d91a1a}-1.52\%$
test_values 6.5402μs 1.0484μs 953.8468 KOps/s 920.6233 KOps/s $\color{#35bf28}+3.61\%$
test_values_nested 0.1599ms 92.1800μs 10.8483 KOps/s 10.5146 KOps/s $\color{#35bf28}+3.17\%$
test_values_nested_locked 0.2068ms 92.3836μs 10.8244 KOps/s 10.5847 KOps/s $\color{#35bf28}+2.26\%$
test_values_nested_leaf 0.1326ms 80.7621μs 12.3820 KOps/s 12.3260 KOps/s $\color{#35bf28}+0.45\%$
test_values_stack_nested 0.1693ms 93.2887μs 10.7194 KOps/s 10.5191 KOps/s $\color{#35bf28}+1.90\%$
test_values_stack_nested_leaf 0.1350ms 79.0742μs 12.6464 KOps/s 12.3568 KOps/s $\color{#35bf28}+2.34\%$
test_values_stack_nested_locked 0.2074ms 93.6395μs 10.6792 KOps/s 10.5491 KOps/s $\color{#35bf28}+1.23\%$
test_membership 16.0086μs 0.7602μs 1.3155 MOps/s 900.6911 KOps/s $\textbf{\color{#35bf28}+46.06\%}$
test_membership_nested 0.1190ms 2.8216μs 354.4067 KOps/s 359.7426 KOps/s $\color{#d91a1a}-1.48\%$
test_membership_nested_leaf 29.3450μs 2.7670μs 361.4071 KOps/s 347.6074 KOps/s $\color{#35bf28}+3.97\%$
test_membership_stacked_nested 28.1420μs 2.7415μs 364.7603 KOps/s 359.7738 KOps/s $\color{#35bf28}+1.39\%$
test_membership_stacked_nested_leaf 23.7640μs 2.7400μs 364.9631 KOps/s 353.6571 KOps/s $\color{#35bf28}+3.20\%$
test_membership_nested_last 26.0990μs 4.2340μs 236.1852 KOps/s 238.9292 KOps/s $\color{#d91a1a}-1.15\%$
test_membership_nested_leaf_last 33.0620μs 4.1784μs 239.3288 KOps/s 237.6224 KOps/s $\color{#35bf28}+0.72\%$
test_membership_stacked_nested_last 45.9560μs 13.6206μs 73.4184 KOps/s 71.0024 KOps/s $\color{#35bf28}+3.40\%$
test_membership_stacked_nested_leaf_last 39.8550μs 13.5369μs 73.8719 KOps/s 70.5443 KOps/s $\color{#35bf28}+4.72\%$
test_nested_getleaf 0.1478ms 10.9505μs 91.3203 KOps/s 92.3273 KOps/s $\color{#d91a1a}-1.09\%$
test_nested_get 49.5930μs 10.2804μs 97.2728 KOps/s 91.8512 KOps/s $\textbf{\color{#35bf28}+5.90\%}$
test_stacked_getleaf 29.2050μs 10.7441μs 93.0746 KOps/s 91.9895 KOps/s $\color{#35bf28}+1.18\%$
test_stacked_get 0.1029ms 10.2586μs 97.4795 KOps/s 97.2419 KOps/s $\color{#35bf28}+0.24\%$
test_nested_getitemleaf 31.7600μs 11.1743μs 89.4912 KOps/s 89.6610 KOps/s $\color{#d91a1a}-0.19\%$
test_nested_getitem 28.9050μs 10.5426μs 94.8532 KOps/s 94.4656 KOps/s $\color{#35bf28}+0.41\%$
test_stacked_getitemleaf 45.4560μs 11.1718μs 89.5112 KOps/s 87.8583 KOps/s $\color{#35bf28}+1.88\%$
test_stacked_getitem 34.6350μs 10.3859μs 96.2847 KOps/s 95.3498 KOps/s $\color{#35bf28}+0.98\%$
test_lock_nested 86.3091ms 0.5913ms 1.6911 KOps/s 1.9614 KOps/s $\textbf{\color{#d91a1a}-13.78\%}$
test_lock_stack_nested 0.7304ms 0.4583ms 2.1818 KOps/s 2.1914 KOps/s $\color{#d91a1a}-0.44\%$
test_unlock_nested 86.4482ms 0.5112ms 1.9561 KOps/s 2.3082 KOps/s $\textbf{\color{#d91a1a}-15.25\%}$
test_unlock_stack_nested 0.5547ms 0.3721ms 2.6874 KOps/s 2.6645 KOps/s $\color{#35bf28}+0.86\%$
test_flatten_speed 0.1904ms 0.1008ms 9.9238 KOps/s 9.8639 KOps/s $\color{#35bf28}+0.61\%$
test_unflatten_speed 1.0373ms 0.5172ms 1.9334 KOps/s 1.9067 KOps/s $\color{#35bf28}+1.40\%$
test_common_ops 4.0085ms 1.1418ms 875.7719 Ops/s 833.2848 Ops/s $\textbf{\color{#35bf28}+5.10\%}$
test_creation 23.3340μs 2.1202μs 471.6623 KOps/s 486.3882 KOps/s $\color{#d91a1a}-3.03\%$
test_creation_empty 73.3480μs 17.8867μs 55.9076 KOps/s 58.8117 KOps/s $\color{#d91a1a}-4.94\%$
test_creation_nested_1 55.8740μs 21.3792μs 46.7744 KOps/s 50.0850 KOps/s $\textbf{\color{#d91a1a}-6.61\%}$
test_creation_nested_2 57.4570μs 25.4260μs 39.3298 KOps/s 41.3657 KOps/s $\color{#d91a1a}-4.92\%$
test_clone 77.5650μs 16.9128μs 59.1269 KOps/s 56.7087 KOps/s $\color{#35bf28}+4.26\%$
test_getitem[int] 1.0970ms 17.0284μs 58.7255 KOps/s 58.4655 KOps/s $\color{#35bf28}+0.44\%$
test_getitem[slice_int] 0.1433ms 31.1787μs 32.0731 KOps/s 31.5664 KOps/s $\color{#35bf28}+1.61\%$
test_getitem[range] 0.1814ms 59.5864μs 16.7823 KOps/s 16.6463 KOps/s $\color{#35bf28}+0.82\%$
test_getitem[tuple] 0.1286ms 25.3944μs 39.3788 KOps/s 38.7449 KOps/s $\color{#35bf28}+1.64\%$
test_getitem[list] 0.2611ms 55.4562μs 18.0323 KOps/s 18.0521 KOps/s $\color{#d91a1a}-0.11\%$
test_setitem_dim[int] 67.3360μs 33.6503μs 29.7174 KOps/s 29.2610 KOps/s $\color{#35bf28}+1.56\%$
test_setitem_dim[slice_int] 99.1850μs 62.4894μs 16.0027 KOps/s 15.9376 KOps/s $\color{#35bf28}+0.41\%$
test_setitem_dim[range] 0.1538ms 85.5772μs 11.6854 KOps/s 11.3074 KOps/s $\color{#35bf28}+3.34\%$
test_setitem_dim[tuple] 0.1060ms 50.2240μs 19.9108 KOps/s 19.5747 KOps/s $\color{#35bf28}+1.72\%$
test_setitem 77.3150μs 30.3030μs 33.0001 KOps/s 32.5585 KOps/s $\color{#35bf28}+1.36\%$
test_set 76.6440μs 29.6884μs 33.6831 KOps/s 33.6239 KOps/s $\color{#35bf28}+0.18\%$
test_set_shared 2.1281ms 0.2220ms 4.5051 KOps/s 4.5083 KOps/s $\color{#d91a1a}-0.07\%$
test_update 0.1448ms 37.6572μs 26.5554 KOps/s 26.4567 KOps/s $\color{#35bf28}+0.37\%$
test_update_nested 0.1535ms 48.8521μs 20.4699 KOps/s 20.2523 KOps/s $\color{#35bf28}+1.07\%$
test_update__nested 0.1128ms 42.9490μs 23.2834 KOps/s 26.5252 KOps/s $\textbf{\color{#d91a1a}-12.22\%}$
test_set_nested 0.1261ms 32.3520μs 30.9100 KOps/s 30.3484 KOps/s $\color{#35bf28}+1.85\%$
test_set_nested_new 0.1115ms 38.0813μs 26.2596 KOps/s 26.5214 KOps/s $\color{#d91a1a}-0.99\%$
test_select 0.1415ms 56.1207μs 17.8187 KOps/s 18.2068 KOps/s $\color{#d91a1a}-2.13\%$
test_select_nested 0.1212ms 58.8184μs 17.0015 KOps/s 16.7907 KOps/s $\color{#35bf28}+1.26\%$
test_exclude_nested 0.1426ms 74.2447μs 13.4690 KOps/s 13.3682 KOps/s $\color{#35bf28}+0.75\%$
test_empty[True] 0.6330ms 0.3531ms 2.8322 KOps/s 2.8070 KOps/s $\color{#35bf28}+0.90\%$
test_empty[False] 9.8433μs 1.2853μs 778.0311 KOps/s 829.2650 KOps/s $\textbf{\color{#d91a1a}-6.18\%}$
test_unbind_speed 0.6464ms 0.3019ms 3.3125 KOps/s 3.1905 KOps/s $\color{#35bf28}+3.82\%$
test_unbind_speed_stack0 0.4936ms 0.2930ms 3.4132 KOps/s 3.4370 KOps/s $\color{#d91a1a}-0.69\%$
test_unbind_speed_stack1 84.8765ms 0.7792ms 1.2834 KOps/s 1.3980 KOps/s $\textbf{\color{#d91a1a}-8.20\%}$
test_split 85.1403ms 2.1471ms 465.7351 Ops/s 452.5656 Ops/s $\color{#35bf28}+2.91\%$
test_chunk 2.2374ms 1.9974ms 500.6396 Ops/s 447.4727 Ops/s $\textbf{\color{#35bf28}+11.88\%}$
test_creation[device0] 0.2079ms 0.1182ms 8.4632 KOps/s 8.5666 KOps/s $\color{#d91a1a}-1.21\%$
test_creation_from_tensor 3.7967ms 0.1197ms 8.3554 KOps/s 8.4984 KOps/s $\color{#d91a1a}-1.68\%$
test_add_one[memmap_tensor0] 0.2148ms 7.5537μs 132.3854 KOps/s 132.4761 KOps/s $\color{#d91a1a}-0.07\%$
test_contiguous[memmap_tensor0] 16.1000μs 1.9035μs 525.3345 KOps/s 525.2369 KOps/s $\color{#35bf28}+0.02\%$
test_stack[memmap_tensor0] 57.8380μs 5.6703μs 176.3576 KOps/s 180.6350 KOps/s $\color{#d91a1a}-2.37\%$
test_memmaptd_index 1.1048ms 0.4202ms 2.3797 KOps/s 2.4098 KOps/s $\color{#d91a1a}-1.25\%$
test_memmaptd_index_astensor 1.2127ms 0.5254ms 1.9032 KOps/s 1.9445 KOps/s $\color{#d91a1a}-2.12\%$
test_memmaptd_index_op 1.4537ms 1.0516ms 950.8874 Ops/s 971.5730 Ops/s $\color{#d91a1a}-2.13\%$
test_serialize_model 0.2028s 0.1274s 7.8475 Ops/s 8.4392 Ops/s $\textbf{\color{#d91a1a}-7.01\%}$
test_serialize_model_pickle 0.4637s 0.3910s 2.5576 Ops/s 2.5087 Ops/s $\color{#35bf28}+1.95\%$
test_serialize_weights 0.1210s 0.1148s 8.7088 Ops/s 7.7696 Ops/s $\textbf{\color{#35bf28}+12.09\%}$
test_serialize_weights_returnearly 0.2553s 0.1693s 5.9064 Ops/s 6.4020 Ops/s $\textbf{\color{#d91a1a}-7.74\%}$
test_serialize_weights_pickle 0.5466s 0.4266s 2.3440 Ops/s 2.3790 Ops/s $\color{#d91a1a}-1.47\%$
test_serialize_weights_filesystem 0.1514s 0.1425s 7.0197 Ops/s 6.9718 Ops/s $\color{#35bf28}+0.69\%$
test_serialize_model_filesystem 0.1552s 0.1462s 6.8410 Ops/s 6.1481 Ops/s $\textbf{\color{#35bf28}+11.27\%}$
test_reshape_pytree 0.1144ms 38.6991μs 25.8404 KOps/s 25.2415 KOps/s $\color{#35bf28}+2.37\%$
test_reshape_td 98.8950μs 45.5778μs 21.9405 KOps/s 20.2350 KOps/s $\textbf{\color{#35bf28}+8.43\%}$
test_view_pytree 0.1182ms 38.9018μs 25.7058 KOps/s 25.4236 KOps/s $\color{#35bf28}+1.11\%$
test_view_td 0.1357ms 51.3082μs 19.4901 KOps/s 17.7673 KOps/s $\textbf{\color{#35bf28}+9.70\%}$
test_unbind_pytree 96.8910μs 36.4161μs 27.4604 KOps/s 27.0109 KOps/s $\color{#35bf28}+1.66\%$
test_unbind_td 0.3125ms 45.6841μs 21.8894 KOps/s 21.6507 KOps/s $\color{#35bf28}+1.10\%$
test_split_pytree 0.1159ms 38.0462μs 26.2838 KOps/s 26.0376 KOps/s $\color{#35bf28}+0.95\%$
test_split_td 0.2141ms 58.3270μs 17.1447 KOps/s 16.7152 KOps/s $\color{#35bf28}+2.57\%$
test_add_pytree 0.1175ms 45.4882μs 21.9837 KOps/s 20.8843 KOps/s $\textbf{\color{#35bf28}+5.26\%}$
test_add_td 0.2285ms 87.5797μs 11.4182 KOps/s 11.3360 KOps/s $\color{#35bf28}+0.72\%$
test_compile_add_one_nested[tensordict-compile] 0.1704ms 61.2824μs 16.3179 KOps/s 16.8975 KOps/s $\color{#d91a1a}-3.43\%$
test_compile_add_one_nested[tensordict-eager] 0.3130ms 0.2006ms 4.9850 KOps/s 5.0989 KOps/s $\color{#d91a1a}-2.23\%$
test_compile_add_one_nested[pytree-compile] 0.1208ms 57.5671μs 17.3710 KOps/s 17.1036 KOps/s $\color{#35bf28}+1.56\%$
test_compile_add_one_nested[pytree-eager] 0.2717ms 0.1416ms 7.0637 KOps/s 6.9481 KOps/s $\color{#35bf28}+1.66\%$
test_compile_copy_nested[tensordict-compile] 52.6290μs 23.7750μs 42.0609 KOps/s 41.9418 KOps/s $\color{#35bf28}+0.28\%$
test_compile_copy_nested[tensordict-eager] 0.2034ms 75.2142μs 13.2954 KOps/s 13.3025 KOps/s $\color{#d91a1a}-0.05\%$
test_compile_copy_nested[pytree-compile] 0.1285ms 75.1562μs 13.3056 KOps/s 13.1858 KOps/s $\color{#35bf28}+0.91\%$
test_compile_copy_nested[pytree-eager] 0.1442ms 69.1288μs 14.4658 KOps/s 14.7060 KOps/s $\color{#d91a1a}-1.63\%$
test_compile_add_one_flat[tensordict-compile] 0.5272ms 0.1865ms 5.3619 KOps/s 5.4316 KOps/s $\color{#d91a1a}-1.28\%$
test_compile_add_one_flat[tensordict-eager] 0.4200ms 0.2505ms 3.9917 KOps/s 4.2161 KOps/s $\textbf{\color{#d91a1a}-5.32\%}$
test_compile_add_one_flat[tensorclass-compile] 0.1037ms 48.9749μs 20.4186 KOps/s 20.5329 KOps/s $\color{#d91a1a}-0.56\%$
test_compile_add_one_flat[tensorclass-eager] 0.1805ms 77.6162μs 12.8839 KOps/s 13.0675 KOps/s $\color{#d91a1a}-1.40\%$
test_compile_add_one_flat[pytree-compile] 0.4163ms 0.1766ms 5.6626 KOps/s 5.7469 KOps/s $\color{#d91a1a}-1.47\%$
test_compile_add_one_flat[pytree-eager] 0.5393ms 0.2891ms 3.4591 KOps/s 3.4175 KOps/s $\color{#35bf28}+1.22\%$
test_compile_add_self_flat[tensordict-eager] 0.6239ms 0.2855ms 3.5032 KOps/s 3.6655 KOps/s $\color{#d91a1a}-4.43\%$
test_compile_add_self_flat[tensordict-compile] 0.3388ms 0.1892ms 5.2854 KOps/s 5.4558 KOps/s $\color{#d91a1a}-3.12\%$
test_compile_add_self_flat[tensorclass-eager] 0.1634ms 74.7994μs 13.3691 KOps/s 14.0022 KOps/s $\color{#d91a1a}-4.52\%$
test_compile_add_self_flat[tensorclass-compile] 95.1080μs 48.8925μs 20.4530 KOps/s 20.8502 KOps/s $\color{#d91a1a}-1.90\%$
test_compile_add_self_flat[pytree-eager] 0.4933ms 0.2316ms 4.3187 KOps/s 4.2714 KOps/s $\color{#35bf28}+1.11\%$
test_compile_add_self_flat[pytree-compile] 0.3616ms 0.1762ms 5.6746 KOps/s 5.5778 KOps/s $\color{#35bf28}+1.74\%$
test_compile_copy_flat[tensordict-compile] 0.2427ms 0.1131ms 8.8434 KOps/s 8.9198 KOps/s $\color{#d91a1a}-0.86\%$
test_compile_copy_flat[tensordict-eager] 0.1797ms 79.3708μs 12.5991 KOps/s 12.8076 KOps/s $\color{#d91a1a}-1.63\%$
test_compile_copy_flat[pytree-compile] 0.1293ms 77.2184μs 12.9503 KOps/s 12.6412 KOps/s $\color{#35bf28}+2.45\%$
test_compile_copy_flat[pytree-eager] 0.1446ms 68.8545μs 14.5234 KOps/s 14.4263 KOps/s $\color{#35bf28}+0.67\%$
test_compile_assign_and_add[tensordict-compile] 0.3614ms 0.1940ms 5.1555 KOps/s 5.1582 KOps/s $\color{#d91a1a}-0.05\%$
test_compile_assign_and_add[tensordict-eager] 1.9867ms 1.7748ms 563.4515 Ops/s 576.7800 Ops/s $\color{#d91a1a}-2.31\%$
test_compile_assign_and_add[pytree-compile] 0.2831ms 0.1905ms 5.2488 KOps/s 5.1136 KOps/s $\color{#35bf28}+2.64\%$
test_compile_assign_and_add[pytree-eager] 1.3246ms 1.1009ms 908.3643 Ops/s 897.5386 Ops/s $\color{#35bf28}+1.21\%$
test_compile_assign_and_add_stack[compile] 0.6961ms 0.4159ms 2.4046 KOps/s 2.3628 KOps/s $\color{#35bf28}+1.77\%$
test_compile_assign_and_add_stack[eager] 4.3783ms 4.0770ms 245.2775 Ops/s 244.0461 Ops/s $\color{#35bf28}+0.50\%$
test_compile_indexing[tensor-tensordict-compile] 87.4240μs 34.6774μs 28.8372 KOps/s 28.5545 KOps/s $\color{#35bf28}+0.99\%$
test_compile_indexing[tensor-tensordict-eager] 1.0612ms 50.0508μs 19.9797 KOps/s 13.0342 KOps/s $\textbf{\color{#35bf28}+53.29\%}$
test_compile_indexing[tensor-tensorclass-compile] 73.6080μs 30.6740μs 32.6009 KOps/s 32.0815 KOps/s $\color{#35bf28}+1.62\%$
test_compile_indexing[tensor-tensorclass-eager] 77.5450μs 29.4581μs 33.9466 KOps/s 32.2257 KOps/s $\textbf{\color{#35bf28}+5.34\%}$
test_compile_indexing[tensor-pytree-compile] 76.3030μs 30.8439μs 32.4213 KOps/s 32.2850 KOps/s $\color{#35bf28}+0.42\%$
test_compile_indexing[tensor-pytree-eager] 75.6020μs 28.8057μs 34.7154 KOps/s 33.6758 KOps/s $\color{#35bf28}+3.09\%$
test_compile_indexing[slice-tensordict-compile] 0.1800ms 75.0527μs 13.3240 KOps/s 13.1957 KOps/s $\color{#35bf28}+0.97\%$
test_compile_indexing[slice-tensordict-eager] 0.5159ms 28.4976μs 35.0906 KOps/s 33.9905 KOps/s $\color{#35bf28}+3.24\%$
test_compile_indexing[slice-tensorclass-compile] 0.1285ms 69.7341μs 14.3402 KOps/s 14.3352 KOps/s $\color{#35bf28}+0.04\%$
test_compile_indexing[slice-tensorclass-eager] 84.8280μs 23.7016μs 42.1912 KOps/s 41.8430 KOps/s $\color{#35bf28}+0.83\%$
test_compile_indexing[slice-pytree-compile] 0.1622ms 70.0913μs 14.2671 KOps/s 14.3708 KOps/s $\color{#d91a1a}-0.72\%$
test_compile_indexing[slice-pytree-eager] 69.3000μs 23.6711μs 42.2456 KOps/s 42.3783 KOps/s $\color{#d91a1a}-0.31\%$
test_compile_indexing[int-tensordict-compile] 0.1679ms 75.5571μs 13.2350 KOps/s 13.2502 KOps/s $\color{#d91a1a}-0.11\%$
test_compile_indexing[int-tensordict-eager] 0.9145ms 27.9842μs 35.7344 KOps/s 34.7270 KOps/s $\color{#35bf28}+2.90\%$
test_compile_indexing[int-tensorclass-compile] 0.1331ms 69.6435μs 14.3588 KOps/s 14.5225 KOps/s $\color{#d91a1a}-1.13\%$
test_compile_indexing[int-tensorclass-eager] 73.0970μs 23.4779μs 42.5933 KOps/s 42.5237 KOps/s $\color{#35bf28}+0.16\%$
test_compile_indexing[int-pytree-compile] 0.1470ms 69.9145μs 14.3032 KOps/s 14.5061 KOps/s $\color{#d91a1a}-1.40\%$
test_compile_indexing[int-pytree-eager] 65.9930μs 23.1608μs 43.1765 KOps/s 42.5996 KOps/s $\color{#35bf28}+1.35\%$
test_mod_add[eager] 67.8470μs 25.2217μs 39.6485 KOps/s 39.7810 KOps/s $\color{#d91a1a}-0.33\%$
test_mod_add[compile] 0.1021ms 38.6064μs 25.9024 KOps/s 24.8600 KOps/s $\color{#35bf28}+4.19\%$
test_mod_add[compile-overhead] 0.1076ms 38.3425μs 26.0807 KOps/s 24.6406 KOps/s $\textbf{\color{#35bf28}+5.84\%}$
test_mod_wrap[eager] 0.3817ms 0.2101ms 4.7600 KOps/s 4.6731 KOps/s $\color{#35bf28}+1.86\%$
test_mod_wrap[compile] 0.3444ms 0.2357ms 4.2419 KOps/s 4.2243 KOps/s $\color{#35bf28}+0.42\%$
test_mod_wrap[compile-overhead] 0.3634ms 0.2361ms 4.2356 KOps/s 4.2528 KOps/s $\color{#d91a1a}-0.40\%$
test_mod_wrap_and_backward[eager] 12.2604ms 10.8015ms 92.5795 Ops/s 92.7119 Ops/s $\color{#d91a1a}-0.14\%$
test_mod_wrap_and_backward[compile] 13.3483ms 10.9598ms 91.2429 Ops/s 92.1985 Ops/s $\color{#d91a1a}-1.04\%$
test_mod_wrap_and_backward[compile-overhead] 12.0849ms 10.9239ms 91.5422 Ops/s 90.5969 Ops/s $\color{#35bf28}+1.04\%$
test_seq_add[eager] 0.2251ms 92.7222μs 10.7849 KOps/s 10.7248 KOps/s $\color{#35bf28}+0.56\%$
test_seq_add[compile] 0.1300ms 64.2425μs 15.5660 KOps/s 15.0828 KOps/s $\color{#35bf28}+3.20\%$
test_seq_add[compile-overhead] 0.1386ms 61.6596μs 16.2181 KOps/s 15.5724 KOps/s $\color{#35bf28}+4.15\%$
test_seq_wrap[eager] 0.5623ms 0.3876ms 2.5797 KOps/s 2.5452 KOps/s $\color{#35bf28}+1.35\%$
test_seq_wrap[compile] 1.1146ms 0.2721ms 3.6750 KOps/s 3.6233 KOps/s $\color{#35bf28}+1.43\%$
test_seq_wrap[compile-overhead] 1.3154ms 0.2713ms 3.6859 KOps/s 3.6310 KOps/s $\color{#35bf28}+1.51\%$
test_func_call_runtime[False-eager] 0.6616ms 0.5198ms 1.9240 KOps/s 1.8656 KOps/s $\color{#35bf28}+3.13\%$
test_func_call_runtime[False-compile] 0.5957ms 0.5011ms 1.9955 KOps/s 1.9687 KOps/s $\color{#35bf28}+1.36\%$
test_func_call_runtime[False-compile-overhead] 1.0513ms 0.5007ms 1.9974 KOps/s 1.9606 KOps/s $\color{#35bf28}+1.87\%$
test_func_call_runtime[True-eager] 1.0105ms 0.7498ms 1.3337 KOps/s 1.3089 KOps/s $\color{#35bf28}+1.89\%$
test_func_call_runtime[True-compile] 0.6245ms 0.5204ms 1.9216 KOps/s 1.9121 KOps/s $\color{#35bf28}+0.49\%$
test_func_call_runtime[True-compile-overhead] 0.8607ms 0.5149ms 1.9420 KOps/s 1.9356 KOps/s $\color{#35bf28}+0.33\%$
test_func_call_cm_runtime[False-eager] 0.8187ms 0.5175ms 1.9326 KOps/s 1.8602 KOps/s $\color{#35bf28}+3.89\%$
test_func_call_cm_runtime[False-compile] 1.1132ms 0.5005ms 1.9979 KOps/s 1.9588 KOps/s $\color{#35bf28}+1.99\%$
test_func_call_cm_runtime[False-compile-overhead] 0.6246ms 0.4999ms 2.0004 KOps/s 1.9652 KOps/s $\color{#35bf28}+1.79\%$
test_func_call_cm_runtime[True-eager] 1.0719ms 0.8852ms 1.1297 KOps/s 1.0835 KOps/s $\color{#35bf28}+4.27\%$
test_func_call_cm_runtime[True-compile] 1.1586ms 0.7361ms 1.3584 KOps/s 1.3100 KOps/s $\color{#35bf28}+3.70\%$
test_func_call_cm_runtime[True-compile-overhead] 0.9540ms 0.7353ms 1.3600 KOps/s 1.3078 KOps/s $\color{#35bf28}+3.99\%$
test_vmap_func_call_cm_runtime[eager] 2.6782ms 1.8951ms 527.6697 Ops/s 510.9624 Ops/s $\color{#35bf28}+3.27\%$
test_vmap_func_call_cm_runtime[compile] 2.4890ms 1.9383ms 515.9029 Ops/s 501.6018 Ops/s $\color{#35bf28}+2.85\%$
test_vmap_func_call_cm_runtime[compile-overhead] 2.5993ms 1.9551ms 511.4851 Ops/s 496.2218 Ops/s $\color{#35bf28}+3.08\%$
test_distributed 0.2247ms 0.1269ms 7.8818 KOps/s 7.7501 KOps/s $\color{#35bf28}+1.70\%$
test_tdmodule 40.0750μs 17.8151μs 56.1322 KOps/s 56.2238 KOps/s $\color{#d91a1a}-0.16\%$
test_tdmodule_dispatch 68.6180μs 35.6209μs 28.0734 KOps/s 28.6859 KOps/s $\color{#d91a1a}-2.14\%$
test_tdseq 45.4850μs 20.3813μs 49.0645 KOps/s 42.3135 KOps/s $\textbf{\color{#35bf28}+15.95\%}$
test_tdseq_dispatch 65.7930μs 40.8584μs 24.4748 KOps/s 22.2819 KOps/s $\textbf{\color{#35bf28}+9.84\%}$
test_instantiation_functorch 1.7738ms 1.5574ms 642.0942 Ops/s 624.2553 Ops/s $\color{#35bf28}+2.86\%$
test_instantiation_td 1.7690ms 1.1595ms 862.4566 Ops/s 831.3938 Ops/s $\color{#35bf28}+3.74\%$
test_exec_functorch 0.2634ms 0.1851ms 5.4034 KOps/s 5.2859 KOps/s $\color{#35bf28}+2.22\%$
test_exec_functional_call 0.2681ms 0.1743ms 5.7379 KOps/s 5.4277 KOps/s $\textbf{\color{#35bf28}+5.72\%}$
test_exec_td 0.3644ms 0.2050ms 4.8788 KOps/s 4.8449 KOps/s $\color{#35bf28}+0.70\%$
test_exec_td_decorator 0.9695ms 0.2337ms 4.2782 KOps/s 4.1524 KOps/s $\color{#35bf28}+3.03\%$
test_vmap_mlp_speed[True-True] 0.7842ms 0.6826ms 1.4650 KOps/s 1.4312 KOps/s $\color{#35bf28}+2.36\%$
test_vmap_mlp_speed[True-False] 0.7826ms 0.6783ms 1.4743 KOps/s 1.4277 KOps/s $\color{#35bf28}+3.26\%$
test_vmap_mlp_speed[False-True] 0.7103ms 0.5379ms 1.8591 KOps/s 1.8285 KOps/s $\color{#35bf28}+1.67\%$
test_vmap_mlp_speed[False-False] 0.8541ms 0.5363ms 1.8646 KOps/s 1.8200 KOps/s $\color{#35bf28}+2.45\%$
test_vmap_mlp_speed_decorator[True-True] 1.2993ms 0.6435ms 1.5541 KOps/s 1.5212 KOps/s $\color{#35bf28}+2.16\%$
test_vmap_mlp_speed_decorator[True-False] 0.9532ms 0.6439ms 1.5530 KOps/s 1.5011 KOps/s $\color{#35bf28}+3.46\%$
test_vmap_mlp_speed_decorator[False-True] 0.8189ms 0.5322ms 1.8788 KOps/s 1.8280 KOps/s $\color{#35bf28}+2.78\%$
test_vmap_mlp_speed_decorator[False-False] 0.6539ms 0.5289ms 1.8908 KOps/s 1.8193 KOps/s $\color{#35bf28}+3.93\%$
test_to_module_speed[True] 2.0388ms 1.4124ms 707.9905 Ops/s 695.2706 Ops/s $\color{#35bf28}+1.83\%$
test_to_module_speed[False] 1.8509ms 1.3787ms 725.3073 Ops/s 722.1913 Ops/s $\color{#35bf28}+0.43\%$
test_tc_init 0.1040ms 43.1519μs 23.1739 KOps/s 21.6386 KOps/s $\textbf{\color{#35bf28}+7.10\%}$
test_tc_init_nested 0.1505ms 86.0397μs 11.6225 KOps/s 11.0476 KOps/s $\textbf{\color{#35bf28}+5.20\%}$
test_tc_first_layer_tensor 18.9450μs 1.5213μs 657.3260 KOps/s 666.7584 KOps/s $\color{#d91a1a}-1.41\%$
test_tc_first_layer_nontensor 19.1360μs 4.6175μs 216.5695 KOps/s 205.8299 KOps/s $\textbf{\color{#35bf28}+5.22\%}$
test_tc_second_layer_tensor 23.9650μs 2.7859μs 358.9515 KOps/s 346.4804 KOps/s $\color{#35bf28}+3.60\%$
test_tc_second_layer_nontensor 39.7440μs 5.8990μs 169.5196 KOps/s 164.3170 KOps/s $\color{#35bf28}+3.17\%$
test_unbind 0.4728s 13.1147ms 76.2505 Ops/s 76.7507 Ops/s $\color{#d91a1a}-0.65\%$
test_full_like 7.7997ms 7.0957ms 140.9299 Ops/s 142.1594 Ops/s $\color{#d91a1a}-0.86\%$
test_zeros_like 5.3568ms 2.8380ms 352.3569 Ops/s 360.4499 Ops/s $\color{#d91a1a}-2.25\%$
test_ones_like 3.4685ms 3.0574ms 327.0776 Ops/s 304.1609 Ops/s $\textbf{\color{#35bf28}+7.53\%}$
test_clone 5.3208ms 4.7771ms 209.3338 Ops/s 202.0753 Ops/s $\color{#35bf28}+3.59\%$
test_squeeze 64.0400μs 12.7988μs 78.1323 KOps/s 79.5954 KOps/s $\color{#d91a1a}-1.84\%$
test_unsqueeze 0.3343ms 93.8482μs 10.6555 KOps/s 10.5460 KOps/s $\color{#35bf28}+1.04\%$
test_split 0.3369ms 0.1955ms 5.1164 KOps/s 4.9313 KOps/s $\color{#35bf28}+3.75\%$
test_permute 0.3586ms 0.2172ms 4.6036 KOps/s 4.4080 KOps/s $\color{#35bf28}+4.44\%$
test_stack 29.7721ms 24.6588ms 40.5535 Ops/s 40.8199 Ops/s $\color{#d91a1a}-0.65\%$
test_cat 28.4933ms 24.3613ms 41.0488 Ops/s 41.3429 Ops/s $\color{#d91a1a}-0.71\%$

Copy link

github-actions bot commented Oct 7, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 228. Improved: $\large\color{#35bf28}13$. Worsened: $\large\color{#d91a1a}22$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.1380ms 16.8607μs 59.3094 KOps/s 62.8555 KOps/s $\textbf{\color{#d91a1a}-5.64\%}$
test_plain_set_stack_nested 0.1051ms 16.7985μs 59.5293 KOps/s 62.8198 KOps/s $\textbf{\color{#d91a1a}-5.24\%}$
test_plain_set_nested_inplace 46.2800μs 18.0193μs 55.4961 KOps/s 58.2314 KOps/s $\color{#d91a1a}-4.70\%$
test_plain_set_stack_nested_inplace 52.3110μs 17.9786μs 55.6218 KOps/s 58.5876 KOps/s $\textbf{\color{#d91a1a}-5.06\%}$
test_items 24.6300μs 2.8431μs 351.7293 KOps/s 345.1611 KOps/s $\color{#35bf28}+1.90\%$
test_items_nested 0.3803ms 0.3425ms 2.9201 KOps/s 2.9395 KOps/s $\color{#d91a1a}-0.66\%$
test_items_nested_locked 0.3711ms 0.3403ms 2.9383 KOps/s 2.9166 KOps/s $\color{#35bf28}+0.74\%$
test_items_nested_leaf 0.1008ms 62.6453μs 15.9629 KOps/s 15.8162 KOps/s $\color{#35bf28}+0.93\%$
test_items_stack_nested 0.3695ms 0.3381ms 2.9577 KOps/s 2.9204 KOps/s $\color{#35bf28}+1.28\%$
test_items_stack_nested_leaf 92.1120μs 62.8094μs 15.9212 KOps/s 15.5783 KOps/s $\color{#35bf28}+2.20\%$
test_items_stack_nested_locked 0.4305ms 0.3420ms 2.9242 KOps/s 2.8992 KOps/s $\color{#35bf28}+0.86\%$
test_keys 32.3600μs 3.4159μs 292.7525 KOps/s 291.9812 KOps/s $\color{#35bf28}+0.26\%$
test_keys_nested 0.1254ms 71.1575μs 14.0533 KOps/s 13.9612 KOps/s $\color{#35bf28}+0.66\%$
test_keys_nested_locked 2.7126ms 77.2891μs 12.9384 KOps/s 12.8048 KOps/s $\color{#35bf28}+1.04\%$
test_keys_nested_leaf 0.2393ms 61.9084μs 16.1529 KOps/s 15.9094 KOps/s $\color{#35bf28}+1.53\%$
test_keys_stack_nested 0.2627ms 71.1475μs 14.0553 KOps/s 14.0213 KOps/s $\color{#35bf28}+0.24\%$
test_keys_stack_nested_leaf 0.2384ms 61.2409μs 16.3290 KOps/s 15.7660 KOps/s $\color{#35bf28}+3.57\%$
test_keys_stack_nested_locked 0.1682ms 76.5259μs 13.0675 KOps/s 12.8742 KOps/s $\color{#35bf28}+1.50\%$
test_values 6.1152μs 0.8351μs 1.1974 MOps/s 1.1935 MOps/s $\color{#35bf28}+0.33\%$
test_values_nested 0.1487ms 48.4619μs 20.6347 KOps/s 20.4069 KOps/s $\color{#35bf28}+1.12\%$
test_values_nested_locked 0.1710ms 50.2338μs 19.9069 KOps/s 19.7220 KOps/s $\color{#35bf28}+0.94\%$
test_values_nested_leaf 71.5510μs 42.6252μs 23.4603 KOps/s 23.3839 KOps/s $\color{#35bf28}+0.33\%$
test_values_stack_nested 85.7520μs 49.0126μs 20.4029 KOps/s 19.9411 KOps/s $\color{#35bf28}+2.32\%$
test_values_stack_nested_leaf 79.7020μs 42.8081μs 23.3601 KOps/s 22.8918 KOps/s $\color{#35bf28}+2.05\%$
test_values_stack_nested_locked 81.3310μs 50.3905μs 19.8450 KOps/s 19.2550 KOps/s $\color{#35bf28}+3.06\%$
test_membership 2.2626μs 0.5042μs 1.9832 MOps/s 2.0020 MOps/s $\color{#d91a1a}-0.94\%$
test_membership_nested 19.2355μs 1.8890μs 529.3839 KOps/s 530.8155 KOps/s $\color{#d91a1a}-0.27\%$
test_membership_nested_leaf 19.8305μs 1.8827μs 531.1591 KOps/s 541.2089 KOps/s $\color{#d91a1a}-1.86\%$
test_membership_stacked_nested 29.0510μs 1.9604μs 510.1011 KOps/s 523.3775 KOps/s $\color{#d91a1a}-2.54\%$
test_membership_stacked_nested_leaf 25.6000μs 1.9575μs 510.8473 KOps/s 517.3982 KOps/s $\color{#d91a1a}-1.27\%$
test_membership_nested_last 45.4510μs 2.9868μs 334.8101 KOps/s 337.4363 KOps/s $\color{#d91a1a}-0.78\%$
test_membership_nested_leaf_last 0.1035ms 2.9944μs 333.9550 KOps/s 334.1197 KOps/s $\color{#d91a1a}-0.05\%$
test_membership_stacked_nested_last 37.6600μs 2.9516μs 338.7993 KOps/s 120.6990 KOps/s $\textbf{\color{#35bf28}+180.70\%}$
test_membership_stacked_nested_leaf_last 38.3210μs 2.9642μs 337.3571 KOps/s 125.0318 KOps/s $\textbf{\color{#35bf28}+169.82\%}$
test_nested_getleaf 0.6715ms 6.0906μs 164.1867 KOps/s 163.8605 KOps/s $\color{#35bf28}+0.20\%$
test_nested_get 35.4510μs 5.6742μs 176.2366 KOps/s 172.9638 KOps/s $\color{#35bf28}+1.89\%$
test_stacked_getleaf 39.9110μs 6.1494μs 162.6166 KOps/s 165.6807 KOps/s $\color{#d91a1a}-1.85\%$
test_stacked_get 36.2700μs 5.7103μs 175.1235 KOps/s 177.8744 KOps/s $\color{#d91a1a}-1.55\%$
test_nested_getitemleaf 30.5600μs 6.1287μs 163.1656 KOps/s 163.9230 KOps/s $\color{#d91a1a}-0.46\%$
test_nested_getitem 33.5710μs 5.7955μs 172.5471 KOps/s 173.8654 KOps/s $\color{#d91a1a}-0.76\%$
test_stacked_getitemleaf 28.0100μs 6.1095μs 163.6801 KOps/s 162.7440 KOps/s $\color{#35bf28}+0.58\%$
test_stacked_getitem 27.1210μs 5.6900μs 175.7479 KOps/s 176.7920 KOps/s $\color{#d91a1a}-0.59\%$
test_lock_nested 6.9664ms 0.4318ms 2.3157 KOps/s 2.3001 KOps/s $\color{#35bf28}+0.68\%$
test_lock_stack_nested 0.5425ms 0.3936ms 2.5404 KOps/s 2.6166 KOps/s $\color{#d91a1a}-2.92\%$
test_unlock_nested 0.7647ms 0.3676ms 2.7201 KOps/s 2.7200 KOps/s $+0.01\%$
test_unlock_stack_nested 0.4655ms 0.3328ms 3.0046 KOps/s 3.1252 KOps/s $\color{#d91a1a}-3.86\%$
test_flatten_speed 0.1711ms 75.7155μs 13.2073 KOps/s 13.0303 KOps/s $\color{#35bf28}+1.36\%$
test_unflatten_speed 0.3666ms 0.3198ms 3.1265 KOps/s 3.0971 KOps/s $\color{#35bf28}+0.95\%$
test_common_ops 1.5606ms 1.2657ms 790.0478 Ops/s 805.6473 Ops/s $\color{#d91a1a}-1.94\%$
test_creation 0.1668ms 1.4921μs 670.2035 KOps/s 681.6119 KOps/s $\color{#d91a1a}-1.67\%$
test_creation_empty 0.1905ms 15.5889μs 64.1483 KOps/s 70.8885 KOps/s $\textbf{\color{#d91a1a}-9.51\%}$
test_creation_nested_1 0.1912ms 17.5571μs 56.9570 KOps/s 64.2117 KOps/s $\textbf{\color{#d91a1a}-11.30\%}$
test_creation_nested_2 0.1876ms 19.8967μs 50.2597 KOps/s 55.2850 KOps/s $\textbf{\color{#d91a1a}-9.09\%}$
test_clone 0.1863ms 29.3155μs 34.1117 KOps/s 35.1605 KOps/s $\color{#d91a1a}-2.98\%$
test_getitem[int] 1.2979ms 16.2441μs 61.5607 KOps/s 62.9552 KOps/s $\color{#d91a1a}-2.22\%$
test_getitem[slice_int] 0.2519ms 28.4083μs 35.2010 KOps/s 36.7439 KOps/s $\color{#d91a1a}-4.20\%$
test_getitem[range] 0.2361ms 0.1114ms 8.9756 KOps/s 8.8708 KOps/s $\color{#35bf28}+1.18\%$
test_getitem[tuple] 0.1205ms 23.7375μs 42.1275 KOps/s 43.1797 KOps/s $\color{#d91a1a}-2.44\%$
test_getitem[list] 0.3228ms 0.1034ms 9.6729 KOps/s 9.9333 KOps/s $\color{#d91a1a}-2.62\%$
test_setitem_dim[int] 0.2328ms 48.7993μs 20.4921 KOps/s 22.1245 KOps/s $\textbf{\color{#d91a1a}-7.38\%}$
test_setitem_dim[slice_int] 92.8420μs 67.8704μs 14.7340 KOps/s 14.6808 KOps/s $\color{#35bf28}+0.36\%$
test_setitem_dim[range] 0.3058ms 0.1302ms 7.6791 KOps/s 7.7262 KOps/s $\color{#d91a1a}-0.61\%$
test_setitem_dim[tuple] 0.2354ms 64.8314μs 15.4246 KOps/s 16.3759 KOps/s $\textbf{\color{#d91a1a}-5.81\%}$
test_setitem 0.1916ms 42.1623μs 23.7179 KOps/s 24.3541 KOps/s $\color{#d91a1a}-2.61\%$
test_set 0.2174ms 42.4308μs 23.5678 KOps/s 25.1582 KOps/s $\textbf{\color{#d91a1a}-6.32\%}$
test_set_shared 0.3582ms 56.9665μs 17.5542 KOps/s 18.6133 KOps/s $\textbf{\color{#d91a1a}-5.69\%}$
test_update 0.2177ms 51.8390μs 19.2905 KOps/s 20.5924 KOps/s $\textbf{\color{#d91a1a}-6.32\%}$
test_update_nested 0.2375ms 58.9109μs 16.9748 KOps/s 17.8525 KOps/s $\color{#d91a1a}-4.92\%$
test_update__nested 0.1596ms 60.4790μs 16.5347 KOps/s 16.6089 KOps/s $\color{#d91a1a}-0.45\%$
test_set_nested 0.1947ms 44.7508μs 22.3460 KOps/s 23.4992 KOps/s $\color{#d91a1a}-4.91\%$
test_set_nested_new 0.1932ms 47.5848μs 21.0151 KOps/s 21.7511 KOps/s $\color{#d91a1a}-3.38\%$
test_select 0.2062ms 60.2968μs 16.5846 KOps/s 16.7028 KOps/s $\color{#d91a1a}-0.71\%$
test_select_nested 79.8410μs 41.5298μs 24.0791 KOps/s 24.0603 KOps/s $\color{#35bf28}+0.08\%$
test_exclude_nested 0.1904ms 58.7596μs 17.0185 KOps/s 16.8838 KOps/s $\color{#35bf28}+0.80\%$
test_empty[True] 0.2998ms 0.2568ms 3.8935 KOps/s 3.8054 KOps/s $\color{#35bf28}+2.32\%$
test_empty[False] 2.7161μs 0.7518μs 1.3301 MOps/s 1.3508 MOps/s $\color{#d91a1a}-1.53\%$
test_to 66.0220μs 26.8030μs 37.3092 KOps/s 36.8120 KOps/s $\color{#35bf28}+1.35\%$
test_to_nonblocking 0.1104ms 26.1276μs 38.2737 KOps/s 39.1655 KOps/s $\color{#d91a1a}-2.28\%$
test_unbind_speed 1.3421ms 0.2823ms 3.5423 KOps/s 3.5741 KOps/s $\color{#d91a1a}-0.89\%$
test_unbind_speed_stack0 0.3523ms 0.2737ms 3.6533 KOps/s 3.6640 KOps/s $\color{#d91a1a}-0.29\%$
test_unbind_speed_stack1 95.6271ms 0.7096ms 1.4092 KOps/s 1.4342 KOps/s $\color{#d91a1a}-1.75\%$
test_split 99.2242ms 2.2240ms 449.6315 Ops/s 462.7561 Ops/s $\color{#d91a1a}-2.84\%$
test_chunk 98.0074ms 2.2467ms 445.0942 Ops/s 459.7034 Ops/s $\color{#d91a1a}-3.18\%$
test_creation[device0] 0.3406ms 0.1296ms 7.7179 KOps/s 7.8323 KOps/s $\color{#d91a1a}-1.46\%$
test_creation_from_tensor 0.3674ms 0.1316ms 7.6000 KOps/s 7.7118 KOps/s $\color{#d91a1a}-1.45\%$
test_add_one[memmap_tensor0] 0.2247ms 9.2188μs 108.4742 KOps/s 113.1946 KOps/s $\color{#d91a1a}-4.17\%$
test_contiguous[memmap_tensor0] 32.1810μs 2.2602μs 442.4367 KOps/s 453.1080 KOps/s $\color{#d91a1a}-2.36\%$
test_stack[memmap_tensor0] 37.2200μs 6.8260μs 146.4987 KOps/s 154.2896 KOps/s $\textbf{\color{#d91a1a}-5.05\%}$
test_memmaptd_index 1.2747ms 0.4299ms 2.3259 KOps/s 2.3424 KOps/s $\color{#d91a1a}-0.70\%$
test_memmaptd_index_astensor 0.7436ms 0.5029ms 1.9884 KOps/s 2.0170 KOps/s $\color{#d91a1a}-1.41\%$
test_memmaptd_index_op 1.4457ms 1.0404ms 961.2068 Ops/s 976.1727 Ops/s $\color{#d91a1a}-1.53\%$
test_serialize_model 0.1318s 0.1308s 7.6429 Ops/s 7.6096 Ops/s $\color{#35bf28}+0.44\%$
test_serialize_model_pickle 1.3804s 1.2178s 0.8211 Ops/s 0.8235 Ops/s $\color{#d91a1a}-0.29\%$
test_serialize_weights 0.1319s 0.1304s 7.6712 Ops/s 7.6385 Ops/s $\color{#35bf28}+0.43\%$
test_serialize_weights_returnearly 0.2212s 56.8085ms 17.6030 Ops/s 17.5145 Ops/s $\color{#35bf28}+0.51\%$
test_serialize_weights_pickle 1.4039s 1.2263s 0.8155 Ops/s 0.8219 Ops/s $\color{#d91a1a}-0.78\%$
test_reshape_pytree 0.1344ms 35.9757μs 27.7966 KOps/s 27.1398 KOps/s $\color{#35bf28}+2.42\%$
test_reshape_td 0.1326ms 42.7532μs 23.3901 KOps/s 24.4726 KOps/s $\color{#d91a1a}-4.42\%$
test_view_pytree 0.1730ms 35.8852μs 27.8667 KOps/s 28.7004 KOps/s $\color{#d91a1a}-2.90\%$
test_view_td 0.1828ms 46.5803μs 21.4683 KOps/s 21.5632 KOps/s $\color{#d91a1a}-0.44\%$
test_unbind_pytree 0.1630ms 34.5515μs 28.9423 KOps/s 29.9645 KOps/s $\color{#d91a1a}-3.41\%$
test_unbind_td 0.5469ms 42.7540μs 23.3896 KOps/s 23.5879 KOps/s $\color{#d91a1a}-0.84\%$
test_split_pytree 0.1449ms 45.4746μs 21.9903 KOps/s 21.4216 KOps/s $\color{#35bf28}+2.65\%$
test_split_td 97.6364ms 65.7410μs 15.2112 KOps/s 17.9208 KOps/s $\textbf{\color{#d91a1a}-15.12\%}$
test_add_pytree 0.2110ms 57.6553μs 17.3445 KOps/s 16.2812 KOps/s $\textbf{\color{#35bf28}+6.53\%}$
test_add_td 0.2387ms 92.2956μs 10.8347 KOps/s 10.7860 KOps/s $\color{#35bf28}+0.45\%$
test_compile_add_one_nested[tensordict-compile] 0.3055ms 0.1613ms 6.2003 KOps/s 6.0615 KOps/s $\color{#35bf28}+2.29\%$
test_compile_add_one_nested[tensordict-eager] 0.5730ms 0.1659ms 6.0260 KOps/s 6.0185 KOps/s $\color{#35bf28}+0.13\%$
test_compile_add_one_nested[pytree-compile] 0.2834ms 0.1444ms 6.9269 KOps/s 6.8508 KOps/s $\color{#35bf28}+1.11\%$
test_compile_add_one_nested[pytree-eager] 0.3478ms 0.1879ms 5.3213 KOps/s 5.4806 KOps/s $\color{#d91a1a}-2.91\%$
test_compile_copy_nested[tensordict-compile] 0.4002ms 21.7900μs 45.8927 KOps/s 47.0499 KOps/s $\color{#d91a1a}-2.46\%$
test_compile_copy_nested[tensordict-eager] 0.4330ms 49.0133μs 20.4026 KOps/s 20.6276 KOps/s $\color{#d91a1a}-1.09\%$
test_compile_copy_nested[pytree-compile] 0.4458ms 65.2564μs 15.3242 KOps/s 15.5188 KOps/s $\color{#d91a1a}-1.25\%$
test_compile_copy_nested[pytree-eager] 0.4309ms 49.9903μs 20.0039 KOps/s 20.2026 KOps/s $\color{#d91a1a}-0.98\%$
test_compile_add_one_flat[tensordict-compile] 0.3844ms 0.3204ms 3.1210 KOps/s 3.0895 KOps/s $\color{#35bf28}+1.02\%$
test_compile_add_one_flat[tensordict-eager] 0.6253ms 0.2348ms 4.2592 KOps/s 4.1977 KOps/s $\color{#35bf28}+1.47\%$
test_compile_add_one_flat[tensorclass-compile] 0.6323ms 0.1279ms 7.8177 KOps/s 7.7325 KOps/s $\color{#35bf28}+1.10\%$
test_compile_add_one_flat[tensorclass-eager] 0.4560ms 66.9081μs 14.9459 KOps/s 15.1501 KOps/s $\color{#d91a1a}-1.35\%$
test_compile_add_one_flat[pytree-compile] 0.5242ms 0.3188ms 3.1371 KOps/s 3.1338 KOps/s $\color{#35bf28}+0.10\%$
test_compile_add_one_flat[pytree-eager] 0.8306ms 0.6444ms 1.5517 KOps/s 1.6176 KOps/s $\color{#d91a1a}-4.07\%$
test_compile_add_self_flat[tensordict-eager] 0.4805ms 0.2825ms 3.5394 KOps/s 3.4752 KOps/s $\color{#35bf28}+1.85\%$
test_compile_add_self_flat[tensordict-compile] 0.5163ms 0.3232ms 3.0945 KOps/s 3.0884 KOps/s $\color{#35bf28}+0.20\%$
test_compile_add_self_flat[tensorclass-eager] 0.4650ms 76.9646μs 12.9930 KOps/s 12.4066 KOps/s $\color{#35bf28}+4.73\%$
test_compile_add_self_flat[tensorclass-compile] 0.2925ms 0.1296ms 7.7151 KOps/s 7.6595 KOps/s $\color{#35bf28}+0.73\%$
test_compile_add_self_flat[pytree-eager] 0.9205ms 0.5314ms 1.8817 KOps/s 1.8892 KOps/s $\color{#d91a1a}-0.40\%$
test_compile_add_self_flat[pytree-compile] 0.4512ms 0.3175ms 3.1498 KOps/s 3.1419 KOps/s $\color{#35bf28}+0.25\%$
test_compile_copy_flat[tensordict-compile] 0.3971ms 20.3226μs 49.2064 KOps/s 50.0233 KOps/s $\color{#d91a1a}-1.63\%$
test_compile_copy_flat[tensordict-eager] 0.4161ms 38.3576μs 26.0704 KOps/s 26.0238 KOps/s $\color{#35bf28}+0.18\%$
test_compile_copy_flat[pytree-compile] 0.4501ms 69.8978μs 14.3066 KOps/s 14.2611 KOps/s $\color{#35bf28}+0.32\%$
test_compile_copy_flat[pytree-eager] 0.4432ms 51.1713μs 19.5422 KOps/s 19.3952 KOps/s $\color{#35bf28}+0.76\%$
test_compile_assign_and_add[tensordict-compile] 2.3603ms 0.7804ms 1.2814 KOps/s 1.1249 KOps/s $\textbf{\color{#35bf28}+13.91\%}$
test_compile_assign_and_add[tensordict-eager] 3.5713ms 3.3048ms 302.5868 Ops/s 305.3275 Ops/s $\color{#d91a1a}-0.90\%$
test_compile_assign_and_add[pytree-compile] 2.3075ms 0.8178ms 1.2227 KOps/s 1.1284 KOps/s $\textbf{\color{#35bf28}+8.36\%}$
test_compile_assign_and_add[pytree-eager] 3.6486ms 3.2851ms 304.4043 Ops/s 313.8505 Ops/s $\color{#d91a1a}-3.01\%$
test_compile_indexing[tensor-tensordict-compile] 0.2563ms 0.1097ms 9.1173 KOps/s 9.0707 KOps/s $\color{#35bf28}+0.51\%$
test_compile_indexing[tensor-tensordict-eager] 0.2445ms 63.2661μs 15.8063 KOps/s 15.5068 KOps/s $\color{#35bf28}+1.93\%$
test_compile_indexing[tensor-tensorclass-compile] 0.2473ms 0.1028ms 9.7254 KOps/s 9.2071 KOps/s $\textbf{\color{#35bf28}+5.63\%}$
test_compile_indexing[tensor-tensorclass-eager] 0.4304ms 45.3924μs 22.0301 KOps/s 21.0374 KOps/s $\color{#35bf28}+4.72\%$
test_compile_indexing[tensor-pytree-compile] 0.5020ms 0.1076ms 9.2973 KOps/s 9.1297 KOps/s $\color{#35bf28}+1.84\%$
test_compile_indexing[tensor-pytree-eager] 0.4349ms 44.1613μs 22.6443 KOps/s 20.7719 KOps/s $\textbf{\color{#35bf28}+9.01\%}$
test_compile_indexing[slice-tensordict-compile] 0.3099ms 0.1427ms 7.0065 KOps/s 6.8823 KOps/s $\color{#35bf28}+1.81\%$
test_compile_indexing[slice-tensordict-eager] 0.4328ms 27.1041μs 36.8947 KOps/s 39.2764 KOps/s $\textbf{\color{#d91a1a}-6.06\%}$
test_compile_indexing[slice-tensorclass-compile] 0.5221ms 0.1367ms 7.3129 KOps/s 7.2365 KOps/s $\color{#35bf28}+1.06\%$
test_compile_indexing[slice-tensorclass-eager] 0.1034ms 20.9977μs 47.6243 KOps/s 48.8875 KOps/s $\color{#d91a1a}-2.58\%$
test_compile_indexing[slice-pytree-compile] 0.5280ms 0.1324ms 7.5511 KOps/s 7.2235 KOps/s $\color{#35bf28}+4.54\%$
test_compile_indexing[slice-pytree-eager] 0.4032ms 20.8686μs 47.9189 KOps/s 50.0768 KOps/s $\color{#d91a1a}-4.31\%$
test_compile_indexing[int-tensordict-compile] 0.5440ms 0.1389ms 7.1991 KOps/s 6.7805 KOps/s $\textbf{\color{#35bf28}+6.17\%}$
test_compile_indexing[int-tensordict-eager] 0.4822ms 25.0629μs 39.8996 KOps/s 38.8493 KOps/s $\color{#35bf28}+2.70\%$
test_compile_indexing[int-tensorclass-compile] 0.5376ms 0.1325ms 7.5455 KOps/s 7.4145 KOps/s $\color{#35bf28}+1.77\%$
test_compile_indexing[int-tensorclass-eager] 0.4043ms 20.6796μs 48.3568 KOps/s 49.3437 KOps/s $\color{#d91a1a}-2.00\%$
test_compile_indexing[int-pytree-compile] 0.5179ms 0.1330ms 7.5171 KOps/s 7.4378 KOps/s $\color{#35bf28}+1.07\%$
test_compile_indexing[int-pytree-eager] 0.2176ms 20.9901μs 47.6416 KOps/s 49.3632 KOps/s $\color{#d91a1a}-3.49\%$
test_mod_add[eager] 0.4388ms 35.0630μs 28.5201 KOps/s 31.4470 KOps/s $\textbf{\color{#d91a1a}-9.31\%}$
test_mod_add[compile] 0.2466ms 71.4545μs 13.9949 KOps/s 14.1342 KOps/s $\color{#d91a1a}-0.99\%$
test_mod_add[compile-overhead] 0.2521ms 0.1313ms 7.6137 KOps/s 6.3716 KOps/s $\textbf{\color{#35bf28}+19.49\%}$
test_mod_wrap[eager] 1.1751ms 0.7920ms 1.2627 KOps/s 1.2569 KOps/s $\color{#35bf28}+0.46\%$
test_mod_wrap[compile] 1.9849ms 0.8444ms 1.1843 KOps/s 1.1696 KOps/s $\color{#35bf28}+1.26\%$
test_mod_wrap[compile-overhead] 4.8244ms 3.0255ms 330.5258 Ops/s 328.2096 Ops/s $\color{#35bf28}+0.71\%$
test_mod_wrap_and_backward[eager] 4.2838ms 4.0949ms 244.2084 Ops/s 238.8469 Ops/s $\color{#35bf28}+2.24\%$
test_mod_wrap_and_backward[compile] 4.4697ms 4.0065ms 249.5938 Ops/s 244.8192 Ops/s $\color{#35bf28}+1.95\%$
test_mod_wrap_and_backward[compile-overhead] 1.3804ms 0.9201ms 1.0868 KOps/s 992.7535 Ops/s $\textbf{\color{#35bf28}+9.47\%}$
test_seq_add[eager] 0.3123ms 98.9533μs 10.1058 KOps/s 9.8600 KOps/s $\color{#35bf28}+2.49\%$
test_seq_add[compile] 0.2798ms 86.1218μs 11.6115 KOps/s 12.0770 KOps/s $\color{#d91a1a}-3.85\%$
test_seq_add[compile-overhead] 0.2640ms 0.1145ms 8.7308 KOps/s 8.3125 KOps/s $\textbf{\color{#35bf28}+5.03\%}$
test_seq_wrap[eager] 1.0908ms 0.9450ms 1.0582 KOps/s 1.0625 KOps/s $\color{#d91a1a}-0.41\%$
test_seq_wrap[compile] 1.0330ms 0.8573ms 1.1664 KOps/s 1.1520 KOps/s $\color{#35bf28}+1.25\%$
test_seq_wrap[compile-overhead] 0.3467ms 0.2190ms 4.5656 KOps/s 4.3024 KOps/s $\textbf{\color{#35bf28}+6.12\%}$
test_func_call_runtime[False-eager] 2.5971ms 2.4209ms 413.0613 Ops/s 410.4836 Ops/s $\color{#35bf28}+0.63\%$
test_func_call_runtime[False-compile] 2.5734ms 2.4190ms 413.4008 Ops/s 409.7735 Ops/s $\color{#35bf28}+0.89\%$
test_func_call_runtime[False-compile-overhead] 0.5031ms 0.3608ms 2.7720 KOps/s 2.7222 KOps/s $\color{#35bf28}+1.83\%$
test_func_call_runtime[True-eager] 2.7663ms 2.5753ms 388.3069 Ops/s 386.0810 Ops/s $\color{#35bf28}+0.58\%$
test_func_call_runtime[True-compile] 2.6160ms 2.4576ms 406.9045 Ops/s 410.0992 Ops/s $\color{#d91a1a}-0.78\%$
test_func_call_runtime[True-compile-overhead] 0.5034ms 0.3812ms 2.6234 KOps/s 2.5886 KOps/s $\color{#35bf28}+1.34\%$
test_func_call_cm_runtime[False-eager] 2.6604ms 2.3979ms 417.0361 Ops/s 414.7182 Ops/s $\color{#35bf28}+0.56\%$
test_func_call_cm_runtime[False-compile] 2.6029ms 2.4275ms 411.9524 Ops/s 411.5932 Ops/s $\color{#35bf28}+0.09\%$
test_func_call_cm_runtime[False-compile-overhead] 0.5102ms 0.3624ms 2.7595 KOps/s 2.7166 KOps/s $\color{#35bf28}+1.58\%$
test_func_call_cm_runtime[True-eager] 2.8528ms 2.6786ms 373.3292 Ops/s 371.7620 Ops/s $\color{#35bf28}+0.42\%$
test_func_call_cm_runtime[True-compile] 2.6980ms 2.4916ms 401.3526 Ops/s 403.7815 Ops/s $\color{#d91a1a}-0.60\%$
test_func_call_cm_runtime[True-compile-overhead] 0.5615ms 0.4082ms 2.4496 KOps/s 2.4326 KOps/s $\color{#35bf28}+0.70\%$
test_vmap_func_call_cm_runtime[eager] 4.2890ms 3.8410ms 260.3511 Ops/s 262.0096 Ops/s $\color{#d91a1a}-0.63\%$
test_vmap_func_call_cm_runtime[compile] 2.6587ms 2.5122ms 398.0599 Ops/s 404.5932 Ops/s $\color{#d91a1a}-1.61\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.5754ms 0.4074ms 2.4544 KOps/s 2.4245 KOps/s $\color{#35bf28}+1.23\%$
test_distributed 2.2372ms 0.1720ms 5.8146 KOps/s 8.6710 KOps/s $\textbf{\color{#d91a1a}-32.94\%}$
test_tdmodule 33.1700μs 14.5177μs 68.8815 KOps/s 71.0861 KOps/s $\color{#d91a1a}-3.10\%$
test_tdmodule_dispatch 49.2410μs 28.7060μs 34.8359 KOps/s 37.0522 KOps/s $\textbf{\color{#d91a1a}-5.98\%}$
test_tdseq 35.7710μs 15.6466μs 63.9117 KOps/s 67.5509 KOps/s $\textbf{\color{#d91a1a}-5.39\%}$
test_tdseq_dispatch 56.0510μs 32.4658μs 30.8016 KOps/s 33.9684 KOps/s $\textbf{\color{#d91a1a}-9.32\%}$
test_instantiation_functorch 2.1633ms 1.8707ms 534.5609 Ops/s 518.8702 Ops/s $\color{#35bf28}+3.02\%$
test_instantiation_td 1.8126ms 1.1979ms 834.8202 Ops/s 818.0575 Ops/s $\color{#35bf28}+2.05\%$
test_exec_functorch 1.1552ms 1.0178ms 982.5247 Ops/s 992.9311 Ops/s $\color{#d91a1a}-1.05\%$
test_exec_functional_call 1.3689ms 1.0140ms 986.1937 Ops/s 982.1234 Ops/s $\color{#35bf28}+0.41\%$
test_exec_td 1.2046ms 1.0378ms 963.5682 Ops/s 959.5270 Ops/s $\color{#35bf28}+0.42\%$
test_exec_td_decorator 1.1993ms 1.0766ms 928.8509 Ops/s 932.2729 Ops/s $\color{#d91a1a}-0.37\%$
test_vmap_mlp_speed[True-True] 1.4707ms 1.2803ms 781.0518 Ops/s 785.7149 Ops/s $\color{#d91a1a}-0.59\%$
test_vmap_mlp_speed[True-False] 1.4371ms 1.2755ms 783.9837 Ops/s 789.2893 Ops/s $\color{#d91a1a}-0.67\%$
test_vmap_mlp_speed[False-True] 1.3767ms 1.1721ms 853.1698 Ops/s 856.1587 Ops/s $\color{#d91a1a}-0.35\%$
test_vmap_mlp_speed[False-False] 1.3732ms 1.1722ms 853.0777 Ops/s 855.9759 Ops/s $\color{#d91a1a}-0.34\%$
test_vmap_mlp_speed_decorator[True-True] 1.8067ms 1.2564ms 795.9208 Ops/s 802.9177 Ops/s $\color{#d91a1a}-0.87\%$
test_vmap_mlp_speed_decorator[True-False] 1.4642ms 1.2555ms 796.4653 Ops/s 802.1324 Ops/s $\color{#d91a1a}-0.71\%$
test_vmap_mlp_speed_decorator[False-True] 1.3618ms 1.1710ms 853.9454 Ops/s 853.5334 Ops/s $\color{#35bf28}+0.05\%$
test_vmap_mlp_speed_decorator[False-False] 1.3189ms 1.1704ms 854.4341 Ops/s 856.9635 Ops/s $\color{#d91a1a}-0.30\%$
test_vmap_transformer_speed[True-True] 13.4793ms 13.1739ms 75.9075 Ops/s 75.3659 Ops/s $\color{#35bf28}+0.72\%$
test_vmap_transformer_speed[True-False] 13.3967ms 13.1490ms 76.0513 Ops/s 75.6843 Ops/s $\color{#35bf28}+0.48\%$
test_vmap_transformer_speed[False-True] 13.2303ms 13.0249ms 76.7759 Ops/s 76.4357 Ops/s $\color{#35bf28}+0.45\%$
test_vmap_transformer_speed[False-False] 13.2962ms 13.0044ms 76.8972 Ops/s 76.4627 Ops/s $\color{#35bf28}+0.57\%$
test_vmap_transformer_speed_decorator[True-True] 34.4341ms 34.1654ms 29.2694 Ops/s 29.3502 Ops/s $\color{#d91a1a}-0.28\%$
test_vmap_transformer_speed_decorator[True-False] 34.4239ms 34.0867ms 29.3369 Ops/s 29.2304 Ops/s $\color{#35bf28}+0.36\%$
test_vmap_transformer_speed_decorator[False-True] 34.7792ms 33.9491ms 29.4558 Ops/s 29.5210 Ops/s $\color{#d91a1a}-0.22\%$
test_vmap_transformer_speed_decorator[False-False] 34.3171ms 34.0173ms 29.3968 Ops/s 29.4775 Ops/s $\color{#d91a1a}-0.27\%$
test_to_module_speed[True] 2.0825ms 0.9988ms 1.0012 KOps/s 999.6883 Ops/s $\color{#35bf28}+0.15\%$
test_to_module_speed[False] 1.3562ms 0.9754ms 1.0253 KOps/s 1.0212 KOps/s $\color{#35bf28}+0.40\%$
test_tc_init 0.1492ms 35.8635μs 27.8835 KOps/s 32.2982 KOps/s $\textbf{\color{#d91a1a}-13.67\%}$
test_tc_init_nested 0.1078ms 72.8598μs 13.7250 KOps/s 15.7862 KOps/s $\textbf{\color{#d91a1a}-13.06\%}$
test_tc_first_layer_tensor 3.5643μs 0.6787μs 1.4733 MOps/s 1.4638 MOps/s $\color{#35bf28}+0.65\%$
test_tc_first_layer_nontensor 31.6410μs 2.2367μs 447.0913 KOps/s 452.4715 KOps/s $\color{#d91a1a}-1.19\%$
test_tc_second_layer_tensor 14.6080μs 1.3772μs 726.1369 KOps/s 737.4186 KOps/s $\color{#d91a1a}-1.53\%$
test_tc_second_layer_nontensor 24.5110μs 2.9704μs 336.6556 KOps/s 340.9172 KOps/s $\color{#d91a1a}-1.25\%$
test_unbind 0.1972s 12.2284ms 81.7770 Ops/s 92.6614 Ops/s $\textbf{\color{#d91a1a}-11.75\%}$
test_full_like 0.8031ms 0.5756ms 1.7373 KOps/s 1.7374 KOps/s $-0.01\%$
test_zeros_like 0.3553ms 0.1983ms 5.0431 KOps/s 5.0400 KOps/s $\color{#35bf28}+0.06\%$
test_ones_like 0.3724ms 0.1984ms 5.0401 KOps/s 5.0464 KOps/s $\color{#d91a1a}-0.13\%$
test_clone 0.5932ms 0.4146ms 2.4122 KOps/s 2.4131 KOps/s $\color{#d91a1a}-0.04\%$
test_squeeze 0.1276ms 9.8779μs 101.2359 KOps/s 101.5380 KOps/s $\color{#d91a1a}-0.30\%$
test_unsqueeze 0.2192ms 73.1197μs 13.6762 KOps/s 13.1540 KOps/s $\color{#35bf28}+3.97\%$
test_split 0.4098ms 0.1579ms 6.3321 KOps/s 6.4060 KOps/s $\color{#d91a1a}-1.15\%$
test_permute 0.3163ms 0.1799ms 5.5582 KOps/s 5.4742 KOps/s $\color{#35bf28}+1.53\%$
test_stack 1.3172ms 0.8188ms 1.2213 KOps/s 1.1593 KOps/s $\textbf{\color{#35bf28}+5.35\%}$
test_cat 1.3901ms 1.2322ms 811.5433 Ops/s 811.5973 Ops/s $-0.01\%$

@vmoens vmoens merged commit b201188 into gh/vmoens/27/base Oct 8, 2024
36 of 55 checks passed
vmoens added a commit that referenced this pull request Oct 8, 2024
ghstack-source-id: dc32497757d1fb19dc61dea9115810890d5a4acb
Pull Request resolved: #1032
@vmoens vmoens deleted the gh/vmoens/27/head branch October 8, 2024 09:41
@vmoens vmoens restored the gh/vmoens/27/head branch October 8, 2024 09:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants