Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Densify lazy tensordicts #955

Merged
merged 4 commits into from
Aug 9, 2024
Merged

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Aug 9, 2024

[ghstack-poisoned]
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 9, 2024
[ghstack-poisoned]
[ghstack-poisoned]
@vmoens vmoens added the enhancement New feature or request label Aug 9, 2024
Copy link

github-actions bot commented Aug 9, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 219. Improved: $\large\color{#35bf28}7$. Worsened: $\large\color{#d91a1a}49$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 49.5830μs 22.1152μs 45.2177 KOps/s 46.3063 KOps/s $\color{#d91a1a}-2.35\%$
test_plain_set_stack_nested 57.2970μs 22.3371μs 44.7685 KOps/s 45.9930 KOps/s $\color{#d91a1a}-2.66\%$
test_plain_set_nested_inplace 57.6380μs 25.0612μs 39.9023 KOps/s 42.0509 KOps/s $\textbf{\color{#d91a1a}-5.11\%}$
test_plain_set_stack_nested_inplace 61.6250μs 24.3103μs 41.1348 KOps/s 42.4230 KOps/s $\color{#d91a1a}-3.04\%$
test_items 23.0740μs 2.7052μs 369.6620 KOps/s 352.0356 KOps/s $\textbf{\color{#35bf28}+5.01\%}$
test_items_nested 1.4549ms 0.3704ms 2.7001 KOps/s 2.9797 KOps/s $\textbf{\color{#d91a1a}-9.38\%}$
test_items_nested_locked 0.5983ms 0.3380ms 2.9586 KOps/s 2.9583 KOps/s $+0.01\%$
test_items_nested_leaf 0.1627ms 87.5593μs 11.4208 KOps/s 11.9237 KOps/s $\color{#d91a1a}-4.22\%$
test_items_stack_nested 0.6323ms 0.3403ms 2.9387 KOps/s 2.9457 KOps/s $\color{#d91a1a}-0.24\%$
test_items_stack_nested_leaf 0.1760ms 84.3198μs 11.8596 KOps/s 12.0171 KOps/s $\color{#d91a1a}-1.31\%$
test_items_stack_nested_locked 0.6228ms 0.3391ms 2.9487 KOps/s 2.9423 KOps/s $\color{#35bf28}+0.22\%$
test_keys 31.7600μs 3.9907μs 250.5809 KOps/s 257.4958 KOps/s $\color{#d91a1a}-2.69\%$
test_keys_nested 0.2518ms 0.1454ms 6.8767 KOps/s 6.8638 KOps/s $\color{#35bf28}+0.19\%$
test_keys_nested_locked 0.6694ms 0.1508ms 6.6331 KOps/s 6.6195 KOps/s $\color{#35bf28}+0.21\%$
test_keys_nested_leaf 0.2136ms 0.1247ms 8.0210 KOps/s 8.1322 KOps/s $\color{#d91a1a}-1.37\%$
test_keys_stack_nested 0.2790ms 0.1463ms 6.8330 KOps/s 6.9731 KOps/s $\color{#d91a1a}-2.01\%$
test_keys_stack_nested_leaf 0.2383ms 0.1249ms 8.0033 KOps/s 8.1730 KOps/s $\color{#d91a1a}-2.08\%$
test_keys_stack_nested_locked 0.2486ms 0.1502ms 6.6572 KOps/s 6.7425 KOps/s $\color{#d91a1a}-1.27\%$
test_values 9.4803μs 1.4443μs 692.3908 KOps/s 856.5606 KOps/s $\textbf{\color{#d91a1a}-19.17\%}$
test_values_nested 0.1217ms 51.3316μs 19.4812 KOps/s 19.3442 KOps/s $\color{#35bf28}+0.71\%$
test_values_nested_locked 97.8930μs 50.5112μs 19.7976 KOps/s 19.9921 KOps/s $\color{#d91a1a}-0.97\%$
test_values_nested_leaf 82.9350μs 46.0704μs 21.7059 KOps/s 22.4109 KOps/s $\color{#d91a1a}-3.15\%$
test_values_stack_nested 96.8610μs 52.2080μs 19.1542 KOps/s 19.6169 KOps/s $\color{#d91a1a}-2.36\%$
test_values_stack_nested_leaf 0.1051ms 45.5290μs 21.9640 KOps/s 22.5449 KOps/s $\color{#d91a1a}-2.58\%$
test_values_stack_nested_locked 0.1010ms 52.0264μs 19.2210 KOps/s 19.7326 KOps/s $\color{#d91a1a}-2.59\%$
test_membership 2.8734μs 0.7495μs 1.3342 MOps/s 1.0913 MOps/s $\textbf{\color{#35bf28}+22.26\%}$
test_membership_nested 30.8980μs 2.6875μs 372.0961 KOps/s 367.0342 KOps/s $\color{#35bf28}+1.38\%$
test_membership_nested_leaf 30.8280μs 2.6859μs 372.3194 KOps/s 383.6181 KOps/s $\color{#d91a1a}-2.95\%$
test_membership_stacked_nested 38.5020μs 2.6692μs 374.6372 KOps/s 382.2072 KOps/s $\color{#d91a1a}-1.98\%$
test_membership_stacked_nested_leaf 38.3710μs 2.6542μs 376.7670 KOps/s 382.0374 KOps/s $\color{#d91a1a}-1.38\%$
test_membership_nested_last 31.2180μs 3.9769μs 251.4516 KOps/s 257.3222 KOps/s $\color{#d91a1a}-2.28\%$
test_membership_nested_leaf_last 29.4050μs 3.9623μs 252.3776 KOps/s 258.6231 KOps/s $\color{#d91a1a}-2.41\%$
test_membership_stacked_nested_last 57.0670μs 13.0736μs 76.4898 KOps/s 202.0362 KOps/s $\textbf{\color{#d91a1a}-62.14\%}$
test_membership_stacked_nested_leaf_last 61.4650μs 12.5924μs 79.4128 KOps/s 203.7740 KOps/s $\textbf{\color{#d91a1a}-61.03\%}$
test_nested_getleaf 40.3560μs 10.4307μs 95.8712 KOps/s 95.3393 KOps/s $\color{#35bf28}+0.56\%$
test_nested_get 43.2010μs 9.9345μs 100.6596 KOps/s 101.8667 KOps/s $\color{#d91a1a}-1.18\%$
test_stacked_getleaf 38.4720μs 10.4234μs 95.9377 KOps/s 95.8762 KOps/s $\color{#35bf28}+0.06\%$
test_stacked_get 48.5820μs 9.7560μs 102.5009 KOps/s 101.7012 KOps/s $\color{#35bf28}+0.79\%$
test_nested_getitemleaf 41.1770μs 10.8906μs 91.8224 KOps/s 91.8996 KOps/s $\color{#d91a1a}-0.08\%$
test_nested_getitem 42.8400μs 10.0550μs 99.4530 KOps/s 99.7175 KOps/s $\color{#d91a1a}-0.27\%$
test_stacked_getitemleaf 0.1888ms 10.9600μs 91.2406 KOps/s 93.0340 KOps/s $\color{#d91a1a}-1.93\%$
test_stacked_getitem 42.4100μs 10.0555μs 99.4480 KOps/s 102.8447 KOps/s $\color{#d91a1a}-3.30\%$
test_lock_nested 80.6786ms 0.5780ms 1.7301 KOps/s 1.9760 KOps/s $\textbf{\color{#d91a1a}-12.45\%}$
test_lock_stack_nested 0.7115ms 0.4467ms 2.2384 KOps/s 2.1635 KOps/s $\color{#35bf28}+3.46\%$
test_unlock_nested 88.5185ms 0.5178ms 1.9314 KOps/s 2.3908 KOps/s $\textbf{\color{#d91a1a}-19.21\%}$
test_unlock_stack_nested 0.6307ms 0.3634ms 2.7521 KOps/s 2.6603 KOps/s $\color{#35bf28}+3.45\%$
test_flatten_speed 0.5522ms 0.1073ms 9.3154 KOps/s 9.7910 KOps/s $\color{#d91a1a}-4.86\%$
test_unflatten_speed 0.5984ms 0.4648ms 2.1513 KOps/s 2.1953 KOps/s $\color{#d91a1a}-2.01\%$
test_common_ops 5.4312ms 1.1783ms 848.6853 Ops/s 916.8127 Ops/s $\textbf{\color{#d91a1a}-7.43\%}$
test_creation 0.1375ms 2.2168μs 451.1008 KOps/s 495.2284 KOps/s $\textbf{\color{#d91a1a}-8.91\%}$
test_creation_empty 53.5910μs 19.3333μs 51.7243 KOps/s 56.3804 KOps/s $\textbf{\color{#d91a1a}-8.26\%}$
test_creation_nested_1 60.3430μs 22.5377μs 44.3702 KOps/s 47.0005 KOps/s $\textbf{\color{#d91a1a}-5.60\%}$
test_creation_nested_2 1.3476ms 27.4419μs 36.4406 KOps/s 39.4596 KOps/s $\textbf{\color{#d91a1a}-7.65\%}$
test_clone 0.2651ms 17.9813μs 55.6132 KOps/s 62.0612 KOps/s $\textbf{\color{#d91a1a}-10.39\%}$
test_getitem[int] 0.8255ms 17.1901μs 58.1730 KOps/s 61.0094 KOps/s $\color{#d91a1a}-4.65\%$
test_getitem[slice_int] 0.1356ms 33.6375μs 29.7287 KOps/s 32.4183 KOps/s $\textbf{\color{#d91a1a}-8.30\%}$
test_getitem[range] 0.1623ms 58.1922μs 17.1844 KOps/s 17.9576 KOps/s $\color{#d91a1a}-4.31\%$
test_getitem[tuple] 0.1388ms 26.1217μs 38.2824 KOps/s 40.2078 KOps/s $\color{#d91a1a}-4.79\%$
test_getitem[list] 0.2893ms 52.9444μs 18.8877 KOps/s 19.5446 KOps/s $\color{#d91a1a}-3.36\%$
test_setitem_dim[int] 88.1450μs 43.3310μs 23.0782 KOps/s 24.0836 KOps/s $\color{#d91a1a}-4.18\%$
test_setitem_dim[slice_int] 0.1258ms 74.4821μs 13.4260 KOps/s 13.5735 KOps/s $\color{#d91a1a}-1.09\%$
test_setitem_dim[range] 0.2011ms 96.7834μs 10.3324 KOps/s 10.7608 KOps/s $\color{#d91a1a}-3.98\%$
test_setitem_dim[tuple] 0.1097ms 61.0217μs 16.3876 KOps/s 16.9919 KOps/s $\color{#d91a1a}-3.56\%$
test_setitem 0.1675ms 31.8820μs 31.3657 KOps/s 35.4565 KOps/s $\textbf{\color{#d91a1a}-11.54\%}$
test_set 0.3143ms 31.4596μs 31.7868 KOps/s 36.0598 KOps/s $\textbf{\color{#d91a1a}-11.85\%}$
test_set_shared 1.2976ms 0.2204ms 4.5382 KOps/s 4.6820 KOps/s $\color{#d91a1a}-3.07\%$
test_update 0.3441ms 38.8928μs 25.7117 KOps/s 28.5334 KOps/s $\textbf{\color{#d91a1a}-9.89\%}$
test_update_nested 0.2483ms 49.6538μs 20.1394 KOps/s 22.3366 KOps/s $\textbf{\color{#d91a1a}-9.84\%}$
test_update__nested 0.1282ms 35.3600μs 28.2806 KOps/s 30.4925 KOps/s $\textbf{\color{#d91a1a}-7.25\%}$
test_set_nested 0.1469ms 34.3228μs 29.1351 KOps/s 33.2697 KOps/s $\textbf{\color{#d91a1a}-12.43\%}$
test_set_nested_new 0.1339ms 38.5237μs 25.9580 KOps/s 28.8362 KOps/s $\textbf{\color{#d91a1a}-9.98\%}$
test_select 0.1679ms 56.2533μs 17.7767 KOps/s 19.0070 KOps/s $\textbf{\color{#d91a1a}-6.47\%}$
test_select_nested 0.1172ms 59.5073μs 16.8047 KOps/s 16.9734 KOps/s $\color{#d91a1a}-0.99\%$
test_exclude_nested 0.1627ms 78.0333μs 12.8150 KOps/s 13.1391 KOps/s $\color{#d91a1a}-2.47\%$
test_empty[True] 0.7217ms 0.3252ms 3.0748 KOps/s 3.0950 KOps/s $\color{#d91a1a}-0.65\%$
test_empty[False] 11.8497μs 1.1605μs 861.6754 KOps/s 862.6629 KOps/s $\color{#d91a1a}-0.11\%$
test_unbind_speed 0.5256ms 0.3110ms 3.2155 KOps/s 3.2230 KOps/s $\color{#d91a1a}-0.23\%$
test_unbind_speed_stack0 0.5736ms 0.2953ms 3.3865 KOps/s 3.3675 KOps/s $\color{#35bf28}+0.56\%$
test_unbind_speed_stack1 81.9724ms 0.7647ms 1.3077 KOps/s 1.3964 KOps/s $\textbf{\color{#d91a1a}-6.36\%}$
test_split 89.4074ms 2.1988ms 454.7959 Ops/s 469.0029 Ops/s $\color{#d91a1a}-3.03\%$
test_chunk 83.8390ms 2.1872ms 457.2155 Ops/s 467.7117 Ops/s $\color{#d91a1a}-2.24\%$
test_creation[device0] 0.2177ms 0.1216ms 8.2223 KOps/s 8.2205 KOps/s $\color{#35bf28}+0.02\%$
test_creation_from_tensor 4.1673ms 0.1232ms 8.1155 KOps/s 8.3875 KOps/s $\color{#d91a1a}-3.24\%$
test_add_one[memmap_tensor0] 0.2251ms 8.0027μs 124.9575 KOps/s 133.2127 KOps/s $\textbf{\color{#d91a1a}-6.20\%}$
test_contiguous[memmap_tensor0] 18.3140μs 2.0929μs 477.7989 KOps/s 500.9327 KOps/s $\color{#d91a1a}-4.62\%$
test_stack[memmap_tensor0] 50.8450μs 5.9258μs 168.7541 KOps/s 179.8554 KOps/s $\textbf{\color{#d91a1a}-6.17\%}$
test_memmaptd_index 1.2011ms 0.4244ms 2.3564 KOps/s 2.4935 KOps/s $\textbf{\color{#d91a1a}-5.50\%}$
test_memmaptd_index_astensor 1.0639ms 0.5008ms 1.9967 KOps/s 2.0933 KOps/s $\color{#d91a1a}-4.61\%$
test_memmaptd_index_op 1.8573ms 1.0897ms 917.7253 Ops/s 984.1313 Ops/s $\textbf{\color{#d91a1a}-6.75\%}$
test_serialize_model 0.1281s 0.1211s 8.2589 Ops/s 7.4242 Ops/s $\textbf{\color{#35bf28}+11.24\%}$
test_serialize_model_pickle 0.4711s 0.3881s 2.5769 Ops/s 2.5405 Ops/s $\color{#35bf28}+1.44\%$
test_serialize_weights 0.1313s 0.1213s 8.2471 Ops/s 8.7053 Ops/s $\textbf{\color{#d91a1a}-5.26\%}$
test_serialize_weights_returnearly 0.1774s 0.1607s 6.2244 Ops/s 6.2154 Ops/s $\color{#35bf28}+0.15\%$
test_serialize_weights_pickle 0.4415s 0.4019s 2.4884 Ops/s 2.4186 Ops/s $\color{#35bf28}+2.88\%$
test_serialize_weights_filesystem 0.1465s 0.1416s 7.0640 Ops/s 6.5801 Ops/s $\textbf{\color{#35bf28}+7.35\%}$
test_serialize_model_filesystem 0.2288s 0.1603s 6.2378 Ops/s 6.6309 Ops/s $\textbf{\color{#d91a1a}-5.93\%}$
test_reshape_pytree 0.1031ms 41.0704μs 24.3484 KOps/s 25.2112 KOps/s $\color{#d91a1a}-3.42\%$
test_reshape_td 96.0500μs 47.7815μs 20.9286 KOps/s 20.6710 KOps/s $\color{#35bf28}+1.25\%$
test_view_pytree 94.9480μs 39.7903μs 25.1317 KOps/s 25.0476 KOps/s $\color{#35bf28}+0.34\%$
test_view_td 0.1070ms 54.1267μs 18.4752 KOps/s 18.9574 KOps/s $\color{#d91a1a}-2.54\%$
test_unbind_pytree 78.5570μs 37.9453μs 26.3538 KOps/s 27.2752 KOps/s $\color{#d91a1a}-3.38\%$
test_unbind_td 0.3646ms 47.0479μs 21.2550 KOps/s 21.8943 KOps/s $\color{#d91a1a}-2.92\%$
test_split_pytree 80.5700μs 41.8738μs 23.8813 KOps/s 25.1560 KOps/s $\textbf{\color{#d91a1a}-5.07\%}$
test_split_td 0.5653ms 60.8105μs 16.4445 KOps/s 17.2782 KOps/s $\color{#d91a1a}-4.83\%$
test_add_pytree 0.1250ms 50.0850μs 19.9661 KOps/s 21.8675 KOps/s $\textbf{\color{#d91a1a}-8.70\%}$
test_add_td 0.1890ms 89.8703μs 11.1272 KOps/s 11.9204 KOps/s $\textbf{\color{#d91a1a}-6.65\%}$
test_compile_add_one_nested[tensordict-compile] 0.1217ms 55.3685μs 18.0608 KOps/s 18.4384 KOps/s $\color{#d91a1a}-2.05\%$
test_compile_add_one_nested[tensordict-eager] 0.4200ms 0.1934ms 5.1713 KOps/s 5.2596 KOps/s $\color{#d91a1a}-1.68\%$
test_compile_add_one_nested[pytree-compile] 0.2098ms 55.9354μs 17.8778 KOps/s 18.8698 KOps/s $\textbf{\color{#d91a1a}-5.26\%}$
test_compile_add_one_nested[pytree-eager] 0.2963ms 0.1487ms 6.7263 KOps/s 7.0572 KOps/s $\color{#d91a1a}-4.69\%$
test_compile_copy_nested[tensordict-compile] 66.2840μs 20.9232μs 47.7938 KOps/s 48.1187 KOps/s $\color{#d91a1a}-0.68\%$
test_compile_copy_nested[tensordict-eager] 0.1893ms 65.2790μs 15.3189 KOps/s 15.7408 KOps/s $\color{#d91a1a}-2.68\%$
test_compile_copy_nested[pytree-compile] 0.1803ms 79.7734μs 12.5355 KOps/s 12.7461 KOps/s $\color{#d91a1a}-1.65\%$
test_compile_copy_nested[pytree-eager] 0.1349ms 70.0495μs 14.2756 KOps/s 14.2735 KOps/s $\color{#35bf28}+0.01\%$
test_compile_add_one_flat[tensordict-compile] 0.2966ms 0.1800ms 5.5562 KOps/s 5.7898 KOps/s $\color{#d91a1a}-4.03\%$
test_compile_add_one_flat[tensordict-eager] 0.2966ms 0.1977ms 5.0582 KOps/s 5.1643 KOps/s $\color{#d91a1a}-2.06\%$
test_compile_add_one_flat[tensorclass-compile] 0.1101ms 40.2565μs 24.8407 KOps/s 26.2943 KOps/s $\textbf{\color{#d91a1a}-5.53\%}$
test_compile_add_one_flat[tensorclass-eager] 1.1929ms 72.3399μs 13.8236 KOps/s 14.2006 KOps/s $\color{#d91a1a}-2.65\%$
test_compile_add_one_flat[pytree-compile] 0.3034ms 0.1773ms 5.6393 KOps/s 5.8604 KOps/s $\color{#d91a1a}-3.77\%$
test_compile_add_one_flat[pytree-eager] 0.3842ms 0.3038ms 3.2912 KOps/s 3.4493 KOps/s $\color{#d91a1a}-4.58\%$
test_compile_add_self_flat[tensordict-eager] 0.2989ms 0.2111ms 4.7371 KOps/s 4.8729 KOps/s $\color{#d91a1a}-2.79\%$
test_compile_add_self_flat[tensordict-compile] 0.2833ms 0.1803ms 5.5471 KOps/s 5.7067 KOps/s $\color{#d91a1a}-2.80\%$
test_compile_add_self_flat[tensorclass-eager] 0.1932ms 63.7792μs 15.6791 KOps/s 15.8233 KOps/s $\color{#d91a1a}-0.91\%$
test_compile_add_self_flat[tensorclass-compile] 79.1590μs 41.0246μs 24.3756 KOps/s 25.0798 KOps/s $\color{#d91a1a}-2.81\%$
test_compile_add_self_flat[pytree-eager] 0.3642ms 0.2513ms 3.9795 KOps/s 4.2061 KOps/s $\textbf{\color{#d91a1a}-5.39\%}$
test_compile_add_self_flat[pytree-compile] 0.3010ms 0.1780ms 5.6187 KOps/s 5.8169 KOps/s $\color{#d91a1a}-3.41\%$
test_compile_copy_flat[tensordict-compile] 0.2701ms 0.1101ms 9.0828 KOps/s 9.3687 KOps/s $\color{#d91a1a}-3.05\%$
test_compile_copy_flat[tensordict-eager] 0.1192ms 56.5721μs 17.6766 KOps/s 18.1291 KOps/s $\color{#d91a1a}-2.50\%$
test_compile_copy_flat[pytree-compile] 0.1531ms 79.2012μs 12.6261 KOps/s 11.9595 KOps/s $\textbf{\color{#35bf28}+5.57\%}$
test_compile_copy_flat[pytree-eager] 0.1380ms 70.2825μs 14.2283 KOps/s 14.0584 KOps/s $\color{#35bf28}+1.21\%$
test_compile_assign_and_add[tensordict-compile] 0.2950ms 0.1918ms 5.2150 KOps/s 5.2640 KOps/s $\color{#d91a1a}-0.93\%$
test_compile_assign_and_add[tensordict-eager] 3.8921ms 1.6936ms 590.4551 Ops/s 613.2494 Ops/s $\color{#d91a1a}-3.72\%$
test_compile_assign_and_add[pytree-compile] 0.3763ms 0.1881ms 5.3155 KOps/s 5.3266 KOps/s $\color{#d91a1a}-0.21\%$
test_compile_assign_and_add[pytree-eager] 1.4652ms 1.1297ms 885.1958 Ops/s 935.5714 Ops/s $\textbf{\color{#d91a1a}-5.38\%}$
test_compile_assign_and_add_stack[compile] 0.6248ms 0.4055ms 2.4662 KOps/s 2.4109 KOps/s $\color{#35bf28}+2.29\%$
test_compile_assign_and_add_stack[eager] 4.3552ms 4.0027ms 249.8329 Ops/s 260.9257 Ops/s $\color{#d91a1a}-4.25\%$
test_compile_indexing[tensor-tensordict-compile] 91.5020μs 33.2263μs 30.0966 KOps/s 30.6204 KOps/s $\color{#d91a1a}-1.71\%$
test_compile_indexing[tensor-tensordict-eager] 1.4968ms 50.0832μs 19.9668 KOps/s 21.0963 KOps/s $\textbf{\color{#d91a1a}-5.35\%}$
test_compile_indexing[tensor-tensorclass-compile] 86.2520μs 29.6650μs 33.7098 KOps/s 36.9389 KOps/s $\textbf{\color{#d91a1a}-8.74\%}$
test_compile_indexing[tensor-tensorclass-eager] 92.1730μs 30.7161μs 32.5562 KOps/s 32.7913 KOps/s $\color{#d91a1a}-0.72\%$
test_compile_indexing[tensor-pytree-compile] 64.4310μs 29.8210μs 33.5334 KOps/s 36.7996 KOps/s $\textbf{\color{#d91a1a}-8.88\%}$
test_compile_indexing[tensor-pytree-eager] 83.2860μs 31.0768μs 32.1783 KOps/s 32.7270 KOps/s $\color{#d91a1a}-1.68\%$
test_compile_indexing[slice-tensordict-compile] 0.1963ms 73.2297μs 13.6557 KOps/s 13.7970 KOps/s $\color{#d91a1a}-1.02\%$
test_compile_indexing[slice-tensordict-eager] 0.7271ms 28.9713μs 34.5169 KOps/s 36.0616 KOps/s $\color{#d91a1a}-4.28\%$
test_compile_indexing[slice-tensorclass-compile] 0.1366ms 69.3495μs 14.4197 KOps/s 14.9320 KOps/s $\color{#d91a1a}-3.43\%$
test_compile_indexing[slice-tensorclass-eager] 67.2360μs 25.3465μs 39.4532 KOps/s 40.8123 KOps/s $\color{#d91a1a}-3.33\%$
test_compile_indexing[slice-pytree-compile] 0.1310ms 68.7588μs 14.5436 KOps/s 15.1598 KOps/s $\color{#d91a1a}-4.06\%$
test_compile_indexing[slice-pytree-eager] 72.6960μs 25.0241μs 39.9615 KOps/s 39.7813 KOps/s $\color{#35bf28}+0.45\%$
test_compile_indexing[int-tensordict-compile] 0.1385ms 73.6514μs 13.5775 KOps/s 13.9787 KOps/s $\color{#d91a1a}-2.87\%$
test_compile_indexing[int-tensordict-eager] 1.0381ms 29.0201μs 34.4589 KOps/s 36.4577 KOps/s $\textbf{\color{#d91a1a}-5.48\%}$
test_compile_indexing[int-tensorclass-compile] 0.1635ms 68.9071μs 14.5123 KOps/s 14.6545 KOps/s $\color{#d91a1a}-0.97\%$
test_compile_indexing[int-tensorclass-eager] 0.1408ms 25.5335μs 39.1643 KOps/s 40.5168 KOps/s $\color{#d91a1a}-3.34\%$
test_compile_indexing[int-pytree-compile] 0.1701ms 69.3363μs 14.4225 KOps/s 14.9700 KOps/s $\color{#d91a1a}-3.66\%$
test_compile_indexing[int-pytree-eager] 73.6760μs 24.6729μs 40.5303 KOps/s 39.5369 KOps/s $\color{#35bf28}+2.51\%$
test_mod_add[eager] 76.8140μs 27.0835μs 36.9228 KOps/s 40.0176 KOps/s $\textbf{\color{#d91a1a}-7.73\%}$
test_mod_add[compile] 87.4340μs 38.0226μs 26.3002 KOps/s 27.6859 KOps/s $\textbf{\color{#d91a1a}-5.01\%}$
test_mod_add[compile-overhead] 81.1020μs 37.3596μs 26.7669 KOps/s 27.3214 KOps/s $\color{#d91a1a}-2.03\%$
test_mod_wrap[eager] 0.4138ms 0.2143ms 4.6659 KOps/s 4.4457 KOps/s $\color{#35bf28}+4.95\%$
test_mod_wrap[compile] 2.0368ms 0.2322ms 4.3073 KOps/s 4.2833 KOps/s $\color{#35bf28}+0.56\%$
test_mod_wrap[compile-overhead] 0.4546ms 0.2266ms 4.4136 KOps/s 4.2088 KOps/s $\color{#35bf28}+4.87\%$
test_mod_wrap_and_backward[eager] 13.8402ms 11.4367ms 87.4379 Ops/s 91.3787 Ops/s $\color{#d91a1a}-4.31\%$
test_mod_wrap_and_backward[compile] 13.6393ms 11.6260ms 86.0140 Ops/s 92.0014 Ops/s $\textbf{\color{#d91a1a}-6.51\%}$
test_mod_wrap_and_backward[compile-overhead] 14.9062ms 12.2259ms 81.7937 Ops/s 91.7629 Ops/s $\textbf{\color{#d91a1a}-10.86\%}$
test_seq_add[eager] 0.1702ms 92.2231μs 10.8433 KOps/s 11.5547 KOps/s $\textbf{\color{#d91a1a}-6.16\%}$
test_seq_add[compile] 0.1560ms 59.7094μs 16.7478 KOps/s 16.6438 KOps/s $\color{#35bf28}+0.62\%$
test_seq_add[compile-overhead] 0.1742ms 59.7875μs 16.7259 KOps/s 16.8632 KOps/s $\color{#d91a1a}-0.81\%$
test_seq_wrap[eager] 0.5076ms 0.3840ms 2.6044 KOps/s 2.6840 KOps/s $\color{#d91a1a}-2.97\%$
test_seq_wrap[compile] 0.5511ms 0.2626ms 3.8076 KOps/s 3.7932 KOps/s $\color{#35bf28}+0.38\%$
test_seq_wrap[compile-overhead] 0.5157ms 0.2630ms 3.8025 KOps/s 3.7904 KOps/s $\color{#35bf28}+0.32\%$
test_func_call_runtime[False-eager] 0.7349ms 0.5507ms 1.8159 KOps/s 1.9015 KOps/s $\color{#d91a1a}-4.50\%$
test_func_call_runtime[False-compile] 0.9221ms 0.4982ms 2.0074 KOps/s 2.0418 KOps/s $\color{#d91a1a}-1.69\%$
test_func_call_runtime[False-compile-overhead] 0.6113ms 0.4929ms 2.0288 KOps/s 2.0451 KOps/s $\color{#d91a1a}-0.80\%$
test_func_call_runtime[True-eager] 1.2247ms 0.7743ms 1.2914 KOps/s 1.3263 KOps/s $\color{#d91a1a}-2.63\%$
test_func_call_runtime[True-compile] 0.5936ms 0.5094ms 1.9631 KOps/s 1.9936 KOps/s $\color{#d91a1a}-1.53\%$
test_func_call_runtime[True-compile-overhead] 0.9091ms 0.5112ms 1.9562 KOps/s 1.9680 KOps/s $\color{#d91a1a}-0.60\%$
test_func_call_cm_runtime[False-eager] 1.5901ms 0.5640ms 1.7730 KOps/s 1.9219 KOps/s $\textbf{\color{#d91a1a}-7.75\%}$
test_func_call_cm_runtime[False-compile] 0.5994ms 0.4930ms 2.0282 KOps/s 2.0502 KOps/s $\color{#d91a1a}-1.07\%$
test_func_call_cm_runtime[False-compile-overhead] 1.0161ms 0.4957ms 2.0175 KOps/s 2.0359 KOps/s $\color{#d91a1a}-0.90\%$
test_func_call_cm_runtime[True-eager] 1.1310ms 0.8992ms 1.1121 KOps/s 1.1133 KOps/s $\color{#d91a1a}-0.11\%$
test_func_call_cm_runtime[True-compile] 1.1447ms 0.8482ms 1.1790 KOps/s 1.1823 KOps/s $\color{#d91a1a}-0.28\%$
test_func_call_cm_runtime[True-compile-overhead] 1.0676ms 0.8460ms 1.1821 KOps/s 1.1892 KOps/s $\color{#d91a1a}-0.60\%$
test_distributed 0.2667ms 0.1294ms 7.7303 KOps/s 7.4154 KOps/s $\color{#35bf28}+4.25\%$
test_tdmodule 93.4250μs 18.2857μs 54.6875 KOps/s 57.9267 KOps/s $\textbf{\color{#d91a1a}-5.59\%}$
test_tdmodule_dispatch 62.8180μs 37.4287μs 26.7175 KOps/s 27.9936 KOps/s $\color{#d91a1a}-4.56\%$
test_tdseq 35.6060μs 20.0617μs 49.8462 KOps/s 52.3524 KOps/s $\color{#d91a1a}-4.79\%$
test_tdseq_dispatch 84.7080μs 41.8683μs 23.8844 KOps/s 24.9460 KOps/s $\color{#d91a1a}-4.26\%$
test_instantiation_functorch 1.9260ms 1.6675ms 599.6831 Ops/s 602.4616 Ops/s $\color{#d91a1a}-0.46\%$
test_instantiation_td 1.9740ms 1.2058ms 829.3365 Ops/s 827.4871 Ops/s $\color{#35bf28}+0.22\%$
test_exec_functorch 0.4057ms 0.1843ms 5.4260 KOps/s 5.5778 KOps/s $\color{#d91a1a}-2.72\%$
test_exec_functional_call 0.3280ms 0.1737ms 5.7556 KOps/s 5.8513 KOps/s $\color{#d91a1a}-1.64\%$
test_exec_td 0.2827ms 0.1745ms 5.7307 KOps/s 5.7649 KOps/s $\color{#d91a1a}-0.59\%$
test_exec_td_decorator 0.8630ms 0.2285ms 4.3754 KOps/s 4.4504 KOps/s $\color{#d91a1a}-1.68\%$
test_vmap_mlp_speed[True-True] 0.8393ms 0.5939ms 1.6839 KOps/s 1.7322 KOps/s $\color{#d91a1a}-2.79\%$
test_vmap_mlp_speed[True-False] 0.8583ms 0.5863ms 1.7056 KOps/s 1.7624 KOps/s $\color{#d91a1a}-3.22\%$
test_vmap_mlp_speed[False-True] 0.7927ms 0.4878ms 2.0499 KOps/s 2.1271 KOps/s $\color{#d91a1a}-3.63\%$
test_vmap_mlp_speed[False-False] 0.6706ms 0.4882ms 2.0484 KOps/s 2.1316 KOps/s $\color{#d91a1a}-3.90\%$
test_vmap_mlp_speed_decorator[True-True] 0.9491ms 0.6457ms 1.5488 KOps/s 1.5985 KOps/s $\color{#d91a1a}-3.11\%$
test_vmap_mlp_speed_decorator[True-False] 1.0580ms 0.6451ms 1.5501 KOps/s 1.6083 KOps/s $\color{#d91a1a}-3.62\%$
test_vmap_mlp_speed_decorator[False-True] 0.8574ms 0.5314ms 1.8819 KOps/s 1.9345 KOps/s $\color{#d91a1a}-2.72\%$
test_vmap_mlp_speed_decorator[False-False] 0.7380ms 0.5316ms 1.8811 KOps/s 1.9440 KOps/s $\color{#d91a1a}-3.24\%$
test_to_module_speed[True] 2.2819ms 1.3249ms 754.7476 Ops/s 754.8753 Ops/s $\color{#d91a1a}-0.02\%$
test_to_module_speed[False] 2.1091ms 1.2911ms 774.5321 Ops/s 770.4757 Ops/s $\color{#35bf28}+0.53\%$
test_tc_init 77.6560μs 46.1553μs 21.6660 KOps/s 22.0187 KOps/s $\color{#d91a1a}-1.60\%$
test_tc_init_nested 0.1738ms 94.8299μs 10.5452 KOps/s 10.9411 KOps/s $\color{#d91a1a}-3.62\%$
test_tc_first_layer_tensor 23.8140μs 1.4859μs 672.9845 KOps/s 690.6866 KOps/s $\color{#d91a1a}-2.56\%$
test_tc_first_layer_nontensor 28.7830μs 4.2970μs 232.7203 KOps/s 236.0247 KOps/s $\color{#d91a1a}-1.40\%$
test_tc_second_layer_tensor 39.8640μs 2.6909μs 371.6186 KOps/s 369.1007 KOps/s $\color{#35bf28}+0.68\%$
test_tc_second_layer_nontensor 29.0140μs 5.5165μs 181.2747 KOps/s 181.5853 KOps/s $\color{#d91a1a}-0.17\%$
test_unbind 0.4549s 17.2802ms 57.8698 Ops/s 67.5783 Ops/s $\textbf{\color{#d91a1a}-14.37\%}$
test_full_like 17.3406ms 11.3495ms 88.1096 Ops/s 141.6104 Ops/s $\textbf{\color{#d91a1a}-37.78\%}$
test_zeros_like 15.4088ms 7.1767ms 139.3399 Ops/s 131.9552 Ops/s $\textbf{\color{#35bf28}+5.60\%}$
test_ones_like 12.1370ms 7.4477ms 134.2704 Ops/s 128.3583 Ops/s $\color{#35bf28}+4.61\%$
test_clone 16.9052ms 8.8800ms 112.6130 Ops/s 102.5625 Ops/s $\textbf{\color{#35bf28}+9.80\%}$
test_squeeze 73.4470μs 13.4700μs 74.2388 KOps/s 76.6820 KOps/s $\color{#d91a1a}-3.19\%$
test_unsqueeze 0.3276ms 96.2925μs 10.3850 KOps/s 10.7905 KOps/s $\color{#d91a1a}-3.76\%$
test_split 0.3920ms 0.2039ms 4.9050 KOps/s 4.9867 KOps/s $\color{#d91a1a}-1.64\%$
test_permute 0.3700ms 0.2205ms 4.5355 KOps/s 4.5835 KOps/s $\color{#d91a1a}-1.05\%$
test_stack 29.5706ms 24.1686ms 41.3760 Ops/s 41.6282 Ops/s $\color{#d91a1a}-0.61\%$
test_cat 29.5340ms 24.2424ms 41.2500 Ops/s 41.5835 Ops/s $\color{#d91a1a}-0.80\%$

Copy link

github-actions bot commented Aug 9, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 225. Improved: $\large\color{#35bf28}34$. Worsened: $\large\color{#d91a1a}9$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.2985ms 16.5736μs 60.3371 KOps/s 55.5278 KOps/s $\textbf{\color{#35bf28}+8.66\%}$
test_plain_set_stack_nested 34.9100μs 16.6564μs 60.0369 KOps/s 54.9337 KOps/s $\textbf{\color{#35bf28}+9.29\%}$
test_plain_set_nested_inplace 0.2094ms 17.8321μs 56.0788 KOps/s 52.2699 KOps/s $\textbf{\color{#35bf28}+7.29\%}$
test_plain_set_stack_nested_inplace 43.7110μs 17.7074μs 56.4734 KOps/s 50.8548 KOps/s $\textbf{\color{#35bf28}+11.05\%}$
test_items 0.1936ms 4.7056μs 212.5136 KOps/s 212.4430 KOps/s $\color{#35bf28}+0.03\%$
test_items_nested 0.5555ms 0.3620ms 2.7621 KOps/s 2.6930 KOps/s $\color{#35bf28}+2.56\%$
test_items_nested_locked 0.5455ms 0.3636ms 2.7501 KOps/s 2.6494 KOps/s $\color{#35bf28}+3.80\%$
test_items_nested_leaf 0.1123ms 84.2212μs 11.8735 KOps/s 11.6745 KOps/s $\color{#35bf28}+1.70\%$
test_items_stack_nested 0.5583ms 0.3606ms 2.7731 KOps/s 2.7039 KOps/s $\color{#35bf28}+2.56\%$
test_items_stack_nested_leaf 0.2812ms 85.7132μs 11.6668 KOps/s 11.5298 KOps/s $\color{#35bf28}+1.19\%$
test_items_stack_nested_locked 0.5491ms 0.3649ms 2.7402 KOps/s 2.7010 KOps/s $\color{#35bf28}+1.45\%$
test_keys 0.1883ms 4.3702μs 228.8200 KOps/s 227.7542 KOps/s $\color{#35bf28}+0.47\%$
test_keys_nested 0.2577ms 66.8962μs 14.9485 KOps/s 15.2018 KOps/s $\color{#d91a1a}-1.67\%$
test_keys_nested_locked 0.9129ms 72.3059μs 13.8301 KOps/s 13.8115 KOps/s $\color{#35bf28}+0.14\%$
test_keys_nested_leaf 74.9310μs 57.2151μs 17.4779 KOps/s 17.4365 KOps/s $\color{#35bf28}+0.24\%$
test_keys_stack_nested 0.2620ms 67.3925μs 14.8385 KOps/s 14.9438 KOps/s $\color{#d91a1a}-0.70\%$
test_keys_stack_nested_leaf 0.2537ms 57.6332μs 17.3511 KOps/s 17.1302 KOps/s $\color{#35bf28}+1.29\%$
test_keys_stack_nested_locked 0.2674ms 72.7481μs 13.7461 KOps/s 13.6921 KOps/s $\color{#35bf28}+0.39\%$
test_values 63.9513μs 1.7771μs 562.7289 KOps/s 561.0806 KOps/s $\color{#35bf28}+0.29\%$
test_values_nested 0.2169ms 33.7256μs 29.6510 KOps/s 29.4816 KOps/s $\color{#35bf28}+0.57\%$
test_values_nested_locked 0.2204ms 35.6307μs 28.0657 KOps/s 27.6198 KOps/s $\color{#35bf28}+1.61\%$
test_values_nested_leaf 52.7610μs 30.0000μs 33.3333 KOps/s 32.7862 KOps/s $\color{#35bf28}+1.67\%$
test_values_stack_nested 0.2274ms 34.4075μs 29.0635 KOps/s 29.1783 KOps/s $\color{#d91a1a}-0.39\%$
test_values_stack_nested_leaf 0.2188ms 30.3418μs 32.9579 KOps/s 32.5252 KOps/s $\color{#35bf28}+1.33\%$
test_values_stack_nested_locked 0.2231ms 36.2816μs 27.5622 KOps/s 27.6465 KOps/s $\color{#d91a1a}-0.30\%$
test_membership 1.2836μs 0.5389μs 1.8557 MOps/s 1.7751 MOps/s $\color{#35bf28}+4.54\%$
test_membership_nested 0.1039ms 1.9581μs 510.6983 KOps/s 511.4504 KOps/s $\color{#d91a1a}-0.15\%$
test_membership_nested_leaf 99.9565μs 1.9543μs 511.6888 KOps/s 504.1702 KOps/s $\color{#35bf28}+1.49\%$
test_membership_stacked_nested 26.9600μs 2.0045μs 498.8844 KOps/s 504.0078 KOps/s $\color{#d91a1a}-1.02\%$
test_membership_stacked_nested_leaf 17.6310μs 2.0085μs 497.8840 KOps/s 497.3662 KOps/s $\color{#35bf28}+0.10\%$
test_membership_nested_last 0.1941ms 2.9204μs 342.4173 KOps/s 340.4623 KOps/s $\color{#35bf28}+0.57\%$
test_membership_nested_leaf_last 16.1000μs 2.9445μs 339.6205 KOps/s 340.0258 KOps/s $\color{#d91a1a}-0.12\%$
test_membership_stacked_nested_last 27.6310μs 2.9637μs 337.4202 KOps/s 343.4002 KOps/s $\color{#d91a1a}-1.74\%$
test_membership_stacked_nested_leaf_last 0.2000ms 2.9322μs 341.0416 KOps/s 341.7601 KOps/s $\color{#d91a1a}-0.21\%$
test_nested_getleaf 0.1986ms 7.9156μs 126.3327 KOps/s 126.4926 KOps/s $\color{#d91a1a}-0.13\%$
test_nested_get 22.5800μs 7.4479μs 134.2667 KOps/s 133.5551 KOps/s $\color{#35bf28}+0.53\%$
test_stacked_getleaf 0.2035ms 7.9687μs 125.4914 KOps/s 125.6733 KOps/s $\color{#d91a1a}-0.14\%$
test_stacked_get 21.5700μs 7.4852μs 133.5974 KOps/s 135.2651 KOps/s $\color{#d91a1a}-1.23\%$
test_nested_getitemleaf 0.1976ms 8.1207μs 123.1421 KOps/s 121.8433 KOps/s $\color{#35bf28}+1.07\%$
test_nested_getitem 21.2100μs 7.7050μs 129.7862 KOps/s 130.1232 KOps/s $\color{#d91a1a}-0.26\%$
test_stacked_getitemleaf 30.7000μs 8.2027μs 121.9114 KOps/s 121.9459 KOps/s $\color{#d91a1a}-0.03\%$
test_stacked_getitem 0.2018ms 7.7070μs 129.7521 KOps/s 130.0029 KOps/s $\color{#d91a1a}-0.19\%$
test_lock_nested 1.2952ms 0.4824ms 2.0731 KOps/s 2.0534 KOps/s $\color{#35bf28}+0.96\%$
test_lock_stack_nested 0.5010ms 0.4440ms 2.2525 KOps/s 2.2398 KOps/s $\color{#35bf28}+0.57\%$
test_unlock_nested 0.9116ms 0.4025ms 2.4842 KOps/s 2.4594 KOps/s $\color{#35bf28}+1.01\%$
test_unlock_stack_nested 0.5014ms 0.3642ms 2.7458 KOps/s 2.7322 KOps/s $\color{#35bf28}+0.50\%$
test_flatten_speed 0.3118ms 0.1068ms 9.3661 KOps/s 9.5279 KOps/s $\color{#d91a1a}-1.70\%$
test_unflatten_speed 0.5123ms 0.3190ms 3.1352 KOps/s 3.1369 KOps/s $\color{#d91a1a}-0.06\%$
test_common_ops 1.7399ms 1.4389ms 694.9654 Ops/s 691.7279 Ops/s $\color{#35bf28}+0.47\%$
test_creation 19.7110μs 1.6578μs 603.2191 KOps/s 592.2792 KOps/s $\color{#35bf28}+1.85\%$
test_creation_empty 43.3610μs 16.3219μs 61.2673 KOps/s 51.9830 KOps/s $\textbf{\color{#35bf28}+17.86\%}$
test_creation_nested_1 35.7010μs 18.6457μs 53.6315 KOps/s 47.5933 KOps/s $\textbf{\color{#35bf28}+12.69\%}$
test_creation_nested_2 0.2233ms 21.5330μs 46.4403 KOps/s 40.2115 KOps/s $\textbf{\color{#35bf28}+15.49\%}$
test_clone 54.0510μs 32.8600μs 30.4322 KOps/s 30.1221 KOps/s $\color{#35bf28}+1.03\%$
test_getitem[int] 1.0586ms 18.2880μs 54.6807 KOps/s 53.0161 KOps/s $\color{#35bf28}+3.14\%$
test_getitem[slice_int] 0.2430ms 30.3968μs 32.8982 KOps/s 32.1034 KOps/s $\color{#35bf28}+2.48\%$
test_getitem[range] 0.2904ms 0.1184ms 8.4469 KOps/s 8.5596 KOps/s $\color{#d91a1a}-1.32\%$
test_getitem[tuple] 0.1444ms 26.6580μs 37.5122 KOps/s 36.2181 KOps/s $\color{#35bf28}+3.57\%$
test_getitem[list] 0.4035ms 0.1074ms 9.3119 KOps/s 8.7844 KOps/s $\textbf{\color{#35bf28}+6.01\%}$
test_setitem_dim[int] 78.2420μs 55.8493μs 17.9053 KOps/s 15.7706 KOps/s $\textbf{\color{#35bf28}+13.54\%}$
test_setitem_dim[slice_int] 0.1021ms 79.6506μs 12.5548 KOps/s 11.6263 KOps/s $\textbf{\color{#35bf28}+7.99\%}$
test_setitem_dim[range] 0.1671ms 0.1432ms 6.9857 KOps/s 6.4248 KOps/s $\textbf{\color{#35bf28}+8.73\%}$
test_setitem_dim[tuple] 0.2227ms 73.6526μs 13.5773 KOps/s 12.2162 KOps/s $\textbf{\color{#35bf28}+11.14\%}$
test_setitem 0.1914ms 45.4787μs 21.9883 KOps/s 20.5316 KOps/s $\textbf{\color{#35bf28}+7.10\%}$
test_set 0.2389ms 44.8307μs 22.3062 KOps/s 19.8443 KOps/s $\textbf{\color{#35bf28}+12.41\%}$
test_set_shared 0.3922ms 56.5816μs 17.6736 KOps/s 17.2435 KOps/s $\color{#35bf28}+2.49\%$
test_update 0.2032ms 53.3731μs 18.7360 KOps/s 17.4172 KOps/s $\textbf{\color{#35bf28}+7.57\%}$
test_update_nested 0.2590ms 61.3139μs 16.3095 KOps/s 14.4105 KOps/s $\textbf{\color{#35bf28}+13.18\%}$
test_update__nested 0.2675ms 65.6733μs 15.2269 KOps/s 14.2046 KOps/s $\textbf{\color{#35bf28}+7.20\%}$
test_set_nested 0.2478ms 47.8081μs 20.9170 KOps/s 19.5774 KOps/s $\textbf{\color{#35bf28}+6.84\%}$
test_set_nested_new 0.2033ms 51.8052μs 19.3031 KOps/s 17.5298 KOps/s $\textbf{\color{#35bf28}+10.12\%}$
test_select 0.2682ms 66.2801μs 15.0875 KOps/s 13.7352 KOps/s $\textbf{\color{#35bf28}+9.85\%}$
test_select_nested 74.4010μs 51.1689μs 19.5431 KOps/s 19.2942 KOps/s $\color{#35bf28}+1.29\%$
test_exclude_nested 0.2569ms 69.9652μs 14.2928 KOps/s 14.3221 KOps/s $\color{#d91a1a}-0.20\%$
test_empty[True] 0.4803ms 0.2838ms 3.5238 KOps/s 3.4937 KOps/s $\color{#35bf28}+0.86\%$
test_empty[False] 19.2453μs 0.8658μs 1.1550 MOps/s 1.1052 MOps/s $\color{#35bf28}+4.51\%$
test_to 49.1610μs 28.4711μs 35.1233 KOps/s 35.4237 KOps/s $\color{#d91a1a}-0.85\%$
test_to_nonblocking 0.2264ms 28.3789μs 35.2375 KOps/s 37.3586 KOps/s $\textbf{\color{#d91a1a}-5.68\%}$
test_unbind_speed 1.1815ms 0.3165ms 3.1591 KOps/s 3.1614 KOps/s $\color{#d91a1a}-0.07\%$
test_unbind_speed_stack0 0.5105ms 0.3169ms 3.1553 KOps/s 3.1718 KOps/s $\color{#d91a1a}-0.52\%$
test_unbind_speed_stack1 89.6224ms 0.7871ms 1.2704 KOps/s 1.3758 KOps/s $\textbf{\color{#d91a1a}-7.66\%}$
test_split 92.4675ms 2.4759ms 403.9010 Ops/s 403.7185 Ops/s $\color{#35bf28}+0.05\%$
test_chunk 92.7088ms 2.4837ms 402.6261 Ops/s 404.6127 Ops/s $\color{#d91a1a}-0.49\%$
test_creation[device0] 0.2493ms 0.1073ms 9.3208 KOps/s 9.3143 KOps/s $\color{#35bf28}+0.07\%$
test_creation_from_tensor 0.3158ms 0.1047ms 9.5501 KOps/s 9.2820 KOps/s $\color{#35bf28}+2.89\%$
test_add_one[memmap_tensor0] 0.1288ms 10.0660μs 99.3444 KOps/s 99.1214 KOps/s $\color{#35bf28}+0.22\%$
test_contiguous[memmap_tensor0] 0.1904ms 2.3505μs 425.4451 KOps/s 421.8040 KOps/s $\color{#35bf28}+0.86\%$
test_stack[memmap_tensor0] 33.6510μs 7.2983μs 137.0184 KOps/s 130.6959 KOps/s $\color{#35bf28}+4.84\%$
test_memmaptd_index 1.2428ms 0.4770ms 2.0966 KOps/s 2.0861 KOps/s $\color{#35bf28}+0.51\%$
test_memmaptd_index_astensor 0.7904ms 0.5431ms 1.8412 KOps/s 1.8512 KOps/s $\color{#d91a1a}-0.54\%$
test_memmaptd_index_op 1.5526ms 1.1397ms 877.4285 Ops/s 839.8908 Ops/s $\color{#35bf28}+4.47\%$
test_serialize_model 93.7375ms 89.2760ms 11.2012 Ops/s 10.8196 Ops/s $\color{#35bf28}+3.53\%$
test_serialize_model_pickle 1.3491s 1.2362s 0.8089 Ops/s 0.8083 Ops/s $\color{#35bf28}+0.07\%$
test_serialize_weights 0.1826s 96.6039ms 10.3515 Ops/s 9.7019 Ops/s $\textbf{\color{#35bf28}+6.70\%}$
test_serialize_weights_returnearly 0.2732s 67.8955ms 14.7285 Ops/s 14.9354 Ops/s $\color{#d91a1a}-1.39\%$
test_serialize_weights_pickle 1.3550s 1.2370s 0.8084 Ops/s 0.8029 Ops/s $\color{#35bf28}+0.69\%$
test_reshape_pytree 0.1197ms 39.8769μs 25.0772 KOps/s 24.4544 KOps/s $\color{#35bf28}+2.55\%$
test_reshape_td 0.1482ms 50.0326μs 19.9870 KOps/s 20.9994 KOps/s $\color{#d91a1a}-4.82\%$
test_view_pytree 85.0420μs 41.0625μs 24.3531 KOps/s 24.7627 KOps/s $\color{#d91a1a}-1.65\%$
test_view_td 0.1963ms 51.9561μs 19.2470 KOps/s 18.7668 KOps/s $\color{#35bf28}+2.56\%$
test_unbind_pytree 0.1782ms 39.4786μs 25.3302 KOps/s 25.0898 KOps/s $\color{#35bf28}+0.96\%$
test_unbind_td 0.3942ms 48.4505μs 20.6396 KOps/s 20.5387 KOps/s $\color{#35bf28}+0.49\%$
test_split_pytree 0.2301ms 55.1450μs 18.1340 KOps/s 18.6660 KOps/s $\color{#d91a1a}-2.85\%$
test_split_td 0.5602ms 64.4068μs 15.5263 KOps/s 15.7340 KOps/s $\color{#d91a1a}-1.32\%$
test_add_pytree 0.2356ms 66.7445μs 14.9825 KOps/s 15.9179 KOps/s $\textbf{\color{#d91a1a}-5.88\%}$
test_add_td 0.2743ms 98.7287μs 10.1288 KOps/s 9.3595 KOps/s $\textbf{\color{#35bf28}+8.22\%}$
test_compile_add_one_nested[tensordict-compile] 0.4234ms 0.2238ms 4.4681 KOps/s 4.4062 KOps/s $\color{#35bf28}+1.40\%$
test_compile_add_one_nested[tensordict-eager] 0.3220ms 0.1788ms 5.5918 KOps/s 5.5808 KOps/s $\color{#35bf28}+0.20\%$
test_compile_add_one_nested[pytree-compile] 0.3373ms 0.1554ms 6.4332 KOps/s 6.4438 KOps/s $\color{#d91a1a}-0.16\%$
test_compile_add_one_nested[pytree-eager] 0.3841ms 0.2025ms 4.9371 KOps/s 4.7727 KOps/s $\color{#35bf28}+3.45\%$
test_compile_copy_nested[tensordict-compile] 0.2203ms 23.3914μs 42.7508 KOps/s 41.8699 KOps/s $\color{#35bf28}+2.10\%$
test_compile_copy_nested[tensordict-eager] 87.3710μs 50.0557μs 19.9778 KOps/s 19.8509 KOps/s $\color{#35bf28}+0.64\%$
test_compile_copy_nested[pytree-compile] 0.2791ms 74.5599μs 13.4120 KOps/s 13.4291 KOps/s $\color{#d91a1a}-0.13\%$
test_compile_copy_nested[pytree-eager] 85.9410μs 59.8062μs 16.7207 KOps/s 16.7698 KOps/s $\color{#d91a1a}-0.29\%$
test_compile_add_one_flat[tensordict-compile] 0.4744ms 0.3438ms 2.9091 KOps/s 2.8936 KOps/s $\color{#35bf28}+0.53\%$
test_compile_add_one_flat[tensordict-eager] 0.3701ms 0.2257ms 4.4316 KOps/s 4.4517 KOps/s $\color{#d91a1a}-0.45\%$
test_compile_add_one_flat[tensorclass-compile] 0.2899ms 0.1386ms 7.2143 KOps/s 6.9034 KOps/s $\color{#35bf28}+4.50\%$
test_compile_add_one_flat[tensorclass-eager] 0.2058ms 64.7664μs 15.4401 KOps/s 15.2622 KOps/s $\color{#35bf28}+1.17\%$
test_compile_add_one_flat[pytree-compile] 0.4503ms 0.3428ms 2.9174 KOps/s 2.9022 KOps/s $\color{#35bf28}+0.52\%$
test_compile_add_one_flat[pytree-eager] 0.8195ms 0.6725ms 1.4869 KOps/s 1.4398 KOps/s $\color{#35bf28}+3.27\%$
test_compile_add_self_flat[tensordict-eager] 0.4203ms 0.2740ms 3.6492 KOps/s 3.6039 KOps/s $\color{#35bf28}+1.26\%$
test_compile_add_self_flat[tensordict-compile] 0.4694ms 0.3481ms 2.8724 KOps/s 2.8771 KOps/s $\color{#d91a1a}-0.16\%$
test_compile_add_self_flat[tensorclass-eager] 0.2306ms 76.0074μs 13.1566 KOps/s 13.0137 KOps/s $\color{#35bf28}+1.10\%$
test_compile_add_self_flat[tensorclass-compile] 0.2878ms 0.1385ms 7.2203 KOps/s 6.8739 KOps/s $\textbf{\color{#35bf28}+5.04\%}$
test_compile_add_self_flat[pytree-eager] 0.7200ms 0.5741ms 1.7418 KOps/s 1.6861 KOps/s $\color{#35bf28}+3.30\%$
test_compile_add_self_flat[pytree-compile] 0.4519ms 0.3432ms 2.9140 KOps/s 2.9016 KOps/s $\color{#35bf28}+0.43\%$
test_compile_copy_flat[tensordict-compile] 0.1613ms 20.3449μs 49.1523 KOps/s 48.7364 KOps/s $\color{#35bf28}+0.85\%$
test_compile_copy_flat[tensordict-eager] 64.4810μs 32.5812μs 30.6925 KOps/s 30.4628 KOps/s $\color{#35bf28}+0.75\%$
test_compile_copy_flat[pytree-compile] 0.1842ms 77.0855μs 12.9726 KOps/s 12.8907 KOps/s $\color{#35bf28}+0.64\%$
test_compile_copy_flat[pytree-eager] 93.1320μs 60.0547μs 16.6515 KOps/s 16.4629 KOps/s $\color{#35bf28}+1.15\%$
test_compile_assign_and_add[tensordict-compile] 2.5072ms 0.8737ms 1.1446 KOps/s 1.0632 KOps/s $\textbf{\color{#35bf28}+7.66\%}$
test_compile_assign_and_add[tensordict-eager] 3.7284ms 3.5239ms 283.7751 Ops/s 281.1193 Ops/s $\color{#35bf28}+0.94\%$
test_compile_assign_and_add[pytree-compile] 2.5108ms 0.8612ms 1.1612 KOps/s 1.0682 KOps/s $\textbf{\color{#35bf28}+8.71\%}$
test_compile_assign_and_add[pytree-eager] 3.9175ms 3.6066ms 277.2728 Ops/s 276.1444 Ops/s $\color{#35bf28}+0.41\%$
test_compile_indexing[tensor-tensordict-compile] 0.2632ms 0.1195ms 8.3662 KOps/s 8.0101 KOps/s $\color{#35bf28}+4.45\%$
test_compile_indexing[tensor-tensordict-eager] 0.3107ms 66.8198μs 14.9656 KOps/s 14.3023 KOps/s $\color{#35bf28}+4.64\%$
test_compile_indexing[tensor-tensorclass-compile] 0.2723ms 0.1159ms 8.6295 KOps/s 8.6640 KOps/s $\color{#d91a1a}-0.40\%$
test_compile_indexing[tensor-tensorclass-eager] 0.2352ms 51.4709μs 19.4285 KOps/s 20.5001 KOps/s $\textbf{\color{#d91a1a}-5.23\%}$
test_compile_indexing[tensor-pytree-compile] 0.2801ms 0.1155ms 8.6605 KOps/s 8.8064 KOps/s $\color{#d91a1a}-1.66\%$
test_compile_indexing[tensor-pytree-eager] 0.2409ms 51.2303μs 19.5197 KOps/s 20.2853 KOps/s $\color{#d91a1a}-3.77\%$
test_compile_indexing[slice-tensordict-compile] 0.3695ms 0.1524ms 6.5599 KOps/s 6.5622 KOps/s $\color{#d91a1a}-0.03\%$
test_compile_indexing[slice-tensordict-eager] 0.2396ms 29.3606μs 34.0592 KOps/s 35.0555 KOps/s $\color{#d91a1a}-2.84\%$
test_compile_indexing[slice-tensorclass-compile] 0.2891ms 0.1407ms 7.1095 KOps/s 6.8271 KOps/s $\color{#35bf28}+4.14\%$
test_compile_indexing[slice-tensorclass-eager] 0.1586ms 23.6932μs 42.2062 KOps/s 40.7193 KOps/s $\color{#35bf28}+3.65\%$
test_compile_indexing[slice-pytree-compile] 0.3015ms 0.1405ms 7.1167 KOps/s 6.8680 KOps/s $\color{#35bf28}+3.62\%$
test_compile_indexing[slice-pytree-eager] 0.2254ms 23.8495μs 41.9296 KOps/s 41.5179 KOps/s $\color{#35bf28}+0.99\%$
test_compile_indexing[int-tensordict-compile] 0.4086ms 0.1514ms 6.6032 KOps/s 6.4556 KOps/s $\color{#35bf28}+2.29\%$
test_compile_indexing[int-tensordict-eager] 0.5075ms 27.5010μs 36.3623 KOps/s 35.4150 KOps/s $\color{#35bf28}+2.67\%$
test_compile_indexing[int-tensorclass-compile] 0.2957ms 0.1407ms 7.1069 KOps/s 7.0473 KOps/s $\color{#35bf28}+0.85\%$
test_compile_indexing[int-tensorclass-eager] 0.1350ms 25.1764μs 39.7197 KOps/s 40.8442 KOps/s $\color{#d91a1a}-2.75\%$
test_compile_indexing[int-pytree-compile] 0.3200ms 0.1465ms 6.8245 KOps/s 7.1011 KOps/s $\color{#d91a1a}-3.89\%$
test_compile_indexing[int-pytree-eager] 53.8500μs 25.3930μs 39.3809 KOps/s 41.6349 KOps/s $\textbf{\color{#d91a1a}-5.41\%}$
test_mod_add[eager] 0.2093ms 35.4128μs 28.2384 KOps/s 27.8625 KOps/s $\color{#35bf28}+1.35\%$
test_mod_add[compile] 0.2072ms 74.9678μs 13.3391 KOps/s 13.1028 KOps/s $\color{#35bf28}+1.80\%$
test_mod_add[compile-overhead] 0.2707ms 0.1425ms 7.0183 KOps/s 6.4285 KOps/s $\textbf{\color{#35bf28}+9.17\%}$
test_mod_wrap[eager] 0.4012ms 0.2489ms 4.0180 KOps/s 3.8881 KOps/s $\color{#35bf28}+3.34\%$
test_mod_wrap[compile] 1.2326ms 0.3070ms 3.2578 KOps/s 3.1664 KOps/s $\color{#35bf28}+2.89\%$
test_mod_wrap[compile-overhead] 8.4132ms 4.3819ms 228.2117 Ops/s 231.6199 Ops/s $\color{#d91a1a}-1.47\%$
test_mod_wrap_and_backward[eager] 1.5631ms 1.3902ms 719.3127 Ops/s 704.6603 Ops/s $\color{#35bf28}+2.08\%$
test_mod_wrap_and_backward[compile] 1.5439ms 1.3892ms 719.8564 Ops/s 662.5210 Ops/s $\textbf{\color{#35bf28}+8.65\%}$
test_mod_wrap_and_backward[compile-overhead] 1.3375ms 0.9270ms 1.0787 KOps/s 966.9465 Ops/s $\textbf{\color{#35bf28}+11.56\%}$
test_seq_add[eager] 0.2529ms 0.1043ms 9.5914 KOps/s 9.2046 KOps/s $\color{#35bf28}+4.20\%$
test_seq_add[compile] 0.2347ms 87.3096μs 11.4535 KOps/s 11.0183 KOps/s $\color{#35bf28}+3.95\%$
test_seq_add[compile-overhead] 0.2659ms 0.1246ms 8.0261 KOps/s 7.9537 KOps/s $\color{#35bf28}+0.91\%$
test_seq_wrap[eager] 0.5615ms 0.4048ms 2.4705 KOps/s 2.4129 KOps/s $\color{#35bf28}+2.39\%$
test_seq_wrap[compile] 0.5138ms 0.3349ms 2.9860 KOps/s 2.9863 KOps/s $-0.01\%$
test_seq_wrap[compile-overhead] 0.3822ms 0.2422ms 4.1280 KOps/s 4.1108 KOps/s $\color{#35bf28}+0.42\%$
test_func_call_runtime[False-eager] 0.9049ms 0.7605ms 1.3149 KOps/s 1.2982 KOps/s $\color{#35bf28}+1.29\%$
test_func_call_runtime[False-compile] 1.0351ms 0.8683ms 1.1516 KOps/s 1.1880 KOps/s $\color{#d91a1a}-3.06\%$
test_func_call_runtime[False-compile-overhead] 0.4722ms 0.3850ms 2.5972 KOps/s 2.5764 KOps/s $\color{#35bf28}+0.81\%$
test_func_call_runtime[True-eager] 1.1765ms 0.9584ms 1.0434 KOps/s 1.0407 KOps/s $\color{#35bf28}+0.25\%$
test_func_call_runtime[True-compile] 1.0188ms 0.8782ms 1.1387 KOps/s 1.1280 KOps/s $\color{#35bf28}+0.95\%$
test_func_call_runtime[True-compile-overhead] 0.6080ms 0.4315ms 2.3176 KOps/s 2.3132 KOps/s $\color{#35bf28}+0.19\%$
test_func_call_cm_runtime[False-eager] 0.9634ms 0.7907ms 1.2647 KOps/s 1.3024 KOps/s $\color{#d91a1a}-2.89\%$
test_func_call_cm_runtime[False-compile] 1.0574ms 0.8470ms 1.1806 KOps/s 1.1804 KOps/s $\color{#35bf28}+0.01\%$
test_func_call_cm_runtime[False-compile-overhead] 0.5384ms 0.3879ms 2.5779 KOps/s 2.5459 KOps/s $\color{#35bf28}+1.26\%$
test_func_call_cm_runtime[True-eager] 1.2330ms 1.0636ms 940.2349 Ops/s 927.3585 Ops/s $\color{#35bf28}+1.39\%$
test_func_call_cm_runtime[True-compile] 1.1852ms 1.0387ms 962.7831 Ops/s 947.7602 Ops/s $\color{#35bf28}+1.59\%$
test_func_call_cm_runtime[True-compile-overhead] 1.1877ms 1.0315ms 969.4998 Ops/s 946.4511 Ops/s $\color{#35bf28}+2.44\%$
test_distributed 1.2916ms 71.7629μs 13.9348 KOps/s 14.6363 KOps/s $\color{#d91a1a}-4.79\%$
test_tdmodule 69.8810μs 16.0064μs 62.4751 KOps/s 55.8470 KOps/s $\textbf{\color{#35bf28}+11.87\%}$
test_tdmodule_dispatch 53.6600μs 31.8687μs 31.3788 KOps/s 27.9016 KOps/s $\textbf{\color{#35bf28}+12.46\%}$
test_tdseq 33.3310μs 16.5918μs 60.2708 KOps/s 54.0815 KOps/s $\textbf{\color{#35bf28}+11.44\%}$
test_tdseq_dispatch 50.7410μs 33.1169μs 30.1961 KOps/s 26.5467 KOps/s $\textbf{\color{#35bf28}+13.75\%}$
test_instantiation_functorch 2.2553ms 2.0908ms 478.2768 Ops/s 478.6613 Ops/s $\color{#d91a1a}-0.08\%$
test_instantiation_td 2.0537ms 1.3550ms 737.9967 Ops/s 737.9661 Ops/s $+0.00\%$
test_exec_functorch 0.3749ms 0.2355ms 4.2455 KOps/s 4.2760 KOps/s $\color{#d91a1a}-0.71\%$
test_exec_functional_call 0.4148ms 0.2393ms 4.1784 KOps/s 4.4427 KOps/s $\textbf{\color{#d91a1a}-5.95\%}$
test_exec_td 0.4373ms 0.2437ms 4.1032 KOps/s 4.2906 KOps/s $\color{#d91a1a}-4.37\%$
test_exec_td_decorator 0.6028ms 0.2874ms 3.4790 KOps/s 3.4770 KOps/s $\color{#35bf28}+0.06\%$
test_vmap_mlp_speed[True-True] 0.8229ms 0.6610ms 1.5129 KOps/s 1.4897 KOps/s $\color{#35bf28}+1.55\%$
test_vmap_mlp_speed[True-False] 0.8804ms 0.6877ms 1.4542 KOps/s 1.5008 KOps/s $\color{#d91a1a}-3.10\%$
test_vmap_mlp_speed[False-True] 0.7819ms 0.6091ms 1.6417 KOps/s 1.7231 KOps/s $\color{#d91a1a}-4.72\%$
test_vmap_mlp_speed[False-False] 0.7695ms 0.6105ms 1.6379 KOps/s 1.7231 KOps/s $\color{#d91a1a}-4.95\%$
test_vmap_mlp_speed_decorator[True-True] 1.3559ms 0.7125ms 1.4034 KOps/s 1.3838 KOps/s $\color{#35bf28}+1.41\%$
test_vmap_mlp_speed_decorator[True-False] 0.9632ms 0.7391ms 1.3530 KOps/s 1.3835 KOps/s $\color{#d91a1a}-2.20\%$
test_vmap_mlp_speed_decorator[False-True] 0.8326ms 0.6491ms 1.5407 KOps/s 1.6051 KOps/s $\color{#d91a1a}-4.01\%$
test_vmap_mlp_speed_decorator[False-False] 0.7998ms 0.6310ms 1.5847 KOps/s 1.5997 KOps/s $\color{#d91a1a}-0.94\%$
test_vmap_transformer_speed[True-True] 9.3381ms 8.8883ms 112.5076 Ops/s 111.7936 Ops/s $\color{#35bf28}+0.64\%$
test_vmap_transformer_speed[True-False] 9.2134ms 8.8588ms 112.8820 Ops/s 112.5895 Ops/s $\color{#35bf28}+0.26\%$
test_vmap_transformer_speed[False-True] 9.1413ms 8.7763ms 113.9428 Ops/s 113.3041 Ops/s $\color{#35bf28}+0.56\%$
test_vmap_transformer_speed[False-False] 9.2636ms 8.7850ms 113.8299 Ops/s 113.7433 Ops/s $\color{#35bf28}+0.08\%$
test_vmap_transformer_speed_decorator[True-True] 22.0769ms 21.0437ms 47.5202 Ops/s 47.8268 Ops/s $\color{#d91a1a}-0.64\%$
test_vmap_transformer_speed_decorator[True-False] 0.1984s 24.7288ms 40.4387 Ops/s 47.6432 Ops/s $\textbf{\color{#d91a1a}-15.12\%}$
test_vmap_transformer_speed_decorator[False-True] 21.8742ms 20.8787ms 47.8958 Ops/s 48.0046 Ops/s $\color{#d91a1a}-0.23\%$
test_vmap_transformer_speed_decorator[False-False] 21.7712ms 20.8365ms 47.9927 Ops/s 47.9871 Ops/s $\color{#35bf28}+0.01\%$
test_to_module_speed[True] 1.6815ms 1.1513ms 868.5457 Ops/s 866.5182 Ops/s $\color{#35bf28}+0.23\%$
test_to_module_speed[False] 1.6016ms 1.1216ms 891.5949 Ops/s 889.7193 Ops/s $\color{#35bf28}+0.21\%$
test_tc_init 0.1351ms 37.7610μs 26.4823 KOps/s 23.7695 KOps/s $\textbf{\color{#35bf28}+11.41\%}$
test_tc_init_nested 0.2044ms 78.6685μs 12.7116 KOps/s 11.8059 KOps/s $\textbf{\color{#35bf28}+7.67\%}$
test_tc_first_layer_tensor 3.5502μs 0.7794μs 1.2831 MOps/s 1.2756 MOps/s $\color{#35bf28}+0.58\%$
test_tc_first_layer_nontensor 16.8100μs 2.5558μs 391.2723 KOps/s 397.0708 KOps/s $\color{#d91a1a}-1.46\%$
test_tc_second_layer_tensor 25.9900μs 1.7052μs 586.4248 KOps/s 618.7240 KOps/s $\textbf{\color{#d91a1a}-5.22\%}$
test_tc_second_layer_nontensor 27.6210μs 3.4540μs 289.5202 KOps/s 296.8910 KOps/s $\color{#d91a1a}-2.48\%$
test_unbind 0.1819s 12.9144ms 77.4328 Ops/s 84.2086 Ops/s $\textbf{\color{#d91a1a}-8.05\%}$
test_full_like 0.7558ms 0.5782ms 1.7296 KOps/s 1.7346 KOps/s $\color{#d91a1a}-0.29\%$
test_zeros_like 0.3406ms 0.1979ms 5.0525 KOps/s 5.0591 KOps/s $\color{#d91a1a}-0.13\%$
test_ones_like 0.3506ms 0.1979ms 5.0524 KOps/s 5.0608 KOps/s $\color{#d91a1a}-0.17\%$
test_clone 0.5628ms 0.4153ms 2.4080 KOps/s 2.4138 KOps/s $\color{#d91a1a}-0.24\%$
test_squeeze 0.1457ms 11.2988μs 88.5047 KOps/s 89.2232 KOps/s $\color{#d91a1a}-0.81\%$
test_unsqueeze 0.2814ms 82.9407μs 12.0568 KOps/s 12.0587 KOps/s $\color{#d91a1a}-0.02\%$
test_split 0.2948ms 0.1805ms 5.5404 KOps/s 5.5363 KOps/s $\color{#35bf28}+0.07\%$
test_permute 0.3108ms 0.1972ms 5.0704 KOps/s 5.1269 KOps/s $\color{#d91a1a}-1.10\%$
test_stack 1.3791ms 0.9141ms 1.0940 KOps/s 1.0905 KOps/s $\color{#35bf28}+0.32\%$
test_cat 1.2743ms 1.2314ms 812.0684 Ops/s 811.6571 Ops/s $\color{#35bf28}+0.05\%$

[ghstack-poisoned]
@vmoens vmoens merged commit 05406d4 into gh/vmoens/10/base Aug 9, 2024
10 of 28 checks passed
vmoens added a commit that referenced this pull request Aug 9, 2024
ghstack-source-id: 5bf45ecbe91cc5172d67f33761a3cb6e4a0e5fb2
Pull Request resolved: #955
@vmoens vmoens deleted the gh/vmoens/10/head branch August 9, 2024 23:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants