Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Doc] Fix symbolic trace reference in doc #918

Merged
merged 3 commits into from
Jul 25, 2024
Merged

[Doc] Fix symbolic trace reference in doc #918

merged 3 commits into from
Jul 25, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Jul 25, 2024

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 25, 2024
@vmoens vmoens added the documentation Improvements or additions to documentation label Jul 25, 2024
@vmoens vmoens merged commit 6276742 into main Jul 25, 2024
31 of 34 checks passed
@vmoens vmoens deleted the be-fix-fx branch July 25, 2024 16:20
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 213. Improved: $\large\color{#35bf28}13$. Worsened: $\large\color{#d91a1a}12$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 52.9700μs 22.1158μs 45.2165 KOps/s 47.2252 KOps/s $\color{#d91a1a}-4.25\%$
test_plain_set_stack_nested 52.2370μs 22.0031μs 45.4482 KOps/s 46.1818 KOps/s $\color{#d91a1a}-1.59\%$
test_plain_set_nested_inplace 59.4720μs 24.0309μs 41.6131 KOps/s 42.6284 KOps/s $\color{#d91a1a}-2.38\%$
test_plain_set_stack_nested_inplace 55.9350μs 24.0007μs 41.6655 KOps/s 42.9632 KOps/s $\color{#d91a1a}-3.02\%$
test_items 24.5060μs 2.6302μs 380.2051 KOps/s 351.5449 KOps/s $\textbf{\color{#35bf28}+8.15\%}$
test_items_nested 0.3944ms 0.3358ms 2.9779 KOps/s 2.9330 KOps/s $\color{#35bf28}+1.53\%$
test_items_nested_locked 0.6271ms 0.3375ms 2.9626 KOps/s 2.9358 KOps/s $\color{#35bf28}+0.91\%$
test_items_nested_leaf 0.1469ms 87.5532μs 11.4216 KOps/s 11.8274 KOps/s $\color{#d91a1a}-3.43\%$
test_items_stack_nested 2.2591ms 0.3387ms 2.9528 KOps/s 2.9478 KOps/s $\color{#35bf28}+0.17\%$
test_items_stack_nested_leaf 0.1952ms 88.6204μs 11.2841 KOps/s 12.0055 KOps/s $\textbf{\color{#d91a1a}-6.01\%}$
test_items_stack_nested_locked 0.4251ms 0.3405ms 2.9370 KOps/s 2.9313 KOps/s $\color{#35bf28}+0.19\%$
test_keys 24.1850μs 3.8846μs 257.4242 KOps/s 246.4953 KOps/s $\color{#35bf28}+4.43\%$
test_keys_nested 0.2535ms 0.1442ms 6.9333 KOps/s 6.8009 KOps/s $\color{#35bf28}+1.95\%$
test_keys_nested_locked 0.7407ms 0.1498ms 6.6742 KOps/s 6.5520 KOps/s $\color{#35bf28}+1.86\%$
test_keys_nested_leaf 0.2109ms 0.1248ms 8.0145 KOps/s 7.8943 KOps/s $\color{#35bf28}+1.52\%$
test_keys_stack_nested 0.2425ms 0.1437ms 6.9597 KOps/s 6.9470 KOps/s $\color{#35bf28}+0.18\%$
test_keys_stack_nested_leaf 0.2118ms 0.1221ms 8.1911 KOps/s 8.0394 KOps/s $\color{#35bf28}+1.89\%$
test_keys_stack_nested_locked 0.3974ms 0.1486ms 6.7294 KOps/s 6.6983 KOps/s $\color{#35bf28}+0.46\%$
test_values 7.9800μs 1.0575μs 945.6168 KOps/s 848.0727 KOps/s $\textbf{\color{#35bf28}+11.50\%}$
test_values_nested 95.8100μs 50.4398μs 19.8256 KOps/s 19.6433 KOps/s $\color{#35bf28}+0.93\%$
test_values_nested_locked 0.1001ms 49.8534μs 20.0588 KOps/s 19.5741 KOps/s $\color{#35bf28}+2.48\%$
test_values_nested_leaf 82.9560μs 45.0591μs 22.1931 KOps/s 21.8215 KOps/s $\color{#35bf28}+1.70\%$
test_values_stack_nested 0.1791ms 51.4819μs 19.4243 KOps/s 18.9848 KOps/s $\color{#35bf28}+2.31\%$
test_values_stack_nested_leaf 89.2770μs 44.9128μs 22.2654 KOps/s 22.2126 KOps/s $\color{#35bf28}+0.24\%$
test_values_stack_nested_locked 0.1001ms 51.1286μs 19.5585 KOps/s 19.3600 KOps/s $\color{#35bf28}+1.03\%$
test_membership 2.8734μs 0.7346μs 1.3613 MOps/s 1.0795 MOps/s $\textbf{\color{#35bf28}+26.10\%}$
test_membership_nested 25.0670μs 2.5883μs 386.3576 KOps/s 376.0086 KOps/s $\color{#35bf28}+2.75\%$
test_membership_nested_leaf 19.2460μs 2.5929μs 385.6724 KOps/s 370.0933 KOps/s $\color{#35bf28}+4.21\%$
test_membership_stacked_nested 30.1760μs 2.6464μs 377.8670 KOps/s 376.3419 KOps/s $\color{#35bf28}+0.41\%$
test_membership_stacked_nested_leaf 29.1640μs 2.6361μs 379.3480 KOps/s 379.4488 KOps/s $\color{#d91a1a}-0.03\%$
test_membership_nested_last 27.6320μs 3.9674μs 252.0566 KOps/s 250.3802 KOps/s $\color{#35bf28}+0.67\%$
test_membership_nested_leaf_last 39.4540μs 4.0017μs 249.8911 KOps/s 251.1824 KOps/s $\color{#d91a1a}-0.51\%$
test_membership_stacked_nested_last 53.5600μs 9.3908μs 106.4870 KOps/s 78.4726 KOps/s $\textbf{\color{#35bf28}+35.70\%}$
test_membership_stacked_nested_leaf_last 41.5280μs 9.3893μs 106.5038 KOps/s 78.8362 KOps/s $\textbf{\color{#35bf28}+35.10\%}$
test_nested_getleaf 39.7340μs 10.6912μs 93.5350 KOps/s 92.3845 KOps/s $\color{#35bf28}+1.25\%$
test_nested_get 35.4770μs 10.1605μs 98.4203 KOps/s 99.7919 KOps/s $\color{#d91a1a}-1.37\%$
test_stacked_getleaf 37.5200μs 10.3813μs 96.3268 KOps/s 95.9154 KOps/s $\color{#35bf28}+0.43\%$
test_stacked_get 30.1470μs 9.9412μs 100.5920 KOps/s 102.1613 KOps/s $\color{#d91a1a}-1.54\%$
test_nested_getitemleaf 35.5160μs 11.0956μs 90.1257 KOps/s 89.9219 KOps/s $\color{#35bf28}+0.23\%$
test_nested_getitem 40.9570μs 10.1977μs 98.0618 KOps/s 98.6190 KOps/s $\color{#d91a1a}-0.56\%$
test_stacked_getitemleaf 40.5160μs 11.0911μs 90.1622 KOps/s 91.0164 KOps/s $\color{#d91a1a}-0.94\%$
test_stacked_getitem 47.7600μs 10.2872μs 97.2079 KOps/s 99.1878 KOps/s $\color{#d91a1a}-2.00\%$
test_lock_nested 6.9129ms 0.5005ms 1.9979 KOps/s 1.9710 KOps/s $\color{#35bf28}+1.37\%$
test_lock_stack_nested 0.6915ms 0.4519ms 2.2128 KOps/s 2.2487 KOps/s $\color{#d91a1a}-1.59\%$
test_unlock_nested 90.7510ms 0.5054ms 1.9785 KOps/s 2.3734 KOps/s $\textbf{\color{#d91a1a}-16.64\%}$
test_unlock_stack_nested 0.4581ms 0.3662ms 2.7304 KOps/s 2.7773 KOps/s $\color{#d91a1a}-1.69\%$
test_flatten_speed 0.6252ms 0.1068ms 9.3650 KOps/s 9.6922 KOps/s $\color{#d91a1a}-3.38\%$
test_unflatten_speed 0.9773ms 0.4318ms 2.3157 KOps/s 2.3215 KOps/s $\color{#d91a1a}-0.25\%$
test_common_ops 4.2508ms 1.0793ms 926.5096 Ops/s 934.7374 Ops/s $\color{#d91a1a}-0.88\%$
test_creation 14.3370μs 2.0328μs 491.9365 KOps/s 495.3525 KOps/s $\color{#d91a1a}-0.69\%$
test_creation_empty 48.9010μs 18.0492μs 55.4041 KOps/s 57.7402 KOps/s $\color{#d91a1a}-4.05\%$
test_creation_nested_1 60.7740μs 21.3067μs 46.9336 KOps/s 49.1052 KOps/s $\color{#d91a1a}-4.42\%$
test_creation_nested_2 85.9110μs 24.6867μs 40.5077 KOps/s 42.1069 KOps/s $\color{#d91a1a}-3.80\%$
test_clone 0.1125ms 17.4473μs 57.3154 KOps/s 59.5533 KOps/s $\color{#d91a1a}-3.76\%$
test_getitem[int] 1.1822ms 16.8432μs 59.3712 KOps/s 61.5940 KOps/s $\color{#d91a1a}-3.61\%$
test_getitem[slice_int] 0.1273ms 31.5690μs 31.6766 KOps/s 33.2282 KOps/s $\color{#d91a1a}-4.67\%$
test_getitem[range] 0.3301ms 58.2921μs 17.1550 KOps/s 17.7949 KOps/s $\color{#d91a1a}-3.60\%$
test_getitem[tuple] 0.1498ms 26.9893μs 37.0518 KOps/s 40.4465 KOps/s $\textbf{\color{#d91a1a}-8.39\%}$
test_getitem[list] 0.2422ms 53.0066μs 18.8656 KOps/s 19.0406 KOps/s $\color{#d91a1a}-0.92\%$
test_setitem_dim[int] 74.5300μs 39.4771μs 25.3311 KOps/s 25.4885 KOps/s $\color{#d91a1a}-0.62\%$
test_setitem_dim[slice_int] 0.1312ms 70.1076μs 14.2638 KOps/s 14.5780 KOps/s $\color{#d91a1a}-2.16\%$
test_setitem_dim[range] 0.1324ms 93.3085μs 10.7171 KOps/s 10.7995 KOps/s $\color{#d91a1a}-0.76\%$
test_setitem_dim[tuple] 99.7570μs 57.0539μs 17.5273 KOps/s 17.7904 KOps/s $\color{#d91a1a}-1.48\%$
test_setitem 0.1022ms 29.3254μs 34.1002 KOps/s 35.6094 KOps/s $\color{#d91a1a}-4.24\%$
test_set 0.1087ms 28.4998μs 35.0880 KOps/s 36.5836 KOps/s $\color{#d91a1a}-4.09\%$
test_set_shared 3.9517ms 0.2162ms 4.6259 KOps/s 4.6743 KOps/s $\color{#d91a1a}-1.04\%$
test_update 0.1810ms 35.0716μs 28.5131 KOps/s 29.4769 KOps/s $\color{#d91a1a}-3.27\%$
test_update_nested 0.1462ms 44.8943μs 22.2745 KOps/s 22.6820 KOps/s $\color{#d91a1a}-1.80\%$
test_update__nested 0.1358ms 34.7162μs 28.8050 KOps/s 29.2684 KOps/s $\color{#d91a1a}-1.58\%$
test_set_nested 0.1166ms 30.9885μs 32.2700 KOps/s 33.6298 KOps/s $\color{#d91a1a}-4.04\%$
test_set_nested_new 0.1576ms 35.7146μs 27.9998 KOps/s 29.0568 KOps/s $\color{#d91a1a}-3.64\%$
test_select 0.2074ms 52.8401μs 18.9250 KOps/s 19.6765 KOps/s $\color{#d91a1a}-3.82\%$
test_select_nested 0.1130ms 59.0501μs 16.9348 KOps/s 16.7791 KOps/s $\color{#35bf28}+0.93\%$
test_exclude_nested 0.1451ms 76.7117μs 13.0358 KOps/s 12.6885 KOps/s $\color{#35bf28}+2.74\%$
test_empty[True] 0.5547ms 0.3199ms 3.1255 KOps/s 3.1130 KOps/s $\color{#35bf28}+0.40\%$
test_empty[False] 9.4196μs 1.1454μs 873.0875 KOps/s 844.7930 KOps/s $\color{#35bf28}+3.35\%$
test_unbind_speed 0.4993ms 0.2993ms 3.3411 KOps/s 3.2071 KOps/s $\color{#35bf28}+4.18\%$
test_unbind_speed_stack0 0.4263ms 0.2937ms 3.4053 KOps/s 3.4411 KOps/s $\color{#d91a1a}-1.04\%$
test_unbind_speed_stack1 0.1044s 0.7895ms 1.2667 KOps/s 1.4069 KOps/s $\textbf{\color{#d91a1a}-9.96\%}$
test_split 0.1049s 2.2331ms 447.8059 Ops/s 471.7569 Ops/s $\textbf{\color{#d91a1a}-5.08\%}$
test_chunk 0.1014s 2.2241ms 449.6191 Ops/s 469.4144 Ops/s $\color{#d91a1a}-4.22\%$
test_creation[device0] 0.2211ms 0.1186ms 8.4290 KOps/s 8.4419 KOps/s $\color{#d91a1a}-0.15\%$
test_creation_from_tensor 4.9286ms 0.1210ms 8.2622 KOps/s 8.2306 KOps/s $\color{#35bf28}+0.38\%$
test_add_one[memmap_tensor0] 0.2570ms 7.7462μs 129.0959 KOps/s 128.5209 KOps/s $\color{#35bf28}+0.45\%$
test_contiguous[memmap_tensor0] 43.9830μs 2.0226μs 494.4063 KOps/s 491.6332 KOps/s $\color{#35bf28}+0.56\%$
test_stack[memmap_tensor0] 70.3320μs 5.7930μs 172.6228 KOps/s 175.4322 KOps/s $\color{#d91a1a}-1.60\%$
test_memmaptd_index 1.0691ms 0.4090ms 2.4451 KOps/s 2.4588 KOps/s $\color{#d91a1a}-0.55\%$
test_memmaptd_index_astensor 1.1342ms 0.4878ms 2.0498 KOps/s 2.0639 KOps/s $\color{#d91a1a}-0.68\%$
test_memmaptd_index_op 1.4231ms 1.0345ms 966.6693 Ops/s 976.7053 Ops/s $\color{#d91a1a}-1.03\%$
test_serialize_model 0.1330s 0.1279s 7.8168 Ops/s 6.8433 Ops/s $\textbf{\color{#35bf28}+14.23\%}$
test_serialize_model_pickle 0.5019s 0.4056s 2.4658 Ops/s 2.4424 Ops/s $\color{#35bf28}+0.96\%$
test_serialize_weights 0.2150s 0.1397s 7.1583 Ops/s 7.8904 Ops/s $\textbf{\color{#d91a1a}-9.28\%}$
test_serialize_weights_returnearly 0.1865s 0.1705s 5.8650 Ops/s 5.9274 Ops/s $\color{#d91a1a}-1.05\%$
test_serialize_weights_pickle 0.4900s 0.3895s 2.5677 Ops/s 2.5389 Ops/s $\color{#35bf28}+1.13\%$
test_serialize_weights_filesystem 0.1513s 0.1429s 6.9982 Ops/s 6.5161 Ops/s $\textbf{\color{#35bf28}+7.40\%}$
test_serialize_model_filesystem 0.2293s 0.1648s 6.0686 Ops/s 6.4338 Ops/s $\textbf{\color{#d91a1a}-5.68\%}$
test_reshape_pytree 89.2880μs 39.5368μs 25.2929 KOps/s 25.4949 KOps/s $\color{#d91a1a}-0.79\%$
test_reshape_td 0.1052ms 46.1955μs 21.6471 KOps/s 21.8086 KOps/s $\color{#d91a1a}-0.74\%$
test_view_pytree 0.1025ms 39.7239μs 25.1738 KOps/s 25.4968 KOps/s $\color{#d91a1a}-1.27\%$
test_view_td 0.1138ms 52.5226μs 19.0394 KOps/s 19.0752 KOps/s $\color{#d91a1a}-0.19\%$
test_unbind_pytree 85.5610μs 37.3614μs 26.7656 KOps/s 27.3514 KOps/s $\color{#d91a1a}-2.14\%$
test_unbind_td 0.4251ms 45.3209μs 22.0649 KOps/s 21.9813 KOps/s $\color{#35bf28}+0.38\%$
test_split_pytree 0.1007ms 39.7820μs 25.1370 KOps/s 25.5432 KOps/s $\color{#d91a1a}-1.59\%$
test_split_td 0.5572ms 58.5116μs 17.0906 KOps/s 17.6592 KOps/s $\color{#d91a1a}-3.22\%$
test_add_pytree 98.3650μs 45.9463μs 21.7645 KOps/s 21.5666 KOps/s $\color{#35bf28}+0.92\%$
test_add_td 0.2216ms 80.8699μs 12.3655 KOps/s 12.6337 KOps/s $\color{#d91a1a}-2.12\%$
test_compile_add_one_nested[tensordict-compile] 0.1030ms 53.6979μs 18.6227 KOps/s 18.0114 KOps/s $\color{#35bf28}+3.39\%$
test_compile_add_one_nested[tensordict-eager] 0.4021ms 0.1862ms 5.3704 KOps/s 5.3822 KOps/s $\color{#d91a1a}-0.22\%$
test_compile_add_one_nested[pytree-compile] 0.2012ms 54.3950μs 18.3840 KOps/s 18.2948 KOps/s $\color{#35bf28}+0.49\%$
test_compile_add_one_nested[pytree-eager] 0.3016ms 0.1459ms 6.8540 KOps/s 6.8541 KOps/s $-0.00\%$
test_compile_copy_nested[tensordict-compile] 50.5440μs 20.4045μs 49.0088 KOps/s 46.5352 KOps/s $\textbf{\color{#35bf28}+5.32\%}$
test_compile_copy_nested[tensordict-eager] 0.1701ms 62.9611μs 15.8828 KOps/s 15.6524 KOps/s $\color{#35bf28}+1.47\%$
test_compile_copy_nested[pytree-compile] 0.1711ms 79.0419μs 12.6515 KOps/s 12.6631 KOps/s $\color{#d91a1a}-0.09\%$
test_compile_copy_nested[pytree-eager] 0.1328ms 71.7303μs 13.9411 KOps/s 13.8659 KOps/s $\color{#35bf28}+0.54\%$
test_compile_add_one_flat[tensordict-compile] 0.2874ms 0.1733ms 5.7706 KOps/s 5.7752 KOps/s $\color{#d91a1a}-0.08\%$
test_compile_add_one_flat[tensordict-eager] 0.4044ms 0.1954ms 5.1182 KOps/s 5.1377 KOps/s $\color{#d91a1a}-0.38\%$
test_compile_add_one_flat[tensorclass-compile] 0.1215ms 39.2218μs 25.4961 KOps/s 24.5270 KOps/s $\color{#35bf28}+3.95\%$
test_compile_add_one_flat[tensorclass-eager] 1.4921ms 68.1786μs 14.6674 KOps/s 14.6508 KOps/s $\color{#35bf28}+0.11\%$
test_compile_add_one_flat[pytree-compile] 0.2794ms 0.1721ms 5.8096 KOps/s 5.7512 KOps/s $\color{#35bf28}+1.02\%$
test_compile_add_one_flat[pytree-eager] 0.5083ms 0.3023ms 3.3081 KOps/s 3.3419 KOps/s $\color{#d91a1a}-1.01\%$
test_compile_add_self_flat[tensordict-eager] 0.4279ms 0.2088ms 4.7896 KOps/s 4.7224 KOps/s $\color{#35bf28}+1.42\%$
test_compile_add_self_flat[tensordict-compile] 0.3302ms 0.1830ms 5.4635 KOps/s 5.6755 KOps/s $\color{#d91a1a}-3.74\%$
test_compile_add_self_flat[tensorclass-eager] 0.1727ms 62.5863μs 15.9779 KOps/s 16.1636 KOps/s $\color{#d91a1a}-1.15\%$
test_compile_add_self_flat[tensorclass-compile] 82.1640μs 39.7098μs 25.1827 KOps/s 24.9593 KOps/s $\color{#35bf28}+0.89\%$
test_compile_add_self_flat[pytree-eager] 0.5319ms 0.2469ms 4.0494 KOps/s 4.1269 KOps/s $\color{#d91a1a}-1.88\%$
test_compile_add_self_flat[pytree-compile] 0.3432ms 0.1739ms 5.7496 KOps/s 5.8187 KOps/s $\color{#d91a1a}-1.19\%$
test_compile_copy_flat[tensordict-compile] 0.1965ms 0.1076ms 9.2899 KOps/s 9.3818 KOps/s $\color{#d91a1a}-0.98\%$
test_compile_copy_flat[tensordict-eager] 0.1238ms 55.8418μs 17.9077 KOps/s 17.3460 KOps/s $\color{#35bf28}+3.24\%$
test_compile_copy_flat[pytree-compile] 0.1683ms 82.2672μs 12.1555 KOps/s 12.7137 KOps/s $\color{#d91a1a}-4.39\%$
test_compile_copy_flat[pytree-eager] 0.1500ms 71.9285μs 13.9027 KOps/s 13.9399 KOps/s $\color{#d91a1a}-0.27\%$
test_compile_assign_and_add[tensordict-compile] 0.2850ms 0.1902ms 5.2572 KOps/s 5.3490 KOps/s $\color{#d91a1a}-1.72\%$
test_compile_assign_and_add[tensordict-eager] 2.8751ms 1.8654ms 536.0653 Ops/s 606.5114 Ops/s $\textbf{\color{#d91a1a}-11.61\%}$
test_compile_assign_and_add[pytree-compile] 0.4238ms 0.1910ms 5.2367 KOps/s 5.2480 KOps/s $\color{#d91a1a}-0.22\%$
test_compile_assign_and_add[pytree-eager] 1.3831ms 1.1002ms 908.9523 Ops/s 906.6653 Ops/s $\color{#35bf28}+0.25\%$
test_compile_assign_and_add_stack[compile] 0.7418ms 0.4250ms 2.3528 KOps/s 2.3461 KOps/s $\color{#35bf28}+0.29\%$
test_compile_assign_and_add_stack[eager] 4.0987ms 3.8131ms 262.2530 Ops/s 269.2811 Ops/s $\color{#d91a1a}-2.61\%$
test_compile_indexing[tensor-tensordict-compile] 87.8940μs 33.5780μs 29.7814 KOps/s 30.6657 KOps/s $\color{#d91a1a}-2.88\%$
test_compile_indexing[tensor-tensordict-eager] 0.7794ms 48.9824μs 20.4155 KOps/s 20.7107 KOps/s $\color{#d91a1a}-1.43\%$
test_compile_indexing[tensor-tensorclass-compile] 67.9370μs 28.8982μs 34.6042 KOps/s 36.1692 KOps/s $\color{#d91a1a}-4.33\%$
test_compile_indexing[tensor-tensorclass-eager] 83.5260μs 29.7261μs 33.6405 KOps/s 34.1237 KOps/s $\color{#d91a1a}-1.42\%$
test_compile_indexing[tensor-pytree-compile] 0.1169ms 28.8823μs 34.6233 KOps/s 35.7654 KOps/s $\color{#d91a1a}-3.19\%$
test_compile_indexing[tensor-pytree-eager] 72.2850μs 30.0855μs 33.2386 KOps/s 33.6801 KOps/s $\color{#d91a1a}-1.31\%$
test_compile_indexing[slice-tensordict-compile] 0.1600ms 71.0117μs 14.0822 KOps/s 14.0836 KOps/s $\color{#d91a1a}-0.01\%$
test_compile_indexing[slice-tensordict-eager] 0.3856ms 27.6191μs 36.2068 KOps/s 37.5238 KOps/s $\color{#d91a1a}-3.51\%$
test_compile_indexing[slice-tensorclass-compile] 0.1246ms 66.5966μs 15.0158 KOps/s 14.7046 KOps/s $\color{#35bf28}+2.12\%$
test_compile_indexing[slice-tensorclass-eager] 75.8530μs 25.6367μs 39.0065 KOps/s 42.0845 KOps/s $\textbf{\color{#d91a1a}-7.31\%}$
test_compile_indexing[slice-pytree-compile] 0.1268ms 67.1880μs 14.8836 KOps/s 14.4907 KOps/s $\color{#35bf28}+2.71\%$
test_compile_indexing[slice-pytree-eager] 62.8180μs 25.0780μs 39.8755 KOps/s 42.0978 KOps/s $\textbf{\color{#d91a1a}-5.28\%}$
test_compile_indexing[int-tensordict-compile] 0.1574ms 71.1025μs 14.0642 KOps/s 14.0073 KOps/s $\color{#35bf28}+0.41\%$
test_compile_indexing[int-tensordict-eager] 0.7618ms 27.6139μs 36.2137 KOps/s 37.4683 KOps/s $\color{#d91a1a}-3.35\%$
test_compile_indexing[int-tensorclass-compile] 0.2610ms 68.9139μs 14.5109 KOps/s 14.7203 KOps/s $\color{#d91a1a}-1.42\%$
test_compile_indexing[int-tensorclass-eager] 79.3890μs 24.3626μs 41.0465 KOps/s 42.1443 KOps/s $\color{#d91a1a}-2.60\%$
test_compile_indexing[int-pytree-compile] 0.1357ms 66.8305μs 14.9632 KOps/s 14.8066 KOps/s $\color{#35bf28}+1.06\%$
test_compile_indexing[int-pytree-eager] 70.0710μs 24.2757μs 41.1934 KOps/s 43.0249 KOps/s $\color{#d91a1a}-4.26\%$
test_mod_add[eager] 95.9290μs 24.0197μs 41.6325 KOps/s 42.1721 KOps/s $\color{#d91a1a}-1.28\%$
test_mod_add[compile] 0.1353ms 37.3798μs 26.7524 KOps/s 27.3139 KOps/s $\color{#d91a1a}-2.06\%$
test_mod_add[compile-overhead] 0.1356ms 36.9146μs 27.0895 KOps/s 26.8843 KOps/s $\color{#35bf28}+0.76\%$
test_mod_wrap[eager] 0.3634ms 0.2077ms 4.8154 KOps/s 4.6586 KOps/s $\color{#35bf28}+3.37\%$
test_mod_wrap[compile] 1.7596ms 0.2259ms 4.4263 KOps/s 4.3326 KOps/s $\color{#35bf28}+2.16\%$
test_mod_wrap[compile-overhead] 0.4481ms 0.2238ms 4.4679 KOps/s 4.3733 KOps/s $\color{#35bf28}+2.16\%$
test_mod_wrap_and_backward[eager] 12.8100ms 11.1978ms 89.3031 Ops/s 89.9365 Ops/s $\color{#d91a1a}-0.70\%$
test_mod_wrap_and_backward[compile] 16.9214ms 11.8675ms 84.2638 Ops/s 86.4230 Ops/s $\color{#d91a1a}-2.50\%$
test_mod_wrap_and_backward[compile-overhead] 14.2113ms 11.7661ms 84.9900 Ops/s 89.7048 Ops/s $\textbf{\color{#d91a1a}-5.26\%}$
test_seq_add[eager] 0.2128ms 85.5268μs 11.6922 KOps/s 11.5782 KOps/s $\color{#35bf28}+0.98\%$
test_seq_add[compile] 0.1512ms 59.0128μs 16.9455 KOps/s 16.4274 KOps/s $\color{#35bf28}+3.15\%$
test_seq_add[compile-overhead] 0.1440ms 58.6545μs 17.0490 KOps/s 16.7485 KOps/s $\color{#35bf28}+1.79\%$
test_seq_wrap[eager] 0.5629ms 0.3719ms 2.6891 KOps/s 2.6439 KOps/s $\color{#35bf28}+1.71\%$
test_seq_wrap[compile] 0.4492ms 0.2589ms 3.8622 KOps/s 3.6970 KOps/s $\color{#35bf28}+4.47\%$
test_seq_wrap[compile-overhead] 0.4059ms 0.2574ms 3.8857 KOps/s 3.6618 KOps/s $\textbf{\color{#35bf28}+6.11\%}$
test_func_call_runtime[False-eager] 0.8773ms 0.5225ms 1.9138 KOps/s 1.8025 KOps/s $\textbf{\color{#35bf28}+6.18\%}$
test_func_call_runtime[False-compile] 0.6122ms 0.4941ms 2.0241 KOps/s 1.9622 KOps/s $\color{#35bf28}+3.15\%$
test_func_call_runtime[False-compile-overhead] 0.6086ms 0.4928ms 2.0290 KOps/s 1.9881 KOps/s $\color{#35bf28}+2.06\%$
test_func_call_runtime[True-eager] 0.9843ms 0.8273ms 1.2088 KOps/s 1.1600 KOps/s $\color{#35bf28}+4.21\%$
test_func_call_runtime[True-compile] 0.8614ms 0.5130ms 1.9492 KOps/s 1.8984 KOps/s $\color{#35bf28}+2.68\%$
test_func_call_runtime[True-compile-overhead] 1.0233ms 0.5146ms 1.9434 KOps/s 1.8944 KOps/s $\color{#35bf28}+2.58\%$
test_distributed 0.2264ms 0.1326ms 7.5421 KOps/s 7.2918 KOps/s $\color{#35bf28}+3.43\%$
test_tdmodule 42.4100μs 16.9791μs 58.8960 KOps/s 58.5985 KOps/s $\color{#35bf28}+0.51\%$
test_tdmodule_dispatch 56.3560μs 35.6022μs 28.0882 KOps/s 28.4818 KOps/s $\color{#d91a1a}-1.38\%$
test_tdseq 41.1770μs 18.8530μs 53.0421 KOps/s 53.2613 KOps/s $\color{#d91a1a}-0.41\%$
test_tdseq_dispatch 70.0710μs 39.4690μs 25.3364 KOps/s 25.8517 KOps/s $\color{#d91a1a}-1.99\%$
test_instantiation_functorch 1.8525ms 1.6454ms 607.7680 Ops/s 608.3978 Ops/s $\color{#d91a1a}-0.10\%$
test_instantiation_td 2.1611ms 1.1945ms 837.1354 Ops/s 843.7049 Ops/s $\color{#d91a1a}-0.78\%$
test_exec_functorch 0.3202ms 0.1783ms 5.6088 KOps/s 5.5133 KOps/s $\color{#35bf28}+1.73\%$
test_exec_functional_call 0.3202ms 0.1660ms 6.0258 KOps/s 5.6237 KOps/s $\textbf{\color{#35bf28}+7.15\%}$
test_exec_td 0.2818ms 0.1680ms 5.9541 KOps/s 5.6082 KOps/s $\textbf{\color{#35bf28}+6.17\%}$
test_exec_td_decorator 0.7327ms 0.2497ms 4.0044 KOps/s 3.8397 KOps/s $\color{#35bf28}+4.29\%$
test_vmap_mlp_speed[True-True] 0.8348ms 0.5991ms 1.6692 KOps/s 1.6684 KOps/s $\color{#35bf28}+0.05\%$
test_vmap_mlp_speed[True-False] 1.0067ms 0.5966ms 1.6762 KOps/s 1.6745 KOps/s $\color{#35bf28}+0.10\%$
test_vmap_mlp_speed[False-True] 0.8839ms 0.4976ms 2.0095 KOps/s 2.0285 KOps/s $\color{#d91a1a}-0.94\%$
test_vmap_mlp_speed[False-False] 0.7869ms 0.5005ms 1.9979 KOps/s 2.0013 KOps/s $\color{#d91a1a}-0.17\%$
test_vmap_mlp_speed_decorator[True-True] 1.7719ms 0.7280ms 1.3736 KOps/s 1.4454 KOps/s $\color{#d91a1a}-4.97\%$
test_vmap_mlp_speed_decorator[True-False] 1.0276ms 0.6939ms 1.4411 KOps/s 1.4475 KOps/s $\color{#d91a1a}-0.44\%$
test_vmap_mlp_speed_decorator[False-True] 0.8795ms 0.5736ms 1.7434 KOps/s 1.7433 KOps/s $+0.00\%$
test_vmap_mlp_speed_decorator[False-False] 0.9994ms 0.5776ms 1.7314 KOps/s 1.7176 KOps/s $\color{#35bf28}+0.80\%$
test_to_module_speed[True] 2.0803ms 1.8014ms 555.1143 Ops/s 554.7924 Ops/s $\color{#35bf28}+0.06\%$
test_to_module_speed[False] 2.8209ms 1.7785ms 562.2565 Ops/s 567.6108 Ops/s $\color{#d91a1a}-0.94\%$
test_tc_init 83.5860μs 45.2805μs 22.0846 KOps/s 22.1062 KOps/s $\color{#d91a1a}-0.10\%$
test_tc_init_nested 0.1571ms 91.2672μs 10.9568 KOps/s 10.8568 KOps/s $\color{#35bf28}+0.92\%$
test_tc_first_layer_tensor 13.1750μs 1.4864μs 672.7507 KOps/s 680.9036 KOps/s $\color{#d91a1a}-1.20\%$
test_tc_first_layer_nontensor 17.8530μs 4.3051μs 232.2851 KOps/s 236.6953 KOps/s $\color{#d91a1a}-1.86\%$
test_tc_second_layer_tensor 37.2990μs 2.8792μs 347.3144 KOps/s 367.5702 KOps/s $\textbf{\color{#d91a1a}-5.51\%}$
test_tc_second_layer_nontensor 35.0360μs 5.7075μs 175.2072 KOps/s 180.1781 KOps/s $\color{#d91a1a}-2.76\%$
test_unbind 0.4567s 14.3209ms 69.8282 Ops/s 68.5868 Ops/s $\color{#35bf28}+1.81\%$
test_full_like 19.7278ms 12.4876ms 80.0796 Ops/s 74.7684 Ops/s $\textbf{\color{#35bf28}+7.10\%}$
test_zeros_like 15.2120ms 7.6970ms 129.9211 Ops/s 131.6680 Ops/s $\color{#d91a1a}-1.33\%$
test_ones_like 14.0503ms 7.5477ms 132.4905 Ops/s 126.4428 Ops/s $\color{#35bf28}+4.78\%$
test_clone 14.9428ms 9.0678ms 110.2803 Ops/s 106.4526 Ops/s $\color{#35bf28}+3.60\%$
test_squeeze 84.4880μs 12.9782μs 77.0524 KOps/s 76.9814 KOps/s $\color{#35bf28}+0.09\%$
test_unsqueeze 0.1674ms 91.5177μs 10.9268 KOps/s 10.5888 KOps/s $\color{#35bf28}+3.19\%$
test_split 0.4673ms 0.1997ms 5.0072 KOps/s 5.0264 KOps/s $\color{#d91a1a}-0.38\%$
test_permute 0.4417ms 0.2167ms 4.6154 KOps/s 4.5501 KOps/s $\color{#35bf28}+1.43\%$
test_stack 32.4954ms 25.3688ms 39.4185 Ops/s 38.7712 Ops/s $\color{#35bf28}+1.67\%$
test_cat 30.7810ms 25.1222ms 39.8055 Ops/s 39.0688 Ops/s $\color{#35bf28}+1.89\%$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 219. Improved: $\large\color{#35bf28}5$. Worsened: $\large\color{#d91a1a}33$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.1533ms 17.4666μs 57.2520 KOps/s 58.7191 KOps/s $\color{#d91a1a}-2.50\%$
test_plain_set_stack_nested 37.5410μs 17.8379μs 56.0603 KOps/s 58.4477 KOps/s $\color{#d91a1a}-4.08\%$
test_plain_set_nested_inplace 0.2011ms 18.7289μs 53.3934 KOps/s 54.8116 KOps/s $\color{#d91a1a}-2.59\%$
test_plain_set_stack_nested_inplace 41.7600μs 18.9113μs 52.8785 KOps/s 54.7737 KOps/s $\color{#d91a1a}-3.46\%$
test_items 22.3010μs 4.7489μs 210.5754 KOps/s 211.3709 KOps/s $\color{#d91a1a}-0.38\%$
test_items_nested 0.4384ms 0.3602ms 2.7761 KOps/s 2.7041 KOps/s $\color{#35bf28}+2.66\%$
test_items_nested_locked 0.4167ms 0.3663ms 2.7303 KOps/s 2.6952 KOps/s $\color{#35bf28}+1.30\%$
test_items_nested_leaf 0.1552ms 84.4093μs 11.8470 KOps/s 11.8102 KOps/s $\color{#35bf28}+0.31\%$
test_items_stack_nested 0.4568ms 0.3667ms 2.7268 KOps/s 2.7288 KOps/s $\color{#d91a1a}-0.07\%$
test_items_stack_nested_leaf 0.1142ms 84.4103μs 11.8469 KOps/s 11.8024 KOps/s $\color{#35bf28}+0.38\%$
test_items_stack_nested_locked 0.4383ms 0.3670ms 2.7249 KOps/s 2.6991 KOps/s $\color{#35bf28}+0.96\%$
test_keys 23.0400μs 4.4016μs 227.1895 KOps/s 227.9489 KOps/s $\color{#d91a1a}-0.33\%$
test_keys_nested 94.6710μs 65.3105μs 15.3115 KOps/s 15.1745 KOps/s $\color{#35bf28}+0.90\%$
test_keys_nested_locked 0.6598ms 71.4294μs 13.9998 KOps/s 13.6987 KOps/s $\color{#35bf28}+2.20\%$
test_keys_nested_leaf 0.1321ms 55.5583μs 17.9991 KOps/s 17.1668 KOps/s $\color{#35bf28}+4.85\%$
test_keys_stack_nested 94.3310μs 66.6308μs 15.0081 KOps/s 15.1841 KOps/s $\color{#d91a1a}-1.16\%$
test_keys_stack_nested_leaf 83.9210μs 57.5908μs 17.3639 KOps/s 17.3881 KOps/s $\color{#d91a1a}-0.14\%$
test_keys_stack_nested_locked 98.0910μs 71.6795μs 13.9510 KOps/s 13.7669 KOps/s $\color{#35bf28}+1.34\%$
test_values 7.0970μs 1.7571μs 569.1118 KOps/s 564.0378 KOps/s $\color{#35bf28}+0.90\%$
test_values_nested 51.7620μs 33.8433μs 29.5480 KOps/s 29.7443 KOps/s $\color{#d91a1a}-0.66\%$
test_values_nested_locked 0.1464ms 35.7390μs 27.9807 KOps/s 28.0975 KOps/s $\color{#d91a1a}-0.42\%$
test_values_nested_leaf 0.1933ms 30.4629μs 32.8268 KOps/s 33.2105 KOps/s $\color{#d91a1a}-1.16\%$
test_values_stack_nested 59.6520μs 34.6415μs 28.8671 KOps/s 28.7594 KOps/s $\color{#35bf28}+0.37\%$
test_values_stack_nested_leaf 59.3110μs 31.1342μs 32.1191 KOps/s 32.1861 KOps/s $\color{#d91a1a}-0.21\%$
test_values_stack_nested_locked 64.6320μs 36.6089μs 27.3158 KOps/s 27.3340 KOps/s $\color{#d91a1a}-0.07\%$
test_membership 1.5180μs 0.5512μs 1.8141 MOps/s 1.8274 MOps/s $\color{#d91a1a}-0.73\%$
test_membership_nested 12.8750μs 1.9235μs 519.8862 KOps/s 511.3732 KOps/s $\color{#35bf28}+1.66\%$
test_membership_nested_leaf 10.4705μs 1.9300μs 518.1371 KOps/s 515.0401 KOps/s $\color{#35bf28}+0.60\%$
test_membership_stacked_nested 23.2190μs 2.0221μs 494.5367 KOps/s 493.3609 KOps/s $\color{#35bf28}+0.24\%$
test_membership_stacked_nested_leaf 16.6790μs 2.0000μs 500.0077 KOps/s 493.0384 KOps/s $\color{#35bf28}+1.41\%$
test_membership_nested_last 51.7210μs 2.9175μs 342.7607 KOps/s 338.6853 KOps/s $\color{#35bf28}+1.20\%$
test_membership_nested_leaf_last 0.1750ms 2.9328μs 340.9655 KOps/s 338.2057 KOps/s $\color{#35bf28}+0.82\%$
test_membership_stacked_nested_last 0.1953ms 4.3334μs 230.7647 KOps/s 248.3980 KOps/s $\textbf{\color{#d91a1a}-7.10\%}$
test_membership_stacked_nested_leaf_last 0.2025ms 4.3164μs 231.6754 KOps/s 248.6319 KOps/s $\textbf{\color{#d91a1a}-6.82\%}$
test_nested_getleaf 0.1917ms 7.9751μs 125.3903 KOps/s 125.9944 KOps/s $\color{#d91a1a}-0.48\%$
test_nested_get 22.0610μs 7.4475μs 134.2726 KOps/s 134.1858 KOps/s $\color{#35bf28}+0.06\%$
test_stacked_getleaf 0.1968ms 8.0595μs 124.0774 KOps/s 124.5474 KOps/s $\color{#d91a1a}-0.38\%$
test_stacked_get 0.2086ms 7.4811μs 133.6708 KOps/s 133.4762 KOps/s $\color{#35bf28}+0.15\%$
test_nested_getitemleaf 0.1847ms 8.1027μs 123.4149 KOps/s 123.5252 KOps/s $\color{#d91a1a}-0.09\%$
test_nested_getitem 30.3410μs 7.6812μs 130.1886 KOps/s 131.3264 KOps/s $\color{#d91a1a}-0.87\%$
test_stacked_getitemleaf 60.8900μs 8.1171μs 123.1964 KOps/s 123.1673 KOps/s $\color{#35bf28}+0.02\%$
test_stacked_getitem 22.1610μs 7.6504μs 130.7124 KOps/s 131.0770 KOps/s $\color{#d91a1a}-0.28\%$
test_lock_nested 10.2961ms 0.4874ms 2.0517 KOps/s 2.1209 KOps/s $\color{#d91a1a}-3.26\%$
test_lock_stack_nested 0.4868ms 0.4375ms 2.2856 KOps/s 2.3142 KOps/s $\color{#d91a1a}-1.23\%$
test_unlock_nested 0.9117ms 0.3957ms 2.5272 KOps/s 2.5390 KOps/s $\color{#d91a1a}-0.46\%$
test_unlock_stack_nested 0.4346ms 0.3552ms 2.8152 KOps/s 2.8328 KOps/s $\color{#d91a1a}-0.62\%$
test_flatten_speed 0.4734ms 0.1055ms 9.4803 KOps/s 9.5181 KOps/s $\color{#d91a1a}-0.40\%$
test_unflatten_speed 0.4034ms 0.2898ms 3.4506 KOps/s 3.4576 KOps/s $\color{#d91a1a}-0.20\%$
test_common_ops 1.6216ms 1.3844ms 722.3286 Ops/s 735.1604 Ops/s $\color{#d91a1a}-1.75\%$
test_creation 16.3600μs 1.6890μs 592.0826 KOps/s 601.3341 KOps/s $\color{#d91a1a}-1.54\%$
test_creation_empty 45.8000μs 18.5577μs 53.8860 KOps/s 57.2367 KOps/s $\textbf{\color{#d91a1a}-5.85\%}$
test_creation_nested_1 40.1610μs 20.4378μs 48.9289 KOps/s 51.5544 KOps/s $\textbf{\color{#d91a1a}-5.09\%}$
test_creation_nested_2 54.6720μs 23.1407μs 43.2138 KOps/s 46.2714 KOps/s $\textbf{\color{#d91a1a}-6.61\%}$
test_clone 0.1834ms 30.5466μs 32.7369 KOps/s 32.6494 KOps/s $\color{#35bf28}+0.27\%$
test_getitem[int] 1.2506ms 17.6580μs 56.6315 KOps/s 58.3835 KOps/s $\color{#d91a1a}-3.00\%$
test_getitem[slice_int] 0.1627ms 30.7200μs 32.5521 KOps/s 33.8352 KOps/s $\color{#d91a1a}-3.79\%$
test_getitem[range] 0.2637ms 0.1155ms 8.6553 KOps/s 8.6306 KOps/s $\color{#35bf28}+0.29\%$
test_getitem[tuple] 91.3795ms 32.4966μs 30.7725 KOps/s 39.4181 KOps/s $\textbf{\color{#d91a1a}-21.93\%}$
test_getitem[list] 0.2814ms 0.1073ms 9.3231 KOps/s 9.0074 KOps/s $\color{#35bf28}+3.50\%$
test_setitem_dim[int] 0.2539ms 61.1609μs 16.3503 KOps/s 18.3913 KOps/s $\textbf{\color{#d91a1a}-11.10\%}$
test_setitem_dim[slice_int] 0.1082ms 86.4198μs 11.5714 KOps/s 12.5850 KOps/s $\textbf{\color{#d91a1a}-8.05\%}$
test_setitem_dim[range] 0.3005ms 0.1519ms 6.5812 KOps/s 6.8317 KOps/s $\color{#d91a1a}-3.67\%$
test_setitem_dim[tuple] 0.2271ms 78.5217μs 12.7353 KOps/s 13.7238 KOps/s $\textbf{\color{#d91a1a}-7.20\%}$
test_setitem 0.2259ms 47.5742μs 21.0198 KOps/s 22.7922 KOps/s $\textbf{\color{#d91a1a}-7.78\%}$
test_set 0.2373ms 47.2933μs 21.1447 KOps/s 21.7576 KOps/s $\color{#d91a1a}-2.82\%$
test_set_shared 0.3875ms 54.4025μs 18.3815 KOps/s 18.6835 KOps/s $\color{#d91a1a}-1.62\%$
test_update 0.2367ms 56.2920μs 17.7645 KOps/s 19.2048 KOps/s $\textbf{\color{#d91a1a}-7.50\%}$
test_update_nested 0.2454ms 64.2482μs 15.5646 KOps/s 15.8850 KOps/s $\color{#d91a1a}-2.02\%$
test_update__nested 0.2397ms 62.2359μs 16.0679 KOps/s 15.0909 KOps/s $\textbf{\color{#35bf28}+6.47\%}$
test_set_nested 0.2136ms 47.2220μs 21.1765 KOps/s 21.7655 KOps/s $\color{#d91a1a}-2.71\%$
test_set_nested_new 0.2241ms 52.6854μs 18.9806 KOps/s 20.2901 KOps/s $\textbf{\color{#d91a1a}-6.45\%}$
test_select 0.2518ms 67.2362μs 14.8729 KOps/s 15.7432 KOps/s $\textbf{\color{#d91a1a}-5.53\%}$
test_select_nested 0.3544ms 51.3903μs 19.4589 KOps/s 19.2198 KOps/s $\color{#35bf28}+1.24\%$
test_exclude_nested 99.2930μs 69.3189μs 14.4261 KOps/s 14.0593 KOps/s $\color{#35bf28}+2.61\%$
test_empty[True] 0.3764ms 0.2813ms 3.5551 KOps/s 3.5227 KOps/s $\color{#35bf28}+0.92\%$
test_empty[False] 2.5310μs 0.8738μs 1.1444 MOps/s 1.1115 MOps/s $\color{#35bf28}+2.96\%$
test_to 0.1461ms 39.0608μs 25.6011 KOps/s 26.6857 KOps/s $\color{#d91a1a}-4.06\%$
test_to_nonblocking 0.2110ms 24.0198μs 41.6323 KOps/s 42.5500 KOps/s $\color{#d91a1a}-2.16\%$
test_unbind_speed 0.3471ms 0.3068ms 3.2597 KOps/s 3.2911 KOps/s $\color{#d91a1a}-0.96\%$
test_unbind_speed_stack0 0.4925ms 0.3021ms 3.3099 KOps/s 3.2893 KOps/s $\color{#35bf28}+0.63\%$
test_unbind_speed_stack1 89.9226ms 0.7718ms 1.2957 KOps/s 1.3035 KOps/s $\color{#d91a1a}-0.60\%$
test_split 92.4666ms 2.3587ms 423.9645 Ops/s 433.9268 Ops/s $\color{#d91a1a}-2.30\%$
test_chunk 91.9040ms 2.3651ms 422.8216 Ops/s 431.4869 Ops/s $\color{#d91a1a}-2.01\%$
test_creation[device0] 0.2196ms 0.1053ms 9.5007 KOps/s 9.2367 KOps/s $\color{#35bf28}+2.86\%$
test_creation_from_tensor 0.3050ms 0.1025ms 9.7546 KOps/s 9.8833 KOps/s $\color{#d91a1a}-1.30\%$
test_add_one[memmap_tensor0] 55.5810μs 9.3443μs 107.0176 KOps/s 105.4340 KOps/s $\color{#35bf28}+1.50\%$
test_contiguous[memmap_tensor0] 14.4410μs 2.2232μs 449.8105 KOps/s 454.4823 KOps/s $\color{#d91a1a}-1.03\%$
test_stack[memmap_tensor0] 24.0010μs 6.8548μs 145.8821 KOps/s 145.7256 KOps/s $\color{#35bf28}+0.11\%$
test_memmaptd_index 1.1939ms 0.4534ms 2.2055 KOps/s 2.2789 KOps/s $\color{#d91a1a}-3.22\%$
test_memmaptd_index_astensor 0.7977ms 0.5231ms 1.9116 KOps/s 1.9905 KOps/s $\color{#d91a1a}-3.97\%$
test_memmaptd_index_op 1.5511ms 1.1206ms 892.4125 Ops/s 913.8752 Ops/s $\color{#d91a1a}-2.35\%$
test_serialize_model 0.1020s 96.4903ms 10.3637 Ops/s 10.0750 Ops/s $\color{#35bf28}+2.87\%$
test_serialize_model_pickle 1.3506s 1.2364s 0.8088 Ops/s 0.8078 Ops/s $\color{#35bf28}+0.13\%$
test_serialize_weights 0.1905s 0.1031s 9.6982 Ops/s 10.2481 Ops/s $\textbf{\color{#d91a1a}-5.37\%}$
test_serialize_weights_returnearly 82.8818ms 72.3194ms 13.8275 Ops/s 11.2965 Ops/s $\textbf{\color{#35bf28}+22.41\%}$
test_serialize_weights_pickle 1.3473s 1.2360s 0.8091 Ops/s 0.8031 Ops/s $\color{#35bf28}+0.74\%$
test_reshape_pytree 0.1598ms 39.3238μs 25.4299 KOps/s 25.9697 KOps/s $\color{#d91a1a}-2.08\%$
test_reshape_td 0.1252ms 44.6637μs 22.3895 KOps/s 22.9992 KOps/s $\color{#d91a1a}-2.65\%$
test_view_pytree 0.1673ms 38.3999μs 26.0417 KOps/s 26.3963 KOps/s $\color{#d91a1a}-1.34\%$
test_view_td 0.1432ms 51.1968μs 19.5325 KOps/s 19.6550 KOps/s $\color{#d91a1a}-0.62\%$
test_unbind_pytree 0.1587ms 37.5314μs 26.6443 KOps/s 26.8177 KOps/s $\color{#d91a1a}-0.65\%$
test_unbind_td 0.3990ms 46.3213μs 21.5883 KOps/s 21.9268 KOps/s $\color{#d91a1a}-1.54\%$
test_split_pytree 0.1815ms 51.9565μs 19.2469 KOps/s 19.4131 KOps/s $\color{#d91a1a}-0.86\%$
test_split_td 0.4613ms 67.6265μs 14.7871 KOps/s 16.5373 KOps/s $\textbf{\color{#d91a1a}-10.58\%}$
test_add_pytree 0.2518ms 65.6385μs 15.2350 KOps/s 16.2881 KOps/s $\textbf{\color{#d91a1a}-6.47\%}$
test_add_td 0.2831ms 0.1054ms 9.4845 KOps/s 10.4696 KOps/s $\textbf{\color{#d91a1a}-9.41\%}$
test_compile_add_one_nested[tensordict-compile] 0.4119ms 0.2071ms 4.8274 KOps/s 4.7525 KOps/s $\color{#35bf28}+1.58\%$
test_compile_add_one_nested[tensordict-eager] 0.3254ms 0.1739ms 5.7506 KOps/s 5.8247 KOps/s $\color{#d91a1a}-1.27\%$
test_compile_add_one_nested[pytree-compile] 0.2968ms 0.1462ms 6.8405 KOps/s 6.7943 KOps/s $\color{#35bf28}+0.68\%$
test_compile_add_one_nested[pytree-eager] 0.3854ms 0.2136ms 4.6810 KOps/s 5.0690 KOps/s $\textbf{\color{#d91a1a}-7.65\%}$
test_compile_copy_nested[tensordict-compile] 0.1012ms 22.1966μs 45.0519 KOps/s 45.0828 KOps/s $\color{#d91a1a}-0.07\%$
test_compile_copy_nested[tensordict-eager] 86.7710μs 49.1122μs 20.3615 KOps/s 20.6296 KOps/s $\color{#d91a1a}-1.30\%$
test_compile_copy_nested[pytree-compile] 0.1665ms 73.3163μs 13.6395 KOps/s 13.7356 KOps/s $\color{#d91a1a}-0.70\%$
test_compile_copy_nested[pytree-eager] 83.2410μs 59.8690μs 16.7031 KOps/s 16.8437 KOps/s $\color{#d91a1a}-0.83\%$
test_compile_add_one_flat[tensordict-compile] 0.4589ms 0.3287ms 3.0423 KOps/s 3.0576 KOps/s $\color{#d91a1a}-0.50\%$
test_compile_add_one_flat[tensordict-eager] 0.3648ms 0.2215ms 4.5146 KOps/s 4.5388 KOps/s $\color{#d91a1a}-0.53\%$
test_compile_add_one_flat[tensorclass-compile] 0.2803ms 0.1304ms 7.6672 KOps/s 7.6697 KOps/s $\color{#d91a1a}-0.03\%$
test_compile_add_one_flat[tensorclass-eager] 0.2455ms 62.9788μs 15.8783 KOps/s 16.2107 KOps/s $\color{#d91a1a}-2.05\%$
test_compile_add_one_flat[pytree-compile] 0.4479ms 0.3263ms 3.0650 KOps/s 3.0474 KOps/s $\color{#35bf28}+0.58\%$
test_compile_add_one_flat[pytree-eager] 0.8965ms 0.6957ms 1.4373 KOps/s 1.5603 KOps/s $\textbf{\color{#d91a1a}-7.88\%}$
test_compile_add_self_flat[tensordict-eager] 0.4130ms 0.2722ms 3.6744 KOps/s 3.6844 KOps/s $\color{#d91a1a}-0.27\%$
test_compile_add_self_flat[tensordict-compile] 0.4772ms 0.3290ms 3.0394 KOps/s 3.0155 KOps/s $\color{#35bf28}+0.79\%$
test_compile_add_self_flat[tensorclass-eager] 0.2516ms 79.4513μs 12.5863 KOps/s 13.3017 KOps/s $\textbf{\color{#d91a1a}-5.38\%}$
test_compile_add_self_flat[tensorclass-compile] 0.2956ms 0.1377ms 7.2619 KOps/s 7.6248 KOps/s $\color{#d91a1a}-4.76\%$
test_compile_add_self_flat[pytree-eager] 0.7549ms 0.5868ms 1.7043 KOps/s 1.8341 KOps/s $\textbf{\color{#d91a1a}-7.08\%}$
test_compile_add_self_flat[pytree-compile] 0.4746ms 0.3274ms 3.0540 KOps/s 3.0635 KOps/s $\color{#d91a1a}-0.31\%$
test_compile_copy_flat[tensordict-compile] 0.2086ms 20.3928μs 49.0369 KOps/s 52.2699 KOps/s $\textbf{\color{#d91a1a}-6.19\%}$
test_compile_copy_flat[tensordict-eager] 0.2195ms 34.2245μs 29.2189 KOps/s 29.6463 KOps/s $\color{#d91a1a}-1.44\%$
test_compile_copy_flat[pytree-compile] 0.2755ms 77.1040μs 12.9695 KOps/s 13.0502 KOps/s $\color{#d91a1a}-0.62\%$
test_compile_copy_flat[pytree-eager] 89.5320μs 60.7158μs 16.4702 KOps/s 16.5482 KOps/s $\color{#d91a1a}-0.47\%$
test_compile_assign_and_add[tensordict-compile] 2.5563ms 0.9331ms 1.0717 KOps/s 1.0711 KOps/s $\color{#35bf28}+0.06\%$
test_compile_assign_and_add[tensordict-eager] 3.5655ms 3.3466ms 298.8066 Ops/s 296.5985 Ops/s $\color{#35bf28}+0.74\%$
test_compile_assign_and_add[pytree-compile] 2.5279ms 0.9239ms 1.0824 KOps/s 1.0874 KOps/s $\color{#d91a1a}-0.46\%$
test_compile_assign_and_add[pytree-eager] 3.6793ms 3.3840ms 295.5045 Ops/s 298.5145 Ops/s $\color{#d91a1a}-1.01\%$
test_compile_indexing[tensor-tensordict-compile] 0.2419ms 0.1104ms 9.0584 KOps/s 9.0582 KOps/s $+0.00\%$
test_compile_indexing[tensor-tensordict-eager] 0.2641ms 67.8937μs 14.7289 KOps/s 15.2562 KOps/s $\color{#d91a1a}-3.46\%$
test_compile_indexing[tensor-tensorclass-compile] 0.2549ms 0.1031ms 9.7009 KOps/s 9.6520 KOps/s $\color{#35bf28}+0.51\%$
test_compile_indexing[tensor-tensorclass-eager] 0.2205ms 49.1099μs 20.3625 KOps/s 20.5067 KOps/s $\color{#d91a1a}-0.70\%$
test_compile_indexing[tensor-pytree-compile] 0.2804ms 0.1085ms 9.2189 KOps/s 9.2593 KOps/s $\color{#d91a1a}-0.44\%$
test_compile_indexing[tensor-pytree-eager] 0.2246ms 49.5299μs 20.1898 KOps/s 20.5334 KOps/s $\color{#d91a1a}-1.67\%$
test_compile_indexing[slice-tensordict-compile] 0.2754ms 0.1398ms 7.1547 KOps/s 6.9950 KOps/s $\color{#35bf28}+2.28\%$
test_compile_indexing[slice-tensordict-eager] 0.3037ms 33.4289μs 29.9142 KOps/s 37.5567 KOps/s $\textbf{\color{#d91a1a}-20.35\%}$
test_compile_indexing[slice-tensorclass-compile] 0.2951ms 0.1314ms 7.6108 KOps/s 7.6630 KOps/s $\color{#d91a1a}-0.68\%$
test_compile_indexing[slice-tensorclass-eager] 0.1912ms 24.6241μs 40.6105 KOps/s 43.4758 KOps/s $\textbf{\color{#d91a1a}-6.59\%}$
test_compile_indexing[slice-pytree-compile] 0.3207ms 0.1372ms 7.2896 KOps/s 7.6468 KOps/s $\color{#d91a1a}-4.67\%$
test_compile_indexing[slice-pytree-eager] 0.1238ms 25.4162μs 39.3450 KOps/s 44.1052 KOps/s $\textbf{\color{#d91a1a}-10.79\%}$
test_compile_indexing[int-tensordict-compile] 0.3399ms 0.1452ms 6.8869 KOps/s 7.2338 KOps/s $\color{#d91a1a}-4.79\%$
test_compile_indexing[int-tensordict-eager] 0.4928ms 28.9402μs 34.5540 KOps/s 39.3673 KOps/s $\textbf{\color{#d91a1a}-12.23\%}$
test_compile_indexing[int-tensorclass-compile] 0.3230ms 0.1369ms 7.3037 KOps/s 7.5048 KOps/s $\color{#d91a1a}-2.68\%$
test_compile_indexing[int-tensorclass-eager] 87.6820μs 25.8916μs 38.6226 KOps/s 44.1339 KOps/s $\textbf{\color{#d91a1a}-12.49\%}$
test_compile_indexing[int-pytree-compile] 0.3237ms 0.1371ms 7.2962 KOps/s 7.4418 KOps/s $\color{#d91a1a}-1.96\%$
test_compile_indexing[int-pytree-eager] 0.1854ms 25.6936μs 38.9202 KOps/s 43.8622 KOps/s $\textbf{\color{#d91a1a}-11.27\%}$
test_mod_add[eager] 0.1890ms 40.3760μs 24.7672 KOps/s 26.0630 KOps/s $\color{#d91a1a}-4.97\%$
test_mod_add[compile] 0.2357ms 71.0253μs 14.0795 KOps/s 13.6058 KOps/s $\color{#35bf28}+3.48\%$
test_mod_add[compile-overhead] 0.2613ms 0.1472ms 6.7935 KOps/s 6.5906 KOps/s $\color{#35bf28}+3.08\%$
test_mod_wrap[eager] 0.4494ms 0.2575ms 3.8837 KOps/s 3.6620 KOps/s $\textbf{\color{#35bf28}+6.05\%}$
test_mod_wrap[compile] 0.4520ms 0.2967ms 3.3707 KOps/s 3.2210 KOps/s $\color{#35bf28}+4.65\%$
test_mod_wrap[compile-overhead] 8.1783ms 4.3328ms 230.7996 Ops/s 238.1981 Ops/s $\color{#d91a1a}-3.11\%$
test_mod_wrap_and_backward[eager] 1.6902ms 1.4674ms 681.4798 Ops/s 683.9009 Ops/s $\color{#d91a1a}-0.35\%$
test_mod_wrap_and_backward[compile] 2.0412ms 1.4705ms 680.0398 Ops/s 690.1132 Ops/s $\color{#d91a1a}-1.46\%$
test_mod_wrap_and_backward[compile-overhead] 1.5129ms 1.0478ms 954.3974 Ops/s 998.4733 Ops/s $\color{#d91a1a}-4.41\%$
test_seq_add[eager] 0.2634ms 0.1165ms 8.5815 KOps/s 8.9314 KOps/s $\color{#d91a1a}-3.92\%$
test_seq_add[compile] 0.2366ms 87.4619μs 11.4336 KOps/s 11.6069 KOps/s $\color{#d91a1a}-1.49\%$
test_seq_add[compile-overhead] 0.2695ms 0.1233ms 8.1124 KOps/s 8.1763 KOps/s $\color{#d91a1a}-0.78\%$
test_seq_wrap[eager] 0.5982ms 0.4348ms 2.2998 KOps/s 2.3369 KOps/s $\color{#d91a1a}-1.59\%$
test_seq_wrap[compile] 0.5258ms 0.3418ms 2.9260 KOps/s 3.0692 KOps/s $\color{#d91a1a}-4.67\%$
test_seq_wrap[compile-overhead] 0.3064s 0.1466s 6.8226 Ops/s 6.7415 Ops/s $\color{#35bf28}+1.20\%$
test_func_call_runtime[False-eager] 0.9151ms 0.7604ms 1.3152 KOps/s 1.2417 KOps/s $\textbf{\color{#35bf28}+5.92\%}$
test_func_call_runtime[False-compile] 1.0340ms 0.8306ms 1.2039 KOps/s 1.2410 KOps/s $\color{#d91a1a}-2.99\%$
test_func_call_runtime[False-compile-overhead] 0.5229ms 0.3696ms 2.7057 KOps/s 2.7207 KOps/s $\color{#d91a1a}-0.55\%$
test_func_call_runtime[True-eager] 1.1847ms 1.0128ms 987.3654 Ops/s 998.7201 Ops/s $\color{#d91a1a}-1.14\%$
test_func_call_runtime[True-compile] 1.0663ms 0.8936ms 1.1191 KOps/s 1.1519 KOps/s $\color{#d91a1a}-2.85\%$
test_func_call_runtime[True-compile-overhead] 0.6171ms 0.4126ms 2.4238 KOps/s 2.4643 KOps/s $\color{#d91a1a}-1.64\%$
test_distributed 0.3072ms 72.8436μs 13.7280 KOps/s 11.2698 KOps/s $\textbf{\color{#35bf28}+21.81\%}$
test_tdmodule 93.7430μs 17.8358μs 56.0669 KOps/s 61.9535 KOps/s $\textbf{\color{#d91a1a}-9.50\%}$
test_tdmodule_dispatch 56.4920μs 34.9412μs 28.6195 KOps/s 30.9324 KOps/s $\textbf{\color{#d91a1a}-7.48\%}$
test_tdseq 33.9210μs 18.1192μs 55.1901 KOps/s 58.7369 KOps/s $\textbf{\color{#d91a1a}-6.04\%}$
test_tdseq_dispatch 54.8610μs 37.2123μs 26.8728 KOps/s 28.6896 KOps/s $\textbf{\color{#d91a1a}-6.33\%}$
test_instantiation_functorch 2.2186ms 2.0219ms 494.5821 Ops/s 496.2358 Ops/s $\color{#d91a1a}-0.33\%$
test_instantiation_td 2.1187ms 1.3053ms 766.0937 Ops/s 766.7728 Ops/s $\color{#d91a1a}-0.09\%$
test_exec_functorch 0.3823ms 0.2314ms 4.3215 KOps/s 4.3784 KOps/s $\color{#d91a1a}-1.30\%$
test_exec_functional_call 0.3924ms 0.2235ms 4.4733 KOps/s 4.5256 KOps/s $\color{#d91a1a}-1.16\%$
test_exec_td 0.3395ms 0.2227ms 4.4910 KOps/s 4.5500 KOps/s $\color{#d91a1a}-1.30\%$
test_exec_td_decorator 0.4832ms 0.2928ms 3.4158 KOps/s 3.4112 KOps/s $\color{#35bf28}+0.13\%$
test_vmap_mlp_speed[True-True] 0.8484ms 0.6763ms 1.4786 KOps/s 1.4854 KOps/s $\color{#d91a1a}-0.46\%$
test_vmap_mlp_speed[True-False] 0.8245ms 0.6725ms 1.4869 KOps/s 1.4851 KOps/s $\color{#35bf28}+0.12\%$
test_vmap_mlp_speed[False-True] 0.7950ms 0.6104ms 1.6384 KOps/s 1.6941 KOps/s $\color{#d91a1a}-3.29\%$
test_vmap_mlp_speed[False-False] 0.7319ms 0.5862ms 1.7059 KOps/s 1.6905 KOps/s $\color{#35bf28}+0.91\%$
test_vmap_mlp_speed_decorator[True-True] 1.4455ms 0.7546ms 1.3252 KOps/s 1.3217 KOps/s $\color{#35bf28}+0.26\%$
test_vmap_mlp_speed_decorator[True-False] 0.9662ms 0.7565ms 1.3219 KOps/s 1.3360 KOps/s $\color{#d91a1a}-1.06\%$
test_vmap_mlp_speed_decorator[False-True] 0.8635ms 0.6549ms 1.5270 KOps/s 1.5329 KOps/s $\color{#d91a1a}-0.39\%$
test_vmap_mlp_speed_decorator[False-False] 0.8753ms 0.6714ms 1.4894 KOps/s 1.5283 KOps/s $\color{#d91a1a}-2.54\%$
test_vmap_transformer_speed[True-True] 9.0161ms 8.8441ms 113.0692 Ops/s 112.8424 Ops/s $\color{#35bf28}+0.20\%$
test_vmap_transformer_speed[True-False] 9.0276ms 8.8478ms 113.0228 Ops/s 113.2265 Ops/s $\color{#d91a1a}-0.18\%$
test_vmap_transformer_speed[False-True] 9.0096ms 8.7447ms 114.3554 Ops/s 113.9865 Ops/s $\color{#35bf28}+0.32\%$
test_vmap_transformer_speed[False-False] 8.9234ms 8.7492ms 114.2957 Ops/s 113.9762 Ops/s $\color{#35bf28}+0.28\%$
test_vmap_transformer_speed_decorator[True-True] 21.3060ms 21.1540ms 47.2723 Ops/s 47.3550 Ops/s $\color{#d91a1a}-0.17\%$
test_vmap_transformer_speed_decorator[True-False] 21.7883ms 21.0110ms 47.5941 Ops/s 47.4003 Ops/s $\color{#35bf28}+0.41\%$
test_vmap_transformer_speed_decorator[False-True] 21.7800ms 20.9186ms 47.8043 Ops/s 48.0636 Ops/s $\color{#d91a1a}-0.54\%$
test_vmap_transformer_speed_decorator[False-False] 21.0937ms 20.8958ms 47.8565 Ops/s 47.9366 Ops/s $\color{#d91a1a}-0.17\%$
test_to_module_speed[True] 1.5934ms 1.4747ms 678.1198 Ops/s 684.1132 Ops/s $\color{#d91a1a}-0.88\%$
test_to_module_speed[False] 1.5751ms 1.4660ms 682.1376 Ops/s 709.2438 Ops/s $\color{#d91a1a}-3.82\%$
test_tc_init 60.7920μs 41.7024μs 23.9795 KOps/s 25.0292 KOps/s $\color{#d91a1a}-4.19\%$
test_tc_init_nested 0.1126ms 84.1799μs 11.8793 KOps/s 13.0490 KOps/s $\textbf{\color{#d91a1a}-8.96\%}$
test_tc_first_layer_tensor 12.5737μs 0.7941μs 1.2593 MOps/s 1.2982 MOps/s $\color{#d91a1a}-3.00\%$
test_tc_first_layer_nontensor 18.1400μs 2.5551μs 391.3724 KOps/s 394.0249 KOps/s $\color{#d91a1a}-0.67\%$
test_tc_second_layer_tensor 7.1800μs 1.5923μs 628.0135 KOps/s 614.2557 KOps/s $\color{#35bf28}+2.24\%$
test_tc_second_layer_nontensor 18.3610μs 3.3610μs 297.5308 KOps/s 298.5549 KOps/s $\color{#d91a1a}-0.34\%$
test_unbind 0.3197s 13.0300ms 76.7457 Ops/s 80.6878 Ops/s $\color{#d91a1a}-4.89\%$
test_full_like 0.7500ms 0.5791ms 1.7269 KOps/s 1.7241 KOps/s $\color{#35bf28}+0.16\%$
test_zeros_like 0.3442ms 0.1979ms 5.0523 KOps/s 5.0558 KOps/s $\color{#d91a1a}-0.07\%$
test_ones_like 0.3707ms 0.1978ms 5.0557 KOps/s 5.0604 KOps/s $\color{#d91a1a}-0.09\%$
test_clone 0.6021ms 0.4142ms 2.4143 KOps/s 2.4084 KOps/s $\color{#35bf28}+0.25\%$
test_squeeze 40.4710μs 11.0903μs 90.1685 KOps/s 91.9713 KOps/s $\color{#d91a1a}-1.96\%$
test_unsqueeze 0.2475ms 82.7559μs 12.0837 KOps/s 12.6613 KOps/s $\color{#d91a1a}-4.56\%$
test_split 0.4482ms 0.1771ms 5.6478 KOps/s 5.7525 KOps/s $\color{#d91a1a}-1.82\%$
test_permute 0.3401ms 0.1894ms 5.2792 KOps/s 5.2417 KOps/s $\color{#35bf28}+0.72\%$
test_stack 1.3217ms 0.8971ms 1.1147 KOps/s 1.1016 KOps/s $\color{#35bf28}+1.19\%$
test_cat 1.3725ms 1.2317ms 811.8639 Ops/s 811.6479 Ops/s $\color{#35bf28}+0.03\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants