Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Non-blocking for consolidated TD #1020

Merged
merged 4 commits into from
Oct 3, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Oct 2, 2024

[ghstack-poisoned]
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 2, 2024
Copy link

github-actions bot commented Oct 2, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 222. Improved: $\large\color{#35bf28}10$. Worsened: $\large\color{#d91a1a}29$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 52.6380μs 26.0287μs 38.4192 KOps/s 40.6917 KOps/s $\textbf{\color{#d91a1a}-5.58\%}$
test_plain_set_stack_nested 57.8880μs 26.2237μs 38.1335 KOps/s 40.4269 KOps/s $\textbf{\color{#d91a1a}-5.67\%}$
test_plain_set_nested_inplace 62.8380μs 28.5825μs 34.9865 KOps/s 37.0205 KOps/s $\textbf{\color{#d91a1a}-5.49\%}$
test_plain_set_stack_nested_inplace 67.0860μs 28.2036μs 35.4564 KOps/s 36.5039 KOps/s $\color{#d91a1a}-2.87\%$
test_items 28.7540μs 4.2781μs 233.7487 KOps/s 229.8629 KOps/s $\color{#35bf28}+1.69\%$
test_items_nested 0.5258ms 0.3806ms 2.6274 KOps/s 2.5650 KOps/s $\color{#35bf28}+2.43\%$
test_items_nested_locked 0.7185ms 0.3853ms 2.5951 KOps/s 2.5656 KOps/s $\color{#35bf28}+1.15\%$
test_items_nested_leaf 0.1471ms 80.9546μs 12.3526 KOps/s 12.1977 KOps/s $\color{#35bf28}+1.27\%$
test_items_stack_nested 0.8131ms 0.3890ms 2.5709 KOps/s 2.5545 KOps/s $\color{#35bf28}+0.64\%$
test_items_stack_nested_leaf 0.1494ms 84.8068μs 11.7915 KOps/s 11.7265 KOps/s $\color{#35bf28}+0.55\%$
test_items_stack_nested_locked 0.7096ms 0.3906ms 2.5604 KOps/s 2.5619 KOps/s $\color{#d91a1a}-0.06\%$
test_keys 28.6330μs 3.5929μs 278.3267 KOps/s 281.1507 KOps/s $\color{#d91a1a}-1.00\%$
test_keys_nested 0.2293ms 0.1368ms 7.3077 KOps/s 7.2484 KOps/s $\color{#35bf28}+0.82\%$
test_keys_nested_locked 0.7032ms 0.1429ms 6.9983 KOps/s 6.9209 KOps/s $\color{#35bf28}+1.12\%$
test_keys_nested_leaf 0.1957ms 0.1197ms 8.3519 KOps/s 8.2171 KOps/s $\color{#35bf28}+1.64\%$
test_keys_stack_nested 0.2760ms 0.1373ms 7.2838 KOps/s 7.2415 KOps/s $\color{#35bf28}+0.58\%$
test_keys_stack_nested_leaf 0.2079ms 0.1200ms 8.3351 KOps/s 8.3472 KOps/s $\color{#d91a1a}-0.14\%$
test_keys_stack_nested_locked 0.3405ms 0.1439ms 6.9472 KOps/s 6.9497 KOps/s $\color{#d91a1a}-0.03\%$
test_values 5.9972μs 1.0577μs 945.4583 KOps/s 919.1975 KOps/s $\color{#35bf28}+2.86\%$
test_values_nested 0.1773ms 94.0094μs 10.6372 KOps/s 10.5050 KOps/s $\color{#35bf28}+1.26\%$
test_values_nested_locked 0.1580ms 92.7870μs 10.7774 KOps/s 10.4981 KOps/s $\color{#35bf28}+2.66\%$
test_values_nested_leaf 0.1757ms 79.2858μs 12.6126 KOps/s 12.3476 KOps/s $\color{#35bf28}+2.15\%$
test_values_stack_nested 0.1449ms 94.0777μs 10.6295 KOps/s 10.4799 KOps/s $\color{#35bf28}+1.43\%$
test_values_stack_nested_leaf 0.1554ms 79.5996μs 12.5629 KOps/s 12.5748 KOps/s $\color{#d91a1a}-0.10\%$
test_values_stack_nested_locked 0.1576ms 92.7963μs 10.7763 KOps/s 10.5382 KOps/s $\color{#35bf28}+2.26\%$
test_membership 6.1111μs 0.7381μs 1.3548 MOps/s 1.3632 MOps/s $\color{#d91a1a}-0.62\%$
test_membership_nested 18.9050μs 2.7518μs 363.4006 KOps/s 363.8772 KOps/s $\color{#d91a1a}-0.13\%$
test_membership_nested_leaf 30.6170μs 2.7850μs 359.0662 KOps/s 360.5264 KOps/s $\color{#d91a1a}-0.41\%$
test_membership_stacked_nested 26.4690μs 2.7764μs 360.1734 KOps/s 360.4654 KOps/s $\color{#d91a1a}-0.08\%$
test_membership_stacked_nested_leaf 23.4940μs 2.8086μs 356.0519 KOps/s 345.5350 KOps/s $\color{#35bf28}+3.04\%$
test_membership_nested_last 45.8680μs 4.2825μs 233.5079 KOps/s 237.9729 KOps/s $\color{#d91a1a}-1.88\%$
test_membership_nested_leaf_last 98.3230μs 4.3631μs 229.1928 KOps/s 238.5868 KOps/s $\color{#d91a1a}-3.94\%$
test_membership_stacked_nested_last 33.5720μs 4.2633μs 234.5589 KOps/s 169.5345 KOps/s $\textbf{\color{#35bf28}+38.35\%}$
test_membership_stacked_nested_leaf_last 32.9610μs 4.2871μs 233.2566 KOps/s 169.9251 KOps/s $\textbf{\color{#35bf28}+37.27\%}$
test_nested_getleaf 38.1020μs 10.8945μs 91.7892 KOps/s 92.8809 KOps/s $\color{#d91a1a}-1.18\%$
test_nested_get 36.4580μs 10.5377μs 94.8972 KOps/s 98.2193 KOps/s $\color{#d91a1a}-3.38\%$
test_stacked_getleaf 45.4150μs 10.9887μs 91.0025 KOps/s 94.3366 KOps/s $\color{#d91a1a}-3.53\%$
test_stacked_get 36.9490μs 10.4774μs 95.4433 KOps/s 98.3367 KOps/s $\color{#d91a1a}-2.94\%$
test_nested_getitemleaf 37.5610μs 11.4004μs 87.7163 KOps/s 89.3315 KOps/s $\color{#d91a1a}-1.81\%$
test_nested_getitem 40.7460μs 10.3505μs 96.6136 KOps/s 96.0038 KOps/s $\color{#35bf28}+0.64\%$
test_stacked_getitemleaf 35.8670μs 11.4276μs 87.5075 KOps/s 88.8264 KOps/s $\color{#d91a1a}-1.48\%$
test_stacked_getitem 36.0570μs 10.2864μs 97.2157 KOps/s 95.8853 KOps/s $\color{#35bf28}+1.39\%$
test_lock_nested 85.5422ms 0.5979ms 1.6725 KOps/s 1.9676 KOps/s $\textbf{\color{#d91a1a}-15.00\%}$
test_lock_stack_nested 0.7483ms 0.4849ms 2.0622 KOps/s 2.1296 KOps/s $\color{#d91a1a}-3.16\%$
test_unlock_nested 85.8721ms 0.5128ms 1.9501 KOps/s 2.3770 KOps/s $\textbf{\color{#d91a1a}-17.96\%}$
test_unlock_stack_nested 0.7134ms 0.3960ms 2.5250 KOps/s 2.6108 KOps/s $\color{#d91a1a}-3.28\%$
test_flatten_speed 0.2063ms 0.1018ms 9.8190 KOps/s 9.8980 KOps/s $\color{#d91a1a}-0.80\%$
test_unflatten_speed 0.7012ms 0.5306ms 1.8846 KOps/s 1.9174 KOps/s $\color{#d91a1a}-1.71\%$
test_common_ops 6.3955ms 1.2181ms 820.9513 Ops/s 862.7404 Ops/s $\color{#d91a1a}-4.84\%$
test_creation 36.0880μs 2.0935μs 477.6626 KOps/s 469.3700 KOps/s $\color{#35bf28}+1.77\%$
test_creation_empty 51.3560μs 20.3174μs 49.2190 KOps/s 54.8212 KOps/s $\textbf{\color{#d91a1a}-10.22\%}$
test_creation_nested_1 74.6900μs 24.1624μs 41.3867 KOps/s 45.5840 KOps/s $\textbf{\color{#d91a1a}-9.21\%}$
test_creation_nested_2 71.1840μs 28.0966μs 35.5915 KOps/s 37.1035 KOps/s $\color{#d91a1a}-4.08\%$
test_clone 0.1461ms 17.5250μs 57.0614 KOps/s 58.0671 KOps/s $\color{#d91a1a}-1.73\%$
test_getitem[int] 1.1730ms 17.5475μs 56.9881 KOps/s 59.7277 KOps/s $\color{#d91a1a}-4.59\%$
test_getitem[slice_int] 0.1571ms 33.0245μs 30.2805 KOps/s 32.1597 KOps/s $\textbf{\color{#d91a1a}-5.84\%}$
test_getitem[range] 0.1877ms 59.0228μs 16.9426 KOps/s 16.8939 KOps/s $\color{#35bf28}+0.29\%$
test_getitem[tuple] 0.1450ms 26.0494μs 38.3885 KOps/s 39.3414 KOps/s $\color{#d91a1a}-2.42\%$
test_getitem[list] 0.1645ms 54.8637μs 18.2270 KOps/s 18.2725 KOps/s $\color{#d91a1a}-0.25\%$
test_setitem_dim[int] 66.8450μs 34.0141μs 29.3995 KOps/s 29.0570 KOps/s $\color{#35bf28}+1.18\%$
test_setitem_dim[slice_int] 0.1133ms 62.6627μs 15.9584 KOps/s 15.8745 KOps/s $\color{#35bf28}+0.53\%$
test_setitem_dim[range] 0.1446ms 85.6468μs 11.6759 KOps/s 11.4693 KOps/s $\color{#35bf28}+1.80\%$
test_setitem_dim[tuple] 87.4130μs 50.1390μs 19.9446 KOps/s 19.8189 KOps/s $\color{#35bf28}+0.63\%$
test_setitem 92.3130μs 32.4241μs 30.8412 KOps/s 33.1869 KOps/s $\textbf{\color{#d91a1a}-7.07\%}$
test_set 76.3020μs 31.2186μs 32.0321 KOps/s 33.9443 KOps/s $\textbf{\color{#d91a1a}-5.63\%}$
test_set_shared 3.8809ms 0.2220ms 4.5042 KOps/s 4.5006 KOps/s $\color{#35bf28}+0.08\%$
test_update 0.1403ms 40.8770μs 24.4636 KOps/s 26.5258 KOps/s $\textbf{\color{#d91a1a}-7.77\%}$
test_update_nested 0.1943ms 51.9247μs 19.2587 KOps/s 20.3695 KOps/s $\textbf{\color{#d91a1a}-5.45\%}$
test_update__nested 1.0118ms 38.1497μs 26.2125 KOps/s 26.6678 KOps/s $\color{#d91a1a}-1.71\%$
test_set_nested 83.0040μs 35.2756μs 28.3482 KOps/s 30.3337 KOps/s $\textbf{\color{#d91a1a}-6.55\%}$
test_set_nested_new 96.5300μs 40.1146μs 24.9286 KOps/s 26.2681 KOps/s $\textbf{\color{#d91a1a}-5.10\%}$
test_select 0.2446ms 57.4027μs 17.4208 KOps/s 17.9137 KOps/s $\color{#d91a1a}-2.75\%$
test_select_nested 0.1214ms 60.3048μs 16.5824 KOps/s 16.7976 KOps/s $\color{#d91a1a}-1.28\%$
test_exclude_nested 0.1673ms 75.9072μs 13.1740 KOps/s 13.2284 KOps/s $\color{#d91a1a}-0.41\%$
test_empty[True] 0.6688ms 0.3620ms 2.7622 KOps/s 2.7594 KOps/s $\color{#35bf28}+0.10\%$
test_empty[False] 9.6102μs 1.2536μs 797.7187 KOps/s 723.7633 KOps/s $\textbf{\color{#35bf28}+10.22\%}$
test_unbind_speed 0.5191ms 0.3080ms 3.2472 KOps/s 3.2847 KOps/s $\color{#d91a1a}-1.14\%$
test_unbind_speed_stack0 0.9010ms 0.3087ms 3.2399 KOps/s 3.4468 KOps/s $\textbf{\color{#d91a1a}-6.00\%}$
test_unbind_speed_stack1 90.1346ms 0.8208ms 1.2183 KOps/s 1.3794 KOps/s $\textbf{\color{#d91a1a}-11.68\%}$
test_split 3.2966ms 2.0163ms 495.9555 Ops/s 455.6154 Ops/s $\textbf{\color{#35bf28}+8.85\%}$
test_chunk 94.9570ms 2.2056ms 453.3907 Ops/s 451.2668 Ops/s $\color{#35bf28}+0.47\%$
test_creation[device0] 0.2748ms 0.1192ms 8.3878 KOps/s 8.2238 KOps/s $\color{#35bf28}+2.00\%$
test_creation_from_tensor 4.0891ms 0.1217ms 8.2147 KOps/s 8.4289 KOps/s $\color{#d91a1a}-2.54\%$
test_add_one[memmap_tensor0] 0.6369ms 7.1150μs 140.5490 KOps/s 132.4672 KOps/s $\textbf{\color{#35bf28}+6.10\%}$
test_contiguous[memmap_tensor0] 20.3380μs 1.8728μs 533.9637 KOps/s 517.3099 KOps/s $\color{#35bf28}+3.22\%$
test_stack[memmap_tensor0] 29.3550μs 5.6265μs 177.7311 KOps/s 174.0301 KOps/s $\color{#35bf28}+2.13\%$
test_memmaptd_index 1.2025ms 0.4118ms 2.4284 KOps/s 2.3429 KOps/s $\color{#35bf28}+3.65\%$
test_memmaptd_index_astensor 1.0297ms 0.5124ms 1.9517 KOps/s 1.8891 KOps/s $\color{#35bf28}+3.31\%$
test_memmaptd_index_op 2.2664ms 1.1027ms 906.8953 Ops/s 942.4719 Ops/s $\color{#d91a1a}-3.77\%$
test_serialize_model 0.1233s 0.1179s 8.4826 Ops/s 8.5199 Ops/s $\color{#d91a1a}-0.44\%$
test_serialize_model_pickle 0.4269s 0.4006s 2.4962 Ops/s 2.5256 Ops/s $\color{#d91a1a}-1.16\%$
test_serialize_weights 0.1311s 0.1170s 8.5464 Ops/s 8.7400 Ops/s $\color{#d91a1a}-2.22\%$
test_serialize_weights_returnearly 0.2814s 0.1722s 5.8061 Ops/s 5.5995 Ops/s $\color{#35bf28}+3.69\%$
test_serialize_weights_pickle 1.0524s 0.6660s 1.5015 Ops/s 2.3733 Ops/s $\textbf{\color{#d91a1a}-36.73\%}$
test_serialize_weights_filesystem 0.1500s 0.1407s 7.1092 Ops/s 7.1684 Ops/s $\color{#d91a1a}-0.83\%$
test_serialize_model_filesystem 0.1513s 0.1411s 7.0847 Ops/s 6.7164 Ops/s $\textbf{\color{#35bf28}+5.48\%}$
test_reshape_pytree 0.1168ms 40.8407μs 24.4854 KOps/s 25.4148 KOps/s $\color{#d91a1a}-3.66\%$
test_reshape_td 0.1084ms 46.6729μs 21.4257 KOps/s 20.9636 KOps/s $\color{#35bf28}+2.20\%$
test_view_pytree 0.1035ms 40.3886μs 24.7595 KOps/s 25.3585 KOps/s $\color{#d91a1a}-2.36\%$
test_view_td 98.0240μs 52.3517μs 19.1016 KOps/s 18.7751 KOps/s $\color{#35bf28}+1.74\%$
test_unbind_pytree 70.8730μs 37.4719μs 26.6867 KOps/s 27.6753 KOps/s $\color{#d91a1a}-3.57\%$
test_unbind_td 0.3159ms 46.6561μs 21.4334 KOps/s 22.0447 KOps/s $\color{#d91a1a}-2.77\%$
test_split_pytree 84.0770μs 39.7916μs 25.1310 KOps/s 26.1449 KOps/s $\color{#d91a1a}-3.88\%$
test_split_td 84.2913ms 69.4311μs 14.4028 KOps/s 17.0024 KOps/s $\textbf{\color{#d91a1a}-15.29\%}$
test_add_pytree 0.1085ms 45.4481μs 22.0031 KOps/s 21.4160 KOps/s $\color{#35bf28}+2.74\%$
test_add_td 0.1844ms 93.5935μs 10.6845 KOps/s 11.4399 KOps/s $\textbf{\color{#d91a1a}-6.60\%}$
test_compile_add_one_nested[tensordict-compile] 0.1046ms 57.1702μs 17.4916 KOps/s 16.9021 KOps/s $\color{#35bf28}+3.49\%$
test_compile_add_one_nested[tensordict-eager] 0.3861ms 0.1999ms 5.0021 KOps/s 4.8893 KOps/s $\color{#35bf28}+2.31\%$
test_compile_add_one_nested[pytree-compile] 0.1203ms 55.8066μs 17.9190 KOps/s 17.3490 KOps/s $\color{#35bf28}+3.29\%$
test_compile_add_one_nested[pytree-eager] 0.3248ms 0.1420ms 7.0402 KOps/s 6.9650 KOps/s $\color{#35bf28}+1.08\%$
test_compile_copy_nested[tensordict-compile] 76.4230μs 23.0956μs 43.2983 KOps/s 43.3169 KOps/s $\color{#d91a1a}-0.04\%$
test_compile_copy_nested[tensordict-eager] 0.1467ms 74.3684μs 13.4466 KOps/s 12.9084 KOps/s $\color{#35bf28}+4.17\%$
test_compile_copy_nested[pytree-compile] 0.1528ms 76.7800μs 13.0242 KOps/s 13.1338 KOps/s $\color{#d91a1a}-0.83\%$
test_compile_copy_nested[pytree-eager] 0.1358ms 68.3473μs 14.6312 KOps/s 14.4388 KOps/s $\color{#35bf28}+1.33\%$
test_compile_add_one_flat[tensordict-compile] 0.3989ms 0.1806ms 5.5370 KOps/s 5.4905 KOps/s $\color{#35bf28}+0.85\%$
test_compile_add_one_flat[tensordict-eager] 0.5059ms 0.2391ms 4.1831 KOps/s 4.0344 KOps/s $\color{#35bf28}+3.68\%$
test_compile_add_one_flat[tensorclass-compile] 0.1143ms 47.8461μs 20.9003 KOps/s 20.2525 KOps/s $\color{#35bf28}+3.20\%$
test_compile_add_one_flat[tensorclass-eager] 0.1777ms 77.9574μs 12.8275 KOps/s 12.7105 KOps/s $\color{#35bf28}+0.92\%$
test_compile_add_one_flat[pytree-compile] 0.3649ms 0.1753ms 5.7048 KOps/s 5.6905 KOps/s $\color{#35bf28}+0.25\%$
test_compile_add_one_flat[pytree-eager] 0.6371ms 0.2887ms 3.4633 KOps/s 3.3414 KOps/s $\color{#35bf28}+3.65\%$
test_compile_add_self_flat[tensordict-eager] 0.5340ms 0.2729ms 3.6642 KOps/s 3.5408 KOps/s $\color{#35bf28}+3.49\%$
test_compile_add_self_flat[tensordict-compile] 0.5386ms 0.1831ms 5.4624 KOps/s 5.4063 KOps/s $\color{#35bf28}+1.04\%$
test_compile_add_self_flat[tensorclass-eager] 0.1594ms 74.7201μs 13.3833 KOps/s 13.2014 KOps/s $\color{#35bf28}+1.38\%$
test_compile_add_self_flat[tensorclass-compile] 0.1161ms 48.8297μs 20.4793 KOps/s 20.0060 KOps/s $\color{#35bf28}+2.37\%$
test_compile_add_self_flat[pytree-eager] 0.4472ms 0.2330ms 4.2918 KOps/s 4.1718 KOps/s $\color{#35bf28}+2.88\%$
test_compile_add_self_flat[pytree-compile] 0.2868ms 0.1760ms 5.6830 KOps/s 5.7491 KOps/s $\color{#d91a1a}-1.15\%$
test_compile_copy_flat[tensordict-compile] 0.2427ms 0.1112ms 8.9962 KOps/s 9.0368 KOps/s $\color{#d91a1a}-0.45\%$
test_compile_copy_flat[tensordict-eager] 0.1567ms 77.2072μs 12.9522 KOps/s 12.5735 KOps/s $\color{#35bf28}+3.01\%$
test_compile_copy_flat[pytree-compile] 0.1440ms 77.3859μs 12.9222 KOps/s 12.7017 KOps/s $\color{#35bf28}+1.74\%$
test_compile_copy_flat[pytree-eager] 0.1502ms 69.5137μs 14.3856 KOps/s 14.1095 KOps/s $\color{#35bf28}+1.96\%$
test_compile_assign_and_add[tensordict-compile] 0.3360ms 0.1931ms 5.1780 KOps/s 5.1823 KOps/s $\color{#d91a1a}-0.08\%$
test_compile_assign_and_add[tensordict-eager] 2.9584ms 1.7360ms 576.0494 Ops/s 564.3420 Ops/s $\color{#35bf28}+2.07\%$
test_compile_assign_and_add[pytree-compile] 0.3808ms 0.1925ms 5.1954 KOps/s 5.1951 KOps/s $+0.01\%$
test_compile_assign_and_add[pytree-eager] 1.3292ms 1.0904ms 917.0850 Ops/s 878.7278 Ops/s $\color{#35bf28}+4.37\%$
test_compile_assign_and_add_stack[compile] 0.6738ms 0.4134ms 2.4190 KOps/s 2.3897 KOps/s $\color{#35bf28}+1.23\%$
test_compile_assign_and_add_stack[eager] 4.7154ms 4.2808ms 233.5998 Ops/s 240.0512 Ops/s $\color{#d91a1a}-2.69\%$
test_compile_indexing[tensor-tensordict-compile] 81.9030μs 33.7374μs 29.6407 KOps/s 28.3522 KOps/s $\color{#35bf28}+4.54\%$
test_compile_indexing[tensor-tensordict-eager] 0.9220ms 50.5278μs 19.7911 KOps/s 20.2785 KOps/s $\color{#d91a1a}-2.40\%$
test_compile_indexing[tensor-tensorclass-compile] 73.6570μs 30.4504μs 32.8403 KOps/s 33.2482 KOps/s $\color{#d91a1a}-1.23\%$
test_compile_indexing[tensor-tensorclass-eager] 68.4170μs 29.5584μs 33.8313 KOps/s 34.0388 KOps/s $\color{#d91a1a}-0.61\%$
test_compile_indexing[tensor-pytree-compile] 76.5030μs 30.1312μs 33.1882 KOps/s 33.0106 KOps/s $\color{#35bf28}+0.54\%$
test_compile_indexing[tensor-pytree-eager] 95.6990μs 29.5206μs 33.8747 KOps/s 34.1011 KOps/s $\color{#d91a1a}-0.66\%$
test_compile_indexing[slice-tensordict-compile] 0.1811ms 73.7724μs 13.5552 KOps/s 13.3520 KOps/s $\color{#35bf28}+1.52\%$
test_compile_indexing[slice-tensordict-eager] 0.6452ms 28.7052μs 34.8369 KOps/s 34.2855 KOps/s $\color{#35bf28}+1.61\%$
test_compile_indexing[slice-tensorclass-compile] 0.1214ms 68.9938μs 14.4941 KOps/s 14.6381 KOps/s $\color{#d91a1a}-0.98\%$
test_compile_indexing[slice-tensorclass-eager] 63.2780μs 23.7056μs 42.1842 KOps/s 41.9194 KOps/s $\color{#35bf28}+0.63\%$
test_compile_indexing[slice-pytree-compile] 0.1253ms 68.4042μs 14.6190 KOps/s 14.5806 KOps/s $\color{#35bf28}+0.26\%$
test_compile_indexing[slice-pytree-eager] 63.9790μs 23.6004μs 42.3722 KOps/s 42.8127 KOps/s $\color{#d91a1a}-1.03\%$
test_compile_indexing[int-tensordict-compile] 0.1376ms 73.4189μs 13.6205 KOps/s 13.3620 KOps/s $\color{#35bf28}+1.93\%$
test_compile_indexing[int-tensordict-eager] 1.2070ms 28.7648μs 34.7647 KOps/s 34.1354 KOps/s $\color{#35bf28}+1.84\%$
test_compile_indexing[int-tensorclass-compile] 0.1607ms 69.0793μs 14.4761 KOps/s 14.5963 KOps/s $\color{#d91a1a}-0.82\%$
test_compile_indexing[int-tensorclass-eager] 64.7000μs 23.3906μs 42.7523 KOps/s 41.7736 KOps/s $\color{#35bf28}+2.34\%$
test_compile_indexing[int-pytree-compile] 0.1330ms 67.3923μs 14.8385 KOps/s 14.7102 KOps/s $\color{#35bf28}+0.87\%$
test_compile_indexing[int-pytree-eager] 69.2900μs 23.3515μs 42.8237 KOps/s 42.6733 KOps/s $\color{#35bf28}+0.35\%$
test_mod_add[eager] 98.2540μs 28.0360μs 35.6684 KOps/s 38.6727 KOps/s $\textbf{\color{#d91a1a}-7.77\%}$
test_mod_add[compile] 0.1109ms 38.3592μs 26.0694 KOps/s 26.0929 KOps/s $\color{#d91a1a}-0.09\%$
test_mod_add[compile-overhead] 87.9840μs 38.5922μs 25.9120 KOps/s 26.0052 KOps/s $\color{#d91a1a}-0.36\%$
test_mod_wrap[eager] 0.4041ms 0.2089ms 4.7859 KOps/s 4.5880 KOps/s $\color{#35bf28}+4.31\%$
test_mod_wrap[compile] 0.3089ms 0.2288ms 4.3710 KOps/s 4.3391 KOps/s $\color{#35bf28}+0.74\%$
test_mod_wrap[compile-overhead] 0.3091ms 0.2260ms 4.4241 KOps/s 4.4013 KOps/s $\color{#35bf28}+0.52\%$
test_mod_wrap_and_backward[eager] 12.3277ms 10.7536ms 92.9917 Ops/s 93.5066 Ops/s $\color{#d91a1a}-0.55\%$
test_mod_wrap_and_backward[compile] 12.6167ms 11.0992ms 90.0964 Ops/s 82.4422 Ops/s $\textbf{\color{#35bf28}+9.28\%}$
test_mod_wrap_and_backward[compile-overhead] 13.7725ms 11.3706ms 87.9462 Ops/s 82.6547 Ops/s $\textbf{\color{#35bf28}+6.40\%}$
test_seq_add[eager] 0.1644ms 94.3135μs 10.6029 KOps/s 10.3084 KOps/s $\color{#35bf28}+2.86\%$
test_seq_add[compile] 0.1248ms 64.1098μs 15.5982 KOps/s 14.8683 KOps/s $\color{#35bf28}+4.91\%$
test_seq_add[compile-overhead] 0.1306ms 63.2830μs 15.8020 KOps/s 15.6746 KOps/s $\color{#35bf28}+0.81\%$
test_seq_wrap[eager] 0.5971ms 0.3950ms 2.5317 KOps/s 2.5327 KOps/s $\color{#d91a1a}-0.04\%$
test_seq_wrap[compile] 1.1867ms 0.2676ms 3.7374 KOps/s 3.7205 KOps/s $\color{#35bf28}+0.46\%$
test_seq_wrap[compile-overhead] 1.2183ms 0.2665ms 3.7528 KOps/s 3.5977 KOps/s $\color{#35bf28}+4.31\%$
test_func_call_runtime[False-eager] 0.9956ms 0.5370ms 1.8624 KOps/s 1.8859 KOps/s $\color{#d91a1a}-1.25\%$
test_func_call_runtime[False-compile] 0.8780ms 0.5005ms 1.9979 KOps/s 1.9830 KOps/s $\color{#35bf28}+0.75\%$
test_func_call_runtime[False-compile-overhead] 0.6374ms 0.5022ms 1.9914 KOps/s 1.9977 KOps/s $\color{#d91a1a}-0.31\%$
test_func_call_runtime[True-eager] 1.2901ms 0.7575ms 1.3201 KOps/s 1.3320 KOps/s $\color{#d91a1a}-0.89\%$
test_func_call_runtime[True-compile] 0.6240ms 0.5138ms 1.9462 KOps/s 1.9292 KOps/s $\color{#35bf28}+0.88\%$
test_func_call_runtime[True-compile-overhead] 0.6030ms 0.5109ms 1.9575 KOps/s 1.9486 KOps/s $\color{#35bf28}+0.46\%$
test_func_call_cm_runtime[False-eager] 0.9037ms 0.5271ms 1.8972 KOps/s 1.8843 KOps/s $\color{#35bf28}+0.68\%$
test_func_call_cm_runtime[False-compile] 0.6731ms 0.4951ms 2.0198 KOps/s 1.9724 KOps/s $\color{#35bf28}+2.41\%$
test_func_call_cm_runtime[False-compile-overhead] 0.9270ms 0.4967ms 2.0134 KOps/s 1.9774 KOps/s $\color{#35bf28}+1.82\%$
test_func_call_cm_runtime[True-eager] 1.0609ms 0.8973ms 1.1144 KOps/s 1.1056 KOps/s $\color{#35bf28}+0.79\%$
test_func_call_cm_runtime[True-compile] 1.2185ms 0.7468ms 1.3390 KOps/s 1.3517 KOps/s $\color{#d91a1a}-0.93\%$
test_func_call_cm_runtime[True-compile-overhead] 0.9294ms 0.7451ms 1.3421 KOps/s 1.3416 KOps/s $\color{#35bf28}+0.04\%$
test_vmap_func_call_cm_runtime[eager] 2.4289ms 1.9167ms 521.7280 Ops/s 524.8040 Ops/s $\color{#d91a1a}-0.59\%$
test_vmap_func_call_cm_runtime[compile] 2.6320ms 1.9762ms 506.0166 Ops/s 512.2920 Ops/s $\color{#d91a1a}-1.22\%$
test_vmap_func_call_cm_runtime[compile-overhead] 2.6886ms 1.9797ms 505.1160 Ops/s 503.3427 Ops/s $\color{#35bf28}+0.35\%$
test_distributed 0.2300ms 0.1259ms 7.9451 KOps/s 7.7393 KOps/s $\color{#35bf28}+2.66\%$
test_tdmodule 42.8700μs 19.7457μs 50.6440 KOps/s 55.3814 KOps/s $\textbf{\color{#d91a1a}-8.55\%}$
test_tdmodule_dispatch 70.3420μs 39.2082μs 25.5049 KOps/s 27.7483 KOps/s $\textbf{\color{#d91a1a}-8.08\%}$
test_tdseq 67.8270μs 22.2603μs 44.9230 KOps/s 47.3509 KOps/s $\textbf{\color{#d91a1a}-5.13\%}$
test_tdseq_dispatch 79.0170μs 44.8318μs 22.3056 KOps/s 22.2833 KOps/s $\color{#35bf28}+0.10\%$
test_instantiation_functorch 2.5040ms 1.6129ms 620.0087 Ops/s 614.5818 Ops/s $\color{#35bf28}+0.88\%$
test_instantiation_td 2.0066ms 1.1951ms 836.7588 Ops/s 820.6131 Ops/s $\color{#35bf28}+1.97\%$
test_exec_functorch 0.4155ms 0.1899ms 5.2648 KOps/s 5.3141 KOps/s $\color{#d91a1a}-0.93\%$
test_exec_functional_call 0.3088ms 0.1714ms 5.8359 KOps/s 5.5830 KOps/s $\color{#35bf28}+4.53\%$
test_exec_td 0.3644ms 0.2003ms 4.9917 KOps/s 4.9298 KOps/s $\color{#35bf28}+1.26\%$
test_exec_td_decorator 0.7814ms 0.2337ms 4.2785 KOps/s 4.1876 KOps/s $\color{#35bf28}+2.17\%$
test_vmap_mlp_speed[True-True] 1.0157ms 0.6921ms 1.4449 KOps/s 1.4585 KOps/s $\color{#d91a1a}-0.93\%$
test_vmap_mlp_speed[True-False] 1.0081ms 0.6879ms 1.4536 KOps/s 1.4693 KOps/s $\color{#d91a1a}-1.07\%$
test_vmap_mlp_speed[False-True] 0.6686ms 0.5379ms 1.8590 KOps/s 1.8668 KOps/s $\color{#d91a1a}-0.42\%$
test_vmap_mlp_speed[False-False] 0.8124ms 0.5417ms 1.8461 KOps/s 1.8598 KOps/s $\color{#d91a1a}-0.74\%$
test_vmap_mlp_speed_decorator[True-True] 0.8833ms 0.6512ms 1.5357 KOps/s 1.5536 KOps/s $\color{#d91a1a}-1.15\%$
test_vmap_mlp_speed_decorator[True-False] 1.0798ms 0.6546ms 1.5277 KOps/s 1.5521 KOps/s $\color{#d91a1a}-1.57\%$
test_vmap_mlp_speed_decorator[False-True] 0.9341ms 0.5363ms 1.8647 KOps/s 1.8832 KOps/s $\color{#d91a1a}-0.98\%$
test_vmap_mlp_speed_decorator[False-False] 0.8477ms 0.5367ms 1.8634 KOps/s 1.8930 KOps/s $\color{#d91a1a}-1.56\%$
test_to_module_speed[True] 1.6080ms 1.3888ms 720.0479 Ops/s 688.4168 Ops/s $\color{#35bf28}+4.59\%$
test_to_module_speed[False] 2.1808ms 1.3767ms 726.3889 Ops/s 723.1174 Ops/s $\color{#35bf28}+0.45\%$
test_tc_init 94.7570μs 50.1545μs 19.9384 KOps/s 21.0269 KOps/s $\textbf{\color{#d91a1a}-5.18\%}$
test_tc_init_nested 0.1749ms 99.6678μs 10.0333 KOps/s 10.4540 KOps/s $\color{#d91a1a}-4.02\%$
test_tc_first_layer_tensor 22.9830μs 1.5531μs 643.8852 KOps/s 632.4159 KOps/s $\color{#35bf28}+1.81\%$
test_tc_first_layer_nontensor 26.9800μs 4.7718μs 209.5652 KOps/s 204.8666 KOps/s $\color{#35bf28}+2.29\%$
test_tc_second_layer_tensor 24.2150μs 2.8075μs 356.1893 KOps/s 345.1799 KOps/s $\color{#35bf28}+3.19\%$
test_tc_second_layer_nontensor 22.3520μs 6.0015μs 166.6257 KOps/s 159.6663 KOps/s $\color{#35bf28}+4.36\%$
test_unbind 0.4604s 13.0939ms 76.3716 Ops/s 78.2471 Ops/s $\color{#d91a1a}-2.40\%$
test_full_like 15.0701ms 11.5648ms 86.4694 Ops/s 133.5770 Ops/s $\textbf{\color{#d91a1a}-35.27\%}$
test_zeros_like 12.6066ms 6.8810ms 145.3284 Ops/s 350.8841 Ops/s $\textbf{\color{#d91a1a}-58.58\%}$
test_ones_like 14.7679ms 7.4271ms 134.6426 Ops/s 308.0310 Ops/s $\textbf{\color{#d91a1a}-56.29\%}$
test_clone 14.7695ms 9.1160ms 109.6972 Ops/s 196.5174 Ops/s $\textbf{\color{#d91a1a}-44.18\%}$
test_squeeze 70.2210μs 13.7338μs 72.8130 KOps/s 78.5917 KOps/s $\textbf{\color{#d91a1a}-7.35\%}$
test_unsqueeze 0.1919ms 97.1448μs 10.2939 KOps/s 10.8015 KOps/s $\color{#d91a1a}-4.70\%$
test_split 0.4971ms 0.1975ms 5.0642 KOps/s 5.0553 KOps/s $\color{#35bf28}+0.18\%$
test_permute 0.3020ms 0.2207ms 4.5314 KOps/s 4.4768 KOps/s $\color{#35bf28}+1.22\%$
test_stack 30.1211ms 24.1658ms 41.3808 Ops/s 38.2408 Ops/s $\textbf{\color{#35bf28}+8.21\%}$
test_cat 25.8665ms 23.9055ms 41.8314 Ops/s 38.0515 Ops/s $\textbf{\color{#35bf28}+9.93\%}$

Copy link

github-actions bot commented Oct 2, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 228. Improved: $\large\color{#35bf28}17$. Worsened: $\large\color{#d91a1a}5$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.1257ms 16.6185μs 60.1738 KOps/s 56.8502 KOps/s $\textbf{\color{#35bf28}+5.85\%}$
test_plain_set_stack_nested 87.8350μs 16.7516μs 59.6957 KOps/s 57.1452 KOps/s $\color{#35bf28}+4.46\%$
test_plain_set_nested_inplace 67.3140μs 18.0026μs 55.5477 KOps/s 53.3854 KOps/s $\color{#35bf28}+4.05\%$
test_plain_set_stack_nested_inplace 57.9440μs 17.9768μs 55.6272 KOps/s 52.9722 KOps/s $\textbf{\color{#35bf28}+5.01\%}$
test_items 28.8820μs 2.8403μs 352.0770 KOps/s 348.8082 KOps/s $\color{#35bf28}+0.94\%$
test_items_nested 0.4479ms 0.3443ms 2.9045 KOps/s 2.8985 KOps/s $\color{#35bf28}+0.21\%$
test_items_nested_locked 0.4985ms 0.3480ms 2.8736 KOps/s 2.8970 KOps/s $\color{#d91a1a}-0.81\%$
test_items_nested_leaf 91.5660μs 62.8693μs 15.9060 KOps/s 16.0177 KOps/s $\color{#d91a1a}-0.70\%$
test_items_stack_nested 0.4860ms 0.3529ms 2.8335 KOps/s 2.9021 KOps/s $\color{#d91a1a}-2.36\%$
test_items_stack_nested_leaf 0.1056ms 63.6398μs 15.7134 KOps/s 15.9581 KOps/s $\color{#d91a1a}-1.53\%$
test_items_stack_nested_locked 0.5219ms 0.3487ms 2.8674 KOps/s 2.8821 KOps/s $\color{#d91a1a}-0.51\%$
test_keys 38.9230μs 3.4306μs 291.4944 KOps/s 293.8702 KOps/s $\color{#d91a1a}-0.81\%$
test_keys_nested 0.1109ms 71.5381μs 13.9786 KOps/s 14.0279 KOps/s $\color{#d91a1a}-0.35\%$
test_keys_nested_locked 2.8001ms 77.2597μs 12.9434 KOps/s 13.0082 KOps/s $\color{#d91a1a}-0.50\%$
test_keys_nested_leaf 99.7060μs 62.0571μs 16.1142 KOps/s 16.2676 KOps/s $\color{#d91a1a}-0.94\%$
test_keys_stack_nested 0.1118ms 71.7270μs 13.9417 KOps/s 14.1843 KOps/s $\color{#d91a1a}-1.71\%$
test_keys_stack_nested_leaf 0.1239ms 62.7191μs 15.9441 KOps/s 16.2861 KOps/s $\color{#d91a1a}-2.10\%$
test_keys_stack_nested_locked 0.1915ms 76.9617μs 12.9935 KOps/s 12.8707 KOps/s $\color{#35bf28}+0.95\%$
test_values 4.4403μs 0.8602μs 1.1625 MOps/s 1.1399 MOps/s $\color{#35bf28}+1.98\%$
test_values_nested 0.4248ms 48.8463μs 20.4724 KOps/s 20.5658 KOps/s $\color{#d91a1a}-0.45\%$
test_values_nested_locked 0.4402ms 50.3905μs 19.8450 KOps/s 19.8867 KOps/s $\color{#d91a1a}-0.21\%$
test_values_nested_leaf 69.6840μs 43.0022μs 23.2546 KOps/s 23.5702 KOps/s $\color{#d91a1a}-1.34\%$
test_values_stack_nested 0.4433ms 49.7886μs 20.0849 KOps/s 20.2743 KOps/s $\color{#d91a1a}-0.93\%$
test_values_stack_nested_leaf 0.4384ms 43.6414μs 22.9140 KOps/s 23.3047 KOps/s $\color{#d91a1a}-1.68\%$
test_values_stack_nested_locked 0.4298ms 51.2322μs 19.5190 KOps/s 19.8732 KOps/s $\color{#d91a1a}-1.78\%$
test_membership 19.7723μs 0.5017μs 1.9933 MOps/s 1.9875 MOps/s $\color{#35bf28}+0.29\%$
test_membership_nested 0.1908ms 1.8720μs 534.2009 KOps/s 530.6618 KOps/s $\color{#35bf28}+0.67\%$
test_membership_nested_leaf 0.1285ms 1.8365μs 544.5264 KOps/s 528.9284 KOps/s $\color{#35bf28}+2.95\%$
test_membership_stacked_nested 29.1320μs 1.9189μs 521.1316 KOps/s 519.0438 KOps/s $\color{#35bf28}+0.40\%$
test_membership_stacked_nested_leaf 47.1330μs 1.9474μs 513.5049 KOps/s 521.6067 KOps/s $\color{#d91a1a}-1.55\%$
test_membership_nested_last 0.3869ms 2.9561μs 338.2851 KOps/s 334.2080 KOps/s $\color{#35bf28}+1.22\%$
test_membership_nested_leaf_last 27.0420μs 2.9594μs 337.9106 KOps/s 335.8815 KOps/s $\color{#35bf28}+0.60\%$
test_membership_stacked_nested_last 20.6110μs 2.9436μs 339.7179 KOps/s 331.2112 KOps/s $\color{#35bf28}+2.57\%$
test_membership_stacked_nested_leaf_last 27.1220μs 2.9429μs 339.7972 KOps/s 332.8099 KOps/s $\color{#35bf28}+2.10\%$
test_nested_getleaf 35.7020μs 6.1079μs 163.7212 KOps/s 164.3559 KOps/s $\color{#d91a1a}-0.39\%$
test_nested_get 0.3849ms 5.8009μs 172.3882 KOps/s 172.9740 KOps/s $\color{#d91a1a}-0.34\%$
test_stacked_getleaf 0.3895ms 6.0551μs 165.1506 KOps/s 166.1092 KOps/s $\color{#d91a1a}-0.58\%$
test_stacked_get 36.7420μs 5.7518μs 173.8599 KOps/s 175.8366 KOps/s $\color{#d91a1a}-1.12\%$
test_nested_getitemleaf 0.3872ms 6.2130μs 160.9522 KOps/s 162.6437 KOps/s $\color{#d91a1a}-1.04\%$
test_nested_getitem 31.6020μs 5.8720μs 170.2991 KOps/s 172.8287 KOps/s $\color{#d91a1a}-1.46\%$
test_stacked_getitemleaf 0.3924ms 6.1441μs 162.7578 KOps/s 164.7630 KOps/s $\color{#d91a1a}-1.22\%$
test_stacked_getitem 29.1220μs 5.7504μs 173.9009 KOps/s 173.1550 KOps/s $\color{#35bf28}+0.43\%$
test_lock_nested 6.8677ms 0.4365ms 2.2910 KOps/s 2.3210 KOps/s $\color{#d91a1a}-1.29\%$
test_lock_stack_nested 0.4371ms 0.3968ms 2.5202 KOps/s 2.5599 KOps/s $\color{#d91a1a}-1.55\%$
test_unlock_nested 0.7721ms 0.3691ms 2.7097 KOps/s 2.7148 KOps/s $\color{#d91a1a}-0.19\%$
test_unlock_stack_nested 0.3665ms 0.3338ms 2.9962 KOps/s 3.0241 KOps/s $\color{#d91a1a}-0.92\%$
test_flatten_speed 0.4628ms 76.2808μs 13.1095 KOps/s 13.1791 KOps/s $\color{#d91a1a}-0.53\%$
test_unflatten_speed 0.3927ms 0.3238ms 3.0887 KOps/s 3.0488 KOps/s $\color{#35bf28}+1.31\%$
test_common_ops 1.6582ms 1.2866ms 777.2187 Ops/s 774.0121 Ops/s $\color{#35bf28}+0.41\%$
test_creation 0.3763ms 1.4758μs 677.5786 KOps/s 671.0766 KOps/s $\color{#35bf28}+0.97\%$
test_creation_empty 45.9130μs 15.4134μs 64.8788 KOps/s 58.2219 KOps/s $\textbf{\color{#35bf28}+11.43\%}$
test_creation_nested_1 0.4146ms 17.0596μs 58.6181 KOps/s 52.1152 KOps/s $\textbf{\color{#35bf28}+12.48\%}$
test_creation_nested_2 61.4940μs 19.6898μs 50.7878 KOps/s 47.1154 KOps/s $\textbf{\color{#35bf28}+7.79\%}$
test_clone 71.5850μs 29.3244μs 34.1013 KOps/s 35.1792 KOps/s $\color{#d91a1a}-3.06\%$
test_getitem[int] 1.2098ms 16.5152μs 60.5504 KOps/s 63.2557 KOps/s $\color{#d91a1a}-4.28\%$
test_getitem[slice_int] 0.1192ms 28.2324μs 35.4203 KOps/s 35.7426 KOps/s $\color{#d91a1a}-0.90\%$
test_getitem[range] 0.1791ms 0.1092ms 9.1551 KOps/s 9.1721 KOps/s $\color{#d91a1a}-0.19\%$
test_getitem[tuple] 0.1178ms 24.1052μs 41.4848 KOps/s 41.9750 KOps/s $\color{#d91a1a}-1.17\%$
test_getitem[list] 0.5003ms 99.0685μs 10.0940 KOps/s 10.0334 KOps/s $\color{#35bf28}+0.60\%$
test_setitem_dim[int] 65.8650μs 44.6869μs 22.3779 KOps/s 22.1536 KOps/s $\color{#35bf28}+1.01\%$
test_setitem_dim[slice_int] 99.9370μs 67.5130μs 14.8120 KOps/s 14.7270 KOps/s $\color{#35bf28}+0.58\%$
test_setitem_dim[range] 0.1553ms 0.1290ms 7.7532 KOps/s 7.8445 KOps/s $\color{#d91a1a}-1.16\%$
test_setitem_dim[tuple] 0.4667ms 61.4001μs 16.2866 KOps/s 16.3782 KOps/s $\color{#d91a1a}-0.56\%$
test_setitem 88.6450μs 42.7115μs 23.4129 KOps/s 23.6923 KOps/s $\color{#d91a1a}-1.18\%$
test_set 0.4457ms 41.7099μs 23.9751 KOps/s 23.9910 KOps/s $\color{#d91a1a}-0.07\%$
test_set_shared 0.3776ms 54.6473μs 18.2992 KOps/s 18.4706 KOps/s $\color{#d91a1a}-0.93\%$
test_update 0.4325ms 51.4649μs 19.4307 KOps/s 18.5020 KOps/s $\textbf{\color{#35bf28}+5.02\%}$
test_update_nested 0.4586ms 59.5362μs 16.7965 KOps/s 16.0491 KOps/s $\color{#35bf28}+4.66\%$
test_update__nested 0.1086ms 62.6071μs 15.9726 KOps/s 15.7888 KOps/s $\color{#35bf28}+1.16\%$
test_set_nested 0.4355ms 44.8759μs 22.2837 KOps/s 21.2395 KOps/s $\color{#35bf28}+4.92\%$
test_set_nested_new 98.2060μs 48.6411μs 20.5587 KOps/s 19.8255 KOps/s $\color{#35bf28}+3.70\%$
test_select 0.1116ms 61.7999μs 16.1813 KOps/s 15.5360 KOps/s $\color{#35bf28}+4.15\%$
test_select_nested 0.4283ms 41.7290μs 23.9641 KOps/s 21.9884 KOps/s $\textbf{\color{#35bf28}+8.99\%}$
test_exclude_nested 0.4432ms 58.4259μs 17.1157 KOps/s 16.3774 KOps/s $\color{#35bf28}+4.51\%$
test_empty[True] 0.6552ms 0.2608ms 3.8342 KOps/s 3.8159 KOps/s $\color{#35bf28}+0.48\%$
test_empty[False] 4.7683μs 0.7365μs 1.3578 MOps/s 1.3538 MOps/s $\color{#35bf28}+0.29\%$
test_to 0.4078ms 26.3262μs 37.9850 KOps/s 38.6573 KOps/s $\color{#d91a1a}-1.74\%$
test_to_nonblocking 59.8740μs 25.5647μs 39.1164 KOps/s 39.8495 KOps/s $\color{#d91a1a}-1.84\%$
test_unbind_speed 1.5111ms 0.2848ms 3.5108 KOps/s 3.4946 KOps/s $\color{#35bf28}+0.46\%$
test_unbind_speed_stack0 0.6862ms 0.2802ms 3.5686 KOps/s 3.5685 KOps/s $+0.00\%$
test_unbind_speed_stack1 91.1781ms 0.7122ms 1.4041 KOps/s 1.3959 KOps/s $\color{#35bf28}+0.58\%$
test_split 93.1798ms 2.2470ms 445.0385 Ops/s 456.1572 Ops/s $\color{#d91a1a}-2.44\%$
test_chunk 93.2378ms 2.2527ms 443.9182 Ops/s 458.1654 Ops/s $\color{#d91a1a}-3.11\%$
test_creation[device0] 0.5048ms 0.1290ms 7.7529 KOps/s 7.8983 KOps/s $\color{#d91a1a}-1.84\%$
test_creation_from_tensor 0.3783ms 0.1305ms 7.6649 KOps/s 7.6484 KOps/s $\color{#35bf28}+0.22\%$
test_add_one[memmap_tensor0] 0.2663ms 8.5477μs 116.9909 KOps/s 113.6537 KOps/s $\color{#35bf28}+2.94\%$
test_contiguous[memmap_tensor0] 17.7010μs 2.1938μs 455.8384 KOps/s 458.0518 KOps/s $\color{#d91a1a}-0.48\%$
test_stack[memmap_tensor0] 36.4330μs 6.8708μs 145.5439 KOps/s 151.2467 KOps/s $\color{#d91a1a}-3.77\%$
test_memmaptd_index 1.3094ms 0.4455ms 2.2448 KOps/s 2.3501 KOps/s $\color{#d91a1a}-4.48\%$
test_memmaptd_index_astensor 1.0112ms 0.5137ms 1.9466 KOps/s 2.0175 KOps/s $\color{#d91a1a}-3.52\%$
test_memmaptd_index_op 1.4210ms 1.0393ms 962.1907 Ops/s 931.3130 Ops/s $\color{#35bf28}+3.32\%$
test_serialize_model 0.1309s 0.1294s 7.7268 Ops/s 7.6712 Ops/s $\color{#35bf28}+0.73\%$
test_serialize_model_pickle 1.3473s 1.2125s 0.8247 Ops/s 0.8246 Ops/s $\color{#35bf28}+0.02\%$
test_serialize_weights 0.2132s 0.1414s 7.0706 Ops/s 6.9608 Ops/s $\color{#35bf28}+1.58\%$
test_serialize_weights_returnearly 0.2148s 55.9685ms 17.8672 Ops/s 18.0293 Ops/s $\color{#d91a1a}-0.90\%$
test_serialize_weights_pickle 1.3825s 1.2183s 0.8208 Ops/s 0.8217 Ops/s $\color{#d91a1a}-0.11\%$
test_reshape_pytree 66.7340μs 35.8876μs 27.8647 KOps/s 27.8510 KOps/s $\color{#35bf28}+0.05\%$
test_reshape_td 71.5240μs 41.6169μs 24.0287 KOps/s 23.6934 KOps/s $\color{#35bf28}+1.42\%$
test_view_pytree 75.1350μs 36.4172μs 27.4596 KOps/s 28.6591 KOps/s $\color{#d91a1a}-4.19\%$
test_view_td 78.3750μs 46.5388μs 21.4874 KOps/s 22.1342 KOps/s $\color{#d91a1a}-2.92\%$
test_unbind_pytree 68.1140μs 34.8244μs 28.7155 KOps/s 29.4502 KOps/s $\color{#d91a1a}-2.49\%$
test_unbind_td 0.3878ms 43.6703μs 22.8989 KOps/s 23.4293 KOps/s $\color{#d91a1a}-2.26\%$
test_split_pytree 86.4950μs 47.0056μs 21.2741 KOps/s 21.4403 KOps/s $\color{#d91a1a}-0.78\%$
test_split_td 0.7066ms 58.3102μs 17.1497 KOps/s 17.7862 KOps/s $\color{#d91a1a}-3.58\%$
test_add_pytree 97.5660μs 58.6546μs 17.0489 KOps/s 17.8383 KOps/s $\color{#d91a1a}-4.42\%$
test_add_td 0.1365ms 98.3858μs 10.1641 KOps/s 10.3049 KOps/s $\color{#d91a1a}-1.37\%$
test_compile_add_one_nested[tensordict-compile] 0.2287ms 0.1618ms 6.1795 KOps/s 5.9889 KOps/s $\color{#35bf28}+3.18\%$
test_compile_add_one_nested[tensordict-eager] 0.2521ms 0.1667ms 5.9986 KOps/s 6.2638 KOps/s $\color{#d91a1a}-4.23\%$
test_compile_add_one_nested[pytree-compile] 0.1778ms 0.1450ms 6.8963 KOps/s 6.9080 KOps/s $\color{#d91a1a}-0.17\%$
test_compile_add_one_nested[pytree-eager] 0.2618ms 0.1885ms 5.3060 KOps/s 5.4912 KOps/s $\color{#d91a1a}-3.37\%$
test_compile_copy_nested[tensordict-compile] 64.5140μs 22.2656μs 44.9124 KOps/s 47.3740 KOps/s $\textbf{\color{#d91a1a}-5.20\%}$
test_compile_copy_nested[tensordict-eager] 83.1750μs 49.1678μs 20.3385 KOps/s 20.1494 KOps/s $\color{#35bf28}+0.94\%$
test_compile_copy_nested[pytree-compile] 0.2310ms 64.9618μs 15.3937 KOps/s 15.4547 KOps/s $\color{#d91a1a}-0.40\%$
test_compile_copy_nested[pytree-eager] 84.8550μs 50.2655μs 19.8944 KOps/s 19.9871 KOps/s $\color{#d91a1a}-0.46\%$
test_compile_add_one_flat[tensordict-compile] 0.3568ms 0.3215ms 3.1104 KOps/s 3.1223 KOps/s $\color{#d91a1a}-0.38\%$
test_compile_add_one_flat[tensordict-eager] 0.3431ms 0.2418ms 4.1356 KOps/s 4.2935 KOps/s $\color{#d91a1a}-3.68\%$
test_compile_add_one_flat[tensorclass-compile] 0.1721ms 0.1274ms 7.8470 KOps/s 7.8338 KOps/s $\color{#35bf28}+0.17\%$
test_compile_add_one_flat[tensorclass-eager] 0.1756ms 67.4633μs 14.8229 KOps/s 15.3934 KOps/s $\color{#d91a1a}-3.71\%$
test_compile_add_one_flat[pytree-compile] 0.3629ms 0.3200ms 3.1255 KOps/s 3.1534 KOps/s $\color{#d91a1a}-0.89\%$
test_compile_add_one_flat[pytree-eager] 0.7671ms 0.6348ms 1.5752 KOps/s 1.6386 KOps/s $\color{#d91a1a}-3.87\%$
test_compile_add_self_flat[tensordict-eager] 0.3453ms 0.2886ms 3.4656 KOps/s 3.5167 KOps/s $\color{#d91a1a}-1.45\%$
test_compile_add_self_flat[tensordict-compile] 0.3740ms 0.3210ms 3.1153 KOps/s 3.1116 KOps/s $\color{#35bf28}+0.12\%$
test_compile_add_self_flat[tensorclass-eager] 0.1696ms 78.8933μs 12.6754 KOps/s 13.1454 KOps/s $\color{#d91a1a}-3.58\%$
test_compile_add_self_flat[tensorclass-compile] 0.1866ms 0.1306ms 7.6596 KOps/s 7.6020 KOps/s $\color{#35bf28}+0.76\%$
test_compile_add_self_flat[pytree-eager] 0.6238ms 0.5477ms 1.8258 KOps/s 1.9392 KOps/s $\textbf{\color{#d91a1a}-5.85\%}$
test_compile_add_self_flat[pytree-compile] 0.3601ms 0.3193ms 3.1323 KOps/s 3.1431 KOps/s $\color{#d91a1a}-0.35\%$
test_compile_copy_flat[tensordict-compile] 70.8240μs 19.5859μs 51.0572 KOps/s 51.2055 KOps/s $\color{#d91a1a}-0.29\%$
test_compile_copy_flat[tensordict-eager] 69.1740μs 38.1989μs 26.1788 KOps/s 24.3866 KOps/s $\textbf{\color{#35bf28}+7.35\%}$
test_compile_copy_flat[pytree-compile] 0.1059ms 69.4339μs 14.4022 KOps/s 14.3951 KOps/s $\color{#35bf28}+0.05\%$
test_compile_copy_flat[pytree-eager] 87.7660μs 51.2345μs 19.5181 KOps/s 19.3914 KOps/s $\color{#35bf28}+0.65\%$
test_compile_assign_and_add[tensordict-compile] 2.3699ms 0.8405ms 1.1898 KOps/s 1.1231 KOps/s $\textbf{\color{#35bf28}+5.94\%}$
test_compile_assign_and_add[tensordict-eager] 3.4529ms 3.3019ms 302.8534 Ops/s 314.8395 Ops/s $\color{#d91a1a}-3.81\%$
test_compile_assign_and_add[pytree-compile] 2.3501ms 0.8253ms 1.2117 KOps/s 1.1301 KOps/s $\textbf{\color{#35bf28}+7.23\%}$
test_compile_assign_and_add[pytree-eager] 3.4166ms 3.2671ms 306.0800 Ops/s 318.5538 Ops/s $\color{#d91a1a}-3.92\%$
test_compile_indexing[tensor-tensordict-compile] 0.1425ms 0.1086ms 9.2059 KOps/s 9.1195 KOps/s $\color{#35bf28}+0.95\%$
test_compile_indexing[tensor-tensordict-eager] 0.1900ms 60.8971μs 16.4212 KOps/s 15.9640 KOps/s $\color{#35bf28}+2.86\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1550ms 0.1031ms 9.7022 KOps/s 9.6877 KOps/s $\color{#35bf28}+0.15\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1467ms 44.2565μs 22.5955 KOps/s 23.2013 KOps/s $\color{#d91a1a}-2.61\%$
test_compile_indexing[tensor-pytree-compile] 0.1815ms 0.1068ms 9.3620 KOps/s 9.6021 KOps/s $\color{#d91a1a}-2.50\%$
test_compile_indexing[tensor-pytree-eager] 85.0960μs 47.5803μs 21.0171 KOps/s 23.1239 KOps/s $\textbf{\color{#d91a1a}-9.11\%}$
test_compile_indexing[slice-tensordict-compile] 0.1742ms 0.1377ms 7.2606 KOps/s 7.2862 KOps/s $\color{#d91a1a}-0.35\%$
test_compile_indexing[slice-tensordict-eager] 0.1619ms 25.9298μs 38.5657 KOps/s 39.8013 KOps/s $\color{#d91a1a}-3.10\%$
test_compile_indexing[slice-tensorclass-compile] 0.1779ms 0.1312ms 7.6227 KOps/s 7.6073 KOps/s $\color{#35bf28}+0.20\%$
test_compile_indexing[slice-tensorclass-eager] 59.5440μs 21.5170μs 46.4750 KOps/s 48.3435 KOps/s $\color{#d91a1a}-3.87\%$
test_compile_indexing[slice-pytree-compile] 0.1912ms 0.1354ms 7.3874 KOps/s 7.5869 KOps/s $\color{#d91a1a}-2.63\%$
test_compile_indexing[slice-pytree-eager] 89.0860μs 20.9572μs 47.7163 KOps/s 47.8067 KOps/s $\color{#d91a1a}-0.19\%$
test_compile_indexing[int-tensordict-compile] 0.1880ms 0.1385ms 7.2182 KOps/s 7.2191 KOps/s $\color{#d91a1a}-0.01\%$
test_compile_indexing[int-tensordict-eager] 0.4850ms 25.2703μs 39.5722 KOps/s 39.4738 KOps/s $\color{#35bf28}+0.25\%$
test_compile_indexing[int-tensorclass-compile] 0.5816ms 0.1330ms 7.5214 KOps/s 7.5698 KOps/s $\color{#d91a1a}-0.64\%$
test_compile_indexing[int-tensorclass-eager] 51.7630μs 21.3027μs 46.9425 KOps/s 48.0774 KOps/s $\color{#d91a1a}-2.36\%$
test_compile_indexing[int-pytree-compile] 0.1925ms 0.1326ms 7.5431 KOps/s 7.5610 KOps/s $\color{#d91a1a}-0.24\%$
test_compile_indexing[int-pytree-eager] 50.1830μs 20.9972μs 47.6254 KOps/s 39.4005 KOps/s $\textbf{\color{#35bf28}+20.88\%}$
test_mod_add[eager] 64.7240μs 33.0189μs 30.2857 KOps/s 29.9693 KOps/s $\color{#35bf28}+1.06\%$
test_mod_add[compile] 0.1093ms 71.0353μs 14.0775 KOps/s 14.2314 KOps/s $\color{#d91a1a}-1.08\%$
test_mod_add[compile-overhead] 0.2590ms 0.1350ms 7.4098 KOps/s 6.8049 KOps/s $\textbf{\color{#35bf28}+8.89\%}$
test_mod_wrap[eager] 0.8540ms 0.7934ms 1.2604 KOps/s 1.2574 KOps/s $\color{#35bf28}+0.24\%$
test_mod_wrap[compile] 2.0035ms 0.8477ms 1.1797 KOps/s 1.1806 KOps/s $\color{#d91a1a}-0.08\%$
test_mod_wrap[compile-overhead] 4.8545ms 3.0389ms 329.0658 Ops/s 320.9817 Ops/s $\color{#35bf28}+2.52\%$
test_mod_wrap_and_backward[eager] 4.5993ms 4.0799ms 245.1057 Ops/s 237.9322 Ops/s $\color{#35bf28}+3.01\%$
test_mod_wrap_and_backward[compile] 5.0422ms 4.1094ms 243.3458 Ops/s 239.5729 Ops/s $\color{#35bf28}+1.57\%$
test_mod_wrap_and_backward[compile-overhead] 1.3510ms 0.9051ms 1.1048 KOps/s 991.4427 Ops/s $\textbf{\color{#35bf28}+11.44\%}$
test_seq_add[eager] 0.2000ms 0.1035ms 9.6583 KOps/s 9.8867 KOps/s $\color{#d91a1a}-2.31\%$
test_seq_add[compile] 0.1750ms 81.5427μs 12.2635 KOps/s 12.3823 KOps/s $\color{#d91a1a}-0.96\%$
test_seq_add[compile-overhead] 0.1508ms 0.1142ms 8.7531 KOps/s 8.7423 KOps/s $\color{#35bf28}+0.12\%$
test_seq_wrap[eager] 1.0160ms 0.9418ms 1.0618 KOps/s 1.0435 KOps/s $\color{#35bf28}+1.76\%$
test_seq_wrap[compile] 0.9551ms 0.8643ms 1.1570 KOps/s 1.1571 KOps/s $\color{#d91a1a}-0.01\%$
test_seq_wrap[compile-overhead] 0.2970ms 0.2193ms 4.5591 KOps/s 4.5373 KOps/s $\color{#35bf28}+0.48\%$
test_func_call_runtime[False-eager] 2.6259ms 2.4073ms 415.4015 Ops/s 414.3886 Ops/s $\color{#35bf28}+0.24\%$
test_func_call_runtime[False-compile] 2.5334ms 2.4476ms 408.5715 Ops/s 411.6690 Ops/s $\color{#d91a1a}-0.75\%$
test_func_call_runtime[False-compile-overhead] 0.4294ms 0.3609ms 2.7710 KOps/s 2.7852 KOps/s $\color{#d91a1a}-0.51\%$
test_func_call_runtime[True-eager] 2.6733ms 2.5623ms 390.2690 Ops/s 388.9693 Ops/s $\color{#35bf28}+0.33\%$
test_func_call_runtime[True-compile] 2.5482ms 2.4720ms 404.5255 Ops/s 407.4661 Ops/s $\color{#d91a1a}-0.72\%$
test_func_call_runtime[True-compile-overhead] 0.4324ms 0.3845ms 2.6007 KOps/s 2.6404 KOps/s $\color{#d91a1a}-1.50\%$
test_func_call_cm_runtime[False-eager] 2.4797ms 2.4048ms 415.8415 Ops/s 414.7612 Ops/s $\color{#35bf28}+0.26\%$
test_func_call_cm_runtime[False-compile] 2.5299ms 2.4457ms 408.8758 Ops/s 414.1822 Ops/s $\color{#d91a1a}-1.28\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4173ms 0.3653ms 2.7376 KOps/s 2.7646 KOps/s $\color{#d91a1a}-0.98\%$
test_func_call_cm_runtime[True-eager] 2.8352ms 2.6804ms 373.0727 Ops/s 373.0262 Ops/s $\color{#35bf28}+0.01\%$
test_func_call_cm_runtime[True-compile] 2.6041ms 2.5063ms 398.9988 Ops/s 402.8327 Ops/s $\color{#d91a1a}-0.95\%$
test_func_call_cm_runtime[True-compile-overhead] 0.4973ms 0.4088ms 2.4465 KOps/s 2.4571 KOps/s $\color{#d91a1a}-0.43\%$
test_vmap_func_call_cm_runtime[eager] 4.2696ms 3.8435ms 260.1819 Ops/s 263.5596 Ops/s $\color{#d91a1a}-1.28\%$
test_vmap_func_call_cm_runtime[compile] 2.5813ms 2.5072ms 398.8504 Ops/s 403.4707 Ops/s $\color{#d91a1a}-1.15\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.4597ms 0.4118ms 2.4287 KOps/s 2.4517 KOps/s $\color{#d91a1a}-0.94\%$
test_distributed 2.5392ms 0.1757ms 5.6927 KOps/s 8.8475 KOps/s $\textbf{\color{#d91a1a}-35.66\%}$
test_tdmodule 62.3740μs 15.2659μs 65.5056 KOps/s 61.6484 KOps/s $\textbf{\color{#35bf28}+6.26\%}$
test_tdmodule_dispatch 59.5040μs 29.5661μs 33.8225 KOps/s 30.6073 KOps/s $\textbf{\color{#35bf28}+10.50\%}$
test_tdseq 36.1520μs 16.2337μs 61.6003 KOps/s 57.2296 KOps/s $\textbf{\color{#35bf28}+7.64\%}$
test_tdseq_dispatch 53.4230μs 32.3248μs 30.9360 KOps/s 28.3367 KOps/s $\textbf{\color{#35bf28}+9.17\%}$
test_instantiation_functorch 2.0974ms 1.8776ms 532.6061 Ops/s 545.2851 Ops/s $\color{#d91a1a}-2.33\%$
test_instantiation_td 1.8255ms 1.2147ms 823.2339 Ops/s 831.8947 Ops/s $\color{#d91a1a}-1.04\%$
test_exec_functorch 1.1319ms 1.0177ms 982.6443 Ops/s 987.9135 Ops/s $\color{#d91a1a}-0.53\%$
test_exec_functional_call 1.0736ms 1.0275ms 973.2424 Ops/s 977.7719 Ops/s $\color{#d91a1a}-0.46\%$
test_exec_td 1.2473ms 1.0608ms 942.7256 Ops/s 976.8026 Ops/s $\color{#d91a1a}-3.49\%$
test_exec_td_decorator 1.9103ms 1.0819ms 924.3331 Ops/s 941.5361 Ops/s $\color{#d91a1a}-1.83\%$
test_vmap_mlp_speed[True-True] 1.3521ms 1.2810ms 780.6133 Ops/s 781.0284 Ops/s $\color{#d91a1a}-0.05\%$
test_vmap_mlp_speed[True-False] 1.3571ms 1.2770ms 783.0696 Ops/s 787.1952 Ops/s $\color{#d91a1a}-0.52\%$
test_vmap_mlp_speed[False-True] 1.2654ms 1.1690ms 855.4472 Ops/s 863.3416 Ops/s $\color{#d91a1a}-0.91\%$
test_vmap_mlp_speed[False-False] 1.2257ms 1.1745ms 851.4392 Ops/s 860.4453 Ops/s $\color{#d91a1a}-1.05\%$
test_vmap_mlp_speed_decorator[True-True] 1.7547ms 1.2533ms 797.9184 Ops/s 799.8749 Ops/s $\color{#d91a1a}-0.24\%$
test_vmap_mlp_speed_decorator[True-False] 1.3428ms 1.2526ms 798.3587 Ops/s 800.1403 Ops/s $\color{#d91a1a}-0.22\%$
test_vmap_mlp_speed_decorator[False-True] 1.3211ms 1.1723ms 853.0006 Ops/s 858.0404 Ops/s $\color{#d91a1a}-0.59\%$
test_vmap_mlp_speed_decorator[False-False] 1.2794ms 1.1705ms 854.3565 Ops/s 859.9525 Ops/s $\color{#d91a1a}-0.65\%$
test_vmap_transformer_speed[True-True] 13.2957ms 13.1967ms 75.7764 Ops/s 75.6901 Ops/s $\color{#35bf28}+0.11\%$
test_vmap_transformer_speed[True-False] 13.3208ms 13.1628ms 75.9717 Ops/s 75.8283 Ops/s $\color{#35bf28}+0.19\%$
test_vmap_transformer_speed[False-True] 13.0310ms 12.9498ms 77.2212 Ops/s 76.8370 Ops/s $\color{#35bf28}+0.50\%$
test_vmap_transformer_speed[False-False] 13.0104ms 12.9358ms 77.3049 Ops/s 76.9785 Ops/s $\color{#35bf28}+0.42\%$
test_vmap_transformer_speed_decorator[True-True] 33.9718ms 33.8839ms 29.5126 Ops/s 29.6791 Ops/s $\color{#d91a1a}-0.56\%$
test_vmap_transformer_speed_decorator[True-False] 34.1805ms 34.0526ms 29.3664 Ops/s 29.6879 Ops/s $\color{#d91a1a}-1.08\%$
test_vmap_transformer_speed_decorator[False-True] 34.3522ms 34.0215ms 29.3931 Ops/s 29.5662 Ops/s $\color{#d91a1a}-0.59\%$
test_vmap_transformer_speed_decorator[False-False] 34.1088ms 33.9989ms 29.4127 Ops/s 29.5955 Ops/s $\color{#d91a1a}-0.62\%$
test_to_module_speed[True] 2.0741ms 0.9998ms 1.0002 KOps/s 1.0084 KOps/s $\color{#d91a1a}-0.81\%$
test_to_module_speed[False] 1.0761ms 0.9806ms 1.0198 KOps/s 1.0366 KOps/s $\color{#d91a1a}-1.62\%$
test_tc_init 63.5840μs 35.8911μs 27.8620 KOps/s 27.5524 KOps/s $\color{#35bf28}+1.12\%$
test_tc_init_nested 0.1062ms 74.4452μs 13.4327 KOps/s 13.9670 KOps/s $\color{#d91a1a}-3.83\%$
test_tc_first_layer_tensor 4.6120μs 0.6534μs 1.5305 MOps/s 1.4879 MOps/s $\color{#35bf28}+2.86\%$
test_tc_first_layer_nontensor 23.2120μs 2.2066μs 453.1885 KOps/s 443.6885 KOps/s $\color{#35bf28}+2.14\%$
test_tc_second_layer_tensor 10.6180μs 1.3601μs 735.2245 KOps/s 726.7870 KOps/s $\color{#35bf28}+1.16\%$
test_tc_second_layer_nontensor 24.0220μs 2.9678μs 336.9510 KOps/s 336.9989 KOps/s $\color{#d91a1a}-0.01\%$
test_unbind 0.1927s 12.2283ms 81.7777 Ops/s 93.8199 Ops/s $\textbf{\color{#d91a1a}-12.84\%}$
test_full_like 0.6561ms 0.5740ms 1.7423 KOps/s 1.7408 KOps/s $\color{#35bf28}+0.09\%$
test_zeros_like 0.2619ms 0.1980ms 5.0513 KOps/s 5.0501 KOps/s $\color{#35bf28}+0.02\%$
test_ones_like 0.2342ms 0.1978ms 5.0549 KOps/s 5.0555 KOps/s $\color{#d91a1a}-0.01\%$
test_clone 0.4612ms 0.4137ms 2.4170 KOps/s 2.4165 KOps/s $\color{#35bf28}+0.02\%$
test_squeeze 43.1630μs 10.0592μs 99.4111 KOps/s 104.0571 KOps/s $\color{#d91a1a}-4.46\%$
test_unsqueeze 0.2194ms 76.0830μs 13.1435 KOps/s 13.4978 KOps/s $\color{#d91a1a}-2.62\%$
test_split 0.4295ms 0.1616ms 6.1869 KOps/s 6.4290 KOps/s $\color{#d91a1a}-3.77\%$
test_permute 0.2696ms 0.1772ms 5.6433 KOps/s 5.7301 KOps/s $\color{#d91a1a}-1.51\%$
test_stack 1.2454ms 0.8657ms 1.1551 KOps/s 1.1516 KOps/s $\color{#35bf28}+0.30\%$
test_cat 1.2520ms 1.2315ms 812.0051 Ops/s 811.9832 Ops/s $+0.00\%$

[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
@vmoens vmoens added the enhancement New feature or request label Oct 3, 2024
@vmoens vmoens merged commit 9798b2d into gh/vmoens/21/base Oct 3, 2024
50 of 55 checks passed
vmoens added a commit that referenced this pull request Oct 3, 2024
ghstack-source-id: 92120f9043653078ed1eaa693a48c3f7e1ce3412
Pull Request resolved: #1020
@vmoens vmoens deleted the gh/vmoens/21/head branch October 3, 2024 20:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants