Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] Add regular benchmarks to CI in PRs without upload #561

Merged
merged 1 commit into from
Nov 20, 2023

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Nov 20, 2023

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 20, 2023
@vmoens vmoens added the CI label Nov 20, 2023
@vmoens vmoens marked this pull request as ready for review November 20, 2023 15:50
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 105. Improved: $\large\color{#35bf28}6$. Worsened: $\large\color{#d91a1a}3$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 31.0880μs 14.9259μs 66.9975 KOps/s 66.3825 KOps/s $\color{#35bf28}+0.93\%$
test_plain_set_stack_nested 0.1939ms 0.1374ms 7.2768 KOps/s 7.1652 KOps/s $\color{#35bf28}+1.56\%$
test_plain_set_nested_inplace 39.2130μs 17.8066μs 56.1591 KOps/s 55.4817 KOps/s $\color{#35bf28}+1.22\%$
test_plain_set_stack_nested_inplace 0.3419ms 0.1689ms 5.9203 KOps/s 5.9196 KOps/s $\color{#35bf28}+0.01\%$
test_items 35.1350μs 2.6451μs 378.0619 KOps/s 367.3160 KOps/s $\color{#35bf28}+2.93\%$
test_items_nested 0.5546ms 0.2665ms 3.7526 KOps/s 3.7572 KOps/s $\color{#d91a1a}-0.12\%$
test_items_nested_locked 0.9516ms 0.2690ms 3.7181 KOps/s 3.7237 KOps/s $\color{#d91a1a}-0.15\%$
test_items_nested_leaf 0.2207ms 0.1643ms 6.0864 KOps/s 6.0760 KOps/s $\color{#35bf28}+0.17\%$
test_items_stack_nested 1.5745ms 1.3683ms 730.8287 Ops/s 729.4420 Ops/s $\color{#35bf28}+0.19\%$
test_items_stack_nested_leaf 1.4323ms 1.2401ms 806.4057 Ops/s 801.7863 Ops/s $\color{#35bf28}+0.58\%$
test_items_stack_nested_locked 1.9552ms 0.7364ms 1.3580 KOps/s 1.3544 KOps/s $\color{#35bf28}+0.26\%$
test_keys 17.3830μs 4.2777μs 233.7731 KOps/s 230.4820 KOps/s $\color{#35bf28}+1.43\%$
test_keys_nested 0.5191ms 0.1361ms 7.3463 KOps/s 6.6999 KOps/s $\textbf{\color{#35bf28}+9.65\%}$
test_keys_nested_locked 0.2015ms 0.1367ms 7.3134 KOps/s 7.2166 KOps/s $\color{#35bf28}+1.34\%$
test_keys_nested_leaf 0.2340ms 0.1342ms 7.4532 KOps/s 7.3084 KOps/s $\color{#35bf28}+1.98\%$
test_keys_stack_nested 2.1173ms 1.2710ms 786.7659 Ops/s 777.4680 Ops/s $\color{#35bf28}+1.20\%$
test_keys_stack_nested_leaf 1.8747ms 1.2617ms 792.5612 Ops/s 778.5878 Ops/s $\color{#35bf28}+1.79\%$
test_keys_stack_nested_locked 1.0474ms 0.6087ms 1.6430 KOps/s 1.5930 KOps/s $\color{#35bf28}+3.14\%$
test_values 22.3156μs 1.1402μs 877.0152 KOps/s 883.4583 KOps/s $\color{#d91a1a}-0.73\%$
test_values_nested 80.0990μs 48.0012μs 20.8328 KOps/s 20.7153 KOps/s $\color{#35bf28}+0.57\%$
test_values_nested_locked 94.9680μs 48.4587μs 20.6361 KOps/s 20.5181 KOps/s $\color{#35bf28}+0.58\%$
test_values_nested_leaf 79.8780μs 43.2977μs 23.0959 KOps/s 23.1685 KOps/s $\color{#d91a1a}-0.31\%$
test_values_stack_nested 1.7247ms 1.0970ms 911.5651 Ops/s 900.5881 Ops/s $\color{#35bf28}+1.22\%$
test_values_stack_nested_leaf 1.8531ms 1.0878ms 919.3000 Ops/s 911.8909 Ops/s $\color{#35bf28}+0.81\%$
test_values_stack_nested_locked 0.9627ms 0.4800ms 2.0835 KOps/s 2.0482 KOps/s $\color{#35bf28}+1.72\%$
test_membership 12.3930μs 1.3518μs 739.7641 KOps/s 738.7440 KOps/s $\color{#35bf28}+0.14\%$
test_membership_nested 16.1500μs 2.8639μs 349.1787 KOps/s 350.0720 KOps/s $\color{#d91a1a}-0.26\%$
test_membership_nested_leaf 25.3970μs 2.9125μs 343.3484 KOps/s 340.4383 KOps/s $\color{#35bf28}+0.85\%$
test_membership_stacked_nested 41.2770μs 11.3938μs 87.7668 KOps/s 85.1004 KOps/s $\color{#35bf28}+3.13\%$
test_membership_stacked_nested_leaf 0.1994ms 11.5035μs 86.9301 KOps/s 87.2288 KOps/s $\color{#d91a1a}-0.34\%$
test_membership_nested_last 25.8080μs 6.0609μs 164.9932 KOps/s 170.4341 KOps/s $\color{#d91a1a}-3.19\%$
test_membership_nested_leaf_last 28.1020μs 5.9544μs 167.9424 KOps/s 170.1718 KOps/s $\color{#d91a1a}-1.31\%$
test_membership_stacked_nested_last 0.3302ms 0.1782ms 5.6129 KOps/s 5.6023 KOps/s $\color{#35bf28}+0.19\%$
test_membership_stacked_nested_leaf_last 47.5080μs 13.3953μs 74.6530 KOps/s 73.8102 KOps/s $\color{#35bf28}+1.14\%$
test_nested_getleaf 60.4430μs 12.2686μs 81.5092 KOps/s 82.0379 KOps/s $\color{#d91a1a}-0.64\%$
test_nested_get 44.8730μs 11.5705μs 86.4266 KOps/s 86.2096 KOps/s $\color{#35bf28}+0.25\%$
test_stacked_getleaf 3.8734ms 0.5707ms 1.7523 KOps/s 1.7271 KOps/s $\color{#35bf28}+1.46\%$
test_stacked_get 0.6833ms 0.5433ms 1.8405 KOps/s 1.8155 KOps/s $\color{#35bf28}+1.37\%$
test_nested_getitemleaf 51.2450μs 12.2140μs 81.8732 KOps/s 82.5655 KOps/s $\color{#d91a1a}-0.84\%$
test_nested_getitem 33.2220μs 11.6065μs 86.1590 KOps/s 87.1597 KOps/s $\color{#d91a1a}-1.15\%$
test_stacked_getitemleaf 0.6701ms 0.5706ms 1.7524 KOps/s 1.7191 KOps/s $\color{#35bf28}+1.94\%$
test_stacked_getitem 0.9479ms 0.5477ms 1.8257 KOps/s 1.8250 KOps/s $\color{#35bf28}+0.04\%$
test_lock_nested 55.7303ms 0.9406ms 1.0632 KOps/s 1.1361 KOps/s $\textbf{\color{#d91a1a}-6.42\%}$
test_lock_stack_nested 80.4594ms 13.1474ms 76.0607 Ops/s 73.4599 Ops/s $\color{#35bf28}+3.54\%$
test_unlock_nested 54.7156ms 0.9421ms 1.0615 KOps/s 1.0625 KOps/s $\color{#d91a1a}-0.10\%$
test_unlock_stack_nested 76.3571ms 13.5021ms 74.0623 Ops/s 71.7490 Ops/s $\color{#35bf28}+3.22\%$
test_flatten_speed 0.7786ms 0.6625ms 1.5094 KOps/s 1.5030 KOps/s $\color{#35bf28}+0.43\%$
test_unflatten_speed 1.3489ms 1.1555ms 865.4416 Ops/s 868.9491 Ops/s $\color{#d91a1a}-0.40\%$
test_common_ops 4.9994ms 0.6192ms 1.6150 KOps/s 1.5830 KOps/s $\color{#35bf28}+2.02\%$
test_creation 23.8740μs 2.2099μs 452.5180 KOps/s 456.3596 KOps/s $\color{#d91a1a}-0.84\%$
test_creation_empty 26.5890μs 7.2580μs 137.7798 KOps/s 138.8925 KOps/s $\color{#d91a1a}-0.80\%$
test_creation_nested_1 60.9130μs 11.0190μs 90.7521 KOps/s 90.8386 KOps/s $\color{#d91a1a}-0.10\%$
test_creation_nested_2 33.3620μs 13.5855μs 73.6080 KOps/s 73.9091 KOps/s $\color{#d91a1a}-0.41\%$
test_clone 79.1270μs 10.6074μs 94.2737 KOps/s 93.7472 KOps/s $\color{#35bf28}+0.56\%$
test_getitem[int] 35.2160μs 12.7884μs 78.1956 KOps/s 77.8155 KOps/s $\color{#35bf28}+0.49\%$
test_getitem[slice_int] 80.9910μs 28.5481μs 35.0286 KOps/s 33.6920 KOps/s $\color{#35bf28}+3.97\%$
test_getitem[range] 0.1047ms 53.2838μs 18.7674 KOps/s 17.8482 KOps/s $\textbf{\color{#35bf28}+5.15\%}$
test_getitem[tuple] 53.6100μs 23.1877μs 43.1264 KOps/s 43.5283 KOps/s $\color{#d91a1a}-0.92\%$
test_getitem[list] 0.3184ms 48.0795μs 20.7989 KOps/s 20.2059 KOps/s $\color{#35bf28}+2.93\%$
test_setitem_dim[int] 47.2480μs 25.1127μs 39.8205 KOps/s 38.3905 KOps/s $\color{#35bf28}+3.73\%$
test_setitem_dim[slice_int] 76.2210μs 47.7813μs 20.9287 KOps/s 20.2173 KOps/s $\color{#35bf28}+3.52\%$
test_setitem_dim[range] 0.1056ms 68.0154μs 14.7025 KOps/s 14.1067 KOps/s $\color{#35bf28}+4.22\%$
test_setitem_dim[tuple] 69.4990μs 37.3771μs 26.7544 KOps/s 26.2132 KOps/s $\color{#35bf28}+2.06\%$
test_setitem 1.0520ms 15.0694μs 66.3596 KOps/s 69.7409 KOps/s $\color{#d91a1a}-4.85\%$
test_set 86.4000μs 14.4315μs 69.2926 KOps/s 73.0360 KOps/s $\textbf{\color{#d91a1a}-5.13\%}$
test_set_shared 5.0656ms 0.1544ms 6.4779 KOps/s 5.9741 KOps/s $\textbf{\color{#35bf28}+8.43\%}$
test_update 0.1399ms 18.6104μs 53.7333 KOps/s 52.5453 KOps/s $\color{#35bf28}+2.26\%$
test_update_nested 0.1647ms 27.0236μs 37.0047 KOps/s 36.8483 KOps/s $\color{#35bf28}+0.42\%$
test_set_nested 0.1077ms 16.0434μs 62.3311 KOps/s 64.2154 KOps/s $\color{#d91a1a}-2.93\%$
test_set_nested_new 0.1420ms 21.9203μs 45.6198 KOps/s 45.8390 KOps/s $\color{#d91a1a}-0.48\%$
test_select 0.1280ms 45.1620μs 22.1425 KOps/s 21.8442 KOps/s $\color{#35bf28}+1.37\%$
test_unbind_speed 0.5273ms 0.2816ms 3.5514 KOps/s 3.4926 KOps/s $\color{#35bf28}+1.68\%$
test_unbind_speed_stack0 62.4943ms 4.5175ms 221.3619 Ops/s 223.7759 Ops/s $\color{#d91a1a}-1.08\%$
test_unbind_speed_stack1 2.2932μs 0.5962μs 1.6774 MOps/s 1.6776 MOps/s $\color{#d91a1a}-0.01\%$
test_creation[device0] 3.5079ms 0.2943ms 3.3979 KOps/s 3.3036 KOps/s $\color{#35bf28}+2.85\%$
test_creation_from_tensor 0.8475ms 0.3222ms 3.1033 KOps/s 2.7223 KOps/s $\textbf{\color{#35bf28}+14.00\%}$
test_add_one[memmap_tensor0] 0.3962ms 24.6955μs 40.4933 KOps/s 39.3871 KOps/s $\color{#35bf28}+2.81\%$
test_contiguous[memmap_tensor0] 28.8340μs 5.7668μs 173.4072 KOps/s 176.4231 KOps/s $\color{#d91a1a}-1.71\%$
test_stack[memmap_tensor0] 55.9240μs 18.9889μs 52.6624 KOps/s 52.1609 KOps/s $\color{#35bf28}+0.96\%$
test_memmaptd_index 0.4851ms 0.1800ms 5.5564 KOps/s 5.6051 KOps/s $\color{#d91a1a}-0.87\%$
test_memmaptd_index_astensor 0.3330ms 0.2391ms 4.1819 KOps/s 4.1600 KOps/s $\color{#35bf28}+0.53\%$
test_memmaptd_index_op 0.8003ms 0.4774ms 2.0945 KOps/s 2.1931 KOps/s $\color{#d91a1a}-4.50\%$
test_reshape_pytree 79.6090μs 23.1028μs 43.2847 KOps/s 43.4108 KOps/s $\color{#d91a1a}-0.29\%$
test_reshape_td 59.2000μs 20.7973μs 48.0831 KOps/s 48.6535 KOps/s $\color{#d91a1a}-1.17\%$
test_view_pytree 54.9220μs 22.9543μs 43.5648 KOps/s 43.8214 KOps/s $\color{#d91a1a}-0.59\%$
test_view_td 22.8830μs 4.3089μs 232.0794 KOps/s 216.7219 KOps/s $\textbf{\color{#35bf28}+7.09\%}$
test_unbind_pytree 69.6400μs 26.6041μs 37.5882 KOps/s 37.7564 KOps/s $\color{#d91a1a}-0.45\%$
test_unbind_td 87.6630μs 39.2382μs 25.4854 KOps/s 25.3050 KOps/s $\color{#35bf28}+0.71\%$
test_split_pytree 66.6140μs 26.2201μs 38.1387 KOps/s 38.2370 KOps/s $\color{#d91a1a}-0.26\%$
test_split_td 0.1428ms 73.0961μs 13.6806 KOps/s 13.4566 KOps/s $\color{#35bf28}+1.66\%$
test_add_pytree 80.7100μs 31.7164μs 31.5295 KOps/s 31.6321 KOps/s $\color{#d91a1a}-0.32\%$
test_add_td 99.4650μs 41.3300μs 24.1955 KOps/s 24.1757 KOps/s $\color{#35bf28}+0.08\%$
test_distributed 21.9200μs 5.9508μs 168.0443 KOps/s 166.9208 KOps/s $\color{#35bf28}+0.67\%$
test_tdmodule 0.1628ms 20.8839μs 47.8837 KOps/s 48.5833 KOps/s $\color{#d91a1a}-1.44\%$
test_tdmodule_dispatch 0.1813ms 36.9549μs 27.0600 KOps/s 26.6713 KOps/s $\color{#35bf28}+1.46\%$
test_tdseq 0.1113ms 23.0279μs 43.4256 KOps/s 40.9307 KOps/s $\textbf{\color{#35bf28}+6.10\%}$
test_tdseq_dispatch 0.6305ms 40.5512μs 24.6602 KOps/s 23.6944 KOps/s $\color{#35bf28}+4.08\%$
test_instantiation_functorch 1.5420ms 1.2864ms 777.3700 Ops/s 768.6810 Ops/s $\color{#35bf28}+1.13\%$
test_instantiation_td 67.4629ms 1.1228ms 890.6354 Ops/s 959.9089 Ops/s $\textbf{\color{#d91a1a}-7.22\%}$
test_exec_functorch 0.3334ms 0.1481ms 6.7505 KOps/s 6.7293 KOps/s $\color{#35bf28}+0.31\%$
test_exec_td 0.2190ms 0.1433ms 6.9762 KOps/s 7.0883 KOps/s $\color{#d91a1a}-1.58\%$
test_vmap_mlp_speed[True-True] 1.2204ms 0.8464ms 1.1815 KOps/s 1.1799 KOps/s $\color{#35bf28}+0.13\%$
test_vmap_mlp_speed[True-False] 0.7270ms 0.4611ms 2.1690 KOps/s 2.1645 KOps/s $\color{#35bf28}+0.21\%$
test_vmap_mlp_speed[False-True] 1.1242ms 0.7423ms 1.3472 KOps/s 1.3309 KOps/s $\color{#35bf28}+1.22\%$
test_vmap_mlp_speed[False-False] 0.6245ms 0.3822ms 2.6162 KOps/s 2.6293 KOps/s $\color{#d91a1a}-0.50\%$

@vmoens vmoens merged commit 9bd09ae into main Nov 20, 2023
44 of 45 checks passed
@vmoens vmoens deleted the fix-gpu-bench-main branch November 20, 2023 15:56
@vmoens vmoens changed the title [CI] Add regular enchmarks to CI in PRs without upload [CI] Add regular benchmarks to CI in PRs without upload Nov 20, 2023
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 115. Improved: $\large\color{#35bf28}11$. Worsened: $\large\color{#d91a1a}1$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 65.3910μs 12.2956μs 81.3296 KOps/s 81.0736 KOps/s $\color{#35bf28}+0.32\%$
test_plain_set_stack_nested 0.1376ms 0.1161ms 8.6119 KOps/s 8.5890 KOps/s $\color{#35bf28}+0.27\%$
test_plain_set_nested_inplace 27.8000μs 14.6831μs 68.1056 KOps/s 68.0125 KOps/s $\color{#35bf28}+0.14\%$
test_plain_set_stack_nested_inplace 0.1699ms 0.1415ms 7.0651 KOps/s 7.0869 KOps/s $\color{#d91a1a}-0.31\%$
test_items 28.2610μs 4.7885μs 208.8335 KOps/s 211.0785 KOps/s $\color{#d91a1a}-1.06\%$
test_items_nested 0.3735ms 0.3399ms 2.9423 KOps/s 2.9616 KOps/s $\color{#d91a1a}-0.65\%$
test_items_nested_locked 0.3589ms 0.3379ms 2.9596 KOps/s 2.9621 KOps/s $\color{#d91a1a}-0.08\%$
test_items_nested_leaf 0.2194ms 0.1993ms 5.0171 KOps/s 5.0480 KOps/s $\color{#d91a1a}-0.61\%$
test_items_stack_nested 1.4684ms 1.4122ms 708.1243 Ops/s 700.1690 Ops/s $\color{#35bf28}+1.14\%$
test_items_stack_nested_leaf 1.3102ms 1.2366ms 808.6906 Ops/s 792.9701 Ops/s $\color{#35bf28}+1.98\%$
test_items_stack_nested_locked 0.8333ms 0.7992ms 1.2512 KOps/s 1.2402 KOps/s $\color{#35bf28}+0.89\%$
test_keys 24.9200μs 4.5677μs 218.9306 KOps/s 217.0303 KOps/s $\color{#35bf28}+0.88\%$
test_keys_nested 0.4946ms 90.9191μs 10.9988 KOps/s 10.4951 KOps/s $\color{#35bf28}+4.80\%$
test_keys_nested_locked 0.1151ms 90.2505μs 11.0803 KOps/s 11.0369 KOps/s $\color{#35bf28}+0.39\%$
test_keys_nested_leaf 0.1679ms 82.8525μs 12.0696 KOps/s 12.1883 KOps/s $\color{#d91a1a}-0.97\%$
test_keys_stack_nested 1.2992ms 1.2258ms 815.8077 Ops/s 814.5359 Ops/s $\color{#35bf28}+0.16\%$
test_keys_stack_nested_leaf 1.3688ms 1.2322ms 811.5307 Ops/s 817.8607 Ops/s $\color{#d91a1a}-0.77\%$
test_keys_stack_nested_locked 0.6437ms 0.5915ms 1.6907 KOps/s 1.7126 KOps/s $\color{#d91a1a}-1.28\%$
test_values 9.7000μs 1.8977μs 526.9544 KOps/s 527.3737 KOps/s $\color{#d91a1a}-0.08\%$
test_values_nested 65.7810μs 43.4507μs 23.0146 KOps/s 23.0490 KOps/s $\color{#d91a1a}-0.15\%$
test_values_nested_locked 65.9310μs 43.5115μs 22.9824 KOps/s 22.8805 KOps/s $\color{#35bf28}+0.45\%$
test_values_nested_leaf 55.3210μs 38.2988μs 26.1105 KOps/s 26.3704 KOps/s $\color{#d91a1a}-0.99\%$
test_values_stack_nested 1.1617ms 1.0953ms 913.0307 Ops/s 924.6192 Ops/s $\color{#d91a1a}-1.25\%$
test_values_stack_nested_leaf 1.1629ms 1.0738ms 931.2849 Ops/s 933.0879 Ops/s $\color{#d91a1a}-0.19\%$
test_values_stack_nested_locked 0.5872ms 0.4823ms 2.0736 KOps/s 2.0650 KOps/s $\color{#35bf28}+0.41\%$
test_membership 4.9333μs 0.9213μs 1.0854 MOps/s 1.0738 MOps/s $\color{#35bf28}+1.08\%$
test_membership_nested 18.0310μs 2.2335μs 447.7348 KOps/s 456.3075 KOps/s $\color{#d91a1a}-1.88\%$
test_membership_nested_leaf 29.2610μs 2.2091μs 452.6636 KOps/s 453.5883 KOps/s $\color{#d91a1a}-0.20\%$
test_membership_stacked_nested 61.8210μs 10.9596μs 91.2442 KOps/s 92.7748 KOps/s $\color{#d91a1a}-1.65\%$
test_membership_stacked_nested_leaf 42.6910μs 10.8349μs 92.2946 KOps/s 92.8859 KOps/s $\color{#d91a1a}-0.64\%$
test_membership_nested_last 19.3910μs 4.6023μs 217.2818 KOps/s 218.6473 KOps/s $\color{#d91a1a}-0.62\%$
test_membership_nested_leaf_last 21.0910μs 4.6036μs 217.2192 KOps/s 217.6629 KOps/s $\color{#d91a1a}-0.20\%$
test_membership_stacked_nested_last 0.1795ms 0.1440ms 6.9466 KOps/s 6.9587 KOps/s $\color{#d91a1a}-0.17\%$
test_membership_stacked_nested_leaf_last 39.2110μs 12.5893μs 79.4326 KOps/s 79.7379 KOps/s $\color{#d91a1a}-0.38\%$
test_nested_getleaf 38.6310μs 9.3750μs 106.6670 KOps/s 107.0910 KOps/s $\color{#d91a1a}-0.40\%$
test_nested_get 41.6710μs 8.8516μs 112.9734 KOps/s 112.9773 KOps/s $-0.00\%$
test_stacked_getleaf 0.5659ms 0.5260ms 1.9012 KOps/s 1.9216 KOps/s $\color{#d91a1a}-1.07\%$
test_stacked_get 0.5378ms 0.4952ms 2.0193 KOps/s 2.0665 KOps/s $\color{#d91a1a}-2.28\%$
test_nested_getitemleaf 22.6300μs 9.4015μs 106.3666 KOps/s 106.4980 KOps/s $\color{#d91a1a}-0.12\%$
test_nested_getitem 32.0510μs 8.8648μs 112.8054 KOps/s 112.4017 KOps/s $\color{#35bf28}+0.36\%$
test_stacked_getitemleaf 0.5544ms 0.5151ms 1.9412 KOps/s 1.8974 KOps/s $\color{#35bf28}+2.31\%$
test_stacked_getitem 0.5272ms 0.4958ms 2.0171 KOps/s 2.0201 KOps/s $\color{#d91a1a}-0.15\%$
test_lock_nested 46.6560ms 0.9357ms 1.0688 KOps/s 1.1201 KOps/s $\color{#d91a1a}-4.59\%$
test_lock_stack_nested 57.0932ms 12.0273ms 83.1440 Ops/s 81.3153 Ops/s $\color{#35bf28}+2.25\%$
test_unlock_nested 45.2928ms 0.9613ms 1.0402 KOps/s 1.0239 KOps/s $\color{#35bf28}+1.59\%$
test_unlock_stack_nested 57.2930ms 12.8563ms 77.7829 Ops/s 76.1280 Ops/s $\color{#35bf28}+2.17\%$
test_flatten_speed 0.6070ms 0.5536ms 1.8063 KOps/s 1.8202 KOps/s $\color{#d91a1a}-0.76\%$
test_unflatten_speed 1.0222ms 0.9977ms 1.0023 KOps/s 1.0055 KOps/s $\color{#d91a1a}-0.32\%$
test_common_ops 0.6466ms 0.5600ms 1.7858 KOps/s 1.7222 KOps/s $\color{#35bf28}+3.69\%$
test_creation 24.1500μs 1.8089μs 552.8155 KOps/s 572.1075 KOps/s $\color{#d91a1a}-3.37\%$
test_creation_empty 22.0700μs 5.9289μs 168.6667 KOps/s 168.6617 KOps/s $+0.00\%$
test_creation_nested_1 27.1020μs 8.8680μs 112.7653 KOps/s 111.8013 KOps/s $\color{#35bf28}+0.86\%$
test_creation_nested_2 68.4700μs 10.9132μs 91.6321 KOps/s 92.3921 KOps/s $\color{#d91a1a}-0.82\%$
test_clone 50.4100μs 11.7188μs 85.3330 KOps/s 85.0797 KOps/s $\color{#35bf28}+0.30\%$
test_getitem[int] 26.6910μs 12.2679μs 81.5137 KOps/s 77.1713 KOps/s $\textbf{\color{#35bf28}+5.63\%}$
test_getitem[slice_int] 54.3810μs 28.0748μs 35.6191 KOps/s 33.0011 KOps/s $\textbf{\color{#35bf28}+7.93\%}$
test_getitem[range] 75.4710μs 48.3130μs 20.6984 KOps/s 20.4823 KOps/s $\color{#35bf28}+1.05\%$
test_getitem[tuple] 42.2500μs 23.9588μs 41.7383 KOps/s 39.8513 KOps/s $\color{#35bf28}+4.73\%$
test_getitem[list] 0.1778ms 45.4290μs 22.0124 KOps/s 22.2337 KOps/s $\color{#d91a1a}-1.00\%$
test_setitem_dim[int] 42.3500μs 25.6597μs 38.9715 KOps/s 37.1339 KOps/s $\color{#35bf28}+4.95\%$
test_setitem_dim[slice_int] 64.2010μs 44.6639μs 22.3894 KOps/s 21.3017 KOps/s $\textbf{\color{#35bf28}+5.11\%}$
test_setitem_dim[range] 87.2910μs 62.1334μs 16.0944 KOps/s 15.8757 KOps/s $\color{#35bf28}+1.38\%$
test_setitem_dim[tuple] 57.9410μs 38.9489μs 25.6747 KOps/s 24.5227 KOps/s $\color{#35bf28}+4.70\%$
test_setitem 60.2110μs 14.8633μs 67.2798 KOps/s 64.8663 KOps/s $\color{#35bf28}+3.72\%$
test_set 67.5910μs 14.3718μs 69.5808 KOps/s 67.7465 KOps/s $\color{#35bf28}+2.71\%$
test_set_shared 0.1909ms 0.1120ms 8.9252 KOps/s 8.6845 KOps/s $\color{#35bf28}+2.77\%$
test_update 84.2910μs 17.7554μs 56.3210 KOps/s 54.9274 KOps/s $\color{#35bf28}+2.54\%$
test_update_nested 0.1501ms 24.6553μs 40.5593 KOps/s 39.2830 KOps/s $\color{#35bf28}+3.25\%$
test_set_nested 62.6410μs 15.6680μs 63.8245 KOps/s 62.4115 KOps/s $\color{#35bf28}+2.26\%$
test_set_nested_new 66.5410μs 20.8510μs 47.9594 KOps/s 46.8204 KOps/s $\color{#35bf28}+2.43\%$
test_select 88.8910μs 43.0613μs 23.2227 KOps/s 22.8147 KOps/s $\color{#35bf28}+1.79\%$
test_to 72.6110μs 52.7094μs 18.9720 KOps/s 18.5170 KOps/s $\color{#35bf28}+2.46\%$
test_to_nonblocking 60.2910μs 34.4522μs 29.0257 KOps/s 27.2637 KOps/s $\textbf{\color{#35bf28}+6.46\%}$
test_unbind_speed 0.3885ms 0.2825ms 3.5401 KOps/s 3.4687 KOps/s $\color{#35bf28}+2.06\%$
test_unbind_speed_stack0 51.4765ms 3.7183ms 268.9392 Ops/s 249.0503 Ops/s $\textbf{\color{#35bf28}+7.99\%}$
test_unbind_speed_stack1 1.9041μs 0.4921μs 2.0323 MOps/s 2.0138 MOps/s $\color{#35bf28}+0.92\%$
test_creation[device0] 0.7415ms 0.3087ms 3.2399 KOps/s 3.2100 KOps/s $\color{#35bf28}+0.93\%$
test_creation[device1] 0.6830ms 0.3119ms 3.2058 KOps/s 3.1313 KOps/s $\color{#35bf28}+2.38\%$
test_creation_from_tensor 0.6832ms 0.3381ms 2.9578 KOps/s 2.8705 KOps/s $\color{#35bf28}+3.04\%$
test_add_one[memmap_tensor0] 54.2317ms 32.0830μs 31.1692 KOps/s 38.0639 KOps/s $\textbf{\color{#d91a1a}-18.11\%}$
test_add_one[memmap_tensor1] 0.1906ms 75.1431μs 13.3079 KOps/s 13.3301 KOps/s $\color{#d91a1a}-0.17\%$
test_contiguous[memmap_tensor0] 27.0810μs 5.8308μs 171.5016 KOps/s 162.2747 KOps/s $\textbf{\color{#35bf28}+5.69\%}$
test_contiguous[memmap_tensor1] 0.2248ms 22.2438μs 44.9563 KOps/s 43.9505 KOps/s $\color{#35bf28}+2.29\%$
test_stack[memmap_tensor0] 45.2410μs 19.5860μs 51.0568 KOps/s 47.3788 KOps/s $\textbf{\color{#35bf28}+7.76\%}$
test_stack[memmap_tensor1] 0.1588ms 74.3386μs 13.4520 KOps/s 13.1863 KOps/s $\color{#35bf28}+2.01\%$
test_memmaptd_index 0.2489ms 0.2168ms 4.6118 KOps/s 4.2930 KOps/s $\textbf{\color{#35bf28}+7.43\%}$
test_memmaptd_index_astensor 0.3410ms 0.2763ms 3.6189 KOps/s 3.5038 KOps/s $\color{#35bf28}+3.29\%$
test_memmaptd_index_op 0.5834ms 0.5246ms 1.9063 KOps/s 1.7921 KOps/s $\textbf{\color{#35bf28}+6.37\%}$
test_reshape_pytree 50.1910μs 21.2161μs 47.1341 KOps/s 46.8572 KOps/s $\color{#35bf28}+0.59\%$
test_reshape_td 39.7300μs 21.3154μs 46.9145 KOps/s 45.8380 KOps/s $\color{#35bf28}+2.35\%$
test_view_pytree 54.1910μs 20.9896μs 47.6427 KOps/s 46.9401 KOps/s $\color{#35bf28}+1.50\%$
test_view_td 19.0020μs 3.3768μs 296.1397 KOps/s 300.1787 KOps/s $\color{#d91a1a}-1.35\%$
test_unbind_pytree 50.1610μs 26.1431μs 38.2510 KOps/s 36.9766 KOps/s $\color{#35bf28}+3.45\%$
test_unbind_td 62.1600μs 40.4411μs 24.7273 KOps/s 23.5124 KOps/s $\textbf{\color{#35bf28}+5.17\%}$
test_split_pytree 47.0310μs 23.8776μs 41.8803 KOps/s 40.4728 KOps/s $\color{#35bf28}+3.48\%$
test_split_td 0.1401ms 72.0755μs 13.8743 KOps/s 13.4419 KOps/s $\color{#35bf28}+3.22\%$
test_add_pytree 60.1100μs 32.9007μs 30.3945 KOps/s 29.5501 KOps/s $\color{#35bf28}+2.86\%$
test_add_td 68.1310μs 43.4180μs 23.0319 KOps/s 21.7735 KOps/s $\textbf{\color{#35bf28}+5.78\%}$
test_distributed 21.4200μs 5.5713μs 179.4907 KOps/s 178.4085 KOps/s $\color{#35bf28}+0.61\%$
test_tdmodule 39.9900μs 17.1303μs 58.3762 KOps/s 59.8777 KOps/s $\color{#d91a1a}-2.51\%$
test_tdmodule_dispatch 0.2231ms 31.8906μs 31.3572 KOps/s 32.0671 KOps/s $\color{#d91a1a}-2.21\%$
test_tdseq 39.0100μs 19.6405μs 50.9153 KOps/s 49.8640 KOps/s $\color{#35bf28}+2.11\%$
test_tdseq_dispatch 50.3500μs 34.4426μs 29.0338 KOps/s 28.7503 KOps/s $\color{#35bf28}+0.99\%$
test_instantiation_functorch 1.7501ms 1.6738ms 597.4489 Ops/s 582.9304 Ops/s $\color{#35bf28}+2.49\%$
test_instantiation_td 1.7298ms 1.2056ms 829.4932 Ops/s 819.8146 Ops/s $\color{#35bf28}+1.18\%$
test_exec_functorch 0.2204ms 0.1582ms 6.3193 KOps/s 6.1579 KOps/s $\color{#35bf28}+2.62\%$
test_exec_td 0.1876ms 0.1489ms 6.7148 KOps/s 6.4647 KOps/s $\color{#35bf28}+3.87\%$
test_vmap_mlp_speed[True-True] 1.1489ms 1.0607ms 942.7349 Ops/s 942.4423 Ops/s $\color{#35bf28}+0.03\%$
test_vmap_mlp_speed[True-False] 0.6715ms 0.6232ms 1.6046 KOps/s 1.5932 KOps/s $\color{#35bf28}+0.72\%$
test_vmap_mlp_speed[False-True] 1.0760ms 1.0100ms 990.0884 Ops/s 1.0235 KOps/s $\color{#d91a1a}-3.27\%$
test_vmap_mlp_speed[False-False] 0.6463ms 0.5824ms 1.7170 KOps/s 1.7761 KOps/s $\color{#d91a1a}-3.33\%$
test_vmap_transformer_speed[True-True] 13.1967ms 12.6463ms 79.0747 Ops/s 79.1753 Ops/s $\color{#d91a1a}-0.13\%$
test_vmap_transformer_speed[True-False] 8.5091ms 8.3919ms 119.1619 Ops/s 118.1824 Ops/s $\color{#35bf28}+0.83\%$
test_vmap_transformer_speed[False-True] 13.6355ms 12.5679ms 79.5681 Ops/s 79.6068 Ops/s $\color{#d91a1a}-0.05\%$
test_vmap_transformer_speed[False-False] 8.4797ms 8.3251ms 120.1191 Ops/s 119.3198 Ops/s $\color{#35bf28}+0.67\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants