Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Fix unbind #471

Merged
merged 6 commits into from
Jul 3, 2023
Merged

[BugFix] Fix unbind #471

merged 6 commits into from
Jul 3, 2023

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Jul 3, 2023

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 3, 2023
@vmoens vmoens added the bug Something isn't working label Jul 3, 2023
@vmoens vmoens linked an issue Jul 3, 2023 that may be closed by this pull request
Copy link
Contributor

@matteobettini matteobettini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@matteobettini
Copy link
Contributor

Fixes #469

@github-actions
Copy link

github-actions bot commented Jul 3, 2023

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 103. Improved: $\large\color{#35bf28}5$. Worsened: $\large\color{#d91a1a}1$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.1945ms 22.4311μs 44.5810 KOps/s 44.5149 KOps/s $\color{#35bf28}+0.15\%$
test_plain_set_stack_nested 0.3401ms 0.2727ms 3.6664 KOps/s 3.6439 KOps/s $\color{#35bf28}+0.62\%$
test_plain_set_nested_inplace 52.7000μs 25.7829μs 38.7854 KOps/s 38.8550 KOps/s $\color{#d91a1a}-0.18\%$
test_plain_set_stack_nested_inplace 0.7535ms 0.4583ms 2.1820 KOps/s 2.1414 KOps/s $\color{#35bf28}+1.90\%$
test_items 51.9990μs 3.2396μs 308.6823 KOps/s 322.3116 KOps/s $\color{#d91a1a}-4.23\%$
test_items_nested 2.6063ms 0.3578ms 2.7952 KOps/s 2.8293 KOps/s $\color{#d91a1a}-1.20\%$
test_items_nested_locked 0.4125ms 0.3512ms 2.8475 KOps/s 2.8392 KOps/s $\color{#35bf28}+0.29\%$
test_items_nested_leaf 0.2888ms 0.2131ms 4.6929 KOps/s 4.6932 KOps/s $-0.01\%$
test_items_stack_nested 2.2910ms 2.0600ms 485.4405 Ops/s 481.8085 Ops/s $\color{#35bf28}+0.75\%$
test_items_stack_nested_leaf 2.0496ms 1.8970ms 527.1527 Ops/s 523.1054 Ops/s $\color{#35bf28}+0.77\%$
test_items_stack_nested_locked 1.0747ms 0.9417ms 1.0619 KOps/s 1.0345 KOps/s $\color{#35bf28}+2.64\%$
test_keys 60.8000μs 4.4818μs 223.1246 KOps/s 223.4729 KOps/s $\color{#d91a1a}-0.16\%$
test_keys_nested 1.4981ms 0.1700ms 5.8825 KOps/s 5.7926 KOps/s $\color{#35bf28}+1.55\%$
test_keys_nested_locked 0.2272ms 0.1689ms 5.9205 KOps/s 5.8640 KOps/s $\color{#35bf28}+0.96\%$
test_keys_nested_leaf 0.3073ms 0.1718ms 5.8219 KOps/s 5.4269 KOps/s $\textbf{\color{#35bf28}+7.28\%}$
test_keys_stack_nested 1.9961ms 1.8372ms 544.2920 Ops/s 539.5147 Ops/s $\color{#35bf28}+0.89\%$
test_keys_stack_nested_leaf 1.9368ms 1.8341ms 545.2312 Ops/s 535.1989 Ops/s $\color{#35bf28}+1.87\%$
test_keys_stack_nested_locked 0.8655ms 0.7733ms 1.2932 KOps/s 1.2695 KOps/s $\color{#35bf28}+1.87\%$
test_values 23.9000μs 1.3269μs 753.6135 KOps/s 840.9268 KOps/s $\textbf{\color{#d91a1a}-10.38\%}$
test_values_nested 0.1217ms 65.7141μs 15.2174 KOps/s 15.0729 KOps/s $\color{#35bf28}+0.96\%$
test_values_nested_locked 0.1187ms 65.7385μs 15.2118 KOps/s 15.0321 KOps/s $\color{#35bf28}+1.20\%$
test_values_nested_leaf 90.9000μs 57.2368μs 17.4713 KOps/s 17.1679 KOps/s $\color{#35bf28}+1.77\%$
test_values_stack_nested 1.7761ms 1.7251ms 579.6930 Ops/s 571.8518 Ops/s $\color{#35bf28}+1.37\%$
test_values_stack_nested_leaf 1.8720ms 1.7122ms 584.0358 Ops/s 578.6508 Ops/s $\color{#35bf28}+0.93\%$
test_values_stack_nested_locked 0.7319ms 0.6437ms 1.5535 KOps/s 1.5149 KOps/s $\color{#35bf28}+2.55\%$
test_membership 15.5000μs 1.8124μs 551.7590 KOps/s 561.2155 KOps/s $\color{#d91a1a}-1.69\%$
test_membership_nested 16.9000μs 3.6907μs 270.9533 KOps/s 278.1488 KOps/s $\color{#d91a1a}-2.59\%$
test_membership_nested_leaf 56.9000μs 3.6874μs 271.1920 KOps/s 277.4325 KOps/s $\color{#d91a1a}-2.25\%$
test_membership_stacked_nested 31.2000μs 6.5146μs 153.5022 KOps/s 157.0549 KOps/s $\color{#d91a1a}-2.26\%$
test_membership_stacked_nested_leaf 29.2000μs 6.5602μs 152.4340 KOps/s 151.7591 KOps/s $\color{#35bf28}+0.44\%$
test_membership_nested_last 59.3990μs 7.3490μs 136.0722 KOps/s 138.8834 KOps/s $\color{#d91a1a}-2.02\%$
test_membership_nested_leaf_last 30.9000μs 7.4029μs 135.0820 KOps/s 138.7461 KOps/s $\color{#d91a1a}-2.64\%$
test_membership_stacked_nested_last 0.2474ms 0.2028ms 4.9314 KOps/s 4.9421 KOps/s $\color{#d91a1a}-0.22\%$
test_membership_stacked_nested_leaf_last 0.2669ms 0.2042ms 4.8977 KOps/s 4.9576 KOps/s $\color{#d91a1a}-1.21\%$
test_stacked_getleaf 1.3295ms 1.2321ms 811.6236 Ops/s 803.9286 Ops/s $\color{#35bf28}+0.96\%$
test_stacked_get 1.2589ms 1.1902ms 840.2109 Ops/s 838.8096 Ops/s $\color{#35bf28}+0.17\%$
test_lock_nested 1.0151ms 0.9460ms 1.0571 KOps/s 1.0497 KOps/s $\color{#35bf28}+0.70\%$
test_lock_stack_nested 72.5387ms 12.9868ms 77.0015 Ops/s 76.1642 Ops/s $\color{#35bf28}+1.10\%$
test_unlock_nested 1.7513ms 0.9652ms 1.0360 KOps/s 1.0395 KOps/s $\color{#d91a1a}-0.33\%$
test_unlock_stack_nested 72.2449ms 13.4595ms 74.2968 Ops/s 73.5720 Ops/s $\color{#35bf28}+0.99\%$
test_flatten_speed 1.0993ms 0.9557ms 1.0463 KOps/s 1.0903 KOps/s $\color{#d91a1a}-4.03\%$
test_unflatten_speed 1.7245ms 1.6332ms 612.3102 Ops/s 615.6415 Ops/s $\color{#d91a1a}-0.54\%$
test_common_ops 1.3664ms 1.0329ms 968.1094 Ops/s 966.9230 Ops/s $\color{#35bf28}+0.12\%$
test_creation 39.1000μs 6.4408μs 155.2593 KOps/s 156.7658 KOps/s $\color{#d91a1a}-0.96\%$
test_creation_empty 41.1000μs 13.2704μs 75.3555 KOps/s 78.4214 KOps/s $\color{#d91a1a}-3.91\%$
test_creation_nested_1 80.1990μs 24.1525μs 41.4036 KOps/s 42.3889 KOps/s $\color{#d91a1a}-2.32\%$
test_creation_nested_2 53.7990μs 25.3773μs 39.4053 KOps/s 39.9764 KOps/s $\color{#d91a1a}-1.43\%$
test_clone 0.2350ms 24.5888μs 40.6690 KOps/s 39.6042 KOps/s $\color{#35bf28}+2.69\%$
test_getitem[int] 90.5990μs 29.6988μs 33.6714 KOps/s 31.9371 KOps/s $\textbf{\color{#35bf28}+5.43\%}$
test_getitem[slice_int] 92.9000μs 65.6365μs 15.2354 KOps/s 15.6260 KOps/s $\color{#d91a1a}-2.50\%$
test_getitem[range] 88.2000μs 63.8449μs 15.6630 KOps/s 15.4891 KOps/s $\color{#35bf28}+1.12\%$
test_getitem[tuple] 0.1386ms 59.2440μs 16.8794 KOps/s 16.8179 KOps/s $\color{#35bf28}+0.37\%$
test_getitem[list] 82.0990μs 54.5737μs 18.3239 KOps/s 18.4798 KOps/s $\color{#d91a1a}-0.84\%$
test_setitem_dim[int] 76.3000μs 42.4078μs 23.5806 KOps/s 23.4111 KOps/s $\color{#35bf28}+0.72\%$
test_setitem_dim[slice_int] 0.1153ms 80.4401μs 12.4316 KOps/s 12.4183 KOps/s $\color{#35bf28}+0.11\%$
test_setitem_dim[range] 0.1004ms 73.3023μs 13.6421 KOps/s 13.6868 KOps/s $\color{#d91a1a}-0.33\%$
test_setitem_dim[tuple] 89.2990μs 73.5168μs 13.6023 KOps/s 13.5984 KOps/s $\color{#35bf28}+0.03\%$
test_setitem 0.2254ms 29.6283μs 33.7515 KOps/s 33.2791 KOps/s $\color{#35bf28}+1.42\%$
test_set 0.2240ms 29.2203μs 34.2227 KOps/s 33.7123 KOps/s $\color{#35bf28}+1.51\%$
test_set_shared 0.3914ms 0.1503ms 6.6512 KOps/s 6.6412 KOps/s $\color{#35bf28}+0.15\%$
test_update 0.2585ms 30.7445μs 32.5262 KOps/s 32.0600 KOps/s $\color{#35bf28}+1.45\%$
test_update_nested 0.2539ms 47.9941μs 20.8359 KOps/s 20.5977 KOps/s $\color{#35bf28}+1.16\%$
test_set_nested 0.2530ms 40.8383μs 24.4868 KOps/s 24.3574 KOps/s $\color{#35bf28}+0.53\%$
test_set_nested_new 0.1189ms 60.7248μs 16.4677 KOps/s 16.4442 KOps/s $\color{#35bf28}+0.14\%$
test_select 0.1560ms 0.1038ms 9.6296 KOps/s 9.5242 KOps/s $\color{#35bf28}+1.11\%$
test_unbind_speed 0.6994ms 0.6392ms 1.5644 KOps/s 1.1276 KOps/s $\textbf{\color{#35bf28}+38.74\%}$
test_unbind_speed_stack0 3.5760ms 3.0240ms 330.6911 Ops/s 138.8365 Ops/s $\textbf{\color{#35bf28}+138.19\%}$
test_unbind_speed_stack1 2.4950μs 0.4474μs 2.2352 MOps/s 2.1472 MOps/s $\color{#35bf28}+4.10\%$
test_creation[device0] 0.7728ms 0.3273ms 3.0552 KOps/s 3.0522 KOps/s $\color{#35bf28}+0.10\%$
test_creation_from_tensor 0.7939ms 0.3668ms 2.7265 KOps/s 2.7045 KOps/s $\color{#35bf28}+0.81\%$
test_add_one[memmap_tensor0] 0.9604ms 30.2258μs 33.0844 KOps/s 32.4961 KOps/s $\color{#35bf28}+1.81\%$
test_contiguous[memmap_tensor0] 68.2990μs 8.2751μs 120.8440 KOps/s 117.4186 KOps/s $\color{#35bf28}+2.92\%$
test_stack[memmap_tensor0] 0.1288ms 24.9719μs 40.0451 KOps/s 39.0742 KOps/s $\color{#35bf28}+2.48\%$
test_memmaptd_index 0.3402ms 0.2725ms 3.6701 KOps/s 3.6822 KOps/s $\color{#d91a1a}-0.33\%$
test_memmaptd_index_astensor 1.1757ms 0.9810ms 1.0194 KOps/s 995.1240 Ops/s $\color{#35bf28}+2.44\%$
test_memmaptd_index_op 2.3393ms 2.0086ms 497.8492 Ops/s 488.7578 Ops/s $\color{#35bf28}+1.86\%$
test_reshape_pytree 0.1202ms 36.6679μs 27.2718 KOps/s 27.6756 KOps/s $\color{#d91a1a}-1.46\%$
test_reshape_td 0.1079ms 45.1521μs 22.1474 KOps/s 22.0648 KOps/s $\color{#35bf28}+0.37\%$
test_view_pytree 81.8980μs 33.8437μs 29.5476 KOps/s 30.0164 KOps/s $\color{#d91a1a}-1.56\%$
test_view_td 62.4990μs 8.6830μs 115.1681 KOps/s 113.9284 KOps/s $\color{#35bf28}+1.09\%$
test_unbind_pytree 73.6990μs 38.7668μs 25.7953 KOps/s 26.4407 KOps/s $\color{#d91a1a}-2.44\%$
test_unbind_td 0.2161ms 96.0348μs 10.4129 KOps/s 7.2559 KOps/s $\textbf{\color{#35bf28}+43.51\%}$
test_split_pytree 76.5990μs 42.1155μs 23.7442 KOps/s 23.5235 KOps/s $\color{#35bf28}+0.94\%$
test_split_td 0.7876ms 0.1140ms 8.7699 KOps/s 8.6855 KOps/s $\color{#35bf28}+0.97\%$
test_add_pytree 82.2980μs 45.0528μs 22.1962 KOps/s 21.8352 KOps/s $\color{#35bf28}+1.65\%$
test_add_td 0.1017ms 56.7828μs 17.6110 KOps/s 17.6986 KOps/s $\color{#d91a1a}-0.49\%$
test_distributed 68.6990μs 8.3389μs 119.9201 KOps/s 118.4837 KOps/s $\color{#35bf28}+1.21\%$
test_tdmodule 0.1297ms 22.2962μs 44.8506 KOps/s 43.4720 KOps/s $\color{#35bf28}+3.17\%$
test_tdmodule_dispatch 0.2900ms 49.7941μs 20.0827 KOps/s 20.1821 KOps/s $\color{#d91a1a}-0.49\%$
test_tdseq 0.1379ms 24.7214μs 40.4508 KOps/s 40.4175 KOps/s $\color{#35bf28}+0.08\%$
test_tdseq_dispatch 0.1861ms 53.2595μs 18.7760 KOps/s 18.4976 KOps/s $\color{#35bf28}+1.51\%$
test_instantiation_functorch 1.6579ms 1.5224ms 656.8713 Ops/s 651.4850 Ops/s $\color{#35bf28}+0.83\%$
test_instantiation_td 1.9381ms 1.2578ms 795.0489 Ops/s 793.6341 Ops/s $\color{#35bf28}+0.18\%$
test_exec_functorch 0.2636ms 0.1754ms 5.7024 KOps/s 5.6409 KOps/s $\color{#35bf28}+1.09\%$
test_exec_td 0.1975ms 0.1637ms 6.1089 KOps/s 6.1351 KOps/s $\color{#d91a1a}-0.43\%$
test_vmap_mlp_speed[True-True] 2.3210ms 1.2845ms 778.5291 Ops/s 790.5055 Ops/s $\color{#d91a1a}-1.52\%$
test_vmap_mlp_speed[True-False] 5.8376ms 0.5058ms 1.9769 KOps/s 2.0243 KOps/s $\color{#d91a1a}-2.34\%$
test_vmap_mlp_speed[False-True] 1.3815ms 1.0521ms 950.4395 Ops/s 941.3249 Ops/s $\color{#35bf28}+0.97\%$
test_vmap_mlp_speed[False-False] 1.7675ms 0.3945ms 2.5346 KOps/s 2.4855 KOps/s $\color{#35bf28}+1.98\%$
test_vmap_transformer_speed[True-True] 16.8433ms 16.4239ms 60.8869 Ops/s 61.8070 Ops/s $\color{#d91a1a}-1.49\%$
test_vmap_transformer_speed[True-False] 11.2644ms 7.7691ms 128.7153 Ops/s 130.8829 Ops/s $\color{#d91a1a}-1.66\%$
test_vmap_transformer_speed[False-True] 16.2424ms 15.6658ms 63.8334 Ops/s 64.0889 Ops/s $\color{#d91a1a}-0.40\%$
test_vmap_transformer_speed[False-False] 7.8829ms 7.5162ms 133.0468 Ops/s 134.4283 Ops/s $\color{#d91a1a}-1.03\%$

@vmoens vmoens merged commit 3bb784d into main Jul 3, 2023
@vmoens vmoens deleted the fix_unbind branch July 3, 2023 12:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Unbind resets nested batch sizes
3 participants