Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Support empty tuple in lazy stack indexing #696

Merged
merged 2 commits into from
Mar 4, 2024
Merged

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Mar 4, 2024

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 4, 2024
@vmoens vmoens marked this pull request as ready for review March 4, 2024 12:34
@vmoens vmoens added the bug Something isn't working label Mar 4, 2024
Copy link

github-actions bot commented Mar 4, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 126. Improved: $\large\color{#35bf28}7$. Worsened: $\large\color{#d91a1a}2$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 44.9240μs 17.2803μs 57.8693 KOps/s 55.6877 KOps/s $\color{#35bf28}+3.92\%$
test_plain_set_stack_nested 55.4240μs 17.5959μs 56.8315 KOps/s 55.1297 KOps/s $\color{#35bf28}+3.09\%$
test_plain_set_nested_inplace 60.9540μs 19.6609μs 50.8625 KOps/s 49.6580 KOps/s $\color{#35bf28}+2.43\%$
test_plain_set_stack_nested_inplace 48.8610μs 19.6046μs 51.0084 KOps/s 49.8873 KOps/s $\color{#35bf28}+2.25\%$
test_items 19.5770μs 2.3995μs 416.7586 KOps/s 416.1104 KOps/s $\color{#35bf28}+0.16\%$
test_items_nested 0.3274ms 0.2669ms 3.7464 KOps/s 3.6392 KOps/s $\color{#35bf28}+2.95\%$
test_items_nested_locked 1.0780ms 0.2678ms 3.7336 KOps/s 3.6542 KOps/s $\color{#35bf28}+2.17\%$
test_items_nested_leaf 0.2179ms 0.1655ms 6.0424 KOps/s 5.8660 KOps/s $\color{#35bf28}+3.01\%$
test_items_stack_nested 0.3911ms 0.2704ms 3.6986 KOps/s 3.5868 KOps/s $\color{#35bf28}+3.12\%$
test_items_stack_nested_leaf 0.6082ms 0.1664ms 6.0102 KOps/s 5.9614 KOps/s $\color{#35bf28}+0.82\%$
test_items_stack_nested_locked 0.3999ms 0.2711ms 3.6886 KOps/s 3.5894 KOps/s $\color{#35bf28}+2.76\%$
test_keys 27.7720μs 3.8177μs 261.9345 KOps/s 250.7571 KOps/s $\color{#35bf28}+4.46\%$
test_keys_nested 1.9794ms 0.1523ms 6.5680 KOps/s 6.6749 KOps/s $\color{#d91a1a}-1.60\%$
test_keys_nested_locked 0.3159ms 0.1521ms 6.5751 KOps/s 6.5098 KOps/s $\color{#35bf28}+1.00\%$
test_keys_nested_leaf 34.7881ms 0.1354ms 7.3880 KOps/s 7.6417 KOps/s $\color{#d91a1a}-3.32\%$
test_keys_stack_nested 0.2625ms 0.1539ms 6.4970 KOps/s 6.5457 KOps/s $\color{#d91a1a}-0.74\%$
test_keys_stack_nested_leaf 0.2304ms 0.1328ms 7.5279 KOps/s 7.4663 KOps/s $\color{#35bf28}+0.82\%$
test_keys_stack_nested_locked 0.2993ms 0.1607ms 6.2217 KOps/s 6.3632 KOps/s $\color{#d91a1a}-2.22\%$
test_values 4.8690μs 1.1897μs 840.5215 KOps/s 849.2419 KOps/s $\color{#d91a1a}-1.03\%$
test_values_nested 0.1026ms 51.8233μs 19.2963 KOps/s 19.0060 KOps/s $\color{#35bf28}+1.53\%$
test_values_nested_locked 0.1457ms 51.3729μs 19.4655 KOps/s 18.9323 KOps/s $\color{#35bf28}+2.82\%$
test_values_nested_leaf 89.8290μs 45.7269μs 21.8690 KOps/s 21.2054 KOps/s $\color{#35bf28}+3.13\%$
test_values_stack_nested 0.1195ms 52.2800μs 19.1278 KOps/s 18.5346 KOps/s $\color{#35bf28}+3.20\%$
test_values_stack_nested_leaf 88.9270μs 46.0913μs 21.6961 KOps/s 21.2882 KOps/s $\color{#35bf28}+1.92\%$
test_values_stack_nested_locked 0.1047ms 51.9859μs 19.2360 KOps/s 18.7923 KOps/s $\color{#35bf28}+2.36\%$
test_membership 21.0890μs 1.3388μs 746.9391 KOps/s 739.6683 KOps/s $\color{#35bf28}+0.98\%$
test_membership_nested 46.8880μs 3.5150μs 284.4965 KOps/s 290.5404 KOps/s $\color{#d91a1a}-2.08\%$
test_membership_nested_leaf 34.2850μs 3.4866μs 286.8114 KOps/s 290.0186 KOps/s $\color{#d91a1a}-1.11\%$
test_membership_stacked_nested 20.9000μs 3.4411μs 290.6086 KOps/s 292.5549 KOps/s $\color{#d91a1a}-0.67\%$
test_membership_stacked_nested_leaf 32.6320μs 3.5025μs 285.5086 KOps/s 257.1414 KOps/s $\textbf{\color{#35bf28}+11.03\%}$
test_membership_nested_last 47.1580μs 4.3107μs 231.9830 KOps/s 233.3943 KOps/s $\color{#d91a1a}-0.60\%$
test_membership_nested_leaf_last 31.7590μs 4.3718μs 228.7378 KOps/s 232.6066 KOps/s $\color{#d91a1a}-1.66\%$
test_membership_stacked_nested_last 27.6920μs 5.5052μs 181.6473 KOps/s 102.8258 KOps/s $\textbf{\color{#35bf28}+76.66\%}$
test_membership_stacked_nested_leaf_last 24.0450μs 5.5567μs 179.9627 KOps/s 102.3026 KOps/s $\textbf{\color{#35bf28}+75.91\%}$
test_nested_getleaf 35.9370μs 10.5554μs 94.7380 KOps/s 95.8392 KOps/s $\color{#d91a1a}-1.15\%$
test_nested_get 48.2110μs 10.1973μs 98.0656 KOps/s 100.3289 KOps/s $\color{#d91a1a}-2.26\%$
test_stacked_getleaf 32.9920μs 10.9812μs 91.0645 KOps/s 95.4730 KOps/s $\color{#d91a1a}-4.62\%$
test_stacked_get 27.7830μs 9.8909μs 101.1033 KOps/s 101.4098 KOps/s $\color{#d91a1a}-0.30\%$
test_nested_getitemleaf 48.9120μs 11.1373μs 89.7886 KOps/s 90.9963 KOps/s $\color{#d91a1a}-1.33\%$
test_nested_getitem 28.0520μs 10.5040μs 95.2018 KOps/s 97.1714 KOps/s $\color{#d91a1a}-2.03\%$
test_stacked_getitemleaf 37.6610μs 10.9176μs 91.5955 KOps/s 92.7195 KOps/s $\color{#d91a1a}-1.21\%$
test_stacked_getitem 41.9290μs 10.5270μs 94.9934 KOps/s 97.4616 KOps/s $\color{#d91a1a}-2.53\%$
test_lock_nested 0.7866ms 0.3404ms 2.9381 KOps/s 2.9456 KOps/s $\color{#d91a1a}-0.26\%$
test_lock_stack_nested 0.4155ms 0.2986ms 3.3489 KOps/s 3.4367 KOps/s $\color{#d91a1a}-2.55\%$
test_unlock_nested 97.2247ms 0.4409ms 2.2679 KOps/s 2.1817 KOps/s $\color{#35bf28}+3.95\%$
test_unlock_stack_nested 0.3965ms 0.3060ms 3.2677 KOps/s 3.3310 KOps/s $\color{#d91a1a}-1.90\%$
test_flatten_speed 0.6350ms 0.2840ms 3.5208 KOps/s 3.4994 KOps/s $\color{#35bf28}+0.61\%$
test_unflatten_speed 0.8060ms 0.4129ms 2.4221 KOps/s 2.3952 KOps/s $\color{#35bf28}+1.13\%$
test_common_ops 4.2848ms 0.7193ms 1.3902 KOps/s 1.3248 KOps/s $\color{#35bf28}+4.93\%$
test_creation 17.4230μs 1.8249μs 547.9717 KOps/s 550.7654 KOps/s $\color{#d91a1a}-0.51\%$
test_creation_empty 33.9340μs 10.6854μs 93.5857 KOps/s 87.5624 KOps/s $\textbf{\color{#35bf28}+6.88\%}$
test_creation_nested_1 42.3300μs 13.3031μs 75.1704 KOps/s 71.7653 KOps/s $\color{#35bf28}+4.74\%$
test_creation_nested_2 38.2620μs 16.6309μs 60.1289 KOps/s 57.6553 KOps/s $\color{#35bf28}+4.29\%$
test_clone 0.1404ms 13.5522μs 73.7888 KOps/s 74.6718 KOps/s $\color{#d91a1a}-1.18\%$
test_getitem[int] 31.3290μs 11.3495μs 88.1093 KOps/s 87.9478 KOps/s $\color{#35bf28}+0.18\%$
test_getitem[slice_int] 61.8650μs 22.6587μs 44.1331 KOps/s 42.6027 KOps/s $\color{#35bf28}+3.59\%$
test_getitem[range] 0.1062ms 42.3851μs 23.5932 KOps/s 23.7682 KOps/s $\color{#d91a1a}-0.74\%$
test_getitem[tuple] 77.8660μs 18.7105μs 53.4459 KOps/s 53.4405 KOps/s $\color{#35bf28}+0.01\%$
test_getitem[list] 0.3827ms 37.6271μs 26.5766 KOps/s 27.0571 KOps/s $\color{#d91a1a}-1.78\%$
test_setitem_dim[int] 67.3360μs 34.2735μs 29.1771 KOps/s 29.2976 KOps/s $\color{#d91a1a}-0.41\%$
test_setitem_dim[slice_int] 0.1071ms 61.6586μs 16.2183 KOps/s 16.2725 KOps/s $\color{#d91a1a}-0.33\%$
test_setitem_dim[range] 0.1515ms 80.4339μs 12.4326 KOps/s 12.4473 KOps/s $\color{#d91a1a}-0.12\%$
test_setitem_dim[tuple] 97.3020μs 49.6980μs 20.1215 KOps/s 20.2439 KOps/s $\color{#d91a1a}-0.60\%$
test_setitem 0.1777ms 20.4533μs 48.8918 KOps/s 48.3680 KOps/s $\color{#35bf28}+1.08\%$
test_set 0.1695ms 19.8297μs 50.4294 KOps/s 49.7259 KOps/s $\color{#35bf28}+1.41\%$
test_set_shared 1.8828ms 0.1460ms 6.8514 KOps/s 6.8748 KOps/s $\color{#d91a1a}-0.34\%$
test_update 0.1797ms 23.1089μs 43.2734 KOps/s 41.9956 KOps/s $\color{#35bf28}+3.04\%$
test_update_nested 0.1524ms 31.6331μs 31.6125 KOps/s 31.4615 KOps/s $\color{#35bf28}+0.48\%$
test_set_nested 0.1487ms 21.7916μs 45.8892 KOps/s 45.3840 KOps/s $\color{#35bf28}+1.11\%$
test_set_nested_new 93.0550μs 25.8267μs 38.7197 KOps/s 38.5559 KOps/s $\color{#35bf28}+0.42\%$
test_select 0.1115ms 39.3633μs 25.4044 KOps/s 24.6859 KOps/s $\color{#35bf28}+2.91\%$
test_select_nested 0.8339ms 57.4479μs 17.4071 KOps/s 17.3644 KOps/s $\color{#35bf28}+0.25\%$
test_exclude_nested 0.2261ms 0.1184ms 8.4437 KOps/s 8.4396 KOps/s $\color{#35bf28}+0.05\%$
test_empty[True] 0.5753ms 0.4149ms 2.4105 KOps/s 2.3764 KOps/s $\color{#35bf28}+1.43\%$
test_empty[False] 5.4522μs 1.0494μs 952.9006 KOps/s 969.4714 KOps/s $\color{#d91a1a}-1.71\%$
test_unbind_speed 0.2917ms 0.2448ms 4.0846 KOps/s 4.0677 KOps/s $\color{#35bf28}+0.41\%$
test_unbind_speed_stack0 0.3946ms 0.2386ms 4.1912 KOps/s 4.2990 KOps/s $\color{#d91a1a}-2.51\%$
test_unbind_speed_stack1 0.1253s 0.6675ms 1.4982 KOps/s 1.5496 KOps/s $\color{#d91a1a}-3.32\%$
test_split 0.1131s 1.6575ms 603.3263 Ops/s 603.5547 Ops/s $\color{#d91a1a}-0.04\%$
test_chunk 1.7593ms 1.4825ms 674.5559 Ops/s 685.0971 Ops/s $\color{#d91a1a}-1.54\%$
test_creation[device0] 0.2374ms 0.1044ms 9.5753 KOps/s 9.6390 KOps/s $\color{#d91a1a}-0.66\%$
test_creation_from_tensor 3.9858ms 83.2736μs 12.0086 KOps/s 12.0781 KOps/s $\color{#d91a1a}-0.58\%$
test_add_one[memmap_tensor0] 94.0960μs 5.5450μs 180.3440 KOps/s 184.6990 KOps/s $\color{#d91a1a}-2.36\%$
test_contiguous[memmap_tensor0] 21.1290μs 0.6471μs 1.5453 MOps/s 1.5658 MOps/s $\color{#d91a1a}-1.31\%$
test_stack[memmap_tensor0] 23.0230μs 3.7428μs 267.1807 KOps/s 283.5899 KOps/s $\textbf{\color{#d91a1a}-5.79\%}$
test_memmaptd_index 0.9118ms 0.2408ms 4.1536 KOps/s 4.0825 KOps/s $\color{#35bf28}+1.74\%$
test_memmaptd_index_astensor 0.7257ms 0.3042ms 3.2870 KOps/s 3.3061 KOps/s $\color{#d91a1a}-0.58\%$
test_memmaptd_index_op 0.8817ms 0.6087ms 1.6428 KOps/s 1.5992 KOps/s $\color{#35bf28}+2.73\%$
test_serialize_model 0.2183s 0.1140s 8.7685 Ops/s 8.0820 Ops/s $\textbf{\color{#35bf28}+8.50\%}$
test_serialize_model_pickle 0.4643s 0.3758s 2.6611 Ops/s 2.6127 Ops/s $\color{#35bf28}+1.85\%$
test_serialize_weights 0.1065s 97.9828ms 10.2059 Ops/s 10.0338 Ops/s $\color{#35bf28}+1.71\%$
test_serialize_weights_returnearly 0.2502s 0.1355s 7.3781 Ops/s 6.8660 Ops/s $\textbf{\color{#35bf28}+7.46\%}$
test_serialize_weights_pickle 0.7994s 0.4859s 2.0581 Ops/s 1.5587 Ops/s $\textbf{\color{#35bf28}+32.04\%}$
test_serialize_weights_filesystem 0.1044s 94.8566ms 10.5422 Ops/s 10.6217 Ops/s $\color{#d91a1a}-0.75\%$
test_serialize_model_filesystem 0.1013s 94.3676ms 10.5969 Ops/s 10.6203 Ops/s $\color{#d91a1a}-0.22\%$
test_reshape_pytree 67.1760μs 21.3950μs 46.7399 KOps/s 47.9650 KOps/s $\color{#d91a1a}-2.55\%$
test_reshape_td 86.4220μs 32.2384μs 31.0189 KOps/s 32.6175 KOps/s $\color{#d91a1a}-4.90\%$
test_view_pytree 70.5220μs 21.1048μs 47.3825 KOps/s 48.8749 KOps/s $\color{#d91a1a}-3.05\%$
test_view_td 0.1429s 66.6139μs 15.0119 KOps/s 16.2114 KOps/s $\textbf{\color{#d91a1a}-7.40\%}$
test_unbind_pytree 66.8750μs 24.4665μs 40.8723 KOps/s 41.3392 KOps/s $\color{#d91a1a}-1.13\%$
test_unbind_td 0.1264ms 35.8123μs 27.9234 KOps/s 28.2857 KOps/s $\color{#d91a1a}-1.28\%$
test_split_pytree 50.8060μs 24.1309μs 41.4406 KOps/s 41.8878 KOps/s $\color{#d91a1a}-1.07\%$
test_split_td 0.1192ms 40.6232μs 24.6165 KOps/s 25.3520 KOps/s $\color{#d91a1a}-2.90\%$
test_add_pytree 79.0480μs 29.8643μs 33.4849 KOps/s 33.6085 KOps/s $\color{#d91a1a}-0.37\%$
test_add_td 0.1103ms 52.8270μs 18.9297 KOps/s 18.1832 KOps/s $\color{#35bf28}+4.11\%$
test_distributed 0.1827ms 0.1005ms 9.9472 KOps/s 10.0679 KOps/s $\color{#d91a1a}-1.20\%$
test_tdmodule 68.3380μs 17.5511μs 56.9765 KOps/s 55.6944 KOps/s $\color{#35bf28}+2.30\%$
test_tdmodule_dispatch 54.6620μs 33.6195μs 29.7447 KOps/s 29.3674 KOps/s $\color{#35bf28}+1.28\%$
test_tdseq 34.8950μs 20.2412μs 49.4042 KOps/s 48.7171 KOps/s $\color{#35bf28}+1.41\%$
test_tdseq_dispatch 63.6990μs 38.5375μs 25.9488 KOps/s 25.4492 KOps/s $\color{#35bf28}+1.96\%$
test_instantiation_functorch 1.9365ms 1.3157ms 760.0266 Ops/s 756.2060 Ops/s $\color{#35bf28}+0.51\%$
test_instantiation_td 2.2001ms 1.0512ms 951.2788 Ops/s 995.4631 Ops/s $\color{#d91a1a}-4.44\%$
test_exec_functorch 0.3436ms 0.1598ms 6.2565 KOps/s 6.3006 KOps/s $\color{#d91a1a}-0.70\%$
test_exec_functional_call 0.2777ms 0.1537ms 6.5061 KOps/s 6.6822 KOps/s $\color{#d91a1a}-2.64\%$
test_exec_td 0.2448ms 0.1496ms 6.6823 KOps/s 6.8578 KOps/s $\color{#d91a1a}-2.56\%$
test_exec_td_decorator 0.6347ms 0.2005ms 4.9870 KOps/s 5.1356 KOps/s $\color{#d91a1a}-2.89\%$
test_vmap_mlp_speed[True-True] 0.8161ms 0.4776ms 2.0937 KOps/s 2.1430 KOps/s $\color{#d91a1a}-2.30\%$
test_vmap_mlp_speed[True-False] 0.8641ms 0.4714ms 2.1214 KOps/s 2.1385 KOps/s $\color{#d91a1a}-0.80\%$
test_vmap_mlp_speed[False-True] 0.7438ms 0.3809ms 2.6253 KOps/s 2.5961 KOps/s $\color{#35bf28}+1.13\%$
test_vmap_mlp_speed[False-False] 0.4981ms 0.3825ms 2.6144 KOps/s 2.5915 KOps/s $\color{#35bf28}+0.88\%$
test_vmap_mlp_speed_decorator[True-True] 0.9997ms 0.4927ms 2.0298 KOps/s 2.0180 KOps/s $\color{#35bf28}+0.58\%$
test_vmap_mlp_speed_decorator[True-False] 0.7605ms 0.4921ms 2.0319 KOps/s 2.0053 KOps/s $\color{#35bf28}+1.33\%$
test_vmap_mlp_speed_decorator[False-True] 0.9353ms 0.4228ms 2.3650 KOps/s 2.4750 KOps/s $\color{#d91a1a}-4.44\%$
test_vmap_mlp_speed_decorator[False-False] 0.7508ms 0.3996ms 2.5022 KOps/s 2.4783 KOps/s $\color{#35bf28}+0.97\%$
test_to_module_speed[True] 2.1999ms 1.3858ms 721.6208 Ops/s 732.7873 Ops/s $\color{#d91a1a}-1.52\%$
test_to_module_speed[False] 1.5132ms 1.3736ms 727.9883 Ops/s 730.5186 Ops/s $\color{#d91a1a}-0.35\%$

@vmoens vmoens merged commit 6804951 into main Mar 4, 2024
45 of 48 checks passed
@vmoens vmoens deleted the fix-indexing-empty branch March 4, 2024 13:24
vmoens added a commit that referenced this pull request Mar 24, 2024
vmoens added a commit that referenced this pull request Mar 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants