Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] No fallback on TensorDictModule.__getattr__ for private attributes #579

Merged
merged 2 commits into from
Nov 27, 2023

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Nov 27, 2023

Description

Currently, TensorDictModule.__getattr__ falls back on the module __getattr__ if the attribute cannot be found. For deepcopy, this causes __deepcopy__ to return the module within, not the TDModule container copy.
We solve this by excluding any private attribute from fallback.

Related issue:
pytorch/rl#1576 (comment)

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 27, 2023
@vmoens vmoens added the bug Something isn't working label Nov 27, 2023
@vmoens vmoens merged commit c730421 into main Nov 27, 2023
29 of 33 checks passed
@vmoens vmoens deleted the fix_deepcopy_modules branch November 27, 2023 09:55
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 113. Improved: $\large\color{#35bf28}4$. Worsened: $\large\color{#d91a1a}8$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 31.2190μs 15.6570μs 63.8694 KOps/s 64.5842 KOps/s $\color{#d91a1a}-1.11\%$
test_plain_set_stack_nested 0.1948ms 0.1439ms 6.9508 KOps/s 6.9894 KOps/s $\color{#d91a1a}-0.55\%$
test_plain_set_nested_inplace 44.5530μs 19.0792μs 52.4130 KOps/s 52.3299 KOps/s $\color{#35bf28}+0.16\%$
test_plain_set_stack_nested_inplace 0.2619ms 0.1741ms 5.7441 KOps/s 5.8538 KOps/s $\color{#d91a1a}-1.87\%$
test_items 18.8160μs 2.3906μs 418.2972 KOps/s 400.8915 KOps/s $\color{#35bf28}+4.34\%$
test_items_nested 1.3154ms 0.2680ms 3.7312 KOps/s 3.7218 KOps/s $\color{#35bf28}+0.25\%$
test_items_nested_locked 0.3282ms 0.2664ms 3.7539 KOps/s 3.7556 KOps/s $\color{#d91a1a}-0.04\%$
test_items_nested_leaf 0.5559ms 0.1649ms 6.0638 KOps/s 6.0174 KOps/s $\color{#35bf28}+0.77\%$
test_items_stack_nested 1.8746ms 1.4663ms 681.9770 Ops/s 669.5286 Ops/s $\color{#35bf28}+1.86\%$
test_items_stack_nested_leaf 1.4841ms 1.3511ms 740.1169 Ops/s 741.6227 Ops/s $\color{#d91a1a}-0.20\%$
test_items_stack_nested_locked 0.9871ms 0.7676ms 1.3027 KOps/s 1.3075 KOps/s $\color{#d91a1a}-0.37\%$
test_keys 18.8750μs 3.8933μs 256.8513 KOps/s 259.1817 KOps/s $\color{#d91a1a}-0.90\%$
test_keys_nested 1.4173ms 0.1396ms 7.1611 KOps/s 6.8299 KOps/s $\color{#35bf28}+4.85\%$
test_keys_nested_locked 0.2018ms 0.1406ms 7.1102 KOps/s 7.2006 KOps/s $\color{#d91a1a}-1.26\%$
test_keys_nested_leaf 0.3182ms 0.1424ms 7.0235 KOps/s 7.2410 KOps/s $\color{#d91a1a}-3.00\%$
test_keys_stack_nested 1.5083ms 1.4027ms 712.9213 Ops/s 707.3967 Ops/s $\color{#35bf28}+0.78\%$
test_keys_stack_nested_leaf 2.0855ms 1.4042ms 712.1255 Ops/s 713.8910 Ops/s $\color{#d91a1a}-0.25\%$
test_keys_stack_nested_locked 1.1008ms 0.6763ms 1.4786 KOps/s 1.4775 KOps/s $\color{#35bf28}+0.07\%$
test_values 7.3762μs 1.2139μs 823.7580 KOps/s 851.7412 KOps/s $\color{#d91a1a}-3.29\%$
test_values_nested 93.3240μs 49.7031μs 20.1195 KOps/s 20.3469 KOps/s $\color{#d91a1a}-1.12\%$
test_values_nested_locked 95.9620μs 49.1564μs 20.3432 KOps/s 20.2728 KOps/s $\color{#35bf28}+0.35\%$
test_values_nested_leaf 69.8400μs 44.3956μs 22.5248 KOps/s 22.6221 KOps/s $\color{#d91a1a}-0.43\%$
test_values_stack_nested 2.6174ms 1.2032ms 831.1246 Ops/s 832.7937 Ops/s $\color{#d91a1a}-0.20\%$
test_values_stack_nested_leaf 2.2087ms 1.1897ms 840.5677 Ops/s 832.8243 Ops/s $\color{#35bf28}+0.93\%$
test_values_stack_nested_locked 0.6582ms 0.5117ms 1.9542 KOps/s 1.9604 KOps/s $\color{#d91a1a}-0.31\%$
test_membership 13.2750μs 1.3543μs 738.3742 KOps/s 724.0232 KOps/s $\color{#35bf28}+1.98\%$
test_membership_nested 64.1300μs 2.7601μs 362.3031 KOps/s 353.0783 KOps/s $\color{#35bf28}+2.61\%$
test_membership_nested_leaf 45.5210μs 2.7524μs 363.3203 KOps/s 360.4311 KOps/s $\color{#35bf28}+0.80\%$
test_membership_stacked_nested 40.2050μs 11.6571μs 85.7846 KOps/s 85.9239 KOps/s $\color{#d91a1a}-0.16\%$
test_membership_stacked_nested_leaf 25.9390μs 11.6867μs 85.5675 KOps/s 86.3888 KOps/s $\color{#d91a1a}-0.95\%$
test_membership_nested_last 50.9080μs 5.9860μs 167.0569 KOps/s 166.7125 KOps/s $\color{#35bf28}+0.21\%$
test_membership_nested_leaf_last 23.2940μs 5.9197μs 168.9277 KOps/s 165.5562 KOps/s $\color{#35bf28}+2.04\%$
test_membership_stacked_nested_last 0.5025ms 0.1714ms 5.8350 KOps/s 6.0060 KOps/s $\color{#d91a1a}-2.85\%$
test_membership_stacked_nested_leaf_last 36.1380μs 13.7797μs 72.5703 KOps/s 72.1644 KOps/s $\color{#35bf28}+0.56\%$
test_nested_getleaf 33.5320μs 10.6606μs 93.8037 KOps/s 93.9921 KOps/s $\color{#d91a1a}-0.20\%$
test_nested_get 28.7240μs 10.0145μs 99.8553 KOps/s 101.3828 KOps/s $\color{#d91a1a}-1.51\%$
test_stacked_getleaf 1.0663ms 0.6417ms 1.5585 KOps/s 1.5450 KOps/s $\color{#35bf28}+0.87\%$
test_stacked_get 0.7404ms 0.6155ms 1.6248 KOps/s 1.5899 KOps/s $\color{#35bf28}+2.19\%$
test_nested_getitemleaf 35.1950μs 10.7804μs 92.7609 KOps/s 92.3586 KOps/s $\color{#35bf28}+0.44\%$
test_nested_getitem 31.1280μs 10.2495μs 97.5655 KOps/s 96.9019 KOps/s $\color{#35bf28}+0.68\%$
test_stacked_getitemleaf 1.1567ms 0.6527ms 1.5320 KOps/s 1.4956 KOps/s $\color{#35bf28}+2.44\%$
test_stacked_getitem 0.7371ms 0.6185ms 1.6168 KOps/s 1.6198 KOps/s $\color{#d91a1a}-0.18\%$
test_lock_nested 53.7979ms 0.5505ms 1.8166 KOps/s 1.7803 KOps/s $\color{#35bf28}+2.03\%$
test_lock_stack_nested 83.6560ms 8.4453ms 118.4095 Ops/s 196.7960 Ops/s $\textbf{\color{#d91a1a}-39.83\%}$
test_unlock_nested 58.8826ms 0.5086ms 1.9660 KOps/s 2.2723 KOps/s $\textbf{\color{#d91a1a}-13.48\%}$
test_unlock_stack_nested 71.9653ms 8.0911ms 123.5919 Ops/s 144.0157 Ops/s $\textbf{\color{#d91a1a}-14.18\%}$
test_flatten_speed 0.5509ms 0.2714ms 3.6843 KOps/s 3.6860 KOps/s $\color{#d91a1a}-0.05\%$
test_unflatten_speed 0.7762ms 0.4660ms 2.1457 KOps/s 2.1468 KOps/s $\color{#d91a1a}-0.05\%$
test_common_ops 1.2129ms 0.6747ms 1.4822 KOps/s 1.5268 KOps/s $\color{#d91a1a}-2.92\%$
test_creation 61.3450μs 2.4359μs 410.5319 KOps/s 402.6016 KOps/s $\color{#35bf28}+1.97\%$
test_creation_empty 24.3160μs 8.1844μs 122.1842 KOps/s 122.6396 KOps/s $\color{#d91a1a}-0.37\%$
test_creation_nested_1 29.9960μs 11.7232μs 85.3008 KOps/s 86.6994 KOps/s $\color{#d91a1a}-1.61\%$
test_creation_nested_2 43.0900μs 15.1149μs 66.1599 KOps/s 66.6181 KOps/s $\color{#d91a1a}-0.69\%$
test_clone 0.1687ms 13.5254μs 73.9350 KOps/s 73.6126 KOps/s $\color{#35bf28}+0.44\%$
test_getitem[int] 36.7590μs 13.0614μs 76.5614 KOps/s 74.6543 KOps/s $\color{#35bf28}+2.55\%$
test_getitem[slice_int] 60.9830μs 24.6750μs 40.5268 KOps/s 37.5405 KOps/s $\textbf{\color{#35bf28}+7.95\%}$
test_getitem[range] 93.1740μs 44.6970μs 22.3729 KOps/s 22.1413 KOps/s $\color{#35bf28}+1.05\%$
test_getitem[tuple] 54.9720μs 20.2340μs 49.4217 KOps/s 47.9384 KOps/s $\color{#35bf28}+3.09\%$
test_getitem[list] 80.4800μs 39.3779μs 25.3950 KOps/s 24.7604 KOps/s $\color{#35bf28}+2.56\%$
test_setitem_dim[int] 53.3600μs 27.5084μs 36.3525 KOps/s 35.4014 KOps/s $\color{#35bf28}+2.69\%$
test_setitem_dim[slice_int] 0.1072ms 51.9562μs 19.2470 KOps/s 19.0424 KOps/s $\color{#35bf28}+1.07\%$
test_setitem_dim[range] 0.1307ms 70.7975μs 14.1248 KOps/s 14.0786 KOps/s $\color{#35bf28}+0.33\%$
test_setitem_dim[tuple] 67.3650μs 40.9775μs 24.4036 KOps/s 23.5425 KOps/s $\color{#35bf28}+3.66\%$
test_setitem 0.1178ms 18.2642μs 54.7519 KOps/s 54.1471 KOps/s $\color{#35bf28}+1.12\%$
test_set 0.1795ms 17.8357μs 56.0673 KOps/s 56.0425 KOps/s $\color{#35bf28}+0.04\%$
test_set_shared 1.9625ms 0.1423ms 7.0298 KOps/s 7.0810 KOps/s $\color{#d91a1a}-0.72\%$
test_update 0.2084ms 19.0562μs 52.4763 KOps/s 52.6096 KOps/s $\color{#d91a1a}-0.25\%$
test_update_nested 0.2075ms 26.3972μs 37.8829 KOps/s 37.1323 KOps/s $\color{#35bf28}+2.02\%$
test_set_nested 0.1770ms 19.4636μs 51.3780 KOps/s 50.6780 KOps/s $\color{#35bf28}+1.38\%$
test_set_nested_new 0.2409ms 24.9010μs 40.1591 KOps/s 38.6171 KOps/s $\color{#35bf28}+3.99\%$
test_select 0.1947ms 50.8659μs 19.6595 KOps/s 19.3218 KOps/s $\color{#35bf28}+1.75\%$
test_unbind_speed 0.6308ms 0.3777ms 2.6479 KOps/s 2.6517 KOps/s $\color{#d91a1a}-0.15\%$
test_unbind_speed_stack0 61.4736ms 5.2946ms 188.8707 Ops/s 211.6008 Ops/s $\textbf{\color{#d91a1a}-10.74\%}$
test_unbind_speed_stack1 1.8860μs 0.6217μs 1.6085 MOps/s 1.5646 MOps/s $\color{#35bf28}+2.81\%$
test_split 55.8859ms 1.7790ms 562.1208 Ops/s 566.0262 Ops/s $\color{#d91a1a}-0.69\%$
test_chunk 62.0893ms 1.7532ms 570.3798 Ops/s 570.5141 Ops/s $\color{#d91a1a}-0.02\%$
test_creation[device0] 3.1896ms 0.3001ms 3.3323 KOps/s 3.3831 KOps/s $\color{#d91a1a}-1.50\%$
test_creation_from_tensor 56.6369ms 0.3579ms 2.7941 KOps/s 3.0685 KOps/s $\textbf{\color{#d91a1a}-8.94\%}$
test_add_one[memmap_tensor0] 74.0990μs 24.6803μs 40.5181 KOps/s 40.5548 KOps/s $\color{#d91a1a}-0.09\%$
test_contiguous[memmap_tensor0] 28.6830μs 5.7729μs 173.2228 KOps/s 171.8807 KOps/s $\color{#35bf28}+0.78\%$
test_stack[memmap_tensor0] 0.1212ms 18.9006μs 52.9084 KOps/s 52.2844 KOps/s $\color{#35bf28}+1.19\%$
test_memmaptd_index 0.6894ms 0.3996ms 2.5028 KOps/s 2.4415 KOps/s $\color{#35bf28}+2.51\%$
test_memmaptd_index_astensor 0.5462ms 0.4620ms 2.1647 KOps/s 2.1335 KOps/s $\color{#35bf28}+1.47\%$
test_memmaptd_index_op 0.8829ms 0.7109ms 1.4067 KOps/s 1.4245 KOps/s $\color{#d91a1a}-1.24\%$
test_reshape_pytree 54.1410μs 23.6248μs 42.3285 KOps/s 42.0671 KOps/s $\color{#35bf28}+0.62\%$
test_reshape_td 68.4580μs 31.1699μs 32.0823 KOps/s 31.7193 KOps/s $\color{#35bf28}+1.14\%$
test_view_pytree 57.3370μs 23.7551μs 42.0962 KOps/s 42.1809 KOps/s $\color{#d91a1a}-0.20\%$
test_view_td 31.5590μs 4.8785μs 204.9796 KOps/s 199.4701 KOps/s $\color{#35bf28}+2.76\%$
test_unbind_pytree 1.5133ms 26.8750μs 37.2092 KOps/s 37.1394 KOps/s $\color{#35bf28}+0.19\%$
test_unbind_td 0.1245ms 59.8728μs 16.7021 KOps/s 16.6475 KOps/s $\color{#35bf28}+0.33\%$
test_split_pytree 73.9180μs 27.0435μs 36.9774 KOps/s 36.3572 KOps/s $\color{#35bf28}+1.71\%$
test_split_td 0.1195ms 46.6495μs 21.4365 KOps/s 21.0237 KOps/s $\color{#35bf28}+1.96\%$
test_add_pytree 81.8630μs 32.9139μs 30.3823 KOps/s 31.2141 KOps/s $\color{#d91a1a}-2.66\%$
test_add_td 94.5870μs 44.4996μs 22.4721 KOps/s 22.9608 KOps/s $\color{#d91a1a}-2.13\%$
test_distributed 25.2870μs 5.9996μs 166.6786 KOps/s 163.3162 KOps/s $\color{#35bf28}+2.06\%$
test_tdmodule 0.1103ms 20.5530μs 48.6547 KOps/s 46.7130 KOps/s $\color{#35bf28}+4.16\%$
test_tdmodule_dispatch 0.2164ms 38.3288μs 26.0900 KOps/s 25.2173 KOps/s $\color{#35bf28}+3.46\%$
test_tdseq 65.3820μs 24.1376μs 41.4291 KOps/s 40.3441 KOps/s $\color{#35bf28}+2.69\%$
test_tdseq_dispatch 0.1476ms 42.2487μs 23.6694 KOps/s 23.5857 KOps/s $\color{#35bf28}+0.35\%$
test_instantiation_functorch 1.4692ms 1.3205ms 757.3047 Ops/s 766.8172 Ops/s $\color{#d91a1a}-1.24\%$
test_instantiation_td 1.5036ms 1.0269ms 973.7946 Ops/s 910.7149 Ops/s $\textbf{\color{#35bf28}+6.93\%}$
test_exec_functorch 0.3415ms 0.1610ms 6.2101 KOps/s 6.2456 KOps/s $\color{#d91a1a}-0.57\%$
test_exec_functional_call 0.2295ms 0.1462ms 6.8420 KOps/s 6.7705 KOps/s $\color{#35bf28}+1.06\%$
test_exec_td 0.2497ms 0.1408ms 7.1041 KOps/s 6.9987 KOps/s $\color{#35bf28}+1.51\%$
test_exec_td_decorator 0.7592ms 0.2196ms 4.5535 KOps/s 5.6097 KOps/s $\textbf{\color{#d91a1a}-18.83\%}$
test_vmap_mlp_speed[True-True] 0.9966ms 0.8832ms 1.1323 KOps/s 1.0936 KOps/s $\color{#35bf28}+3.54\%$
test_vmap_mlp_speed[True-False] 0.6973ms 0.4645ms 2.1528 KOps/s 2.1304 KOps/s $\color{#35bf28}+1.05\%$
test_vmap_mlp_speed[False-True] 0.9862ms 0.7691ms 1.3002 KOps/s 1.2757 KOps/s $\color{#35bf28}+1.93\%$
test_vmap_mlp_speed[False-False] 0.6428ms 0.3831ms 2.6102 KOps/s 2.6053 KOps/s $\color{#35bf28}+0.19\%$
test_vmap_mlp_speed_decorator[True-True] 2.4022ms 1.5483ms 645.8667 Ops/s 554.6986 Ops/s $\textbf{\color{#35bf28}+16.44\%}$
test_vmap_mlp_speed_decorator[True-False] 1.0152ms 0.5462ms 1.8307 KOps/s 1.9373 KOps/s $\textbf{\color{#d91a1a}-5.50\%}$
test_vmap_mlp_speed_decorator[False-True] 1.8666ms 1.3574ms 736.7021 Ops/s 665.7964 Ops/s $\textbf{\color{#35bf28}+10.65\%}$
test_vmap_mlp_speed_decorator[False-False] 0.8616ms 0.4236ms 2.3605 KOps/s 2.5096 KOps/s $\textbf{\color{#d91a1a}-5.94\%}$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 127. Improved: $\large\color{#35bf28}10$. Worsened: $\large\color{#d91a1a}16$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 91.9510μs 12.8164μs 78.0248 KOps/s 78.4511 KOps/s $\color{#d91a1a}-0.54\%$
test_plain_set_stack_nested 0.1309ms 0.1163ms 8.6001 KOps/s 8.6984 KOps/s $\color{#d91a1a}-1.13\%$
test_plain_set_nested_inplace 30.9110μs 15.6268μs 63.9926 KOps/s 65.2082 KOps/s $\color{#d91a1a}-1.86\%$
test_plain_set_stack_nested_inplace 0.1791ms 0.1440ms 6.9447 KOps/s 7.0200 KOps/s $\color{#d91a1a}-1.07\%$
test_items 25.3510μs 4.6965μs 212.9261 KOps/s 213.6690 KOps/s $\color{#d91a1a}-0.35\%$
test_items_nested 0.3601ms 0.3391ms 2.9491 KOps/s 2.9738 KOps/s $\color{#d91a1a}-0.83\%$
test_items_nested_locked 0.4013ms 0.3380ms 2.9585 KOps/s 2.9670 KOps/s $\color{#d91a1a}-0.29\%$
test_items_nested_leaf 0.2176ms 0.1984ms 5.0409 KOps/s 5.0030 KOps/s $\color{#35bf28}+0.76\%$
test_items_stack_nested 1.5555ms 1.5018ms 665.8727 Ops/s 673.5313 Ops/s $\color{#d91a1a}-1.14\%$
test_items_stack_nested_leaf 1.4024ms 1.3274ms 753.3711 Ops/s 765.9583 Ops/s $\color{#d91a1a}-1.64\%$
test_items_stack_nested_locked 0.8900ms 0.8309ms 1.2035 KOps/s 1.2053 KOps/s $\color{#d91a1a}-0.15\%$
test_keys 65.4320μs 4.5727μs 218.6908 KOps/s 218.2332 KOps/s $\color{#35bf28}+0.21\%$
test_keys_nested 1.2701ms 91.5238μs 10.9261 KOps/s 11.0511 KOps/s $\color{#d91a1a}-1.13\%$
test_keys_nested_locked 0.1180ms 91.0065μs 10.9882 KOps/s 11.0705 KOps/s $\color{#d91a1a}-0.74\%$
test_keys_nested_leaf 41.5408ms 87.6715μs 11.4062 KOps/s 12.2148 KOps/s $\textbf{\color{#d91a1a}-6.62\%}$
test_keys_stack_nested 1.3840ms 1.3156ms 760.0858 Ops/s 776.8794 Ops/s $\color{#d91a1a}-2.16\%$
test_keys_stack_nested_leaf 1.3329ms 1.3030ms 767.4336 Ops/s 788.6212 Ops/s $\color{#d91a1a}-2.69\%$
test_keys_stack_nested_locked 0.6838ms 0.6329ms 1.5801 KOps/s 1.5906 KOps/s $\color{#d91a1a}-0.66\%$
test_values 18.9307μs 1.8882μs 529.6042 KOps/s 528.1665 KOps/s $\color{#35bf28}+0.27\%$
test_values_nested 71.8810μs 43.1202μs 23.1910 KOps/s 23.2904 KOps/s $\color{#d91a1a}-0.43\%$
test_values_nested_locked 98.7720μs 43.4346μs 23.0231 KOps/s 22.0757 KOps/s $\color{#35bf28}+4.29\%$
test_values_nested_leaf 54.2110μs 37.9630μs 26.3414 KOps/s 26.8212 KOps/s $\color{#d91a1a}-1.79\%$
test_values_stack_nested 1.1787ms 1.1413ms 876.2213 Ops/s 887.1249 Ops/s $\color{#d91a1a}-1.23\%$
test_values_stack_nested_leaf 1.1782ms 1.1293ms 885.4705 Ops/s 891.9316 Ops/s $\color{#d91a1a}-0.72\%$
test_values_stack_nested_locked 0.5573ms 0.5054ms 1.9785 KOps/s 1.9826 KOps/s $\color{#d91a1a}-0.21\%$
test_membership 5.1960μs 0.9365μs 1.0678 MOps/s 933.1788 KOps/s $\textbf{\color{#35bf28}+14.43\%}$
test_membership_nested 33.5710μs 2.2268μs 449.0784 KOps/s 464.8588 KOps/s $\color{#d91a1a}-3.39\%$
test_membership_nested_leaf 16.6500μs 2.1299μs 469.4956 KOps/s 466.0673 KOps/s $\color{#35bf28}+0.74\%$
test_membership_stacked_nested 40.7410μs 10.9286μs 91.5030 KOps/s 90.8832 KOps/s $\color{#35bf28}+0.68\%$
test_membership_stacked_nested_leaf 30.3600μs 10.9118μs 91.6441 KOps/s 91.3898 KOps/s $\color{#35bf28}+0.28\%$
test_membership_nested_last 70.6310μs 4.6883μs 213.2964 KOps/s 217.5842 KOps/s $\color{#d91a1a}-1.97\%$
test_membership_nested_leaf_last 35.4910μs 4.6976μs 212.8755 KOps/s 216.3001 KOps/s $\color{#d91a1a}-1.58\%$
test_membership_stacked_nested_last 0.1828ms 0.1350ms 7.4053 KOps/s 7.3775 KOps/s $\color{#35bf28}+0.38\%$
test_membership_stacked_nested_leaf_last 46.4710μs 12.8380μs 77.8938 KOps/s 77.6597 KOps/s $\color{#35bf28}+0.30\%$
test_nested_getleaf 33.7210μs 8.4595μs 118.2097 KOps/s 118.2825 KOps/s $\color{#d91a1a}-0.06\%$
test_nested_get 62.9710μs 7.9704μs 125.4636 KOps/s 124.9316 KOps/s $\color{#35bf28}+0.43\%$
test_stacked_getleaf 0.6416ms 0.5738ms 1.7427 KOps/s 1.7848 KOps/s $\color{#d91a1a}-2.36\%$
test_stacked_get 0.6346ms 0.5387ms 1.8564 KOps/s 1.8957 KOps/s $\color{#d91a1a}-2.07\%$
test_nested_getitemleaf 22.6200μs 8.4784μs 117.9475 KOps/s 115.2818 KOps/s $\color{#35bf28}+2.31\%$
test_nested_getitem 23.7800μs 8.0400μs 124.3776 KOps/s 121.9752 KOps/s $\color{#35bf28}+1.97\%$
test_stacked_getitemleaf 0.6324ms 0.5667ms 1.7645 KOps/s 1.7638 KOps/s $\color{#35bf28}+0.04\%$
test_stacked_getitem 0.6985ms 0.5389ms 1.8555 KOps/s 1.8839 KOps/s $\color{#d91a1a}-1.51\%$
test_lock_nested 4.3155ms 0.4767ms 2.0977 KOps/s 1.8028 KOps/s $\textbf{\color{#35bf28}+16.36\%}$
test_lock_stack_nested 71.3810ms 6.7488ms 148.1742 Ops/s 137.0987 Ops/s $\textbf{\color{#35bf28}+8.08\%}$
test_unlock_nested 1.3383ms 0.4507ms 2.2188 KOps/s 2.3143 KOps/s $\color{#d91a1a}-4.13\%$
test_unlock_stack_nested 66.3812ms 7.4620ms 134.0118 Ops/s 162.1292 Ops/s $\textbf{\color{#d91a1a}-17.34\%}$
test_flatten_speed 0.5387ms 0.1869ms 5.3512 KOps/s 5.3583 KOps/s $\color{#d91a1a}-0.13\%$
test_unflatten_speed 0.3910ms 0.3609ms 2.7707 KOps/s 2.7630 KOps/s $\color{#35bf28}+0.28\%$
test_common_ops 1.1844ms 0.6274ms 1.5940 KOps/s 1.6468 KOps/s $\color{#d91a1a}-3.21\%$
test_creation 31.7200μs 1.9402μs 515.4212 KOps/s 477.0380 KOps/s $\textbf{\color{#35bf28}+8.05\%}$
test_creation_empty 79.9010μs 7.1661μs 139.5455 KOps/s 140.5428 KOps/s $\color{#d91a1a}-0.71\%$
test_creation_nested_1 27.7210μs 9.4938μs 105.3315 KOps/s 103.9401 KOps/s $\color{#35bf28}+1.34\%$
test_creation_nested_2 43.4900μs 12.2040μs 81.9402 KOps/s 81.4561 KOps/s $\color{#35bf28}+0.59\%$
test_clone 0.1024ms 15.0657μs 66.3760 KOps/s 67.5225 KOps/s $\color{#d91a1a}-1.70\%$
test_getitem[int] 32.8800μs 12.4754μs 80.1580 KOps/s 79.5894 KOps/s $\color{#35bf28}+0.71\%$
test_getitem[slice_int] 56.9110μs 24.4818μs 40.8467 KOps/s 41.2394 KOps/s $\color{#d91a1a}-0.95\%$
test_getitem[range] 68.3210μs 40.9297μs 24.4321 KOps/s 24.9540 KOps/s $\color{#d91a1a}-2.09\%$
test_getitem[tuple] 41.6110μs 20.9818μs 47.6604 KOps/s 49.3076 KOps/s $\color{#d91a1a}-3.34\%$
test_getitem[list] 0.3078ms 37.1600μs 26.9107 KOps/s 27.3691 KOps/s $\color{#d91a1a}-1.67\%$
test_setitem_dim[int] 44.3410μs 27.9239μs 35.8116 KOps/s 37.1618 KOps/s $\color{#d91a1a}-3.63\%$
test_setitem_dim[slice_int] 72.4720μs 48.4492μs 20.6402 KOps/s 21.3465 KOps/s $\color{#d91a1a}-3.31\%$
test_setitem_dim[range] 84.6510μs 66.2005μs 15.1056 KOps/s 15.8116 KOps/s $\color{#d91a1a}-4.47\%$
test_setitem_dim[tuple] 58.5710μs 41.4053μs 24.1515 KOps/s 25.7561 KOps/s $\textbf{\color{#d91a1a}-6.23\%}$
test_setitem 0.1221ms 19.4307μs 51.4649 KOps/s 54.5856 KOps/s $\textbf{\color{#d91a1a}-5.72\%}$
test_set 0.1043ms 18.9036μs 52.8999 KOps/s 55.4423 KOps/s $\color{#d91a1a}-4.59\%$
test_set_shared 2.7577ms 0.1056ms 9.4696 KOps/s 8.6519 KOps/s $\textbf{\color{#35bf28}+9.45\%}$
test_update 0.1150ms 20.2369μs 49.4147 KOps/s 51.6999 KOps/s $\color{#d91a1a}-4.42\%$
test_update_nested 0.1064ms 26.7332μs 37.4067 KOps/s 38.8327 KOps/s $\color{#d91a1a}-3.67\%$
test_set_nested 0.1013ms 19.8203μs 50.4533 KOps/s 52.6589 KOps/s $\color{#d91a1a}-4.19\%$
test_set_nested_new 99.2010μs 24.5609μs 40.7151 KOps/s 43.0316 KOps/s $\textbf{\color{#d91a1a}-5.38\%}$
test_select 82.0400μs 46.5527μs 21.4811 KOps/s 21.4841 KOps/s $\color{#d91a1a}-0.01\%$
test_to 74.5520μs 54.0235μs 18.5105 KOps/s 19.2684 KOps/s $\color{#d91a1a}-3.93\%$
test_to_nonblocking 68.3010μs 36.5012μs 27.3964 KOps/s 28.1178 KOps/s $\color{#d91a1a}-2.57\%$
test_unbind_speed 0.4065ms 0.3647ms 2.7421 KOps/s 2.7738 KOps/s $\color{#d91a1a}-1.14\%$
test_unbind_speed_stack0 61.8814ms 5.3127ms 188.2291 Ops/s 235.4946 Ops/s $\textbf{\color{#d91a1a}-20.07\%}$
test_unbind_speed_stack1 2.8131μs 0.5211μs 1.9191 MOps/s 1.9171 MOps/s $\color{#35bf28}+0.10\%$
test_split 53.9942ms 1.8921ms 528.5224 Ops/s 549.4064 Ops/s $\color{#d91a1a}-3.80\%$
test_chunk 53.5493ms 1.8755ms 533.1777 Ops/s 554.5149 Ops/s $\color{#d91a1a}-3.85\%$
test_creation[device0] 0.3957ms 0.3137ms 3.1878 KOps/s 3.2180 KOps/s $\color{#d91a1a}-0.94\%$
test_creation[device1] 0.7957ms 0.3160ms 3.1646 KOps/s 3.1931 KOps/s $\color{#d91a1a}-0.89\%$
test_creation_from_tensor 0.7107ms 0.3425ms 2.9197 KOps/s 2.9524 KOps/s $\color{#d91a1a}-1.11\%$
test_add_one[memmap_tensor0] 0.2683ms 25.8387μs 38.7017 KOps/s 30.9150 KOps/s $\textbf{\color{#35bf28}+25.19\%}$
test_add_one[memmap_tensor1] 0.2124ms 75.7946μs 13.1935 KOps/s 13.4806 KOps/s $\color{#d91a1a}-2.13\%$
test_contiguous[memmap_tensor0] 32.9010μs 6.0950μs 164.0692 KOps/s 167.8015 KOps/s $\color{#d91a1a}-2.22\%$
test_contiguous[memmap_tensor1] 48.5410μs 22.7324μs 43.9901 KOps/s 45.7072 KOps/s $\color{#d91a1a}-3.76\%$
test_stack[memmap_tensor0] 48.8200μs 20.9492μs 47.7345 KOps/s 50.7083 KOps/s $\textbf{\color{#d91a1a}-5.86\%}$
test_stack[memmap_tensor1] 0.1656ms 79.9312μs 12.5108 KOps/s 13.3205 KOps/s $\textbf{\color{#d91a1a}-6.08\%}$
test_memmaptd_index 0.4863ms 0.4401ms 2.2723 KOps/s 2.3625 KOps/s $\color{#d91a1a}-3.82\%$
test_memmaptd_index_astensor 0.5841ms 0.4897ms 2.0422 KOps/s 2.0992 KOps/s $\color{#d91a1a}-2.71\%$
test_memmaptd_index_op 0.8711ms 0.7965ms 1.2555 KOps/s 1.3445 KOps/s $\textbf{\color{#d91a1a}-6.62\%}$
test_reshape_pytree 39.1700μs 21.8565μs 45.7531 KOps/s 47.2934 KOps/s $\color{#d91a1a}-3.26\%$
test_reshape_td 60.9210μs 30.6557μs 32.6203 KOps/s 33.1787 KOps/s $\color{#d91a1a}-1.68\%$
test_view_pytree 38.8600μs 21.6646μs 46.1582 KOps/s 47.6797 KOps/s $\color{#d91a1a}-3.19\%$
test_view_td 28.1000μs 4.1121μs 243.1829 KOps/s 245.3361 KOps/s $\color{#d91a1a}-0.88\%$
test_unbind_pytree 59.1710μs 26.9094μs 37.1617 KOps/s 38.5939 KOps/s $\color{#d91a1a}-3.71\%$
test_unbind_td 88.1820μs 57.0100μs 17.5408 KOps/s 17.3435 KOps/s $\color{#35bf28}+1.14\%$
test_split_pytree 43.0510μs 25.3067μs 39.5152 KOps/s 41.7720 KOps/s $\textbf{\color{#d91a1a}-5.40\%}$
test_split_td 67.4910μs 45.1408μs 22.1529 KOps/s 22.1257 KOps/s $\color{#35bf28}+0.12\%$
test_add_pytree 54.4310μs 33.8197μs 29.5686 KOps/s 31.2303 KOps/s $\textbf{\color{#d91a1a}-5.32\%}$
test_add_td 71.7510μs 46.8192μs 21.3588 KOps/s 22.6601 KOps/s $\textbf{\color{#d91a1a}-5.74\%}$
test_distributed 35.3410μs 5.6766μs 176.1619 KOps/s 183.2206 KOps/s $\color{#d91a1a}-3.85\%$
test_tdmodule 32.0610μs 16.5579μs 60.3943 KOps/s 58.0318 KOps/s $\color{#35bf28}+4.07\%$
test_tdmodule_dispatch 0.2266ms 32.7912μs 30.4960 KOps/s 29.5104 KOps/s $\color{#35bf28}+3.34\%$
test_tdseq 36.2210μs 20.0137μs 49.9657 KOps/s 50.4885 KOps/s $\color{#d91a1a}-1.04\%$
test_tdseq_dispatch 52.3910μs 35.9836μs 27.7905 KOps/s 27.4484 KOps/s $\color{#35bf28}+1.25\%$
test_instantiation_functorch 1.8044ms 1.7252ms 579.6353 Ops/s 595.1597 Ops/s $\color{#d91a1a}-2.61\%$
test_instantiation_td 1.8611ms 1.1886ms 841.3360 Ops/s 843.4376 Ops/s $\color{#d91a1a}-0.25\%$
test_exec_functorch 0.2260ms 0.1629ms 6.1398 KOps/s 6.1901 KOps/s $\color{#d91a1a}-0.81\%$
test_exec_functional_call 0.2243ms 0.1633ms 6.1243 KOps/s 6.2605 KOps/s $\color{#d91a1a}-2.18\%$
test_exec_td 0.1849ms 0.1527ms 6.5477 KOps/s 6.6701 KOps/s $\color{#d91a1a}-1.84\%$
test_exec_td_decorator 0.7836ms 0.2256ms 4.4319 KOps/s 5.3738 KOps/s $\textbf{\color{#d91a1a}-17.53\%}$
test_vmap_mlp_speed[True-True] 1.6652ms 1.1014ms 907.9247 Ops/s 912.2151 Ops/s $\color{#d91a1a}-0.47\%$
test_vmap_mlp_speed[True-False] 0.6983ms 0.6338ms 1.5779 KOps/s 1.5816 KOps/s $\color{#d91a1a}-0.23\%$
test_vmap_mlp_speed[False-True] 1.0936ms 1.0120ms 988.1680 Ops/s 997.9713 Ops/s $\color{#d91a1a}-0.98\%$
test_vmap_mlp_speed[False-False] 0.6494ms 0.5681ms 1.7604 KOps/s 1.7917 KOps/s $\color{#d91a1a}-1.75\%$
test_vmap_mlp_speed_decorator[True-True] 2.4580ms 1.8234ms 548.4326 Ops/s 479.3301 Ops/s $\textbf{\color{#35bf28}+14.42\%}$
test_vmap_mlp_speed_decorator[True-False] 1.2426ms 0.7023ms 1.4240 KOps/s 1.4845 KOps/s $\color{#d91a1a}-4.08\%$
test_vmap_mlp_speed_decorator[False-True] 2.0866ms 1.6549ms 604.2558 Ops/s 552.3713 Ops/s $\textbf{\color{#35bf28}+9.39\%}$
test_vmap_mlp_speed_decorator[False-False] 68.5974ms 0.6349ms 1.5750 KOps/s 1.7507 KOps/s $\textbf{\color{#d91a1a}-10.04\%}$
test_vmap_transformer_speed[True-True] 13.4790ms 13.0205ms 76.8018 Ops/s 77.3726 Ops/s $\color{#d91a1a}-0.74\%$
test_vmap_transformer_speed[True-False] 8.8564ms 8.5277ms 117.2643 Ops/s 118.0877 Ops/s $\color{#d91a1a}-0.70\%$
test_vmap_transformer_speed[False-True] 13.3856ms 12.9439ms 77.2563 Ops/s 77.9876 Ops/s $\color{#d91a1a}-0.94\%$
test_vmap_transformer_speed[False-False] 8.7131ms 8.4474ms 118.3797 Ops/s 119.4513 Ops/s $\color{#d91a1a}-0.90\%$
test_vmap_transformer_speed_decorator[True-True] 45.2093ms 44.1677ms 22.6410 Ops/s 14.1246 Ops/s $\textbf{\color{#35bf28}+60.30\%}$
test_vmap_transformer_speed_decorator[True-False] 97.9210ms 22.5552ms 44.3356 Ops/s 48.9190 Ops/s $\textbf{\color{#d91a1a}-9.37\%}$
test_vmap_transformer_speed_decorator[False-True] 46.0670ms 43.9513ms 22.7524 Ops/s 16.6985 Ops/s $\textbf{\color{#35bf28}+36.25\%}$
test_vmap_transformer_speed_decorator[False-False] 0.1011s 22.1775ms 45.0907 Ops/s 49.7001 Ops/s $\textbf{\color{#d91a1a}-9.27\%}$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants