Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] DETERMINISTIC interaction mode #824

Merged
merged 4 commits into from
Jun 20, 2024
Merged

[Feature] DETERMINISTIC interaction mode #824

merged 4 commits into from
Jun 20, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Jun 20, 2024

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 20, 2024
Copy link

github-actions bot commented Jun 20, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 144. Improved: $\large\color{#35bf28}12$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 39.6660μs 16.9849μs 58.8759 KOps/s 58.6117 KOps/s $\color{#35bf28}+0.45\%$
test_plain_set_stack_nested 37.8420μs 17.3269μs 57.7138 KOps/s 58.0461 KOps/s $\color{#d91a1a}-0.57\%$
test_plain_set_nested_inplace 44.0250μs 19.3008μs 51.8114 KOps/s 51.2039 KOps/s $\color{#35bf28}+1.19\%$
test_plain_set_stack_nested_inplace 51.0680μs 19.1445μs 52.2344 KOps/s 51.5927 KOps/s $\color{#35bf28}+1.24\%$
test_items 19.7280μs 2.4902μs 401.5761 KOps/s 380.1613 KOps/s $\textbf{\color{#35bf28}+5.63\%}$
test_items_nested 0.8149ms 0.2673ms 3.7416 KOps/s 3.7052 KOps/s $\color{#35bf28}+0.98\%$
test_items_nested_locked 0.4837ms 0.2738ms 3.6530 KOps/s 3.7240 KOps/s $\color{#d91a1a}-1.91\%$
test_items_nested_leaf 0.1465ms 77.3381μs 12.9302 KOps/s 12.8248 KOps/s $\color{#35bf28}+0.82\%$
test_items_stack_nested 1.3021ms 0.2763ms 3.6194 KOps/s 3.7266 KOps/s $\color{#d91a1a}-2.88\%$
test_items_stack_nested_leaf 0.1557ms 79.1600μs 12.6326 KOps/s 12.2386 KOps/s $\color{#35bf28}+3.22\%$
test_items_stack_nested_locked 0.4742ms 0.2699ms 3.7052 KOps/s 3.7647 KOps/s $\color{#d91a1a}-1.58\%$
test_keys 17.5340μs 3.8888μs 257.1491 KOps/s 242.1355 KOps/s $\textbf{\color{#35bf28}+6.20\%}$
test_keys_nested 0.2291ms 0.1361ms 7.3467 KOps/s 7.2717 KOps/s $\color{#35bf28}+1.03\%$
test_keys_nested_locked 0.7908ms 0.1402ms 7.1303 KOps/s 7.0092 KOps/s $\color{#35bf28}+1.73\%$
test_keys_nested_leaf 0.9050ms 0.1159ms 8.6313 KOps/s 8.5300 KOps/s $\color{#35bf28}+1.19\%$
test_keys_stack_nested 0.2716ms 0.1369ms 7.3061 KOps/s 7.2695 KOps/s $\color{#35bf28}+0.50\%$
test_keys_stack_nested_leaf 0.2041ms 0.1162ms 8.6037 KOps/s 8.5273 KOps/s $\color{#35bf28}+0.90\%$
test_keys_stack_nested_locked 0.2735ms 0.1418ms 7.0521 KOps/s 7.1248 KOps/s $\color{#d91a1a}-1.02\%$
test_values 40.0890μs 1.1901μs 840.2350 KOps/s 860.9991 KOps/s $\color{#d91a1a}-2.41\%$
test_values_nested 89.7710μs 51.2911μs 19.4966 KOps/s 19.1853 KOps/s $\color{#35bf28}+1.62\%$
test_values_nested_locked 0.1034ms 51.5501μs 19.3986 KOps/s 19.4747 KOps/s $\color{#d91a1a}-0.39\%$
test_values_nested_leaf 0.1128ms 46.8888μs 21.3270 KOps/s 21.3320 KOps/s $\color{#d91a1a}-0.02\%$
test_values_stack_nested 0.1354ms 52.5087μs 19.0445 KOps/s 18.8106 KOps/s $\color{#35bf28}+1.24\%$
test_values_stack_nested_leaf 97.8570μs 46.4477μs 21.5296 KOps/s 21.5863 KOps/s $\color{#d91a1a}-0.26\%$
test_values_stack_nested_locked 0.1003ms 52.0128μs 19.2260 KOps/s 19.3270 KOps/s $\color{#d91a1a}-0.52\%$
test_membership 0.1367ms 1.3484μs 741.6208 KOps/s 736.5655 KOps/s $\color{#35bf28}+0.69\%$
test_membership_nested 94.3290μs 3.4092μs 293.3246 KOps/s 254.0131 KOps/s $\textbf{\color{#35bf28}+15.48\%}$
test_membership_nested_leaf 22.8130μs 3.4702μs 288.1661 KOps/s 286.2878 KOps/s $\color{#35bf28}+0.66\%$
test_membership_stacked_nested 20.4890μs 3.3991μs 294.1969 KOps/s 287.8762 KOps/s $\color{#35bf28}+2.20\%$
test_membership_stacked_nested_leaf 25.1780μs 3.4267μs 291.8277 KOps/s 277.1602 KOps/s $\textbf{\color{#35bf28}+5.29\%}$
test_membership_nested_last 29.3860μs 4.1770μs 239.4065 KOps/s 222.4977 KOps/s $\textbf{\color{#35bf28}+7.60\%}$
test_membership_nested_leaf_last 35.4180μs 4.2861μs 233.3135 KOps/s 233.4525 KOps/s $\color{#d91a1a}-0.06\%$
test_membership_stacked_nested_last 19.7570μs 4.2207μs 236.9279 KOps/s 232.3336 KOps/s $\color{#35bf28}+1.98\%$
test_membership_stacked_nested_leaf_last 53.0110μs 4.1817μs 239.1398 KOps/s 231.0433 KOps/s $\color{#35bf28}+3.50\%$
test_nested_getleaf 32.9520μs 10.6175μs 94.1845 KOps/s 92.4920 KOps/s $\color{#35bf28}+1.83\%$
test_nested_get 31.6710μs 10.0201μs 99.7989 KOps/s 97.6169 KOps/s $\color{#35bf28}+2.24\%$
test_stacked_getleaf 34.1650μs 10.6255μs 94.1129 KOps/s 92.5100 KOps/s $\color{#35bf28}+1.73\%$
test_stacked_get 37.9830μs 10.0271μs 99.7298 KOps/s 98.7417 KOps/s $\color{#35bf28}+1.00\%$
test_nested_getitemleaf 42.8110μs 11.2802μs 88.6511 KOps/s 88.3826 KOps/s $\color{#35bf28}+0.30\%$
test_nested_getitem 31.2200μs 10.2077μs 97.9655 KOps/s 95.9955 KOps/s $\color{#35bf28}+2.05\%$
test_stacked_getitemleaf 29.8970μs 11.0760μs 90.2855 KOps/s 89.2160 KOps/s $\color{#35bf28}+1.20\%$
test_stacked_getitem 26.8810μs 10.2213μs 97.8345 KOps/s 95.7262 KOps/s $\color{#35bf28}+2.20\%$
test_lock_nested 49.2904ms 0.3907ms 2.5598 KOps/s 2.9087 KOps/s $\textbf{\color{#d91a1a}-12.00\%}$
test_lock_stack_nested 0.5622ms 0.3110ms 3.2159 KOps/s 3.1676 KOps/s $\color{#35bf28}+1.53\%$
test_unlock_nested 1.3483ms 0.3498ms 2.8588 KOps/s 2.7286 KOps/s $\color{#35bf28}+4.77\%$
test_unlock_stack_nested 0.4765ms 0.3192ms 3.1331 KOps/s 3.1463 KOps/s $\color{#d91a1a}-0.42\%$
test_flatten_speed 0.1777ms 94.4845μs 10.5837 KOps/s 10.1931 KOps/s $\color{#35bf28}+3.83\%$
test_unflatten_speed 0.7178ms 0.4103ms 2.4372 KOps/s 2.3979 KOps/s $\color{#35bf28}+1.64\%$
test_common_ops 4.5958ms 0.7279ms 1.3738 KOps/s 1.3742 KOps/s $\color{#d91a1a}-0.03\%$
test_creation 16.5910μs 1.8956μs 527.5334 KOps/s 516.0628 KOps/s $\color{#35bf28}+2.22\%$
test_creation_empty 40.1170μs 10.5029μs 95.2118 KOps/s 98.5408 KOps/s $\color{#d91a1a}-3.38\%$
test_creation_nested_1 41.3090μs 13.2863μs 75.2653 KOps/s 76.7015 KOps/s $\color{#d91a1a}-1.87\%$
test_creation_nested_2 39.2050μs 16.5610μs 60.3828 KOps/s 61.5797 KOps/s $\color{#d91a1a}-1.94\%$
test_clone 72.2580μs 13.5703μs 73.6903 KOps/s 72.4733 KOps/s $\color{#35bf28}+1.68\%$
test_getitem[int] 36.5600μs 11.5532μs 86.5564 KOps/s 85.7385 KOps/s $\color{#35bf28}+0.95\%$
test_getitem[slice_int] 53.5720μs 23.2609μs 42.9905 KOps/s 43.0429 KOps/s $\color{#d91a1a}-0.12\%$
test_getitem[range] 80.1030μs 60.4592μs 16.5401 KOps/s 17.0975 KOps/s $\color{#d91a1a}-3.26\%$
test_getitem[tuple] 48.9630μs 19.2118μs 52.0512 KOps/s 49.0869 KOps/s $\textbf{\color{#35bf28}+6.04\%}$
test_getitem[list] 0.1014ms 42.3983μs 23.5859 KOps/s 23.1024 KOps/s $\color{#35bf28}+2.09\%$
test_setitem_dim[int] 62.4990μs 36.1535μs 27.6599 KOps/s 27.9936 KOps/s $\color{#d91a1a}-1.19\%$
test_setitem_dim[slice_int] 98.4080μs 62.3280μs 16.0441 KOps/s 15.8558 KOps/s $\color{#35bf28}+1.19\%$
test_setitem_dim[range] 0.1640ms 84.9313μs 11.7742 KOps/s 11.7140 KOps/s $\color{#35bf28}+0.51\%$
test_setitem_dim[tuple] 84.9720μs 51.5642μs 19.3933 KOps/s 19.1885 KOps/s $\color{#35bf28}+1.07\%$
test_setitem 65.8760μs 20.3972μs 49.0264 KOps/s 48.6999 KOps/s $\color{#35bf28}+0.67\%$
test_set 0.1189ms 19.8388μs 50.4063 KOps/s 49.7247 KOps/s $\color{#35bf28}+1.37\%$
test_set_shared 3.5475ms 0.1633ms 6.1227 KOps/s 6.8527 KOps/s $\textbf{\color{#d91a1a}-10.65\%}$
test_update 0.5256ms 22.1560μs 45.1344 KOps/s 44.7013 KOps/s $\color{#35bf28}+0.97\%$
test_update_nested 99.8800μs 31.4472μs 31.7993 KOps/s 32.0865 KOps/s $\color{#d91a1a}-0.89\%$
test_update__nested 83.7030μs 25.1154μs 39.8162 KOps/s 38.2660 KOps/s $\color{#35bf28}+4.05\%$
test_set_nested 59.2830μs 21.5019μs 46.5076 KOps/s 44.7693 KOps/s $\color{#35bf28}+3.88\%$
test_set_nested_new 87.3260μs 25.9071μs 38.5995 KOps/s 37.0603 KOps/s $\color{#35bf28}+4.15\%$
test_select 93.3980μs 41.0229μs 24.3766 KOps/s 23.5495 KOps/s $\color{#35bf28}+3.51\%$
test_select_nested 0.1181ms 60.2294μs 16.6032 KOps/s 16.1520 KOps/s $\color{#35bf28}+2.79\%$
test_exclude_nested 0.2731ms 0.1196ms 8.3605 KOps/s 8.0052 KOps/s $\color{#35bf28}+4.44\%$
test_empty[True] 0.4743ms 0.3898ms 2.5655 KOps/s 2.5168 KOps/s $\color{#35bf28}+1.93\%$
test_empty[False] 6.2395μs 1.1555μs 865.4305 KOps/s 860.4843 KOps/s $\color{#35bf28}+0.57\%$
test_unbind_speed 1.5573ms 0.2562ms 3.9037 KOps/s 3.8201 KOps/s $\color{#35bf28}+2.19\%$
test_unbind_speed_stack0 0.4961ms 0.2572ms 3.8879 KOps/s 3.9729 KOps/s $\color{#d91a1a}-2.14\%$
test_unbind_speed_stack1 68.8774ms 0.7348ms 1.3608 KOps/s 1.3558 KOps/s $\color{#35bf28}+0.37\%$
test_split 67.1963ms 1.6187ms 617.7833 Ops/s 613.4495 Ops/s $\color{#35bf28}+0.71\%$
test_chunk 73.1466ms 1.6415ms 609.1840 Ops/s 612.1814 Ops/s $\color{#d91a1a}-0.49\%$
test_creation[device0] 0.1947ms 85.6720μs 11.6724 KOps/s 11.7757 KOps/s $\color{#d91a1a}-0.88\%$
test_creation_from_tensor 4.1743ms 87.6779μs 11.4054 KOps/s 11.5845 KOps/s $\color{#d91a1a}-1.55\%$
test_add_one[memmap_tensor0] 77.1270μs 5.3737μs 186.0900 KOps/s 183.5800 KOps/s $\color{#35bf28}+1.37\%$
test_contiguous[memmap_tensor0] 10.4100μs 0.6625μs 1.5093 MOps/s 1.5838 MOps/s $\color{#d91a1a}-4.70\%$
test_stack[memmap_tensor0] 17.8940μs 3.6924μs 270.8237 KOps/s 278.0537 KOps/s $\color{#d91a1a}-2.60\%$
test_memmaptd_index 0.9262ms 0.2520ms 3.9686 KOps/s 3.9084 KOps/s $\color{#35bf28}+1.54\%$
test_memmaptd_index_astensor 0.6753ms 0.3250ms 3.0772 KOps/s 3.0661 KOps/s $\color{#35bf28}+0.36\%$
test_memmaptd_index_op 0.9547ms 0.6116ms 1.6350 KOps/s 1.6390 KOps/s $\color{#d91a1a}-0.24\%$
test_serialize_model 0.1712s 0.1137s 8.7953 Ops/s 8.5715 Ops/s $\color{#35bf28}+2.61\%$
test_serialize_model_pickle 0.4523s 0.3752s 2.6656 Ops/s 2.6361 Ops/s $\color{#35bf28}+1.12\%$
test_serialize_weights 0.1857s 0.1131s 8.8438 Ops/s 8.6012 Ops/s $\color{#35bf28}+2.82\%$
test_serialize_weights_returnearly 0.1995s 0.1345s 7.4373 Ops/s 7.9303 Ops/s $\textbf{\color{#d91a1a}-6.22\%}$
test_serialize_weights_pickle 0.9622s 0.5753s 1.7382 Ops/s 2.4292 Ops/s $\textbf{\color{#d91a1a}-28.45\%}$
test_serialize_weights_filesystem 0.1016s 95.0517ms 10.5206 Ops/s 10.2405 Ops/s $\color{#35bf28}+2.74\%$
test_serialize_model_filesystem 0.1657s 99.8180ms 10.0182 Ops/s 10.0564 Ops/s $\color{#d91a1a}-0.38\%$
test_reshape_pytree 53.0210μs 25.3116μs 39.5076 KOps/s 38.8694 KOps/s $\color{#35bf28}+1.64\%$
test_reshape_td 69.8620μs 34.6516μs 28.8587 KOps/s 28.8067 KOps/s $\color{#35bf28}+0.18\%$
test_view_pytree 58.1000μs 25.6374μs 39.0054 KOps/s 35.2751 KOps/s $\textbf{\color{#35bf28}+10.57\%}$
test_view_td 95.5320μs 38.4547μs 26.0046 KOps/s 25.7627 KOps/s $\color{#35bf28}+0.94\%$
test_unbind_pytree 76.3250μs 29.2941μs 34.1365 KOps/s 33.7609 KOps/s $\color{#35bf28}+1.11\%$
test_unbind_td 0.4341ms 38.0541μs 26.2784 KOps/s 25.8867 KOps/s $\color{#35bf28}+1.51\%$
test_split_pytree 65.0640μs 29.0863μs 34.3805 KOps/s 33.9467 KOps/s $\color{#35bf28}+1.28\%$
test_split_td 0.5350ms 41.0194μs 24.3787 KOps/s 24.4765 KOps/s $\color{#d91a1a}-0.40\%$
test_add_pytree 75.1830μs 34.8169μs 28.7217 KOps/s 28.5407 KOps/s $\color{#35bf28}+0.63\%$
test_add_td 0.1257ms 57.1510μs 17.4975 KOps/s 17.0805 KOps/s $\color{#35bf28}+2.44\%$
test_distributed 0.2037ms 0.1024ms 9.7612 KOps/s 9.6076 KOps/s $\color{#35bf28}+1.60\%$
test_tdmodule 39.2250μs 18.6107μs 53.7326 KOps/s 54.7645 KOps/s $\color{#d91a1a}-1.88\%$
test_tdmodule_dispatch 68.6210μs 36.0685μs 27.7250 KOps/s 20.6262 KOps/s $\textbf{\color{#35bf28}+34.42\%}$
test_tdseq 42.9520μs 21.1343μs 47.3164 KOps/s 44.9526 KOps/s $\textbf{\color{#35bf28}+5.26\%}$
test_tdseq_dispatch 72.4170μs 41.5086μs 24.0914 KOps/s 24.8587 KOps/s $\color{#d91a1a}-3.09\%$
test_instantiation_functorch 1.7581ms 1.3619ms 734.2463 Ops/s 748.7774 Ops/s $\color{#d91a1a}-1.94\%$
test_instantiation_td 1.5532ms 1.0299ms 970.9503 Ops/s 965.1835 Ops/s $\color{#35bf28}+0.60\%$
test_exec_functorch 0.3550ms 0.1638ms 6.1050 KOps/s 6.2422 KOps/s $\color{#d91a1a}-2.20\%$
test_exec_functional_call 0.2444ms 0.1513ms 6.6087 KOps/s 6.7799 KOps/s $\color{#d91a1a}-2.53\%$
test_exec_td 0.2906ms 0.1476ms 6.7751 KOps/s 6.8712 KOps/s $\color{#d91a1a}-1.40\%$
test_exec_td_decorator 0.8109ms 0.2245ms 4.4548 KOps/s 4.5450 KOps/s $\color{#d91a1a}-1.98\%$
test_vmap_mlp_speed[True-True] 0.7180ms 0.4904ms 2.0390 KOps/s 2.0731 KOps/s $\color{#d91a1a}-1.64\%$
test_vmap_mlp_speed[True-False] 0.7846ms 0.4905ms 2.0388 KOps/s 2.0018 KOps/s $\color{#35bf28}+1.85\%$
test_vmap_mlp_speed[False-True] 0.5942ms 0.3996ms 2.5024 KOps/s 2.5673 KOps/s $\color{#d91a1a}-2.53\%$
test_vmap_mlp_speed[False-False] 0.7676ms 0.3980ms 2.5126 KOps/s 2.5655 KOps/s $\color{#d91a1a}-2.06\%$
test_vmap_mlp_speed_decorator[True-True] 1.1789ms 0.5568ms 1.7960 KOps/s 1.7965 KOps/s $\color{#d91a1a}-0.03\%$
test_vmap_mlp_speed_decorator[True-False] 0.8956ms 0.5541ms 1.8046 KOps/s 1.6819 KOps/s $\textbf{\color{#35bf28}+7.30\%}$
test_vmap_mlp_speed_decorator[False-True] 1.7430ms 0.4689ms 2.1326 KOps/s 2.1974 KOps/s $\color{#d91a1a}-2.95\%$
test_vmap_mlp_speed_decorator[False-False] 0.7452ms 0.4548ms 2.1989 KOps/s 2.2041 KOps/s $\color{#d91a1a}-0.24\%$
test_to_module_speed[True] 2.6855ms 1.7063ms 586.0497 Ops/s 595.3279 Ops/s $\color{#d91a1a}-1.56\%$
test_to_module_speed[False] 2.5758ms 1.6773ms 596.1916 Ops/s 600.6650 Ops/s $\color{#d91a1a}-0.74\%$
test_tc_init 56.7670μs 29.2001μs 34.2464 KOps/s 34.9283 KOps/s $\color{#d91a1a}-1.95\%$
test_tc_init_nested 0.1470ms 62.2777μs 16.0571 KOps/s 16.2937 KOps/s $\color{#d91a1a}-1.45\%$
test_tc_first_layer_tensor 3.8973μs 0.6846μs 1.4607 MOps/s 1.3975 MOps/s $\color{#35bf28}+4.52\%$
test_tc_first_layer_nontensor 1.9011μs 0.6738μs 1.4841 MOps/s 1.4743 MOps/s $\color{#35bf28}+0.67\%$
test_tc_second_layer_tensor 0.1232ms 1.8531μs 539.6282 KOps/s 539.0945 KOps/s $\color{#35bf28}+0.10\%$
test_tc_second_layer_nontensor 22.8530μs 1.6432μs 608.5631 KOps/s 670.5113 KOps/s $\textbf{\color{#d91a1a}-9.24\%}$
test_unbind 85.5244ms 6.4600ms 154.7994 Ops/s 187.7161 Ops/s $\textbf{\color{#d91a1a}-17.54\%}$
test_full_like 16.7672ms 11.2110ms 89.1983 Ops/s 122.4476 Ops/s $\textbf{\color{#d91a1a}-27.15\%}$
test_zeros_like 6.9307ms 6.0744ms 164.6260 Ops/s 172.4498 Ops/s $\color{#d91a1a}-4.54\%$
test_ones_like 7.4221ms 6.4057ms 156.1113 Ops/s 140.2829 Ops/s $\textbf{\color{#35bf28}+11.28\%}$
test_clone 14.2208ms 8.7043ms 114.8853 Ops/s 118.6913 Ops/s $\color{#d91a1a}-3.21\%$
test_squeeze 65.3740μs 13.8783μs 72.0551 KOps/s 68.3577 KOps/s $\textbf{\color{#35bf28}+5.41\%}$
test_unsqueeze 0.4248ms 62.5165μs 15.9958 KOps/s 16.5435 KOps/s $\color{#d91a1a}-3.31\%$
test_split 0.1872ms 0.1156ms 8.6537 KOps/s 8.6700 KOps/s $\color{#d91a1a}-0.19\%$
test_permute 0.1840ms 0.1287ms 7.7700 KOps/s 7.6335 KOps/s $\color{#35bf28}+1.79\%$
test_stack 28.9411ms 23.4806ms 42.5884 Ops/s 42.6253 Ops/s $\color{#d91a1a}-0.09\%$
test_cat 27.9652ms 23.5592ms 42.4462 Ops/s 43.7791 Ops/s $\color{#d91a1a}-3.04\%$

Copy link

github-actions bot commented Jun 20, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 152. Improved: $\large\color{#35bf28}6$. Worsened: $\large\color{#d91a1a}30$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 33.4200μs 10.6606μs 93.8035 KOps/s 104.6444 KOps/s $\textbf{\color{#d91a1a}-10.36\%}$
test_plain_set_stack_nested 38.1110μs 10.8504μs 92.1622 KOps/s 104.1689 KOps/s $\textbf{\color{#d91a1a}-11.53\%}$
test_plain_set_nested_inplace 70.8920μs 11.9268μs 83.8450 KOps/s 92.9727 KOps/s $\textbf{\color{#d91a1a}-9.82\%}$
test_plain_set_stack_nested_inplace 41.1210μs 11.8214μs 84.5927 KOps/s 93.9902 KOps/s $\textbf{\color{#d91a1a}-10.00\%}$
test_items 36.2610μs 2.9200μs 342.4671 KOps/s 340.9665 KOps/s $\color{#35bf28}+0.44\%$
test_items_nested 0.3779ms 0.3017ms 3.3147 KOps/s 3.3687 KOps/s $\color{#d91a1a}-1.60\%$
test_items_nested_locked 0.3631ms 0.2965ms 3.3731 KOps/s 3.3407 KOps/s $\color{#35bf28}+0.97\%$
test_items_nested_leaf 90.9020μs 63.8468μs 15.6625 KOps/s 15.7552 KOps/s $\color{#d91a1a}-0.59\%$
test_items_stack_nested 0.3690ms 0.2971ms 3.3655 KOps/s 3.3590 KOps/s $\color{#35bf28}+0.19\%$
test_items_stack_nested_leaf 94.1220μs 64.8780μs 15.4136 KOps/s 15.4670 KOps/s $\color{#d91a1a}-0.35\%$
test_items_stack_nested_locked 0.3460ms 0.2970ms 3.3669 KOps/s 3.3152 KOps/s $\color{#35bf28}+1.56\%$
test_keys 26.9500μs 3.4463μs 290.1661 KOps/s 288.8490 KOps/s $\color{#35bf28}+0.46\%$
test_keys_nested 0.1145ms 55.3609μs 18.0633 KOps/s 18.0013 KOps/s $\color{#35bf28}+0.34\%$
test_keys_nested_locked 2.5695ms 59.4904μs 16.8094 KOps/s 16.7180 KOps/s $\color{#35bf28}+0.55\%$
test_keys_nested_leaf 76.8710μs 46.8738μs 21.3339 KOps/s 21.1253 KOps/s $\color{#35bf28}+0.99\%$
test_keys_stack_nested 92.9220μs 55.2153μs 18.1109 KOps/s 18.0042 KOps/s $\color{#35bf28}+0.59\%$
test_keys_stack_nested_leaf 93.7420μs 46.9320μs 21.3074 KOps/s 21.1178 KOps/s $\color{#35bf28}+0.90\%$
test_keys_stack_nested_locked 89.7520μs 59.8029μs 16.7216 KOps/s 16.7586 KOps/s $\color{#d91a1a}-0.22\%$
test_values 5.8418μs 0.8444μs 1.1843 MOps/s 1.1709 MOps/s $\color{#35bf28}+1.15\%$
test_values_nested 93.3920μs 29.5109μs 33.8858 KOps/s 33.5344 KOps/s $\color{#35bf28}+1.05\%$
test_values_nested_locked 57.1210μs 31.1616μs 32.0908 KOps/s 32.0068 KOps/s $\color{#35bf28}+0.26\%$
test_values_nested_leaf 0.1781ms 26.2440μs 38.1039 KOps/s 38.4405 KOps/s $\color{#d91a1a}-0.88\%$
test_values_stack_nested 57.1710μs 30.3925μs 32.9028 KOps/s 32.7746 KOps/s $\color{#35bf28}+0.39\%$
test_values_stack_nested_leaf 62.5710μs 27.2646μs 36.6776 KOps/s 36.7988 KOps/s $\color{#d91a1a}-0.33\%$
test_values_stack_nested_locked 55.1610μs 31.6830μs 31.5627 KOps/s 31.2365 KOps/s $\color{#35bf28}+1.04\%$
test_membership 1.9881μs 0.6185μs 1.6167 MOps/s 1.6102 MOps/s $\color{#35bf28}+0.40\%$
test_membership_nested 27.2110μs 2.2551μs 443.4471 KOps/s 440.5549 KOps/s $\color{#35bf28}+0.66\%$
test_membership_nested_leaf 16.0655μs 2.1960μs 455.3721 KOps/s 453.3567 KOps/s $\color{#35bf28}+0.44\%$
test_membership_stacked_nested 26.4900μs 2.2740μs 439.7607 KOps/s 439.7855 KOps/s $-0.01\%$
test_membership_stacked_nested_leaf 28.1410μs 2.2664μs 441.2338 KOps/s 440.9663 KOps/s $\color{#35bf28}+0.06\%$
test_membership_nested_last 29.2510μs 2.8025μs 356.8266 KOps/s 350.5947 KOps/s $\color{#35bf28}+1.78\%$
test_membership_nested_leaf_last 0.1832ms 2.8161μs 355.1023 KOps/s 359.0402 KOps/s $\color{#d91a1a}-1.10\%$
test_membership_stacked_nested_last 0.2087ms 4.1007μs 243.8621 KOps/s 356.4438 KOps/s $\textbf{\color{#d91a1a}-31.58\%}$
test_membership_stacked_nested_leaf_last 28.6000μs 4.1493μs 241.0056 KOps/s 360.4639 KOps/s $\textbf{\color{#d91a1a}-33.14\%}$
test_nested_getleaf 0.1687ms 6.4759μs 154.4181 KOps/s 153.2067 KOps/s $\color{#35bf28}+0.79\%$
test_nested_get 0.2040ms 6.0846μs 164.3482 KOps/s 162.8555 KOps/s $\color{#35bf28}+0.92\%$
test_stacked_getleaf 0.1883ms 6.4006μs 156.2365 KOps/s 156.1527 KOps/s $\color{#35bf28}+0.05\%$
test_stacked_get 0.1885ms 5.9896μs 166.9568 KOps/s 166.2037 KOps/s $\color{#35bf28}+0.45\%$
test_nested_getitemleaf 0.2039ms 6.5692μs 152.2253 KOps/s 150.1736 KOps/s $\color{#35bf28}+1.37\%$
test_nested_getitem 47.1610μs 6.1684μs 162.1174 KOps/s 159.9524 KOps/s $\color{#35bf28}+1.35\%$
test_stacked_getitemleaf 30.6910μs 6.5420μs 152.8578 KOps/s 152.0594 KOps/s $\color{#35bf28}+0.53\%$
test_stacked_getitem 48.5210μs 6.0775μs 164.5415 KOps/s 162.2702 KOps/s $\color{#35bf28}+1.40\%$
test_lock_nested 58.2551ms 0.3940ms 2.5383 KOps/s 2.5355 KOps/s $\color{#35bf28}+0.11\%$
test_lock_stack_nested 0.3991ms 0.2920ms 3.4242 KOps/s 3.3890 KOps/s $\color{#35bf28}+1.04\%$
test_unlock_nested 59.9287ms 0.3952ms 2.5304 KOps/s 2.5036 KOps/s $\color{#35bf28}+1.07\%$
test_unlock_stack_nested 0.4066ms 0.3001ms 3.3325 KOps/s 3.2986 KOps/s $\color{#35bf28}+1.03\%$
test_flatten_speed 0.3154ms 79.1217μs 12.6388 KOps/s 12.7795 KOps/s $\color{#d91a1a}-1.10\%$
test_unflatten_speed 0.3259ms 0.2613ms 3.8273 KOps/s 3.8300 KOps/s $\color{#d91a1a}-0.07\%$
test_common_ops 1.6978ms 0.6498ms 1.5390 KOps/s 1.6205 KOps/s $\textbf{\color{#d91a1a}-5.03\%}$
test_creation 0.1109ms 1.4652μs 682.5127 KOps/s 696.6021 KOps/s $\color{#d91a1a}-2.02\%$
test_creation_empty 93.8320μs 8.6230μs 115.9690 KOps/s 157.4603 KOps/s $\textbf{\color{#d91a1a}-26.35\%}$
test_creation_nested_1 35.3310μs 10.3140μs 96.9559 KOps/s 125.0231 KOps/s $\textbf{\color{#d91a1a}-22.45\%}$
test_creation_nested_2 0.1036ms 12.2186μs 81.8424 KOps/s 100.6253 KOps/s $\textbf{\color{#d91a1a}-18.67\%}$
test_clone 92.3520μs 12.1082μs 82.5883 KOps/s 80.4669 KOps/s $\color{#35bf28}+2.64\%$
test_getitem[int] 28.0210μs 10.9360μs 91.4407 KOps/s 92.5230 KOps/s $\color{#d91a1a}-1.17\%$
test_getitem[slice_int] 0.1534ms 21.6483μs 46.1930 KOps/s 46.6034 KOps/s $\color{#d91a1a}-0.88\%$
test_getitem[range] 64.7010μs 51.4210μs 19.4473 KOps/s 20.8312 KOps/s $\textbf{\color{#d91a1a}-6.64\%}$
test_getitem[tuple] 0.1263ms 19.2153μs 52.0419 KOps/s 51.2310 KOps/s $\color{#35bf28}+1.58\%$
test_getitem[list] 0.1310ms 36.5471μs 27.3619 KOps/s 27.5599 KOps/s $\color{#d91a1a}-0.72\%$
test_setitem_dim[int] 51.0110μs 30.1881μs 33.1257 KOps/s 36.2995 KOps/s $\textbf{\color{#d91a1a}-8.74\%}$
test_setitem_dim[slice_int] 0.1101ms 52.5998μs 19.0115 KOps/s 20.7065 KOps/s $\textbf{\color{#d91a1a}-8.19\%}$
test_setitem_dim[range] 0.2002ms 69.0351μs 14.4854 KOps/s 15.2659 KOps/s $\textbf{\color{#d91a1a}-5.11\%}$
test_setitem_dim[tuple] 78.5110μs 44.9620μs 22.2410 KOps/s 23.6957 KOps/s $\textbf{\color{#d91a1a}-6.14\%}$
test_setitem 53.4410μs 16.7995μs 59.5255 KOps/s 60.7306 KOps/s $\color{#d91a1a}-1.98\%$
test_set 53.8610μs 16.1390μs 61.9618 KOps/s 65.3917 KOps/s $\textbf{\color{#d91a1a}-5.25\%}$
test_set_shared 2.1275ms 0.1565ms 6.3878 KOps/s 6.3639 KOps/s $\color{#35bf28}+0.38\%$
test_update 0.5052ms 19.3314μs 51.7292 KOps/s 56.6287 KOps/s $\textbf{\color{#d91a1a}-8.65\%}$
test_update_nested 0.1560ms 24.2533μs 41.2315 KOps/s 45.4391 KOps/s $\textbf{\color{#d91a1a}-9.26\%}$
test_update__nested 0.1614ms 22.1114μs 45.2254 KOps/s 43.5262 KOps/s $\color{#35bf28}+3.90\%$
test_set_nested 0.1234ms 17.2295μs 58.0398 KOps/s 59.1436 KOps/s $\color{#d91a1a}-1.87\%$
test_set_nested_new 0.1531ms 20.4459μs 48.9095 KOps/s 51.6618 KOps/s $\textbf{\color{#d91a1a}-5.33\%}$
test_select 0.1618ms 33.0823μs 30.2277 KOps/s 31.8775 KOps/s $\textbf{\color{#d91a1a}-5.18\%}$
test_select_nested 1.0486ms 45.9656μs 21.7554 KOps/s 21.7454 KOps/s $\color{#35bf28}+0.05\%$
test_exclude_nested 0.1304ms 96.1206μs 10.4036 KOps/s 10.1362 KOps/s $\color{#35bf28}+2.64\%$
test_empty[True] 0.3563ms 0.3021ms 3.3106 KOps/s 3.2959 KOps/s $\color{#35bf28}+0.45\%$
test_empty[False] 4.2661μs 0.8403μs 1.1900 MOps/s 1.1986 MOps/s $\color{#d91a1a}-0.71\%$
test_to 0.1027ms 75.5226μs 13.2411 KOps/s 13.0539 KOps/s $\color{#35bf28}+1.43\%$
test_to_nonblocking 0.2138ms 59.8256μs 16.7152 KOps/s 16.4153 KOps/s $\color{#35bf28}+1.83\%$
test_unbind_speed 0.9182ms 0.2586ms 3.8676 KOps/s 3.7744 KOps/s $\color{#35bf28}+2.47\%$
test_unbind_speed_stack0 0.4042ms 0.2631ms 3.8006 KOps/s 3.7378 KOps/s $\color{#35bf28}+1.68\%$
test_unbind_speed_stack1 75.7023ms 0.8046ms 1.2429 KOps/s 1.2277 KOps/s $\color{#35bf28}+1.23\%$
test_split 75.9697ms 1.7349ms 576.3885 Ops/s 569.0045 Ops/s $\color{#35bf28}+1.30\%$
test_chunk 1.7057ms 1.5978ms 625.8596 Ops/s 617.6477 Ops/s $\color{#35bf28}+1.33\%$
test_creation[device0] 0.3108ms 82.0647μs 12.1855 KOps/s 12.1822 KOps/s $\color{#35bf28}+0.03\%$
test_creation_from_tensor 0.4619ms 83.0665μs 12.0385 KOps/s 12.0654 KOps/s $\color{#d91a1a}-0.22\%$
test_add_one[memmap_tensor0] 0.4652ms 7.8485μs 127.4134 KOps/s 120.7102 KOps/s $\textbf{\color{#35bf28}+5.55\%}$
test_contiguous[memmap_tensor0] 0.1956ms 0.6466μs 1.5464 MOps/s 1.5320 MOps/s $\color{#35bf28}+0.94\%$
test_stack[memmap_tensor0] 35.9100μs 5.2292μs 191.2339 KOps/s 187.3580 KOps/s $\color{#35bf28}+2.07\%$
test_memmaptd_index 1.0444ms 0.2939ms 3.4029 KOps/s 3.3637 KOps/s $\color{#35bf28}+1.17\%$
test_memmaptd_index_astensor 0.6127ms 0.3599ms 2.7786 KOps/s 2.7731 KOps/s $\color{#35bf28}+0.20\%$
test_memmaptd_index_op 78.3806ms 0.7671ms 1.3035 KOps/s 1.4840 KOps/s $\textbf{\color{#d91a1a}-12.16\%}$
test_serialize_model 0.1364s 0.1322s 7.5650 Ops/s 7.6205 Ops/s $\color{#d91a1a}-0.73\%$
test_serialize_model_pickle 1.3507s 1.2118s 0.8252 Ops/s 0.8234 Ops/s $\color{#35bf28}+0.22\%$
test_serialize_weights 0.1319s 0.1302s 7.6777 Ops/s 7.0389 Ops/s $\textbf{\color{#35bf28}+9.07\%}$
test_serialize_weights_returnearly 79.2730ms 68.7984ms 14.5352 Ops/s 12.4285 Ops/s $\textbf{\color{#35bf28}+16.95\%}$
test_serialize_weights_pickle 1.3590s 1.2131s 0.8243 Ops/s 0.8232 Ops/s $\color{#35bf28}+0.13\%$
test_reshape_pytree 0.1704ms 24.3612μs 41.0489 KOps/s 40.4989 KOps/s $\color{#35bf28}+1.36\%$
test_reshape_td 59.3110μs 28.9786μs 34.5082 KOps/s 34.2216 KOps/s $\color{#35bf28}+0.84\%$
test_view_pytree 0.1465ms 23.6826μs 42.2252 KOps/s 42.1642 KOps/s $\color{#35bf28}+0.14\%$
test_view_td 0.1539ms 32.3329μs 30.9282 KOps/s 31.1189 KOps/s $\color{#d91a1a}-0.61\%$
test_unbind_pytree 0.1250ms 33.1144μs 30.1983 KOps/s 30.0392 KOps/s $\color{#35bf28}+0.53\%$
test_unbind_td 0.4272ms 39.9321μs 25.0425 KOps/s 24.1175 KOps/s $\color{#35bf28}+3.84\%$
test_split_pytree 63.5210μs 33.3030μs 30.0273 KOps/s 29.2383 KOps/s $\color{#35bf28}+2.70\%$
test_split_td 0.4838ms 38.9099μs 25.7004 KOps/s 25.4652 KOps/s $\color{#35bf28}+0.92\%$
test_add_pytree 0.1083ms 38.1991μs 26.1786 KOps/s 25.2569 KOps/s $\color{#35bf28}+3.65\%$
test_add_td 0.1667ms 53.9143μs 18.5479 KOps/s 20.3580 KOps/s $\textbf{\color{#d91a1a}-8.89\%}$
test_distributed 0.8831ms 0.1155ms 8.6603 KOps/s 8.8916 KOps/s $\color{#d91a1a}-2.60\%$
test_tdmodule 32.3400μs 13.5320μs 73.8988 KOps/s 81.8832 KOps/s $\textbf{\color{#d91a1a}-9.75\%}$
test_tdmodule_dispatch 0.3436ms 27.7933μs 35.9799 KOps/s 41.9516 KOps/s $\textbf{\color{#d91a1a}-14.23\%}$
test_tdseq 34.7900μs 15.5044μs 64.4979 KOps/s 69.6624 KOps/s $\textbf{\color{#d91a1a}-7.41\%}$
test_tdseq_dispatch 50.6010μs 30.1488μs 33.1688 KOps/s 36.8202 KOps/s $\textbf{\color{#d91a1a}-9.92\%}$
test_instantiation_functorch 1.5820ms 1.4339ms 697.4034 Ops/s 685.4485 Ops/s $\color{#35bf28}+1.74\%$
test_instantiation_td 78.3114ms 1.0581ms 945.0585 Ops/s 1.0219 KOps/s $\textbf{\color{#d91a1a}-7.52\%}$
test_exec_functorch 0.2658ms 0.1510ms 6.6204 KOps/s 6.3741 KOps/s $\color{#35bf28}+3.86\%$
test_exec_functional_call 0.3188ms 0.1452ms 6.8862 KOps/s 6.7358 KOps/s $\color{#35bf28}+2.23\%$
test_exec_td 0.3166ms 0.1472ms 6.7932 KOps/s 6.7771 KOps/s $\color{#35bf28}+0.24\%$
test_exec_td_decorator 0.5835ms 0.2050ms 4.8778 KOps/s 4.7873 KOps/s $\color{#35bf28}+1.89\%$
test_vmap_mlp_speed[True-True] 1.3146ms 0.6247ms 1.6007 KOps/s 1.6425 KOps/s $\color{#d91a1a}-2.55\%$
test_vmap_mlp_speed[True-False] 0.8305ms 0.6249ms 1.6003 KOps/s 1.6510 KOps/s $\color{#d91a1a}-3.07\%$
test_vmap_mlp_speed[False-True] 0.8463ms 0.5547ms 1.8028 KOps/s 1.8470 KOps/s $\color{#d91a1a}-2.39\%$
test_vmap_mlp_speed[False-False] 0.7439ms 0.5627ms 1.7773 KOps/s 1.8292 KOps/s $\color{#d91a1a}-2.84\%$
test_vmap_mlp_speed_decorator[True-True] 1.0368ms 0.6662ms 1.5010 KOps/s 1.5060 KOps/s $\color{#d91a1a}-0.33\%$
test_vmap_mlp_speed_decorator[True-False] 0.8576ms 0.6645ms 1.5048 KOps/s 1.5061 KOps/s $\color{#d91a1a}-0.09\%$
test_vmap_mlp_speed_decorator[False-True] 0.7914ms 0.6027ms 1.6592 KOps/s 1.6788 KOps/s $\color{#d91a1a}-1.17\%$
test_vmap_mlp_speed_decorator[False-False] 0.8169ms 0.6016ms 1.6623 KOps/s 1.6750 KOps/s $\color{#d91a1a}-0.76\%$
test_vmap_transformer_speed[True-True] 8.7128ms 8.2808ms 120.7608 Ops/s 120.1925 Ops/s $\color{#35bf28}+0.47\%$
test_vmap_transformer_speed[True-False] 8.6301ms 8.2781ms 120.8003 Ops/s 120.6015 Ops/s $\color{#35bf28}+0.16\%$
test_vmap_transformer_speed[False-True] 8.6247ms 8.2392ms 121.3711 Ops/s 121.2077 Ops/s $\color{#35bf28}+0.13\%$
test_vmap_transformer_speed[False-False] 8.6884ms 8.1958ms 122.0134 Ops/s 121.6030 Ops/s $\color{#35bf28}+0.34\%$
test_vmap_transformer_speed_decorator[True-True] 21.0073ms 19.9852ms 50.0370 Ops/s 49.8526 Ops/s $\color{#35bf28}+0.37\%$
test_vmap_transformer_speed_decorator[True-False] 20.3819ms 19.9371ms 50.1577 Ops/s 50.0069 Ops/s $\color{#35bf28}+0.30\%$
test_vmap_transformer_speed_decorator[False-True] 20.7662ms 19.9207ms 50.1990 Ops/s 50.2138 Ops/s $\color{#d91a1a}-0.03\%$
test_vmap_transformer_speed_decorator[False-False] 20.8112ms 19.8950ms 50.2639 Ops/s 50.1948 Ops/s $\color{#35bf28}+0.14\%$
test_to_module_speed[True] 1.3929ms 1.2628ms 791.9195 Ops/s 779.6518 Ops/s $\color{#35bf28}+1.57\%$
test_to_module_speed[False] 1.3917ms 1.2709ms 786.8214 Ops/s 799.6234 Ops/s $\color{#d91a1a}-1.60\%$
test_tc_init 50.6710μs 24.3407μs 41.0835 KOps/s 54.1200 KOps/s $\textbf{\color{#d91a1a}-24.09\%}$
test_tc_init_nested 0.1064ms 49.4966μs 20.2034 KOps/s 25.3640 KOps/s $\textbf{\color{#d91a1a}-20.35\%}$
test_tc_first_layer_tensor 0.8645μs 0.3386μs 2.9530 MOps/s 2.9672 MOps/s $\color{#d91a1a}-0.48\%$
test_tc_first_layer_nontensor 2.9244μs 0.3527μs 2.8351 MOps/s 2.8676 MOps/s $\color{#d91a1a}-1.13\%$
test_tc_second_layer_tensor 5.8318μs 0.8424μs 1.1871 MOps/s 1.1874 MOps/s $\color{#d91a1a}-0.03\%$
test_tc_second_layer_nontensor 3.5010μs 0.7169μs 1.3950 MOps/s 1.3836 MOps/s $\color{#35bf28}+0.82\%$
test_unbind 0.1059s 7.6070ms 131.4585 Ops/s 208.0919 Ops/s $\textbf{\color{#d91a1a}-36.83\%}$
test_full_like 16.0835ms 15.5660ms 64.2427 Ops/s 44.4294 Ops/s $\textbf{\color{#35bf28}+44.59\%}$
test_zeros_like 17.0425ms 16.1763ms 61.8189 Ops/s 59.9711 Ops/s $\color{#35bf28}+3.08\%$
test_ones_like 16.8309ms 16.1121ms 62.0650 Ops/s 60.3122 Ops/s $\color{#35bf28}+2.91\%$
test_clone 17.8434ms 17.3053ms 57.7858 Ops/s 56.0869 Ops/s $\color{#35bf28}+3.03\%$
test_squeeze 0.1778ms 9.4526μs 105.7915 KOps/s 108.6076 KOps/s $\color{#d91a1a}-2.59\%$
test_unsqueeze 0.2247ms 50.8601μs 19.6618 KOps/s 20.5042 KOps/s $\color{#d91a1a}-4.11\%$
test_split 0.2071ms 93.5986μs 10.6839 KOps/s 10.4663 KOps/s $\color{#35bf28}+2.08\%$
test_permute 0.2351ms 0.1035ms 9.6613 KOps/s 9.0269 KOps/s $\textbf{\color{#35bf28}+7.03\%}$
test_stack 51.4211ms 50.8746ms 19.6562 Ops/s 18.6397 Ops/s $\textbf{\color{#35bf28}+5.45\%}$
test_cat 51.6123ms 50.8517ms 19.6650 Ops/s 19.0448 Ops/s $\color{#35bf28}+3.26\%$

tensordict/nn/probabilistic.py Outdated Show resolved Hide resolved
category=UserWarning,
)

if interaction_type is InteractionType.MODE:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if interaction_type is InteractionType.MODE:
elif interaction_type is InteractionType.MODE:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not necessary, the function either returns or raises just before

tensordict/nn/probabilistic.py Outdated Show resolved Hide resolved
@vmoens vmoens added the enhancement New feature or request label Jun 20, 2024
@vmoens vmoens merged commit d14db1c into main Jun 20, 2024
2 of 5 checks passed
@vmoens vmoens deleted the deterministic-sample branch June 20, 2024 16:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants