Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Seed workers in TensorDict.map #562

Merged
merged 43 commits into from
Nov 24, 2023
Merged

[Feature] Seed workers in TensorDict.map #562

merged 43 commits into from
Nov 24, 2023

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Nov 21, 2023

Description

Seeds the workers in TensorDict.map pool.
The base seed can be passed to the method.
If no seed is required, the unseeded pool can be passed to the method.

The seed is aimed at having a separate seed for each worker and NOT to have reproducible results. By nature, pool's workers will pick up job on the basis of their status, hence the process is non-deterministic by nature.

The tests only verify that with torch and numpy, each worker has a predictable behaviour if the number of jobs equates the number of workers.

cc @NicolasHug

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 21, 2023
@vmoens vmoens added the enhancement New feature or request label Nov 21, 2023
Copy link

github-actions bot commented Nov 21, 2023

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 113. Improved: $\large\color{#35bf28}3$. Worsened: $\large\color{#d91a1a}6$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 30.9580μs 15.9408μs 62.7320 KOps/s 62.6048 KOps/s $\color{#35bf28}+0.20\%$
test_plain_set_stack_nested 0.1844ms 0.1454ms 6.8776 KOps/s 6.8947 KOps/s $\color{#d91a1a}-0.25\%$
test_plain_set_nested_inplace 46.8980μs 19.1487μs 52.2230 KOps/s 52.2761 KOps/s $\color{#d91a1a}-0.10\%$
test_plain_set_stack_nested_inplace 0.3397ms 0.1757ms 5.6907 KOps/s 5.6508 KOps/s $\color{#35bf28}+0.71\%$
test_items 23.2540μs 2.5305μs 395.1808 KOps/s 404.0636 KOps/s $\color{#d91a1a}-2.20\%$
test_items_nested 0.3787ms 0.2676ms 3.7364 KOps/s 3.6598 KOps/s $\color{#35bf28}+2.09\%$
test_items_nested_locked 0.3568ms 0.2677ms 3.7351 KOps/s 3.6926 KOps/s $\color{#35bf28}+1.15\%$
test_items_nested_leaf 0.2297ms 0.1656ms 6.0387 KOps/s 5.9965 KOps/s $\color{#35bf28}+0.70\%$
test_items_stack_nested 1.9136ms 1.5034ms 665.1497 Ops/s 668.6592 Ops/s $\color{#d91a1a}-0.52\%$
test_items_stack_nested_leaf 1.5877ms 1.3754ms 727.0379 Ops/s 736.7743 Ops/s $\color{#d91a1a}-1.32\%$
test_items_stack_nested_locked 1.1721ms 0.7726ms 1.2944 KOps/s 1.2745 KOps/s $\color{#35bf28}+1.56\%$
test_keys 38.3820μs 3.8538μs 259.4828 KOps/s 253.2553 KOps/s $\color{#35bf28}+2.46\%$
test_keys_nested 1.3273ms 0.1380ms 7.2484 KOps/s 6.6746 KOps/s $\textbf{\color{#35bf28}+8.60\%}$
test_keys_nested_locked 0.2718ms 0.1381ms 7.2398 KOps/s 6.9712 KOps/s $\color{#35bf28}+3.85\%$
test_keys_nested_leaf 0.3089ms 0.1382ms 7.2346 KOps/s 7.0535 KOps/s $\color{#35bf28}+2.57\%$
test_keys_stack_nested 2.2633ms 1.4376ms 695.6103 Ops/s 701.2155 Ops/s $\color{#d91a1a}-0.80\%$
test_keys_stack_nested_leaf 1.6553ms 1.4333ms 697.6877 Ops/s 701.1275 Ops/s $\color{#d91a1a}-0.49\%$
test_keys_stack_nested_locked 1.2167ms 0.6949ms 1.4390 KOps/s 1.4413 KOps/s $\color{#d91a1a}-0.16\%$
test_values 7.4087μs 1.1567μs 864.4993 KOps/s 819.9144 KOps/s $\textbf{\color{#35bf28}+5.44\%}$
test_values_nested 98.4150μs 49.8749μs 20.0502 KOps/s 19.7835 KOps/s $\color{#35bf28}+1.35\%$
test_values_nested_locked 97.5830μs 49.7057μs 20.1184 KOps/s 19.9270 KOps/s $\color{#35bf28}+0.96\%$
test_values_nested_leaf 70.6630μs 44.9268μs 22.2585 KOps/s 22.2094 KOps/s $\color{#35bf28}+0.22\%$
test_values_stack_nested 1.8869ms 1.2202ms 819.5267 Ops/s 817.0424 Ops/s $\color{#35bf28}+0.30\%$
test_values_stack_nested_leaf 1.4369ms 1.1958ms 836.2269 Ops/s 827.9180 Ops/s $\color{#35bf28}+1.00\%$
test_values_stack_nested_locked 0.7713ms 0.5159ms 1.9383 KOps/s 1.8943 KOps/s $\color{#35bf28}+2.32\%$
test_membership 11.7020μs 1.3559μs 737.5241 KOps/s 716.9333 KOps/s $\color{#35bf28}+2.87\%$
test_membership_nested 20.9200μs 2.7834μs 359.2784 KOps/s 357.3345 KOps/s $\color{#35bf28}+0.54\%$
test_membership_nested_leaf 14.5880μs 2.7893μs 358.5083 KOps/s 352.8890 KOps/s $\color{#35bf28}+1.59\%$
test_membership_stacked_nested 30.1060μs 11.6810μs 85.6094 KOps/s 83.6600 KOps/s $\color{#35bf28}+2.33\%$
test_membership_stacked_nested_leaf 27.2010μs 11.7269μs 85.2743 KOps/s 83.1212 KOps/s $\color{#35bf28}+2.59\%$
test_membership_nested_last 26.1290μs 5.9708μs 167.4820 KOps/s 161.0849 KOps/s $\color{#35bf28}+3.97\%$
test_membership_nested_leaf_last 33.6030μs 5.9166μs 169.0149 KOps/s 160.9709 KOps/s $\color{#35bf28}+5.00\%$
test_membership_stacked_nested_last 0.2336ms 0.1693ms 5.9056 KOps/s 5.8756 KOps/s $\color{#35bf28}+0.51\%$
test_membership_stacked_nested_leaf_last 60.6510μs 13.7389μs 72.7863 KOps/s 70.3495 KOps/s $\color{#35bf28}+3.46\%$
test_nested_getleaf 42.0590μs 10.8070μs 92.5322 KOps/s 93.6351 KOps/s $\color{#d91a1a}-1.18\%$
test_nested_get 39.4740μs 10.1366μs 98.6521 KOps/s 97.1318 KOps/s $\color{#35bf28}+1.57\%$
test_stacked_getleaf 1.1113ms 0.6573ms 1.5213 KOps/s 1.5432 KOps/s $\color{#d91a1a}-1.42\%$
test_stacked_get 1.1633ms 0.6183ms 1.6172 KOps/s 1.6229 KOps/s $\color{#d91a1a}-0.35\%$
test_nested_getitemleaf 38.9530μs 10.8101μs 92.5057 KOps/s 92.4288 KOps/s $\color{#35bf28}+0.08\%$
test_nested_getitem 31.1890μs 10.0380μs 99.6217 KOps/s 96.3879 KOps/s $\color{#35bf28}+3.36\%$
test_stacked_getitemleaf 0.7349ms 0.6494ms 1.5398 KOps/s 1.5438 KOps/s $\color{#d91a1a}-0.26\%$
test_stacked_getitem 1.1493ms 0.6197ms 1.6136 KOps/s 1.6258 KOps/s $\color{#d91a1a}-0.75\%$
test_lock_nested 51.7627ms 0.5425ms 1.8431 KOps/s 2.0209 KOps/s $\textbf{\color{#d91a1a}-8.80\%}$
test_lock_stack_nested 72.7375ms 7.9461ms 125.8486 Ops/s 124.7967 Ops/s $\color{#35bf28}+0.84\%$
test_unlock_nested 59.2869ms 0.5080ms 1.9685 KOps/s 1.9578 KOps/s $\color{#35bf28}+0.55\%$
test_unlock_stack_nested 67.6171ms 7.7075ms 129.7434 Ops/s 203.4756 Ops/s $\textbf{\color{#d91a1a}-36.24\%}$
test_flatten_speed 0.5329ms 0.2674ms 3.7402 KOps/s 3.6668 KOps/s $\color{#35bf28}+2.00\%$
test_unflatten_speed 0.7900ms 0.4677ms 2.1380 KOps/s 2.1200 KOps/s $\color{#35bf28}+0.85\%$
test_common_ops 1.4480ms 0.6969ms 1.4350 KOps/s 1.4604 KOps/s $\color{#d91a1a}-1.74\%$
test_creation 19.7270μs 2.3983μs 416.9533 KOps/s 412.2689 KOps/s $\color{#35bf28}+1.14\%$
test_creation_empty 39.4050μs 8.0235μs 124.6333 KOps/s 124.0841 KOps/s $\color{#35bf28}+0.44\%$
test_creation_nested_1 29.1550μs 11.5212μs 86.7962 KOps/s 86.3470 KOps/s $\color{#35bf28}+0.52\%$
test_creation_nested_2 44.1830μs 14.9968μs 66.6807 KOps/s 67.1191 KOps/s $\color{#d91a1a}-0.65\%$
test_clone 0.1547ms 13.5970μs 73.5456 KOps/s 75.1772 KOps/s $\color{#d91a1a}-2.17\%$
test_getitem[int] 33.0320μs 13.4627μs 74.2795 KOps/s 75.5157 KOps/s $\color{#d91a1a}-1.64\%$
test_getitem[slice_int] 56.7670μs 25.6634μs 38.9660 KOps/s 39.4083 KOps/s $\color{#d91a1a}-1.12\%$
test_getitem[range] 0.1420ms 45.8911μs 21.7907 KOps/s 21.8665 KOps/s $\color{#d91a1a}-0.35\%$
test_getitem[tuple] 61.3160μs 21.0057μs 47.6062 KOps/s 49.0692 KOps/s $\color{#d91a1a}-2.98\%$
test_getitem[list] 87.1040μs 41.2815μs 24.2239 KOps/s 24.8325 KOps/s $\color{#d91a1a}-2.45\%$
test_setitem_dim[int] 66.5950μs 28.6779μs 34.8701 KOps/s 34.7676 KOps/s $\color{#35bf28}+0.29\%$
test_setitem_dim[slice_int] 0.1209ms 54.0120μs 18.5144 KOps/s 18.8679 KOps/s $\color{#d91a1a}-1.87\%$
test_setitem_dim[range] 0.1171ms 74.4826μs 13.4259 KOps/s 13.4047 KOps/s $\color{#35bf28}+0.16\%$
test_setitem_dim[tuple] 80.1500μs 42.0637μs 23.7734 KOps/s 24.1749 KOps/s $\color{#d91a1a}-1.66\%$
test_setitem 0.1304ms 18.8547μs 53.0370 KOps/s 54.5280 KOps/s $\color{#d91a1a}-2.73\%$
test_set 0.1451ms 18.2653μs 54.7485 KOps/s 56.8291 KOps/s $\color{#d91a1a}-3.66\%$
test_set_shared 0.9154ms 0.1387ms 7.2107 KOps/s 7.1952 KOps/s $\color{#35bf28}+0.22\%$
test_update 0.1392ms 24.4143μs 40.9596 KOps/s 42.9574 KOps/s $\color{#d91a1a}-4.65\%$
test_update_nested 0.1373ms 35.0926μs 28.4960 KOps/s 29.1560 KOps/s $\color{#d91a1a}-2.26\%$
test_set_nested 0.1408ms 20.3105μs 49.2356 KOps/s 51.2206 KOps/s $\color{#d91a1a}-3.88\%$
test_set_nested_new 0.1070ms 26.1098μs 38.2997 KOps/s 40.0847 KOps/s $\color{#d91a1a}-4.45\%$
test_select 90.8510μs 52.2570μs 19.1362 KOps/s 19.5666 KOps/s $\color{#d91a1a}-2.20\%$
test_unbind_speed 0.4897ms 0.3780ms 2.6454 KOps/s 2.6543 KOps/s $\color{#d91a1a}-0.33\%$
test_unbind_speed_stack0 61.4956ms 5.4489ms 183.5241 Ops/s 246.0125 Ops/s $\textbf{\color{#d91a1a}-25.40\%}$
test_unbind_speed_stack1 2.0553μs 0.6359μs 1.5726 MOps/s 1.5884 MOps/s $\color{#d91a1a}-0.99\%$
test_split 53.2208ms 1.7987ms 555.9613 Ops/s 551.1746 Ops/s $\color{#35bf28}+0.87\%$
test_chunk 53.1069ms 1.7527ms 570.5540 Ops/s 563.9719 Ops/s $\color{#35bf28}+1.17\%$
test_creation[device0] 0.4528ms 0.2932ms 3.4110 KOps/s 3.3707 KOps/s $\color{#35bf28}+1.19\%$
test_creation_from_tensor 3.6900ms 0.3325ms 3.0076 KOps/s 2.9909 KOps/s $\color{#35bf28}+0.56\%$
test_add_one[memmap_tensor0] 74.9210μs 25.8971μs 38.6143 KOps/s 38.8723 KOps/s $\color{#d91a1a}-0.66\%$
test_contiguous[memmap_tensor0] 31.1180μs 5.8753μs 170.2033 KOps/s 172.5945 KOps/s $\color{#d91a1a}-1.39\%$
test_stack[memmap_tensor0] 70.9040μs 19.8307μs 50.4269 KOps/s 51.4771 KOps/s $\color{#d91a1a}-2.04\%$
test_memmaptd_index 0.7629ms 0.4234ms 2.3620 KOps/s 4.9891 KOps/s $\textbf{\color{#d91a1a}-52.66\%}$
test_memmaptd_index_astensor 1.0997ms 0.4810ms 2.0790 KOps/s 3.8499 KOps/s $\textbf{\color{#d91a1a}-46.00\%}$
test_memmaptd_index_op 0.8232ms 0.7129ms 1.4028 KOps/s 1.8582 KOps/s $\textbf{\color{#d91a1a}-24.51\%}$
test_reshape_pytree 58.0590μs 23.4317μs 42.6772 KOps/s 43.1005 KOps/s $\color{#d91a1a}-0.98\%$
test_reshape_td 86.0220μs 32.1147μs 31.1384 KOps/s 31.4175 KOps/s $\color{#d91a1a}-0.89\%$
test_view_pytree 0.4083ms 23.1747μs 43.1506 KOps/s 42.6875 KOps/s $\color{#35bf28}+1.08\%$
test_view_td 20.8890μs 4.8888μs 204.5501 KOps/s 202.0564 KOps/s $\color{#35bf28}+1.23\%$
test_unbind_pytree 74.5500μs 26.1496μs 38.2415 KOps/s 37.4838 KOps/s $\color{#35bf28}+2.02\%$
test_unbind_td 0.1042ms 60.6776μs 16.4805 KOps/s 16.7622 KOps/s $\color{#d91a1a}-1.68\%$
test_split_pytree 69.0100μs 26.2547μs 38.0884 KOps/s 37.9882 KOps/s $\color{#35bf28}+0.26\%$
test_split_td 93.3160μs 47.4781μs 21.0624 KOps/s 21.3423 KOps/s $\color{#d91a1a}-1.31\%$
test_add_pytree 92.9350μs 31.8399μs 31.4071 KOps/s 31.3965 KOps/s $\color{#35bf28}+0.03\%$
test_add_td 94.5470μs 44.9459μs 22.2489 KOps/s 22.0066 KOps/s $\color{#35bf28}+1.10\%$
test_distributed 18.3250μs 6.1787μs 161.8461 KOps/s 165.1348 KOps/s $\color{#d91a1a}-1.99\%$
test_tdmodule 0.1632ms 21.5397μs 46.4259 KOps/s 43.5717 KOps/s $\textbf{\color{#35bf28}+6.55\%}$
test_tdmodule_dispatch 0.1731ms 38.7492μs 25.8070 KOps/s 25.9545 KOps/s $\color{#d91a1a}-0.57\%$
test_tdseq 42.4700μs 24.6012μs 40.6485 KOps/s 41.2058 KOps/s $\color{#d91a1a}-1.35\%$
test_tdseq_dispatch 0.1558ms 43.2593μs 23.1164 KOps/s 23.2081 KOps/s $\color{#d91a1a}-0.40\%$
test_instantiation_functorch 1.9651ms 1.3093ms 763.7872 Ops/s 765.8959 Ops/s $\color{#d91a1a}-0.28\%$
test_instantiation_td 63.9799ms 1.0903ms 917.1595 Ops/s 911.9377 Ops/s $\color{#35bf28}+0.57\%$
test_exec_functorch 0.2440ms 0.1629ms 6.1402 KOps/s 6.1182 KOps/s $\color{#35bf28}+0.36\%$
test_exec_functional_call 0.3563ms 0.1497ms 6.6806 KOps/s 6.6556 KOps/s $\color{#35bf28}+0.38\%$
test_exec_td 0.2291ms 0.1474ms 6.7862 KOps/s 6.7570 KOps/s $\color{#35bf28}+0.43\%$
test_exec_td_decorator 0.9088ms 0.2241ms 4.4624 KOps/s 4.5002 KOps/s $\color{#d91a1a}-0.84\%$
test_vmap_mlp_speed[True-True] 1.1984ms 0.8870ms 1.1273 KOps/s 1.1102 KOps/s $\color{#35bf28}+1.54\%$
test_vmap_mlp_speed[True-False] 0.7180ms 0.4675ms 2.1392 KOps/s 2.1256 KOps/s $\color{#35bf28}+0.64\%$
test_vmap_mlp_speed[False-True] 0.8987ms 0.7703ms 1.2981 KOps/s 1.2783 KOps/s $\color{#35bf28}+1.55\%$
test_vmap_mlp_speed[False-False] 0.6042ms 0.3845ms 2.6009 KOps/s 2.5753 KOps/s $\color{#35bf28}+0.99\%$
test_vmap_mlp_speed_decorator[True-True] 3.3344ms 1.5827ms 631.8494 Ops/s 634.2327 Ops/s $\color{#d91a1a}-0.38\%$
test_vmap_mlp_speed_decorator[True-False] 1.0141ms 0.5490ms 1.8215 KOps/s 1.8101 KOps/s $\color{#35bf28}+0.63\%$
test_vmap_mlp_speed_decorator[False-True] 2.4471ms 1.3755ms 727.0078 Ops/s 725.8930 Ops/s $\color{#35bf28}+0.15\%$
test_vmap_mlp_speed_decorator[False-False] 0.8009ms 0.4277ms 2.3381 KOps/s 2.3250 KOps/s $\color{#35bf28}+0.56\%$

Copy link

github-actions bot commented Nov 21, 2023

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 127. Improved: $\large\color{#35bf28}11$. Worsened: $\large\color{#d91a1a}4$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.4582ms 12.7039μs 78.7161 KOps/s 79.0844 KOps/s $\color{#d91a1a}-0.47\%$
test_plain_set_stack_nested 0.1546ms 0.1155ms 8.6603 KOps/s 8.4284 KOps/s $\color{#35bf28}+2.75\%$
test_plain_set_nested_inplace 31.5110μs 15.1810μs 65.8720 KOps/s 66.4994 KOps/s $\color{#d91a1a}-0.94\%$
test_plain_set_stack_nested_inplace 0.1755ms 0.1415ms 7.0681 KOps/s 7.1160 KOps/s $\color{#d91a1a}-0.67\%$
test_items 18.4900μs 4.6489μs 215.1033 KOps/s 210.0346 KOps/s $\color{#35bf28}+2.41\%$
test_items_nested 0.3865ms 0.3374ms 2.9640 KOps/s 2.9790 KOps/s $\color{#d91a1a}-0.50\%$
test_items_nested_locked 0.3776ms 0.3369ms 2.9681 KOps/s 2.9707 KOps/s $\color{#d91a1a}-0.09\%$
test_items_nested_leaf 0.2379ms 0.1968ms 5.0813 KOps/s 5.0579 KOps/s $\color{#35bf28}+0.46\%$
test_items_stack_nested 1.5501ms 1.4628ms 683.6393 Ops/s 674.4099 Ops/s $\color{#35bf28}+1.37\%$
test_items_stack_nested_leaf 1.3864ms 1.3021ms 768.0048 Ops/s 758.4717 Ops/s $\color{#35bf28}+1.26\%$
test_items_stack_nested_locked 1.7892ms 0.8045ms 1.2430 KOps/s 1.2384 KOps/s $\color{#35bf28}+0.37\%$
test_keys 80.4620μs 4.5616μs 219.2230 KOps/s 216.5815 KOps/s $\color{#35bf28}+1.22\%$
test_keys_nested 0.4920ms 90.5490μs 11.0437 KOps/s 11.0320 KOps/s $\color{#35bf28}+0.11\%$
test_keys_nested_locked 0.1227ms 90.0600μs 11.1037 KOps/s 11.0856 KOps/s $\color{#35bf28}+0.16\%$
test_keys_nested_leaf 42.1598ms 87.3543μs 11.4476 KOps/s 12.1434 KOps/s $\textbf{\color{#d91a1a}-5.73\%}$
test_keys_stack_nested 1.4061ms 1.2815ms 780.3644 Ops/s 768.1502 Ops/s $\color{#35bf28}+1.59\%$
test_keys_stack_nested_leaf 1.4420ms 1.2749ms 784.3483 Ops/s 762.4644 Ops/s $\color{#35bf28}+2.87\%$
test_keys_stack_nested_locked 0.6547ms 0.6029ms 1.6586 KOps/s 1.6345 KOps/s $\color{#35bf28}+1.47\%$
test_values 27.2207μs 1.8793μs 532.1184 KOps/s 531.5517 KOps/s $\color{#35bf28}+0.11\%$
test_values_nested 67.4520μs 43.1490μs 23.1755 KOps/s 23.2652 KOps/s $\color{#d91a1a}-0.39\%$
test_values_nested_locked 0.1056ms 43.3181μs 23.0850 KOps/s 23.0749 KOps/s $\color{#35bf28}+0.04\%$
test_values_nested_leaf 56.4910μs 37.2415μs 26.8518 KOps/s 26.8537 KOps/s $-0.01\%$
test_values_stack_nested 1.2014ms 1.1346ms 881.4065 Ops/s 882.8706 Ops/s $\color{#d91a1a}-0.17\%$
test_values_stack_nested_leaf 1.1570ms 1.1118ms 899.4421 Ops/s 881.6044 Ops/s $\color{#35bf28}+2.02\%$
test_values_stack_nested_locked 0.5297ms 0.4795ms 2.0856 KOps/s 2.0511 KOps/s $\color{#35bf28}+1.68\%$
test_membership 4.7502μs 0.9281μs 1.0774 MOps/s 1.0486 MOps/s $\color{#35bf28}+2.75\%$
test_membership_nested 34.9210μs 2.1223μs 471.1787 KOps/s 453.7762 KOps/s $\color{#35bf28}+3.84\%$
test_membership_nested_leaf 9.9505μs 2.1245μs 470.7094 KOps/s 472.8513 KOps/s $\color{#d91a1a}-0.45\%$
test_membership_stacked_nested 31.1300μs 10.9056μs 91.6959 KOps/s 91.9553 KOps/s $\color{#d91a1a}-0.28\%$
test_membership_stacked_nested_leaf 38.6900μs 10.8793μs 91.9173 KOps/s 92.3375 KOps/s $\color{#d91a1a}-0.46\%$
test_membership_nested_last 31.6810μs 4.6452μs 215.2778 KOps/s 217.2020 KOps/s $\color{#d91a1a}-0.89\%$
test_membership_nested_leaf_last 33.5000μs 4.6616μs 214.5186 KOps/s 217.0923 KOps/s $\color{#d91a1a}-1.19\%$
test_membership_stacked_nested_last 0.1634ms 0.1342ms 7.4537 KOps/s 7.5277 KOps/s $\color{#d91a1a}-0.98\%$
test_membership_stacked_nested_leaf_last 32.0310μs 12.6571μs 79.0069 KOps/s 78.6241 KOps/s $\color{#35bf28}+0.49\%$
test_nested_getleaf 23.3120μs 8.3831μs 119.2876 KOps/s 119.3633 KOps/s $\color{#d91a1a}-0.06\%$
test_nested_get 30.3300μs 7.9446μs 125.8710 KOps/s 126.2973 KOps/s $\color{#d91a1a}-0.34\%$
test_stacked_getleaf 0.6463ms 0.5769ms 1.7334 KOps/s 1.7355 KOps/s $\color{#d91a1a}-0.12\%$
test_stacked_get 0.5829ms 0.5390ms 1.8554 KOps/s 1.8253 KOps/s $\color{#35bf28}+1.65\%$
test_nested_getitemleaf 23.1400μs 8.4164μs 118.8163 KOps/s 118.1700 KOps/s $\color{#35bf28}+0.55\%$
test_nested_getitem 30.9310μs 7.9845μs 125.2434 KOps/s 125.4791 KOps/s $\color{#d91a1a}-0.19\%$
test_stacked_getitemleaf 0.7129ms 0.5659ms 1.7671 KOps/s 1.7419 KOps/s $\color{#35bf28}+1.44\%$
test_stacked_getitem 0.5904ms 0.5402ms 1.8511 KOps/s 1.8278 KOps/s $\color{#35bf28}+1.28\%$
test_lock_nested 4.3387ms 0.4562ms 2.1921 KOps/s 2.1997 KOps/s $\color{#d91a1a}-0.34\%$
test_lock_stack_nested 67.6783ms 6.4961ms 153.9392 Ops/s 151.7888 Ops/s $\color{#35bf28}+1.42\%$
test_unlock_nested 1.2997ms 0.4301ms 2.3251 KOps/s 2.0408 KOps/s $\textbf{\color{#35bf28}+13.94\%}$
test_unlock_stack_nested 64.1600ms 7.2311ms 138.2907 Ops/s 138.0729 Ops/s $\color{#35bf28}+0.16\%$
test_flatten_speed 0.5220ms 0.1864ms 5.3646 KOps/s 5.3734 KOps/s $\color{#d91a1a}-0.16\%$
test_unflatten_speed 0.4280ms 0.3641ms 2.7469 KOps/s 2.7567 KOps/s $\color{#d91a1a}-0.36\%$
test_common_ops 1.0275ms 0.6005ms 1.6653 KOps/s 1.6626 KOps/s $\color{#35bf28}+0.16\%$
test_creation 31.8610μs 1.9259μs 519.2460 KOps/s 513.1785 KOps/s $\color{#35bf28}+1.18\%$
test_creation_empty 25.6310μs 6.9564μs 143.7523 KOps/s 144.4028 KOps/s $\color{#d91a1a}-0.45\%$
test_creation_nested_1 25.7400μs 9.3962μs 106.4256 KOps/s 107.7651 KOps/s $\color{#d91a1a}-1.24\%$
test_creation_nested_2 37.5410μs 12.0218μs 83.1824 KOps/s 85.4216 KOps/s $\color{#d91a1a}-2.62\%$
test_clone 96.2330μs 13.6734μs 73.1349 KOps/s 73.0799 KOps/s $\color{#35bf28}+0.08\%$
test_getitem[int] 32.5500μs 11.9378μs 83.7678 KOps/s 83.7998 KOps/s $\color{#d91a1a}-0.04\%$
test_getitem[slice_int] 50.5010μs 22.6078μs 44.2325 KOps/s 43.4643 KOps/s $\color{#35bf28}+1.77\%$
test_getitem[range] 61.1120μs 39.2741μs 25.4621 KOps/s 26.4861 KOps/s $\color{#d91a1a}-3.87\%$
test_getitem[tuple] 42.3710μs 19.5123μs 51.2497 KOps/s 51.2009 KOps/s $\color{#35bf28}+0.10\%$
test_getitem[list] 0.3043ms 36.2562μs 27.5815 KOps/s 27.9323 KOps/s $\color{#d91a1a}-1.26\%$
test_setitem_dim[int] 42.2000μs 25.6071μs 39.0517 KOps/s 35.2022 KOps/s $\textbf{\color{#35bf28}+10.94\%}$
test_setitem_dim[slice_int] 63.0010μs 44.8482μs 22.2974 KOps/s 20.7262 KOps/s $\textbf{\color{#35bf28}+7.58\%}$
test_setitem_dim[range] 80.6930μs 61.4956μs 16.2613 KOps/s 15.0415 KOps/s $\textbf{\color{#35bf28}+8.11\%}$
test_setitem_dim[tuple] 55.2610μs 37.4179μs 26.7252 KOps/s 24.1902 KOps/s $\textbf{\color{#35bf28}+10.48\%}$
test_setitem 0.1030ms 17.6066μs 56.7970 KOps/s 52.2010 KOps/s $\textbf{\color{#35bf28}+8.80\%}$
test_set 0.1011ms 16.6359μs 60.1110 KOps/s 53.9754 KOps/s $\textbf{\color{#35bf28}+11.37\%}$
test_set_shared 2.7792ms 0.1012ms 9.8836 KOps/s 9.9869 KOps/s $\color{#d91a1a}-1.03\%$
test_update 0.1102ms 21.2578μs 47.0415 KOps/s 47.0751 KOps/s $\color{#d91a1a}-0.07\%$
test_update_nested 0.1138ms 30.1055μs 33.2165 KOps/s 33.0897 KOps/s $\color{#35bf28}+0.38\%$
test_set_nested 0.1196ms 17.9900μs 55.5865 KOps/s 55.4981 KOps/s $\color{#35bf28}+0.16\%$
test_set_nested_new 98.0610μs 22.6818μs 44.0882 KOps/s 41.6487 KOps/s $\textbf{\color{#35bf28}+5.86\%}$
test_select 65.5910μs 44.9243μs 22.2597 KOps/s 20.3029 KOps/s $\textbf{\color{#35bf28}+9.64\%}$
test_to 72.8210μs 51.6260μs 19.3701 KOps/s 19.7629 KOps/s $\color{#d91a1a}-1.99\%$
test_to_nonblocking 62.8510μs 33.8826μs 29.5137 KOps/s 30.1761 KOps/s $\color{#d91a1a}-2.20\%$
test_unbind_speed 0.3978ms 0.3435ms 2.9112 KOps/s 2.8708 KOps/s $\color{#35bf28}+1.41\%$
test_unbind_speed_stack0 60.0348ms 5.0587ms 197.6778 Ops/s 196.4227 Ops/s $\color{#35bf28}+0.64\%$
test_unbind_speed_stack1 1.5555μs 0.5252μs 1.9041 MOps/s 1.9178 MOps/s $\color{#d91a1a}-0.71\%$
test_split 53.6222ms 1.7785ms 562.2798 Ops/s 563.1767 Ops/s $\color{#d91a1a}-0.16\%$
test_chunk 52.8331ms 1.7634ms 567.0809 Ops/s 570.8887 Ops/s $\color{#d91a1a}-0.67\%$
test_creation[device0] 0.3805ms 0.3097ms 3.2285 KOps/s 3.2263 KOps/s $\color{#35bf28}+0.07\%$
test_creation[device1] 0.7781ms 0.3130ms 3.1950 KOps/s 3.2231 KOps/s $\color{#d91a1a}-0.87\%$
test_creation_from_tensor 0.6495ms 0.3382ms 2.9570 KOps/s 2.7536 KOps/s $\textbf{\color{#35bf28}+7.39\%}$
test_add_one[memmap_tensor0] 0.2820ms 23.1385μs 43.2180 KOps/s 43.8870 KOps/s $\color{#d91a1a}-1.52\%$
test_add_one[memmap_tensor1] 0.2089ms 71.5599μs 13.9743 KOps/s 14.1890 KOps/s $\color{#d91a1a}-1.51\%$
test_contiguous[memmap_tensor0] 26.0010μs 5.7691μs 173.3383 KOps/s 174.8155 KOps/s $\color{#d91a1a}-0.84\%$
test_contiguous[memmap_tensor1] 43.9510μs 21.2340μs 47.0943 KOps/s 48.1601 KOps/s $\color{#d91a1a}-2.21\%$
test_stack[memmap_tensor0] 55.7410μs 19.6338μs 50.9325 KOps/s 53.2372 KOps/s $\color{#d91a1a}-4.33\%$
test_stack[memmap_tensor1] 0.1609ms 71.5558μs 13.9751 KOps/s 14.1485 KOps/s $\color{#d91a1a}-1.23\%$
test_memmaptd_index 0.4582ms 0.4065ms 2.4600 KOps/s 4.6239 KOps/s $\textbf{\color{#d91a1a}-46.80\%}$
test_memmaptd_index_astensor 0.5492ms 0.4657ms 2.1474 KOps/s 3.6563 KOps/s $\textbf{\color{#d91a1a}-41.27\%}$
test_memmaptd_index_op 0.8065ms 0.7226ms 1.3840 KOps/s 1.8992 KOps/s $\textbf{\color{#d91a1a}-27.13\%}$
test_reshape_pytree 54.9210μs 20.6385μs 48.4530 KOps/s 48.5723 KOps/s $\color{#d91a1a}-0.25\%$
test_reshape_td 52.8610μs 29.4492μs 33.9567 KOps/s 34.5154 KOps/s $\color{#d91a1a}-1.62\%$
test_view_pytree 43.9510μs 20.8054μs 48.0644 KOps/s 48.8272 KOps/s $\color{#d91a1a}-1.56\%$
test_view_td 19.6910μs 4.0667μs 245.8975 KOps/s 247.6290 KOps/s $\color{#d91a1a}-0.70\%$
test_unbind_pytree 49.3310μs 25.8164μs 38.7351 KOps/s 38.8197 KOps/s $\color{#d91a1a}-0.22\%$
test_unbind_td 85.1120μs 55.4390μs 18.0379 KOps/s 18.4076 KOps/s $\color{#d91a1a}-2.01\%$
test_split_pytree 42.0800μs 23.9767μs 41.7071 KOps/s 41.9559 KOps/s $\color{#d91a1a}-0.59\%$
test_split_td 72.3820μs 44.7467μs 22.3480 KOps/s 23.3951 KOps/s $\color{#d91a1a}-4.48\%$
test_add_pytree 62.9610μs 30.7401μs 32.5308 KOps/s 31.7088 KOps/s $\color{#35bf28}+2.59\%$
test_add_td 83.9220μs 42.4370μs 23.5644 KOps/s 23.5036 KOps/s $\color{#35bf28}+0.26\%$
test_distributed 66.6310μs 5.5818μs 179.1540 KOps/s 181.9685 KOps/s $\color{#d91a1a}-1.55\%$
test_tdmodule 89.9920μs 16.5684μs 60.3557 KOps/s 60.1574 KOps/s $\color{#35bf28}+0.33\%$
test_tdmodule_dispatch 0.2015ms 32.4835μs 30.7848 KOps/s 29.4525 KOps/s $\color{#35bf28}+4.52\%$
test_tdseq 36.4510μs 19.9953μs 50.0119 KOps/s 45.3747 KOps/s $\textbf{\color{#35bf28}+10.22\%}$
test_tdseq_dispatch 0.1337ms 35.9394μs 27.8246 KOps/s 27.1470 KOps/s $\color{#35bf28}+2.50\%$
test_instantiation_functorch 1.7614ms 1.6521ms 605.2979 Ops/s 592.6381 Ops/s $\color{#35bf28}+2.14\%$
test_instantiation_td 1.7828ms 1.1804ms 847.1393 Ops/s 853.4532 Ops/s $\color{#d91a1a}-0.74\%$
test_exec_functorch 0.2040ms 0.1574ms 6.3541 KOps/s 6.3066 KOps/s $\color{#35bf28}+0.75\%$
test_exec_functional_call 0.2168ms 0.1546ms 6.4692 KOps/s 6.6313 KOps/s $\color{#d91a1a}-2.44\%$
test_exec_td 0.1913ms 0.1428ms 7.0050 KOps/s 7.0924 KOps/s $\color{#d91a1a}-1.23\%$
test_exec_td_decorator 0.8983ms 0.2151ms 4.6484 KOps/s 4.6796 KOps/s $\color{#d91a1a}-0.67\%$
test_vmap_mlp_speed[True-True] 1.1420ms 1.0959ms 912.4932 Ops/s 942.0305 Ops/s $\color{#d91a1a}-3.14\%$
test_vmap_mlp_speed[True-False] 0.6766ms 0.6050ms 1.6528 KOps/s 1.6650 KOps/s $\color{#d91a1a}-0.73\%$
test_vmap_mlp_speed[False-True] 1.1219ms 0.9947ms 1.0053 KOps/s 1.0324 KOps/s $\color{#d91a1a}-2.62\%$
test_vmap_mlp_speed[False-False] 0.6175ms 0.5471ms 1.8278 KOps/s 1.8860 KOps/s $\color{#d91a1a}-3.09\%$
test_vmap_mlp_speed_decorator[True-True] 2.6304ms 1.7877ms 559.3631 Ops/s 567.6835 Ops/s $\color{#d91a1a}-1.47\%$
test_vmap_mlp_speed_decorator[True-False] 1.1945ms 0.6820ms 1.4663 KOps/s 1.4852 KOps/s $\color{#d91a1a}-1.27\%$
test_vmap_mlp_speed_decorator[False-True] 2.0636ms 1.5838ms 631.4026 Ops/s 629.6422 Ops/s $\color{#35bf28}+0.28\%$
test_vmap_mlp_speed_decorator[False-False] 1.0074ms 0.5661ms 1.7666 KOps/s 1.7580 KOps/s $\color{#35bf28}+0.49\%$
test_vmap_transformer_speed[True-True] 12.4382ms 12.2734ms 81.4771 Ops/s 81.1908 Ops/s $\color{#35bf28}+0.35\%$
test_vmap_transformer_speed[True-False] 8.3122ms 8.0072ms 124.8883 Ops/s 124.4708 Ops/s $\color{#35bf28}+0.34\%$
test_vmap_transformer_speed[False-True] 12.7647ms 12.3246ms 81.1383 Ops/s 81.5649 Ops/s $\color{#d91a1a}-0.52\%$
test_vmap_transformer_speed[False-False] 8.2564ms 7.9653ms 125.5448 Ops/s 125.6122 Ops/s $\color{#d91a1a}-0.05\%$
test_vmap_transformer_speed_decorator[True-True] 43.2497ms 42.1658ms 23.7159 Ops/s 23.7761 Ops/s $\color{#d91a1a}-0.25\%$
test_vmap_transformer_speed_decorator[True-False] 95.2215ms 21.2052ms 47.1583 Ops/s 47.0390 Ops/s $\color{#35bf28}+0.25\%$
test_vmap_transformer_speed_decorator[False-True] 44.5876ms 42.1671ms 23.7152 Ops/s 23.9880 Ops/s $\color{#d91a1a}-1.14\%$
test_vmap_transformer_speed_decorator[False-False] 98.5064ms 20.9570ms 47.7168 Ops/s 47.8588 Ops/s $\color{#d91a1a}-0.30\%$

Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just sharing pointers to the equivalent APIs in the pytorch Dataloader for ref, in case you want to align the APIs

tensordict/base.py Outdated Show resolved Hide resolved
tensordict/base.py Show resolved Hide resolved
tensordict/utils.py Outdated Show resolved Hide resolved
@vmoens vmoens marked this pull request as ready for review November 22, 2023 13:02
tensordict/base.py Outdated Show resolved Hide resolved
@vmoens vmoens merged commit a2591ff into main Nov 24, 2023
26 of 45 checks passed
@vmoens vmoens deleted the seeding-pool branch November 24, 2023 11:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants