Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Benchmark] Memmap tensordict benchmarks #432

Merged
merged 1 commit into from
Jun 19, 2023
Merged

[Benchmark] Memmap tensordict benchmarks #432

merged 1 commit into from
Jun 19, 2023

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Jun 19, 2023

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 19, 2023
@vmoens vmoens merged commit 6ddf43e into main Jun 19, 2023
@vmoens vmoens deleted the benchmark_memmap branch June 19, 2023 08:27
@github-actions
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 77. Improved: $\large\color{#35bf28}3$. Worsened: $\large\color{#d91a1a}20$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_items 13.0000μs 3.6489μs 274.0515 KOps/s 315.4834 KOps/s $\textbf{\color{#d91a1a}-13.13\%}$
test_items_nested 1.2939ms 0.4872ms 2.0526 KOps/s 2.0460 KOps/s $\color{#35bf28}+0.32\%$
test_items_nested_leaf 0.3508ms 0.2892ms 3.4581 KOps/s 3.4315 KOps/s $\color{#35bf28}+0.78\%$
test_items_stack_nested 26.2477ms 23.9985ms 41.6693 Ops/s 42.0800 Ops/s $\color{#d91a1a}-0.98\%$
test_items_stack_nested_leaf 17.5271ms 13.2491ms 75.4766 Ops/s 76.7638 Ops/s $\color{#d91a1a}-1.68\%$
test_keys 74.2010μs 7.3868μs 135.3768 KOps/s 143.3242 KOps/s $\textbf{\color{#d91a1a}-5.55\%}$
test_keys_nested 0.2875ms 0.1866ms 5.3586 KOps/s 5.4647 KOps/s $\color{#d91a1a}-1.94\%$
test_keys_nested_leaf 0.2949ms 0.1844ms 5.4232 KOps/s 5.5108 KOps/s $\color{#d91a1a}-1.59\%$
test_keys_stack_nested 1.9570ms 1.7542ms 570.0714 Ops/s 572.4613 Ops/s $\color{#d91a1a}-0.42\%$
test_keys_stack_nested_leaf 1.9514ms 1.7536ms 570.2491 Ops/s 577.8733 Ops/s $\color{#d91a1a}-1.32\%$
test_values 5.6000μs 2.2827μs 438.0741 KOps/s 451.2768 KOps/s $\color{#d91a1a}-2.93\%$
test_values_nested 0.5404ms 0.4841ms 2.0659 KOps/s 2.0484 KOps/s $\color{#35bf28}+0.85\%$
test_values_nested_leaf 0.3625ms 0.2891ms 3.4592 KOps/s 3.4393 KOps/s $\color{#35bf28}+0.58\%$
test_values_stack_nested 27.1391ms 23.9204ms 41.8053 Ops/s 42.8161 Ops/s $\color{#d91a1a}-2.36\%$
test_values_stack_nested_leaf 14.9192ms 13.1568ms 76.0065 Ops/s 79.1617 Ops/s $\color{#d91a1a}-3.99\%$
test_membership 16.9000μs 3.8552μs 259.3873 KOps/s 300.0166 KOps/s $\textbf{\color{#d91a1a}-13.54\%}$
test_membership_nested 31.9010μs 7.2286μs 138.3399 KOps/s 148.2362 KOps/s $\textbf{\color{#d91a1a}-6.68\%}$
test_membership_nested_leaf 19.7000μs 6.8661μs 145.6437 KOps/s 155.6863 KOps/s $\textbf{\color{#d91a1a}-6.45\%}$
test_membership_stacked_nested 35.9000μs 8.1831μs 122.2029 KOps/s 103.6075 KOps/s $\textbf{\color{#35bf28}+17.95\%}$
test_membership_stacked_nested_leaf 24.2010μs 8.1982μs 121.9781 KOps/s 100.7832 KOps/s $\textbf{\color{#35bf28}+21.03\%}$
test_stacked_getleaf 1.3514ms 1.1684ms 855.8855 Ops/s 888.1331 Ops/s $\color{#d91a1a}-3.63\%$
test_stacked_get 1.2871ms 1.1161ms 895.9566 Ops/s 937.3480 Ops/s $\color{#d91a1a}-4.42\%$
test_common_ops 0.9909ms 0.9749ms 1.0257 KOps/s 1.0420 KOps/s $\color{#d91a1a}-1.56\%$
test_creation 4.6901μs 4.2761μs 233.8581 KOps/s 232.7368 KOps/s $\color{#35bf28}+0.48\%$
test_creation_empty 12.1981μs 11.1793μs 89.4511 KOps/s 92.1893 KOps/s $\color{#d91a1a}-2.97\%$
test_creation_nested_1 26.9043μs 21.2555μs 47.0467 KOps/s 49.7255 KOps/s $\textbf{\color{#d91a1a}-5.39\%}$
test_creation_nested_2 22.7642μs 22.0919μs 45.2654 KOps/s 47.4513 KOps/s $\color{#d91a1a}-4.61\%$
test_clone 25.2383μs 20.8537μs 47.9530 KOps/s 49.3350 KOps/s $\color{#d91a1a}-2.80\%$
test_getitem[int] 30.0285μs 26.3154μs 38.0005 KOps/s 38.4724 KOps/s $\color{#d91a1a}-1.23\%$
test_getitem[slice_int] 64.4604μs 60.0198μs 16.6612 KOps/s 16.7869 KOps/s $\color{#d91a1a}-0.75\%$
test_getitem[range] 65.8843μs 59.7825μs 16.7273 KOps/s 17.0300 KOps/s $\color{#d91a1a}-1.78\%$
test_getitem[tuple] 58.7836μs 55.4191μs 18.0443 KOps/s 17.9770 KOps/s $\color{#35bf28}+0.37\%$
test_getitem[list] 55.8915μs 52.6602μs 18.9897 KOps/s 19.2527 KOps/s $\color{#d91a1a}-1.37\%$
test_setitem_dim[int] 83.4010μs 42.6600μs 23.4412 KOps/s 24.8577 KOps/s $\textbf{\color{#d91a1a}-5.70\%}$
test_setitem_dim[slice_int] 0.1253ms 80.7280μs 12.3873 KOps/s 12.8439 KOps/s $\color{#d91a1a}-3.56\%$
test_setitem_dim[range] 0.1496ms 75.0665μs 13.3215 KOps/s 14.1239 KOps/s $\textbf{\color{#d91a1a}-5.68\%}$
test_setitem_dim[tuple] 0.1281ms 73.0613μs 13.6871 KOps/s 14.1808 KOps/s $\color{#d91a1a}-3.48\%$
test_setitem 27.7493μs 26.4340μs 37.8301 KOps/s 39.4136 KOps/s $\color{#d91a1a}-4.02\%$
test_set 26.9932μs 25.8746μs 38.6480 KOps/s 40.6163 KOps/s $\color{#d91a1a}-4.85\%$
test_set_shared 0.1485ms 0.1457ms 6.8612 KOps/s 6.3373 KOps/s $\textbf{\color{#35bf28}+8.27\%}$
test_update 31.5123μs 29.3079μs 34.1205 KOps/s 35.0615 KOps/s $\color{#d91a1a}-2.68\%$
test_update_nested 44.6154μs 43.4254μs 23.0280 KOps/s 23.7707 KOps/s $\color{#d91a1a}-3.12\%$
test_set_nested 38.4394μs 36.3819μs 27.4862 KOps/s 28.5471 KOps/s $\color{#d91a1a}-3.72\%$
test_set_nested_new 53.5255μs 52.2223μs 19.1489 KOps/s 19.5300 KOps/s $\color{#d91a1a}-1.95\%$
test_select 95.7009μs 86.2516μs 11.5940 KOps/s 11.8463 KOps/s $\color{#d91a1a}-2.13\%$
test_creation[device0] 1.3187ms 0.5016ms 1.9936 KOps/s 2.0583 KOps/s $\color{#d91a1a}-3.14\%$
test_creation_from_tensor 0.5852ms 0.4577ms 2.1848 KOps/s 2.1970 KOps/s $\color{#d91a1a}-0.56\%$
test_add_one[memmap_tensor0] 34.2503μs 31.8379μs 31.4091 KOps/s 32.3545 KOps/s $\color{#d91a1a}-2.92\%$
test_contiguous[memmap_tensor0] 9.0571μs 8.3819μs 119.3054 KOps/s 120.6115 KOps/s $\color{#d91a1a}-1.08\%$
test_stack[memmap_tensor0] 0.1583ms 41.9203μs 23.8548 KOps/s 25.2523 KOps/s $\textbf{\color{#d91a1a}-5.53\%}$
test_reshape_pytree 38.4154μs 35.9348μs 27.8282 KOps/s 28.3052 KOps/s $\color{#d91a1a}-1.69\%$
test_reshape_td 41.4604μs 39.1378μs 25.5507 KOps/s 26.0853 KOps/s $\color{#d91a1a}-2.05\%$
test_view_pytree 34.2933μs 32.9657μs 30.3346 KOps/s 31.1175 KOps/s $\color{#d91a1a}-2.52\%$
test_view_td 9.8981μs 9.0522μs 110.4709 KOps/s 112.4788 KOps/s $\color{#d91a1a}-1.79\%$
test_unbind_pytree 38.4123μs 37.0029μs 27.0249 KOps/s 27.6981 KOps/s $\color{#d91a1a}-2.43\%$
test_unbind_td 0.1215ms 0.1195ms 8.3658 KOps/s 8.4587 KOps/s $\color{#d91a1a}-1.10\%$
test_split_pytree 43.9444μs 41.9206μs 23.8546 KOps/s 24.0195 KOps/s $\color{#d91a1a}-0.69\%$
test_split_td 0.1046ms 0.1007ms 9.9346 KOps/s 10.0078 KOps/s $\color{#d91a1a}-0.73\%$
test_add_pytree 46.6314μs 45.1509μs 22.1479 KOps/s 22.7410 KOps/s $\color{#d91a1a}-2.61\%$
test_add_td 53.6605μs 51.8624μs 19.2818 KOps/s 19.7142 KOps/s $\color{#d91a1a}-2.19\%$
test_distributed 88.8000μs 88.8000μs 11.2613 KOps/s 14.1044 KOps/s $\textbf{\color{#d91a1a}-20.16\%}$
test_tdmodule 64.2010μs 23.7975μs 42.0211 KOps/s 44.5772 KOps/s $\textbf{\color{#d91a1a}-5.73\%}$
test_tdmodule_dispatch 0.2694ms 54.2964μs 18.4174 KOps/s 20.0196 KOps/s $\textbf{\color{#d91a1a}-8.00\%}$
test_tdseq 0.2415ms 34.4640μs 29.0158 KOps/s 35.0184 KOps/s $\textbf{\color{#d91a1a}-17.14\%}$
test_tdseq_dispatch 0.2352ms 65.6611μs 15.2297 KOps/s 14.7098 KOps/s $\color{#35bf28}+3.53\%$
test_instantiation_functorch 10.5584ms 1.6598ms 602.4846 Ops/s 639.7883 Ops/s $\textbf{\color{#d91a1a}-5.83\%}$
test_instantiation_td 1.4014ms 1.2153ms 822.8716 Ops/s 828.2471 Ops/s $\color{#d91a1a}-0.65\%$
test_exec_functorch 0.1898ms 0.1846ms 5.4157 KOps/s 5.6378 KOps/s $\color{#d91a1a}-3.94\%$
test_exec_td 0.1763ms 0.1720ms 5.8156 KOps/s 5.9759 KOps/s $\color{#d91a1a}-2.68\%$
test_vmap_mlp_speed[True-True] 1.5639ms 1.4735ms 678.6638 Ops/s 714.6856 Ops/s $\textbf{\color{#d91a1a}-5.04\%}$
test_vmap_mlp_speed[True-False] 0.7010ms 0.6707ms 1.4910 KOps/s 1.5872 KOps/s $\textbf{\color{#d91a1a}-6.06\%}$
test_vmap_mlp_speed[False-True] 1.6535ms 1.2664ms 789.6526 Ops/s 835.5476 Ops/s $\textbf{\color{#d91a1a}-5.49\%}$
test_vmap_mlp_speed[False-False] 0.6962ms 0.5570ms 1.7953 KOps/s 1.8778 KOps/s $\color{#d91a1a}-4.40\%$
test_vmap_transformer_speed[True-True] 21.6093ms 17.6126ms 56.7776 Ops/s 60.7536 Ops/s $\textbf{\color{#d91a1a}-6.54\%}$
test_vmap_transformer_speed[True-False] 9.0201ms 8.7367ms 114.4591 Ops/s 120.9275 Ops/s $\textbf{\color{#d91a1a}-5.35\%}$
test_vmap_transformer_speed[False-True] 17.4961ms 16.7805ms 59.5931 Ops/s 62.4206 Ops/s $\color{#d91a1a}-4.53\%$
test_vmap_transformer_speed[False-False] 9.0050ms 8.7104ms 114.8049 Ops/s 122.8104 Ops/s $\textbf{\color{#d91a1a}-6.52\%}$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants