Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Better constructors for MemoryMappedTensors #557

Merged
merged 1 commit into from
Nov 15, 2023

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Nov 15, 2023

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 15, 2023
@vmoens vmoens added the enhancement New feature or request label Nov 15, 2023
@vmoens vmoens marked this pull request as ready for review November 15, 2023 09:40
@vmoens vmoens merged commit b862fe2 into main Nov 15, 2023
24 of 31 checks passed
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 105. Improved: $\large\color{#35bf28}2$. Worsened: $\large\color{#d91a1a}4$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 49.5770μs 15.0034μs 66.6517 KOps/s 66.9385 KOps/s $\color{#d91a1a}-0.43\%$
test_plain_set_stack_nested 0.2158ms 0.1418ms 7.0516 KOps/s 7.1802 KOps/s $\color{#d91a1a}-1.79\%$
test_plain_set_nested_inplace 43.0610μs 18.3963μs 54.3588 KOps/s 54.5733 KOps/s $\color{#d91a1a}-0.39\%$
test_plain_set_stack_nested_inplace 0.2464ms 0.1713ms 5.8364 KOps/s 5.8324 KOps/s $\color{#35bf28}+0.07\%$
test_items 16.8210μs 2.3924μs 417.9973 KOps/s 408.2788 KOps/s $\color{#35bf28}+2.38\%$
test_items_nested 0.5803ms 0.2733ms 3.6589 KOps/s 3.7029 KOps/s $\color{#d91a1a}-1.19\%$
test_items_nested_locked 0.5176ms 0.2733ms 3.6592 KOps/s 3.6904 KOps/s $\color{#d91a1a}-0.84\%$
test_items_nested_leaf 0.5499ms 0.1685ms 5.9330 KOps/s 5.9520 KOps/s $\color{#d91a1a}-0.32\%$
test_items_stack_nested 2.1430ms 1.4270ms 700.7485 Ops/s 704.1781 Ops/s $\color{#d91a1a}-0.49\%$
test_items_stack_nested_leaf 1.9580ms 1.2951ms 772.1617 Ops/s 784.8335 Ops/s $\color{#d91a1a}-1.61\%$
test_items_stack_nested_locked 1.9311ms 0.7895ms 1.2667 KOps/s 1.3289 KOps/s $\color{#d91a1a}-4.68\%$
test_keys 29.0950μs 3.9176μs 255.2559 KOps/s 257.2691 KOps/s $\color{#d91a1a}-0.78\%$
test_keys_nested 0.5376ms 0.1419ms 7.0461 KOps/s 6.8223 KOps/s $\color{#35bf28}+3.28\%$
test_keys_nested_locked 0.2040ms 0.1416ms 7.0606 KOps/s 7.2703 KOps/s $\color{#d91a1a}-2.88\%$
test_keys_nested_leaf 0.2562ms 0.1389ms 7.2000 KOps/s 7.3942 KOps/s $\color{#d91a1a}-2.63\%$
test_keys_stack_nested 1.9739ms 1.3278ms 753.1257 Ops/s 764.6482 Ops/s $\color{#d91a1a}-1.51\%$
test_keys_stack_nested_leaf 2.0783ms 1.3157ms 760.0616 Ops/s 754.1106 Ops/s $\color{#35bf28}+0.79\%$
test_keys_stack_nested_locked 1.1557ms 0.6445ms 1.5516 KOps/s 1.5914 KOps/s $\color{#d91a1a}-2.50\%$
test_values 8.8143μs 1.5512μs 644.6579 KOps/s 810.9513 KOps/s $\textbf{\color{#d91a1a}-20.51\%}$
test_values_nested 0.1124ms 48.1413μs 20.7722 KOps/s 20.8244 KOps/s $\color{#d91a1a}-0.25\%$
test_values_nested_locked 2.9861ms 47.7219μs 20.9547 KOps/s 20.9031 KOps/s $\color{#35bf28}+0.25\%$
test_values_nested_leaf 0.1235ms 42.6115μs 23.4678 KOps/s 23.3578 KOps/s $\color{#35bf28}+0.47\%$
test_values_stack_nested 1.8059ms 1.1419ms 875.7100 Ops/s 895.0286 Ops/s $\color{#d91a1a}-2.16\%$
test_values_stack_nested_leaf 1.8848ms 1.1293ms 885.5248 Ops/s 898.6621 Ops/s $\color{#d91a1a}-1.46\%$
test_values_stack_nested_locked 0.8753ms 0.5063ms 1.9752 KOps/s 2.0291 KOps/s $\color{#d91a1a}-2.66\%$
test_membership 37.9110μs 1.3376μs 747.5977 KOps/s 721.9954 KOps/s $\color{#35bf28}+3.55\%$
test_membership_nested 44.5240μs 2.8328μs 353.0139 KOps/s 350.6209 KOps/s $\color{#35bf28}+0.68\%$
test_membership_nested_leaf 48.2010μs 2.7994μs 357.2220 KOps/s 349.0479 KOps/s $\color{#35bf28}+2.34\%$
test_membership_stacked_nested 32.9620μs 11.9821μs 83.4578 KOps/s 84.4788 KOps/s $\color{#d91a1a}-1.21\%$
test_membership_stacked_nested_leaf 59.3510μs 11.9861μs 83.4303 KOps/s 83.8562 KOps/s $\color{#d91a1a}-0.51\%$
test_membership_nested_last 27.6720μs 6.0089μs 166.4191 KOps/s 165.0857 KOps/s $\color{#35bf28}+0.81\%$
test_membership_nested_leaf_last 52.4580μs 5.9524μs 168.0008 KOps/s 166.2531 KOps/s $\color{#35bf28}+1.05\%$
test_membership_stacked_nested_last 0.3633ms 0.1828ms 5.4711 KOps/s 5.5288 KOps/s $\color{#d91a1a}-1.04\%$
test_membership_stacked_nested_leaf_last 63.3900μs 14.0068μs 71.3937 KOps/s 72.1600 KOps/s $\color{#d91a1a}-1.06\%$
test_nested_getleaf 54.1920μs 12.3024μs 81.2851 KOps/s 84.8031 KOps/s $\color{#d91a1a}-4.15\%$
test_nested_get 0.1448ms 12.1614μs 82.2273 KOps/s 89.3471 KOps/s $\textbf{\color{#d91a1a}-7.97\%}$
test_stacked_getleaf 4.0822ms 0.5982ms 1.6717 KOps/s 1.7192 KOps/s $\color{#d91a1a}-2.76\%$
test_stacked_get 0.6554ms 0.5667ms 1.7647 KOps/s 1.7978 KOps/s $\color{#d91a1a}-1.84\%$
test_nested_getitemleaf 52.8090μs 12.2102μs 81.8988 KOps/s 84.9556 KOps/s $\color{#d91a1a}-3.60\%$
test_nested_getitem 51.0550μs 11.5442μs 86.6239 KOps/s 88.8755 KOps/s $\color{#d91a1a}-2.53\%$
test_stacked_getitemleaf 0.6913ms 0.5917ms 1.6901 KOps/s 1.7185 KOps/s $\color{#d91a1a}-1.65\%$
test_stacked_getitem 0.7083ms 0.5676ms 1.7619 KOps/s 1.8011 KOps/s $\color{#d91a1a}-2.18\%$
test_lock_nested 55.0422ms 0.9519ms 1.0505 KOps/s 1.1026 KOps/s $\color{#d91a1a}-4.72\%$
test_lock_stack_nested 73.9215ms 12.9765ms 77.0624 Ops/s 73.7082 Ops/s $\color{#35bf28}+4.55\%$
test_unlock_nested 60.8130ms 0.9551ms 1.0470 KOps/s 1.0385 KOps/s $\color{#35bf28}+0.82\%$
test_unlock_stack_nested 73.7778ms 13.3221ms 75.0634 Ops/s 71.5841 Ops/s $\color{#35bf28}+4.86\%$
test_flatten_speed 0.7556ms 0.6709ms 1.4906 KOps/s 1.4988 KOps/s $\color{#d91a1a}-0.55\%$
test_unflatten_speed 1.7884ms 1.1552ms 865.6493 Ops/s 846.1624 Ops/s $\color{#35bf28}+2.30\%$
test_common_ops 4.4978ms 0.6394ms 1.5640 KOps/s 1.5491 KOps/s $\color{#35bf28}+0.96\%$
test_creation 20.6690μs 2.3972μs 417.1559 KOps/s 459.2437 KOps/s $\textbf{\color{#d91a1a}-9.16\%}$
test_creation_empty 28.8650μs 7.6291μs 131.0767 KOps/s 133.3224 KOps/s $\color{#d91a1a}-1.68\%$
test_creation_nested_1 57.3180μs 11.4825μs 87.0892 KOps/s 88.3592 KOps/s $\color{#d91a1a}-1.44\%$
test_creation_nested_2 49.4630μs 13.9948μs 71.4550 KOps/s 72.8035 KOps/s $\color{#d91a1a}-1.85\%$
test_clone 73.9190μs 11.0974μs 90.1111 KOps/s 91.5881 KOps/s $\color{#d91a1a}-1.61\%$
test_getitem[int] 55.8250μs 13.1627μs 75.9721 KOps/s 74.2224 KOps/s $\color{#35bf28}+2.36\%$
test_getitem[slice_int] 0.1023ms 31.0524μs 32.2036 KOps/s 31.9534 KOps/s $\color{#35bf28}+0.78\%$
test_getitem[range] 0.2366ms 56.0277μs 17.8483 KOps/s 17.3124 KOps/s $\color{#35bf28}+3.10\%$
test_getitem[tuple] 78.5380μs 24.4100μs 40.9668 KOps/s 40.5660 KOps/s $\color{#35bf28}+0.99\%$
test_getitem[list] 0.1789ms 51.0360μs 19.5940 KOps/s 19.4798 KOps/s $\color{#35bf28}+0.59\%$
test_setitem_dim[int] 48.6010μs 26.4224μs 37.8467 KOps/s 35.8906 KOps/s $\textbf{\color{#35bf28}+5.45\%}$
test_setitem_dim[slice_int] 73.5580μs 51.7245μs 19.3332 KOps/s 18.9343 KOps/s $\color{#35bf28}+2.11\%$
test_setitem_dim[range] 0.1232ms 71.8395μs 13.9199 KOps/s 13.6826 KOps/s $\color{#35bf28}+1.73\%$
test_setitem_dim[tuple] 64.6720μs 39.4843μs 25.3265 KOps/s 23.5584 KOps/s $\textbf{\color{#35bf28}+7.51\%}$
test_setitem 70.2930μs 15.6961μs 63.7100 KOps/s 64.5123 KOps/s $\color{#d91a1a}-1.24\%$
test_set 81.7040μs 14.8969μs 67.1278 KOps/s 68.4468 KOps/s $\color{#d91a1a}-1.93\%$
test_set_shared 2.9923ms 0.1580ms 6.3282 KOps/s 6.2245 KOps/s $\color{#35bf28}+1.67\%$
test_update 0.1027ms 19.2952μs 51.8264 KOps/s 51.1228 KOps/s $\color{#35bf28}+1.38\%$
test_update_nested 95.7610μs 27.9529μs 35.7744 KOps/s 35.1622 KOps/s $\color{#35bf28}+1.74\%$
test_set_nested 92.2540μs 16.8636μs 59.2992 KOps/s 59.7416 KOps/s $\color{#d91a1a}-0.74\%$
test_set_nested_new 0.1826ms 22.7466μs 43.9626 KOps/s 43.4780 KOps/s $\color{#35bf28}+1.11\%$
test_select 0.1428ms 47.0911μs 21.2354 KOps/s 21.0612 KOps/s $\color{#35bf28}+0.83\%$
test_unbind_speed 0.4334ms 0.2889ms 3.4616 KOps/s 3.4458 KOps/s $\color{#35bf28}+0.46\%$
test_unbind_speed_stack0 60.2022ms 4.5054ms 221.9548 Ops/s 213.9566 Ops/s $\color{#35bf28}+3.74\%$
test_unbind_speed_stack1 2.6224μs 0.6039μs 1.6559 MOps/s 1.5949 MOps/s $\color{#35bf28}+3.82\%$
test_creation[device0] 0.8271ms 0.2954ms 3.3857 KOps/s 3.3469 KOps/s $\color{#35bf28}+1.16\%$
test_creation_from_tensor 3.2824ms 0.3300ms 3.0301 KOps/s 3.0472 KOps/s $\color{#d91a1a}-0.56\%$
test_add_one[memmap_tensor0] 0.3504ms 25.5810μs 39.0915 KOps/s 38.4633 KOps/s $\color{#35bf28}+1.63\%$
test_contiguous[memmap_tensor0] 19.4060μs 5.7997μs 172.4238 KOps/s 173.6567 KOps/s $\color{#d91a1a}-0.71\%$
test_stack[memmap_tensor0] 79.8210μs 18.9484μs 52.7750 KOps/s 52.0012 KOps/s $\color{#35bf28}+1.49\%$
test_memmaptd_index 0.4343ms 0.1806ms 5.5362 KOps/s 5.4563 KOps/s $\color{#35bf28}+1.47\%$
test_memmaptd_index_astensor 0.4876ms 0.2446ms 4.0889 KOps/s 4.0645 KOps/s $\color{#35bf28}+0.60\%$
test_memmaptd_index_op 0.5785ms 0.4685ms 2.1344 KOps/s 2.1062 KOps/s $\color{#35bf28}+1.34\%$
test_reshape_pytree 71.7960μs 23.7824μs 42.0480 KOps/s 42.5962 KOps/s $\color{#d91a1a}-1.29\%$
test_reshape_td 49.3030μs 21.3349μs 46.8714 KOps/s 46.9074 KOps/s $\color{#d91a1a}-0.08\%$
test_view_pytree 76.0330μs 23.7858μs 42.0420 KOps/s 42.0078 KOps/s $\color{#35bf28}+0.08\%$
test_view_td 19.1460μs 4.3215μs 231.4011 KOps/s 237.3761 KOps/s $\color{#d91a1a}-2.52\%$
test_unbind_pytree 60.6440μs 27.1319μs 36.8569 KOps/s 36.8750 KOps/s $\color{#d91a1a}-0.05\%$
test_unbind_td 94.7580μs 40.8368μs 24.4877 KOps/s 24.6971 KOps/s $\color{#d91a1a}-0.85\%$
test_split_pytree 56.6670μs 27.0940μs 36.9086 KOps/s 37.4536 KOps/s $\color{#d91a1a}-1.46\%$
test_split_td 0.1935ms 77.4311μs 12.9147 KOps/s 13.0971 KOps/s $\color{#d91a1a}-1.39\%$
test_add_pytree 96.6320μs 32.3007μs 30.9591 KOps/s 30.8401 KOps/s $\color{#35bf28}+0.39\%$
test_add_td 0.1047ms 43.3412μs 23.0727 KOps/s 23.4394 KOps/s $\color{#d91a1a}-1.56\%$
test_distributed 29.0750μs 6.1272μs 163.2058 KOps/s 160.4231 KOps/s $\color{#35bf28}+1.73\%$
test_tdmodule 0.1752ms 21.4167μs 46.6925 KOps/s 46.4292 KOps/s $\color{#35bf28}+0.57\%$
test_tdmodule_dispatch 0.2187ms 38.6934μs 25.8442 KOps/s 26.2124 KOps/s $\color{#d91a1a}-1.40\%$
test_tdseq 0.1160ms 23.6338μs 42.3123 KOps/s 42.1087 KOps/s $\color{#35bf28}+0.48\%$
test_tdseq_dispatch 0.4573ms 42.0480μs 23.7823 KOps/s 23.8048 KOps/s $\color{#d91a1a}-0.09\%$
test_instantiation_functorch 2.0653ms 1.3272ms 753.4718 Ops/s 774.1282 Ops/s $\color{#d91a1a}-2.67\%$
test_instantiation_td 63.8733ms 1.1183ms 894.1873 Ops/s 961.2880 Ops/s $\textbf{\color{#d91a1a}-6.98\%}$
test_exec_functorch 0.2216ms 0.1468ms 6.8131 KOps/s 6.6068 KOps/s $\color{#35bf28}+3.12\%$
test_exec_td 0.2211ms 0.1434ms 6.9719 KOps/s 6.7297 KOps/s $\color{#35bf28}+3.60\%$
test_vmap_mlp_speed[True-True] 1.0218ms 0.8453ms 1.1830 KOps/s 1.1554 KOps/s $\color{#35bf28}+2.38\%$
test_vmap_mlp_speed[True-False] 0.5533ms 0.4652ms 2.1495 KOps/s 2.1188 KOps/s $\color{#35bf28}+1.45\%$
test_vmap_mlp_speed[False-True] 1.2076ms 0.7358ms 1.3590 KOps/s 1.3347 KOps/s $\color{#35bf28}+1.82\%$
test_vmap_mlp_speed[False-False] 0.7416ms 0.3879ms 2.5780 KOps/s 2.5602 KOps/s $\color{#35bf28}+0.70\%$

@vmoens vmoens deleted the memmap_constructor_kwargs branch November 15, 2023 12:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants