-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Performance] Faster set #619
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Jan 15, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.1474ms | 17.4049μs | 57.4551 KOps/s | 55.0453 KOps/s | |
test_plain_set_stack_nested | 0.1987ms | 0.1456ms | 6.8668 KOps/s | 6.6853 KOps/s | |
test_plain_set_nested_inplace | 66.8750μs | 19.6244μs | 50.9571 KOps/s | 49.4518 KOps/s | |
test_plain_set_stack_nested_inplace | 0.3188ms | 0.1799ms | 5.5599 KOps/s | 5.4782 KOps/s | |
test_items | 18.5340μs | 2.4148μs | 414.1109 KOps/s | 403.4641 KOps/s | |
test_items_nested | 0.4429ms | 0.2654ms | 3.7676 KOps/s | 3.6988 KOps/s | |
test_items_nested_locked | 0.6331ms | 0.2798ms | 3.5737 KOps/s | 3.6817 KOps/s | |
test_items_nested_leaf | 0.2909ms | 0.1644ms | 6.0812 KOps/s | 5.9162 KOps/s | |
test_items_stack_nested | 1.7300ms | 1.3195ms | 757.8714 Ops/s | 705.8108 Ops/s | |
test_items_stack_nested_leaf | 1.3097ms | 1.1929ms | 838.2958 Ops/s | 828.1432 Ops/s | |
test_items_stack_nested_locked | 1.1261ms | 0.8722ms | 1.1466 KOps/s | 1.1441 KOps/s | |
test_keys | 20.1780μs | 3.8007μs | 263.1101 KOps/s | 261.8893 KOps/s | |
test_keys_nested | 59.6221ms | 0.1581ms | 6.3252 KOps/s | 6.8204 KOps/s | |
test_keys_nested_locked | 0.2666ms | 0.1479ms | 6.7625 KOps/s | 6.8350 KOps/s | |
test_keys_nested_leaf | 0.2277ms | 0.1304ms | 7.6696 KOps/s | 7.8440 KOps/s | |
test_keys_stack_nested | 1.7626ms | 1.2749ms | 784.3967 Ops/s | 764.7685 Ops/s | |
test_keys_stack_nested_leaf | 1.6094ms | 1.2599ms | 793.7245 Ops/s | 763.9428 Ops/s | |
test_keys_stack_nested_locked | 1.0252ms | 0.7982ms | 1.2528 KOps/s | 1.2363 KOps/s | |
test_values | 7.3716μs | 1.0551μs | 947.7851 KOps/s | 880.9837 KOps/s | |
test_values_nested | 99.4550μs | 51.9548μs | 19.2475 KOps/s | 19.1889 KOps/s | |
test_values_nested_locked | 97.4020μs | 52.7308μs | 18.9643 KOps/s | 19.0777 KOps/s | |
test_values_nested_leaf | 0.1029ms | 47.6777μs | 20.9741 KOps/s | 21.5859 KOps/s | |
test_values_stack_nested | 1.3010ms | 1.0509ms | 951.5260 Ops/s | 942.7298 Ops/s | |
test_values_stack_nested_leaf | 1.2065ms | 1.0385ms | 962.9303 Ops/s | 957.4539 Ops/s | |
test_values_stack_nested_locked | 3.4413ms | 0.6157ms | 1.6241 KOps/s | 1.6339 KOps/s | |
test_membership | 37.5200μs | 1.3521μs | 739.5755 KOps/s | 733.5604 KOps/s | |
test_membership_nested | 35.4060μs | 2.8890μs | 346.1413 KOps/s | 344.0494 KOps/s | |
test_membership_nested_leaf | 39.3230μs | 2.9555μs | 338.3569 KOps/s | 345.0321 KOps/s | |
test_membership_stacked_nested | 30.0860μs | 12.0192μs | 83.2001 KOps/s | 83.4028 KOps/s | |
test_membership_stacked_nested_leaf | 61.7650μs | 11.9799μs | 83.4735 KOps/s | 83.2538 KOps/s | |
test_membership_nested_last | 37.0980μs | 6.1050μs | 163.7989 KOps/s | 167.1081 KOps/s | |
test_membership_nested_leaf_last | 39.3030μs | 6.1398μs | 162.8730 KOps/s | 166.3374 KOps/s | |
test_membership_stacked_nested_last | 0.3059ms | 0.1689ms | 5.9212 KOps/s | 5.9165 KOps/s | |
test_membership_stacked_nested_leaf_last | 50.2740μs | 14.0437μs | 71.2063 KOps/s | 72.1471 KOps/s | |
test_nested_getleaf | 55.2530μs | 10.7147μs | 93.3297 KOps/s | 94.8429 KOps/s | |
test_nested_get | 42.1290μs | 10.0794μs | 99.2118 KOps/s | 97.9070 KOps/s | |
test_stacked_getleaf | 0.5442ms | 0.4033ms | 2.4795 KOps/s | 2.4401 KOps/s | |
test_stacked_get | 0.6758ms | 0.3682ms | 2.7159 KOps/s | 2.6773 KOps/s | |
test_nested_getitemleaf | 39.9640μs | 10.6326μs | 94.0500 KOps/s | 94.5793 KOps/s | |
test_nested_getitem | 43.4310μs | 10.0771μs | 99.2351 KOps/s | 100.0480 KOps/s | |
test_stacked_getitemleaf | 0.6179ms | 0.4062ms | 2.4616 KOps/s | 2.4236 KOps/s | |
test_stacked_getitem | 0.6182ms | 0.3680ms | 2.7177 KOps/s | 2.6582 KOps/s | |
test_lock_nested | 1.3040ms | 0.4175ms | 2.3951 KOps/s | 2.4080 KOps/s | |
test_lock_stack_nested | 88.6757ms | 6.9351ms | 144.1935 Ops/s | 144.7434 Ops/s | |
test_unlock_nested | 69.0728ms | 0.4903ms | 2.0395 KOps/s | 2.3658 KOps/s | |
test_unlock_stack_nested | 82.2344ms | 6.4046ms | 156.1384 Ops/s | 155.5763 Ops/s | |
test_flatten_speed | 0.7273ms | 0.3727ms | 2.6833 KOps/s | 2.6550 KOps/s | |
test_unflatten_speed | 0.6560ms | 0.4584ms | 2.1815 KOps/s | 2.1728 KOps/s | |
test_common_ops | 1.4277ms | 0.7007ms | 1.4271 KOps/s | 1.3917 KOps/s | |
test_creation | 85.8590μs | 2.0069μs | 498.2833 KOps/s | 494.4449 KOps/s | |
test_creation_empty | 46.9070μs | 11.0634μs | 90.3885 KOps/s | 85.2331 KOps/s | |
test_creation_nested_1 | 56.8650μs | 14.1503μs | 70.6698 KOps/s | 67.9512 KOps/s | |
test_creation_nested_2 | 53.8000μs | 17.3278μs | 57.7107 KOps/s | 50.7208 KOps/s | |
test_clone | 0.2299ms | 12.3308μs | 81.0978 KOps/s | 82.2104 KOps/s | |
test_getitem[int] | 43.3710μs | 11.7848μs | 84.8553 KOps/s | 84.4453 KOps/s | |
test_getitem[slice_int] | 55.8440μs | 22.8609μs | 43.7427 KOps/s | 42.6405 KOps/s | |
test_getitem[range] | 99.8860μs | 44.4289μs | 22.5079 KOps/s | 23.9788 KOps/s | |
test_getitem[tuple] | 71.4930μs | 18.8429μs | 53.0705 KOps/s | 52.6004 KOps/s | |
test_getitem[list] | 0.4974ms | 38.0287μs | 26.2960 KOps/s | 26.5253 KOps/s | |
test_setitem_dim[int] | 82.3630μs | 30.4845μs | 32.8035 KOps/s | 29.8512 KOps/s | |
test_setitem_dim[slice_int] | 91.1290μs | 56.9866μs | 17.5480 KOps/s | 16.9031 KOps/s | |
test_setitem_dim[range] | 0.1229ms | 78.5496μs | 12.7308 KOps/s | 12.9518 KOps/s | |
test_setitem_dim[tuple] | 98.2130μs | 45.3928μs | 22.0299 KOps/s | 21.2960 KOps/s | |
test_setitem | 0.2509ms | 18.9518μs | 52.7654 KOps/s | 50.9169 KOps/s | |
test_set | 0.2126ms | 18.1843μs | 54.9925 KOps/s | 52.2237 KOps/s | |
test_set_shared | 2.1551ms | 0.1390ms | 7.1919 KOps/s | 7.0258 KOps/s | |
test_update | 0.2128ms | 21.8317μs | 45.8051 KOps/s | 44.5790 KOps/s | |
test_update_nested | 0.2282ms | 28.9964μs | 34.4871 KOps/s | 33.7546 KOps/s | |
test_set_nested | 0.2172ms | 19.9508μs | 50.1233 KOps/s | 48.1916 KOps/s | |
test_set_nested_new | 0.2175ms | 25.0202μs | 39.9676 KOps/s | 40.3067 KOps/s | |
test_select | 0.1202ms | 47.9843μs | 20.8402 KOps/s | 20.4530 KOps/s | |
test_unbind_speed | 0.5067ms | 0.3423ms | 2.9216 KOps/s | 2.9574 KOps/s | |
test_unbind_speed_stack0 | 69.9919ms | 4.5807ms | 218.3086 Ops/s | 234.2013 Ops/s | |
test_unbind_speed_stack1 | 2.5757μs | 0.6200μs | 1.6129 MOps/s | 1.5195 MOps/s | |
test_split | 2.3586ms | 1.5571ms | 642.2103 Ops/s | 589.0819 Ops/s | |
test_chunk | 69.4639ms | 1.6630ms | 601.3186 Ops/s | 598.0948 Ops/s | |
test_creation[device0] | 0.2261ms | 0.1007ms | 9.9288 KOps/s | 9.9772 KOps/s | |
test_creation_from_tensor | 3.1541ms | 80.7516μs | 12.3837 KOps/s | 12.2576 KOps/s | |
test_add_one[memmap_tensor0] | 0.4102ms | 5.1425μs | 194.4593 KOps/s | 194.2104 KOps/s | |
test_contiguous[memmap_tensor0] | 13.0740μs | 0.6407μs | 1.5607 MOps/s | 1.5950 MOps/s | |
test_stack[memmap_tensor0] | 71.1420μs | 3.4960μs | 286.0436 KOps/s | 293.1548 KOps/s | |
test_memmaptd_index | 0.4047ms | 0.1997ms | 5.0064 KOps/s | 5.0950 KOps/s | |
test_memmaptd_index_astensor | 0.9717ms | 0.2614ms | 3.8259 KOps/s | 3.8652 KOps/s | |
test_memmaptd_index_op | 0.8038ms | 0.5451ms | 1.8346 KOps/s | 1.7924 KOps/s | |
test_serialize_model | 0.1021s | 98.9067ms | 10.1105 Ops/s | 8.9330 Ops/s | |
test_serialize_model_pickle | 0.4650s | 0.3799s | 2.6320 Ops/s | 2.5846 Ops/s | |
test_serialize_weights | 0.1731s | 0.1036s | 9.6551 Ops/s | 9.3534 Ops/s | |
test_serialize_weights_returnearly | 0.1798s | 0.1288s | 7.7658 Ops/s | 7.3567 Ops/s | |
test_serialize_weights_pickle | 1.0858s | 0.6132s | 1.6307 Ops/s | 1.5945 Ops/s | |
test_serialize_weights_filesystem | 0.1588s | 95.6355ms | 10.4564 Ops/s | 10.6125 Ops/s | |
test_serialize_model_filesystem | 98.1990ms | 91.4000ms | 10.9409 Ops/s | 10.1290 Ops/s | |
test_reshape_pytree | 66.3530μs | 23.1842μs | 43.1328 KOps/s | 42.4549 KOps/s | |
test_reshape_td | 62.5070μs | 31.0382μs | 32.2184 KOps/s | 31.7562 KOps/s | |
test_view_pytree | 55.7640μs | 23.0998μs | 43.2904 KOps/s | 43.8069 KOps/s | |
test_view_td | 34.8550μs | 4.8768μs | 205.0533 KOps/s | 201.5123 KOps/s | |
test_unbind_pytree | 60.6330μs | 26.2771μs | 38.0560 KOps/s | 37.6472 KOps/s | |
test_unbind_td | 0.1181ms | 54.8436μs | 18.2337 KOps/s | 18.1446 KOps/s | |
test_split_pytree | 74.1380μs | 26.6978μs | 37.4563 KOps/s | 38.2040 KOps/s | |
test_split_td | 0.5223ms | 43.4233μs | 23.0291 KOps/s | 22.8524 KOps/s | |
test_add_pytree | 93.4240μs | 31.7574μs | 31.4887 KOps/s | 31.3653 KOps/s | |
test_add_td | 0.1524ms | 51.7880μs | 19.3095 KOps/s | 19.5074 KOps/s | |
test_distributed | 0.2183ms | 97.4719μs | 10.2594 KOps/s | 9.6886 KOps/s | |
test_tdmodule | 0.1122ms | 23.1429μs | 43.2099 KOps/s | 41.8353 KOps/s | |
test_tdmodule_dispatch | 0.2270ms | 42.4961μs | 23.5316 KOps/s | 23.1495 KOps/s | |
test_tdseq | 54.5520μs | 26.4907μs | 37.7491 KOps/s | 36.7227 KOps/s | |
test_tdseq_dispatch | 0.1443ms | 46.8900μs | 21.3265 KOps/s | 20.8761 KOps/s | |
test_instantiation_functorch | 1.5165ms | 1.2998ms | 769.3472 Ops/s | 760.7691 Ops/s | |
test_instantiation_td | 1.6187ms | 1.0275ms | 973.2514 Ops/s | 977.2524 Ops/s | |
test_exec_functorch | 0.2939ms | 0.1605ms | 6.2293 KOps/s | 6.3072 KOps/s | |
test_exec_functional_call | 0.2941ms | 0.1487ms | 6.7242 KOps/s | 6.8808 KOps/s | |
test_exec_td | 0.2658ms | 0.1433ms | 6.9759 KOps/s | 7.1467 KOps/s | |
test_exec_td_decorator | 0.7582ms | 0.1818ms | 5.4997 KOps/s | 5.6534 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.1650ms | 0.8804ms | 1.1359 KOps/s | 1.1381 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.6894ms | 0.4711ms | 2.1229 KOps/s | 2.1433 KOps/s | |
test_vmap_mlp_speed[False-True] | 1.2347ms | 0.7765ms | 1.2878 KOps/s | 1.3189 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.5563ms | 0.3863ms | 2.5886 KOps/s | 2.6602 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 3.2518ms | 2.4571ms | 406.9840 Ops/s | 377.5929 Ops/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8686ms | 0.5242ms | 1.9078 KOps/s | 1.9273 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 79.1764ms | 2.1393ms | 467.4368 Ops/s | 501.9978 Ops/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.6989ms | 0.3994ms | 2.5040 KOps/s | 2.5507 KOps/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.1331ms | 14.1728μs | 70.5576 KOps/s | 75.6005 KOps/s | |
test_plain_set_stack_nested | 0.1400ms | 0.1182ms | 8.4604 KOps/s | 8.6574 KOps/s | |
test_plain_set_nested_inplace | 32.6000μs | 15.3839μs | 65.0032 KOps/s | 68.8844 KOps/s | |
test_plain_set_stack_nested_inplace | 0.1862ms | 0.1446ms | 6.9146 KOps/s | 6.9789 KOps/s | |
test_items | 24.0200μs | 4.7008μs | 212.7310 KOps/s | 211.7866 KOps/s | |
test_items_nested | 0.3785ms | 0.3396ms | 2.9450 KOps/s | 2.9682 KOps/s | |
test_items_nested_locked | 0.3927ms | 0.3398ms | 2.9426 KOps/s | 2.9510 KOps/s | |
test_items_nested_leaf | 0.2209ms | 0.1984ms | 5.0400 KOps/s | 5.0811 KOps/s | |
test_items_stack_nested | 1.3855ms | 1.2672ms | 789.1410 Ops/s | 768.6245 Ops/s | |
test_items_stack_nested_leaf | 1.2710ms | 1.1086ms | 902.0275 Ops/s | 886.6226 Ops/s | |
test_items_stack_nested_locked | 1.0286ms | 0.8871ms | 1.1273 KOps/s | 1.1050 KOps/s | |
test_keys | 24.4110μs | 4.5835μs | 218.1762 KOps/s | 216.9873 KOps/s | |
test_keys_nested | 0.8489ms | 95.2553μs | 10.4981 KOps/s | 10.6618 KOps/s | |
test_keys_nested_locked | 0.1416ms | 94.1783μs | 10.6182 KOps/s | 10.6983 KOps/s | |
test_keys_nested_leaf | 0.1964ms | 77.9798μs | 12.8238 KOps/s | 12.9729 KOps/s | |
test_keys_stack_nested | 1.1761ms | 1.1141ms | 897.6200 Ops/s | 886.7466 Ops/s | |
test_keys_stack_nested_leaf | 1.1421ms | 1.0791ms | 926.6861 Ops/s | 903.3344 Ops/s | |
test_keys_stack_nested_locked | 0.7951ms | 0.7035ms | 1.4215 KOps/s | 1.3932 KOps/s | |
test_values | 8.5003μs | 1.8893μs | 529.2931 KOps/s | 529.8359 KOps/s | |
test_values_nested | 78.7910μs | 45.1076μs | 22.1692 KOps/s | 22.0144 KOps/s | |
test_values_nested_locked | 61.6910μs | 47.4534μs | 21.0733 KOps/s | 20.9445 KOps/s | |
test_values_nested_leaf | 53.2510μs | 39.4725μs | 25.3341 KOps/s | 25.2484 KOps/s | |
test_values_stack_nested | 0.9868ms | 0.9263ms | 1.0796 KOps/s | 1.0493 KOps/s | |
test_values_stack_nested_leaf | 1.0227ms | 0.9212ms | 1.0856 KOps/s | 1.0592 KOps/s | |
test_values_stack_nested_locked | 0.6559ms | 0.5653ms | 1.7691 KOps/s | 1.7342 KOps/s | |
test_membership | 3.7060μs | 0.9419μs | 1.0617 MOps/s | 919.2465 KOps/s | |
test_membership_nested | 11.1505μs | 2.2367μs | 447.0778 KOps/s | 438.6079 KOps/s | |
test_membership_nested_leaf | 13.1105μs | 2.1953μs | 455.5275 KOps/s | 457.5116 KOps/s | |
test_membership_stacked_nested | 31.7210μs | 11.0444μs | 90.5434 KOps/s | 92.2149 KOps/s | |
test_membership_stacked_nested_leaf | 30.1810μs | 11.2039μs | 89.2547 KOps/s | 91.8081 KOps/s | |
test_membership_nested_last | 17.9490μs | 4.7848μs | 208.9971 KOps/s | 211.5276 KOps/s | |
test_membership_nested_leaf_last | 23.2500μs | 4.8503μs | 206.1742 KOps/s | 209.9464 KOps/s | |
test_membership_stacked_nested_last | 0.1677ms | 0.1350ms | 7.4059 KOps/s | 7.3920 KOps/s | |
test_membership_stacked_nested_leaf_last | 36.6600μs | 13.0503μs | 76.6269 KOps/s | 78.4232 KOps/s | |
test_nested_getleaf | 49.6410μs | 8.4721μs | 118.0346 KOps/s | 120.3244 KOps/s | |
test_nested_get | 0.1950ms | 7.9575μs | 125.6674 KOps/s | 127.2676 KOps/s | |
test_stacked_getleaf | 0.5176ms | 0.3151ms | 3.1741 KOps/s | 3.1507 KOps/s | |
test_stacked_get | 0.4920ms | 0.2860ms | 3.4960 KOps/s | 3.4682 KOps/s | |
test_nested_getitemleaf | 0.2056ms | 8.4662μs | 118.1171 KOps/s | 119.3899 KOps/s | |
test_nested_getitem | 47.3920μs | 7.9774μs | 125.3547 KOps/s | 126.2123 KOps/s | |
test_stacked_getitemleaf | 0.5251ms | 0.3137ms | 3.1873 KOps/s | 3.1395 KOps/s | |
test_stacked_getitem | 0.4905ms | 0.2843ms | 3.5171 KOps/s | 3.5312 KOps/s | |
test_lock_nested | 4.6507ms | 0.4211ms | 2.3745 KOps/s | 2.4201 KOps/s | |
test_lock_stack_nested | 84.1382ms | 6.4732ms | 154.4829 Ops/s | 154.1364 Ops/s | |
test_unlock_nested | 0.8045ms | 0.4096ms | 2.4413 KOps/s | 2.4389 KOps/s | |
test_unlock_stack_nested | 83.0853ms | 6.8593ms | 145.7884 Ops/s | 145.4958 Ops/s | |
test_flatten_speed | 0.4963ms | 0.2631ms | 3.8006 KOps/s | 3.7987 KOps/s | |
test_unflatten_speed | 0.5391ms | 0.3516ms | 2.8442 KOps/s | 2.8206 KOps/s | |
test_common_ops | 1.0392ms | 0.5922ms | 1.6886 KOps/s | 1.6935 KOps/s | |
test_creation | 19.6600μs | 1.5656μs | 638.7213 KOps/s | 629.4467 KOps/s | |
test_creation_empty | 36.9810μs | 9.2121μs | 108.5528 KOps/s | 131.9425 KOps/s | |
test_creation_nested_1 | 0.2147ms | 11.1678μs | 89.5428 KOps/s | 105.1924 KOps/s | |
test_creation_nested_2 | 38.9500μs | 13.6286μs | 73.3753 KOps/s | 70.6287 KOps/s | |
test_clone | 0.1355ms | 12.8384μs | 77.8911 KOps/s | 74.9827 KOps/s | |
test_getitem[int] | 0.2224ms | 11.1672μs | 89.5483 KOps/s | 89.2292 KOps/s | |
test_getitem[slice_int] | 46.4210μs | 20.7799μs | 48.1234 KOps/s | 47.9290 KOps/s | |
test_getitem[range] | 61.5510μs | 35.3104μs | 28.3203 KOps/s | 28.1072 KOps/s | |
test_getitem[tuple] | 0.2290ms | 18.6283μs | 53.6818 KOps/s | 53.5024 KOps/s | |
test_getitem[list] | 57.5210μs | 31.8430μs | 31.4040 KOps/s | 30.3369 KOps/s | |
test_setitem_dim[int] | 43.7610μs | 27.6357μs | 36.1851 KOps/s | 38.5296 KOps/s | |
test_setitem_dim[slice_int] | 73.7710μs | 48.6710μs | 20.5461 KOps/s | 21.7423 KOps/s | |
test_setitem_dim[range] | 95.4320μs | 61.7136μs | 16.2039 KOps/s | 16.7766 KOps/s | |
test_setitem_dim[tuple] | 61.5510μs | 42.8583μs | 23.3327 KOps/s | 25.2647 KOps/s | |
test_setitem | 0.2142ms | 18.2271μs | 54.8633 KOps/s | 58.3249 KOps/s | |
test_set | 0.1280ms | 17.5148μs | 57.0947 KOps/s | 60.4818 KOps/s | |
test_set_shared | 2.6933ms | 0.1001ms | 9.9948 KOps/s | 9.9486 KOps/s | |
test_update | 0.2447ms | 20.1657μs | 49.5891 KOps/s | 52.6379 KOps/s | |
test_update_nested | 0.1453ms | 26.7801μs | 37.3412 KOps/s | 40.3925 KOps/s | |
test_set_nested | 0.2367ms | 18.6114μs | 53.7306 KOps/s | 55.5180 KOps/s | |
test_set_nested_new | 0.1254ms | 21.9051μs | 45.6516 KOps/s | 48.0688 KOps/s | |
test_select | 0.2500ms | 42.6711μs | 23.4351 KOps/s | 24.5786 KOps/s | |
test_to | 93.6010μs | 56.5308μs | 17.6895 KOps/s | 18.7841 KOps/s | |
test_to_nonblocking | 67.7810μs | 32.2609μs | 30.9973 KOps/s | 30.4862 KOps/s | |
test_unbind_speed | 0.3777ms | 0.3240ms | 3.0860 KOps/s | 3.0472 KOps/s | |
test_unbind_speed_stack0 | 79.5677ms | 3.7558ms | 266.2564 Ops/s | 263.0733 Ops/s | |
test_unbind_speed_stack1 | 1.4930μs | 0.5300μs | 1.8869 MOps/s | 1.8728 MOps/s | |
test_split | 74.6590ms | 1.7087ms | 585.2569 Ops/s | 576.9215 Ops/s | |
test_chunk | 1.6166ms | 1.5644ms | 639.2424 Ops/s | 587.3949 Ops/s | |
test_creation[device0] | 0.1438ms | 71.5275μs | 13.9806 KOps/s | 14.5282 KOps/s | |
test_creation_from_tensor | 0.1303ms | 55.6750μs | 17.9614 KOps/s | 19.3336 KOps/s | |
test_add_one[memmap_tensor0] | 0.1202ms | 6.6279μs | 150.8783 KOps/s | 149.2507 KOps/s | |
test_contiguous[memmap_tensor0] | 25.3810μs | 0.6302μs | 1.5867 MOps/s | 1.5950 MOps/s | |
test_stack[memmap_tensor0] | 37.8720μs | 4.3904μs | 227.7697 KOps/s | 234.7277 KOps/s | |
test_memmaptd_index | 0.2632ms | 0.2301ms | 4.3464 KOps/s | 4.3325 KOps/s | |
test_memmaptd_index_astensor | 0.3197ms | 0.2880ms | 3.4722 KOps/s | 3.4627 KOps/s | |
test_memmaptd_index_op | 0.7643ms | 0.5802ms | 1.7235 KOps/s | 1.7946 KOps/s | |
test_serialize_model | 0.1713s | 96.2412ms | 10.3906 Ops/s | 9.6232 Ops/s | |
test_serialize_model_pickle | 1.3488s | 1.2368s | 0.8086 Ops/s | 0.8065 Ops/s | |
test_serialize_weights | 88.9779ms | 86.1719ms | 11.6047 Ops/s | 9.8866 Ops/s | |
test_serialize_weights_returnearly | 0.2500s | 77.6090ms | 12.8851 Ops/s | 14.8368 Ops/s | |
test_serialize_weights_pickle | 1.3519s | 1.2451s | 0.8032 Ops/s | 0.8088 Ops/s | |
test_reshape_pytree | 50.7810μs | 23.9955μs | 41.6746 KOps/s | 41.9168 KOps/s | |
test_reshape_td | 46.2310μs | 27.7701μs | 36.0099 KOps/s | 35.1486 KOps/s | |
test_view_pytree | 45.7910μs | 23.4807μs | 42.5881 KOps/s | 42.6964 KOps/s | |
test_view_td | 21.5800μs | 4.0309μs | 248.0866 KOps/s | 242.5299 KOps/s | |
test_unbind_pytree | 58.2110μs | 29.7987μs | 33.5585 KOps/s | 33.2560 KOps/s | |
test_unbind_td | 76.1600μs | 50.5159μs | 19.7957 KOps/s | 19.1499 KOps/s | |
test_split_pytree | 42.6610μs | 27.3592μs | 36.5507 KOps/s | 36.0397 KOps/s | |
test_split_td | 0.6566ms | 38.9585μs | 25.6683 KOps/s | 25.0382 KOps/s | |
test_add_pytree | 59.7820μs | 34.9105μs | 28.6447 KOps/s | 28.3057 KOps/s | |
test_add_td | 67.7920μs | 49.1326μs | 20.3531 KOps/s | 23.2013 KOps/s | |
test_distributed | 0.2641ms | 70.6544μs | 14.1534 KOps/s | 14.6141 KOps/s | |
test_tdmodule | 0.1077ms | 18.3946μs | 54.3639 KOps/s | 57.8624 KOps/s | |
test_tdmodule_dispatch | 0.2680ms | 35.5234μs | 28.1505 KOps/s | 30.1569 KOps/s | |
test_tdseq | 0.1815ms | 21.4833μs | 46.5479 KOps/s | 49.5362 KOps/s | |
test_tdseq_dispatch | 0.1623ms | 37.2535μs | 26.8431 KOps/s | 27.4810 KOps/s | |
test_instantiation_functorch | 1.6961ms | 1.6353ms | 611.5133 Ops/s | 609.3928 Ops/s | |
test_instantiation_td | 1.7098ms | 1.1524ms | 867.7469 Ops/s | 865.4654 Ops/s | |
test_exec_functorch | 0.1864ms | 0.1550ms | 6.4504 KOps/s | 6.4701 KOps/s | |
test_exec_functional_call | 0.1803ms | 0.1541ms | 6.4880 KOps/s | 6.4784 KOps/s | |
test_exec_td | 0.1767ms | 0.1447ms | 6.9127 KOps/s | 6.7361 KOps/s | |
test_exec_td_decorator | 0.8928ms | 0.1864ms | 5.3640 KOps/s | 5.3831 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.1581ms | 1.0714ms | 933.3856 Ops/s | 925.8897 Ops/s | |
test_vmap_mlp_speed[True-False] | 0.7640ms | 0.6435ms | 1.5540 KOps/s | 1.5340 KOps/s | |
test_vmap_mlp_speed[False-True] | 1.0979ms | 0.9925ms | 1.0076 KOps/s | 993.7144 Ops/s | |
test_vmap_mlp_speed[False-False] | 0.6189ms | 0.5748ms | 1.7396 KOps/s | 1.6832 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 3.1181ms | 2.4689ms | 405.0393 Ops/s | 406.2466 Ops/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.0199ms | 0.6933ms | 1.4423 KOps/s | 1.4300 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 2.5513ms | 2.1267ms | 470.2117 Ops/s | 485.8885 Ops/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.9752ms | 0.6150ms | 1.6261 KOps/s | 1.6670 KOps/s | |
test_vmap_transformer_speed[True-True] | 12.7093ms | 12.2856ms | 81.3960 Ops/s | 82.3253 Ops/s | |
test_vmap_transformer_speed[True-False] | 8.4520ms | 8.1461ms | 122.7579 Ops/s | 123.7864 Ops/s | |
test_vmap_transformer_speed[False-True] | 13.3532ms | 12.3299ms | 81.1034 Ops/s | 83.0202 Ops/s | |
test_vmap_transformer_speed[False-False] | 8.4429ms | 8.1203ms | 123.1486 Ops/s | 125.2735 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 0.1654s | 82.4073ms | 12.1348 Ops/s | 12.4329 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 21.4881ms | 19.7849ms | 50.5437 Ops/s | 51.4946 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 70.8819ms | 68.6744ms | 14.5615 Ops/s | 14.9704 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 21.0174ms | 19.3594ms | 51.6546 Ops/s | 47.9706 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Performance
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
cc @dubuqa