-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Sync cuda only if initialized #767
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Apr 30, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 51.3460μs | 17.5902μs | 56.8498 KOps/s | 59.1602 KOps/s | |
test_plain_set_stack_nested | 60.2720μs | 17.8178μs | 56.1237 KOps/s | 58.5934 KOps/s | |
test_plain_set_nested_inplace | 79.9200μs | 20.1602μs | 49.6026 KOps/s | 51.7059 KOps/s | |
test_plain_set_stack_nested_inplace | 66.8250μs | 19.9491μs | 50.1275 KOps/s | 51.8456 KOps/s | |
test_items | 14.3660μs | 2.7144μs | 368.4059 KOps/s | 386.9707 KOps/s | |
test_items_nested | 0.8325ms | 0.2658ms | 3.7624 KOps/s | 3.7301 KOps/s | |
test_items_nested_locked | 0.5580ms | 0.2635ms | 3.7944 KOps/s | 3.6584 KOps/s | |
test_items_nested_leaf | 0.1604ms | 76.7687μs | 13.0261 KOps/s | 12.8469 KOps/s | |
test_items_stack_nested | 0.4526ms | 0.2631ms | 3.8015 KOps/s | 3.7275 KOps/s | |
test_items_stack_nested_leaf | 0.5344ms | 79.9748μs | 12.5039 KOps/s | 12.8925 KOps/s | |
test_items_stack_nested_locked | 0.4093ms | 0.2634ms | 3.7967 KOps/s | 3.7091 KOps/s | |
test_keys | 29.4950μs | 3.9614μs | 252.4364 KOps/s | 258.9075 KOps/s | |
test_keys_nested | 0.2620ms | 0.1397ms | 7.1596 KOps/s | 7.2182 KOps/s | |
test_keys_nested_locked | 2.5352ms | 0.1433ms | 6.9769 KOps/s | 6.9100 KOps/s | |
test_keys_nested_leaf | 0.2077ms | 0.1243ms | 8.0422 KOps/s | 8.4206 KOps/s | |
test_keys_stack_nested | 0.2612ms | 0.1392ms | 7.1832 KOps/s | 7.2497 KOps/s | |
test_keys_stack_nested_leaf | 0.1999ms | 0.1171ms | 8.5366 KOps/s | 8.4373 KOps/s | |
test_keys_stack_nested_locked | 0.2659ms | 0.1443ms | 6.9283 KOps/s | 6.9704 KOps/s | |
test_values | 10.9078μs | 1.1533μs | 867.0551 KOps/s | 867.5608 KOps/s | |
test_values_nested | 93.3550μs | 50.1960μs | 19.9219 KOps/s | 19.2481 KOps/s | |
test_values_nested_locked | 0.1199ms | 50.3084μs | 19.8774 KOps/s | 19.3296 KOps/s | |
test_values_nested_leaf | 84.0660μs | 45.3785μs | 22.0369 KOps/s | 21.3802 KOps/s | |
test_values_stack_nested | 99.6020μs | 50.3565μs | 19.8584 KOps/s | 19.4871 KOps/s | |
test_values_stack_nested_leaf | 96.2400μs | 45.7002μs | 21.8817 KOps/s | 21.4426 KOps/s | |
test_values_stack_nested_locked | 0.1025ms | 50.4478μs | 19.8225 KOps/s | 19.5070 KOps/s | |
test_membership | 16.1700μs | 1.3268μs | 753.6913 KOps/s | 741.0437 KOps/s | |
test_membership_nested | 35.0050μs | 3.3887μs | 295.0964 KOps/s | 288.4450 KOps/s | |
test_membership_nested_leaf | 47.3420μs | 3.4141μs | 292.9024 KOps/s | 283.3889 KOps/s | |
test_membership_stacked_nested | 42.5400μs | 3.3816μs | 295.7195 KOps/s | 292.1257 KOps/s | |
test_membership_stacked_nested_leaf | 55.4830μs | 3.4315μs | 291.4174 KOps/s | 290.5896 KOps/s | |
test_membership_nested_last | 25.4080μs | 4.1016μs | 243.8088 KOps/s | 239.2836 KOps/s | |
test_membership_nested_leaf_last | 38.7730μs | 4.1377μs | 241.6828 KOps/s | 237.5970 KOps/s | |
test_membership_stacked_nested_last | 22.4620μs | 4.1058μs | 243.5604 KOps/s | 231.5966 KOps/s | |
test_membership_stacked_nested_leaf_last | 47.1680μs | 4.1456μs | 241.2189 KOps/s | 240.2639 KOps/s | |
test_nested_getleaf | 53.5600μs | 10.7445μs | 93.0712 KOps/s | 92.6895 KOps/s | |
test_nested_get | 51.3350μs | 10.3167μs | 96.9300 KOps/s | 97.3031 KOps/s | |
test_stacked_getleaf | 31.3180μs | 10.7220μs | 93.2661 KOps/s | 91.7729 KOps/s | |
test_stacked_get | 51.8770μs | 10.1525μs | 98.4982 KOps/s | 97.4394 KOps/s | |
test_nested_getitemleaf | 34.1740μs | 11.1318μs | 89.8324 KOps/s | 85.5681 KOps/s | |
test_nested_getitem | 52.4480μs | 10.2063μs | 97.9786 KOps/s | 93.2217 KOps/s | |
test_stacked_getitemleaf | 54.1410μs | 10.9186μs | 91.5870 KOps/s | 85.8679 KOps/s | |
test_stacked_getitem | 51.4960μs | 10.1876μs | 98.1589 KOps/s | 93.7947 KOps/s | |
test_lock_nested | 50.7226ms | 0.4096ms | 2.4413 KOps/s | 2.8112 KOps/s | |
test_lock_stack_nested | 0.4337ms | 0.3112ms | 3.2139 KOps/s | 3.1258 KOps/s | |
test_unlock_nested | 0.9028ms | 0.3507ms | 2.8516 KOps/s | 2.4447 KOps/s | |
test_unlock_stack_nested | 0.5180ms | 0.3197ms | 3.1283 KOps/s | 3.0557 KOps/s | |
test_flatten_speed | 0.2131ms | 95.2863μs | 10.4947 KOps/s | 10.2598 KOps/s | |
test_unflatten_speed | 0.6887ms | 0.4142ms | 2.4142 KOps/s | 2.3796 KOps/s | |
test_common_ops | 1.5337ms | 0.7503ms | 1.3327 KOps/s | 1.3691 KOps/s | |
test_creation | 70.8120μs | 1.8981μs | 526.8305 KOps/s | 515.4778 KOps/s | |
test_creation_empty | 38.0200μs | 11.8530μs | 84.3671 KOps/s | 97.5822 KOps/s | |
test_creation_nested_1 | 62.8070μs | 14.4362μs | 69.2704 KOps/s | 78.0829 KOps/s | |
test_creation_nested_2 | 88.4250μs | 17.7949μs | 56.1959 KOps/s | 60.7620 KOps/s | |
test_clone | 0.1208ms | 13.5011μs | 74.0679 KOps/s | 74.6047 KOps/s | |
test_getitem[int] | 29.6060μs | 11.6951μs | 85.5060 KOps/s | 83.8444 KOps/s | |
test_getitem[slice_int] | 94.6100μs | 22.7199μs | 44.0144 KOps/s | 41.9315 KOps/s | |
test_getitem[range] | 80.7010μs | 58.7554μs | 17.0197 KOps/s | 16.1083 KOps/s | |
test_getitem[tuple] | 78.4060μs | 18.6917μs | 53.4998 KOps/s | 50.3216 KOps/s | |
test_getitem[list] | 0.1018ms | 40.3404μs | 24.7891 KOps/s | 24.1511 KOps/s | |
test_setitem_dim[int] | 52.4080μs | 34.7614μs | 28.7675 KOps/s | 27.3647 KOps/s | |
test_setitem_dim[slice_int] | 0.1065ms | 62.7724μs | 15.9306 KOps/s | 15.5368 KOps/s | |
test_setitem_dim[range] | 0.1174ms | 84.6696μs | 11.8106 KOps/s | 11.8245 KOps/s | |
test_setitem_dim[tuple] | 95.2180μs | 52.0605μs | 19.2084 KOps/s | 19.1217 KOps/s | |
test_setitem | 72.1750μs | 21.5136μs | 46.4823 KOps/s | 49.1063 KOps/s | |
test_set | 77.1240μs | 20.5549μs | 48.6501 KOps/s | 50.7804 KOps/s | |
test_set_shared | 2.7877ms | 0.1432ms | 6.9846 KOps/s | 7.0510 KOps/s | |
test_update | 0.1235ms | 23.3399μs | 42.8450 KOps/s | 47.3442 KOps/s | |
test_update_nested | 0.1186ms | 31.8294μs | 31.4175 KOps/s | 33.0725 KOps/s | |
test_update__nested | 70.3520μs | 24.5922μs | 40.6633 KOps/s | 39.4629 KOps/s | |
test_set_nested | 0.1246ms | 22.6045μs | 44.2390 KOps/s | 46.4292 KOps/s | |
test_set_nested_new | 0.1294ms | 26.5008μs | 37.7347 KOps/s | 38.8582 KOps/s | |
test_select | 0.1566ms | 40.7334μs | 24.5499 KOps/s | 24.4026 KOps/s | |
test_select_nested | 4.9209ms | 59.6746μs | 16.7575 KOps/s | 16.2626 KOps/s | |
test_exclude_nested | 0.1821ms | 0.1185ms | 8.4383 KOps/s | 8.2956 KOps/s | |
test_empty[True] | 0.5802ms | 0.3892ms | 2.5691 KOps/s | 2.5280 KOps/s | |
test_empty[False] | 11.2176μs | 1.0524μs | 950.2217 KOps/s | 918.6940 KOps/s | |
test_unbind_speed | 0.4953ms | 0.2557ms | 3.9114 KOps/s | 3.8106 KOps/s | |
test_unbind_speed_stack0 | 0.4150ms | 0.2535ms | 3.9440 KOps/s | 3.8892 KOps/s | |
test_unbind_speed_stack1 | 65.6232ms | 0.7239ms | 1.3813 KOps/s | 1.2717 KOps/s | |
test_split | 64.8744ms | 1.5915ms | 628.3466 Ops/s | 638.2248 Ops/s | |
test_chunk | 68.0651ms | 1.5967ms | 626.2971 Ops/s | 590.7942 Ops/s | |
test_creation[device0] | 0.2431ms | 0.1045ms | 9.5659 KOps/s | 9.6265 KOps/s | |
test_creation_from_tensor | 3.5599ms | 83.0619μs | 12.0392 KOps/s | 11.9752 KOps/s | |
test_add_one[memmap_tensor0] | 0.1108ms | 5.5116μs | 181.4350 KOps/s | 182.4363 KOps/s | |
test_contiguous[memmap_tensor0] | 12.3830μs | 0.6344μs | 1.5763 MOps/s | 1.5826 MOps/s | |
test_stack[memmap_tensor0] | 26.8100μs | 3.6289μs | 275.5643 KOps/s | 279.3413 KOps/s | |
test_memmaptd_index | 1.0684ms | 0.2369ms | 4.2217 KOps/s | 4.0702 KOps/s | |
test_memmaptd_index_astensor | 0.7564ms | 0.3162ms | 3.1630 KOps/s | 3.0637 KOps/s | |
test_memmaptd_index_op | 1.0359ms | 0.6112ms | 1.6363 KOps/s | 1.6375 KOps/s | |
test_serialize_model | 0.1092s | 0.1026s | 9.7498 Ops/s | 9.0720 Ops/s | |
test_serialize_model_pickle | 0.4492s | 0.3774s | 2.6495 Ops/s | 2.5841 Ops/s | |
test_serialize_weights | 0.1639s | 0.1085s | 9.2163 Ops/s | 9.1932 Ops/s | |
test_serialize_weights_returnearly | 0.1959s | 0.1300s | 7.6895 Ops/s | 7.8823 Ops/s | |
test_serialize_weights_pickle | 1.0403s | 0.5855s | 1.7079 Ops/s | 2.4440 Ops/s | |
test_serialize_weights_filesystem | 0.1606s | 97.4265ms | 10.2642 Ops/s | 10.4443 Ops/s | |
test_serialize_model_filesystem | 0.1587s | 99.7421ms | 10.0259 Ops/s | 10.2932 Ops/s | |
test_reshape_pytree | 55.8340μs | 25.1231μs | 39.8041 KOps/s | 37.8913 KOps/s | |
test_reshape_td | 0.1021ms | 33.4331μs | 29.9105 KOps/s | 30.1468 KOps/s | |
test_view_pytree | 85.2590μs | 24.8933μs | 40.1714 KOps/s | 37.4267 KOps/s | |
test_view_td | 88.0350μs | 37.1383μs | 26.9264 KOps/s | 26.4547 KOps/s | |
test_unbind_pytree | 89.0760μs | 29.2336μs | 34.2072 KOps/s | 32.9465 KOps/s | |
test_unbind_td | 0.4338ms | 38.1794μs | 26.1921 KOps/s | 26.0581 KOps/s | |
test_split_pytree | 65.0010μs | 28.8446μs | 34.6685 KOps/s | 32.8383 KOps/s | |
test_split_td | 0.1395ms | 40.2698μs | 24.8325 KOps/s | 23.8393 KOps/s | |
test_add_pytree | 86.1910μs | 34.6904μs | 28.8265 KOps/s | 28.0174 KOps/s | |
test_add_td | 0.1215ms | 54.7270μs | 18.2725 KOps/s | 17.6494 KOps/s | |
test_distributed | 0.2148ms | 98.3830μs | 10.1644 KOps/s | 9.8461 KOps/s | |
test_tdmodule | 44.4230μs | 18.2911μs | 54.6715 KOps/s | 56.7570 KOps/s | |
test_tdmodule_dispatch | 63.6390μs | 36.6912μs | 27.2545 KOps/s | 29.4070 KOps/s | |
test_tdseq | 51.6260μs | 21.6714μs | 46.1438 KOps/s | 48.0649 KOps/s | |
test_tdseq_dispatch | 79.2180μs | 44.3015μs | 22.5726 KOps/s | 24.5834 KOps/s | |
test_instantiation_functorch | 2.3318ms | 1.2857ms | 777.7943 Ops/s | 749.2025 Ops/s | |
test_instantiation_td | 1.5963ms | 1.0207ms | 979.7271 Ops/s | 910.1248 Ops/s | |
test_exec_functorch | 0.2972ms | 0.1610ms | 6.2104 KOps/s | 6.1197 KOps/s | |
test_exec_functional_call | 0.2899ms | 0.1499ms | 6.6715 KOps/s | 6.6117 KOps/s | |
test_exec_td | 0.2453ms | 0.1469ms | 6.8094 KOps/s | 6.8668 KOps/s | |
test_exec_td_decorator | 0.5602ms | 0.2219ms | 4.5074 KOps/s | 4.4745 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.7688ms | 0.4928ms | 2.0291 KOps/s | 2.0400 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.5882ms | 0.4851ms | 2.0614 KOps/s | 2.0833 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.5630ms | 0.3952ms | 2.5307 KOps/s | 2.5743 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.7099ms | 0.3945ms | 2.5349 KOps/s | 2.5663 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.0779ms | 0.5535ms | 1.8066 KOps/s | 1.8142 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8679ms | 0.5538ms | 1.8058 KOps/s | 1.8153 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7247ms | 0.4515ms | 2.2149 KOps/s | 2.2135 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.6995ms | 0.4522ms | 2.2112 KOps/s | 2.2175 KOps/s | |
test_to_module_speed[True] | 2.5198ms | 1.6661ms | 600.1983 Ops/s | 592.7601 Ops/s | |
test_to_module_speed[False] | 2.2313ms | 1.6362ms | 611.1741 Ops/s | 603.9089 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
bug
Something isn't working
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.