-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix, CI] Fix GPU benchmarks #611
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Jan 5, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 33.1620μs | 17.4007μs | 57.4690 KOps/s | 59.1184 KOps/s | |
test_plain_set_stack_nested | 0.2043ms | 0.1435ms | 6.9700 KOps/s | 6.9449 KOps/s | |
test_plain_set_nested_inplace | 61.4540μs | 19.5420μs | 51.1720 KOps/s | 51.8037 KOps/s | |
test_plain_set_stack_nested_inplace | 0.3318ms | 0.1780ms | 5.6182 KOps/s | 5.6081 KOps/s | |
test_items | 18.8950μs | 2.4880μs | 401.9256 KOps/s | 408.1212 KOps/s | |
test_items_nested | 0.4379ms | 0.2682ms | 3.7281 KOps/s | 3.6571 KOps/s | |
test_items_nested_locked | 0.4506ms | 0.2695ms | 3.7110 KOps/s | 3.6771 KOps/s | |
test_items_nested_leaf | 0.5457ms | 0.1673ms | 5.9767 KOps/s | 5.9699 KOps/s | |
test_items_stack_nested | 1.4700ms | 1.3231ms | 755.8266 Ops/s | 747.9916 Ops/s | |
test_items_stack_nested_leaf | 1.9284ms | 1.2087ms | 827.3298 Ops/s | 836.1607 Ops/s | |
test_items_stack_nested_locked | 1.0088ms | 0.7734ms | 1.2930 KOps/s | 1.2977 KOps/s | |
test_keys | 37.2300μs | 4.1909μs | 238.6136 KOps/s | 239.8858 KOps/s | |
test_keys_nested | 55.5088ms | 0.1593ms | 6.2762 KOps/s | 6.4632 KOps/s | |
test_keys_nested_locked | 0.3189ms | 0.1460ms | 6.8472 KOps/s | 6.4919 KOps/s | |
test_keys_nested_leaf | 0.2135ms | 0.1294ms | 7.7307 KOps/s | 7.5377 KOps/s | |
test_keys_stack_nested | 1.5008ms | 1.2720ms | 786.1468 Ops/s | 772.6035 Ops/s | |
test_keys_stack_nested_leaf | 1.6517ms | 1.2620ms | 792.4217 Ops/s | 777.8889 Ops/s | |
test_keys_stack_nested_locked | 0.8915ms | 0.6944ms | 1.4401 KOps/s | 1.4133 KOps/s | |
test_values | 6.7766μs | 1.1392μs | 877.8436 KOps/s | 851.5000 KOps/s | |
test_values_nested | 92.8930μs | 53.8541μs | 18.5687 KOps/s | 18.3720 KOps/s | |
test_values_nested_locked | 0.1051ms | 54.2026μs | 18.4493 KOps/s | 18.3374 KOps/s | |
test_values_nested_leaf | 98.9240μs | 48.7197μs | 20.5256 KOps/s | 20.7840 KOps/s | |
test_values_stack_nested | 1.2811ms | 1.0546ms | 948.2380 Ops/s | 936.5131 Ops/s | |
test_values_stack_nested_leaf | 1.2237ms | 1.0348ms | 966.4084 Ops/s | 948.7646 Ops/s | |
test_values_stack_nested_locked | 0.7484ms | 0.5172ms | 1.9334 KOps/s | 1.9054 KOps/s | |
test_membership | 15.2690μs | 1.4040μs | 712.2559 KOps/s | 699.6134 KOps/s | |
test_membership_nested | 45.8350μs | 2.8800μs | 347.2187 KOps/s | 322.5696 KOps/s | |
test_membership_nested_leaf | 19.7460μs | 2.9073μs | 343.9562 KOps/s | 315.2099 KOps/s | |
test_membership_stacked_nested | 57.0460μs | 11.8387μs | 84.4687 KOps/s | 82.8571 KOps/s | |
test_membership_stacked_nested_leaf | 35.7370μs | 11.8455μs | 84.4200 KOps/s | 82.3453 KOps/s | |
test_membership_nested_last | 44.3920μs | 6.0849μs | 164.3422 KOps/s | 157.1372 KOps/s | |
test_membership_nested_leaf_last | 43.8420μs | 6.1198μs | 163.4039 KOps/s | 157.2650 KOps/s | |
test_membership_stacked_nested_last | 0.2726ms | 0.1689ms | 5.9213 KOps/s | 5.8885 KOps/s | |
test_membership_stacked_nested_leaf_last | 0.1787ms | 14.5510μs | 68.7236 KOps/s | 70.8581 KOps/s | |
test_nested_getleaf | 43.8110μs | 10.8067μs | 92.5351 KOps/s | 92.9297 KOps/s | |
test_nested_get | 45.7450μs | 10.1839μs | 98.1938 KOps/s | 98.3789 KOps/s | |
test_stacked_getleaf | 0.7038ms | 0.4672ms | 2.1405 KOps/s | 2.1036 KOps/s | |
test_stacked_get | 0.5850ms | 0.4390ms | 2.2779 KOps/s | 2.2483 KOps/s | |
test_nested_getitemleaf | 27.5410μs | 10.8743μs | 91.9598 KOps/s | 91.8678 KOps/s | |
test_nested_getitem | 48.8110μs | 10.2185μs | 97.8619 KOps/s | 97.3062 KOps/s | |
test_stacked_getitemleaf | 0.6449ms | 0.4702ms | 2.1267 KOps/s | 2.0745 KOps/s | |
test_stacked_getitem | 0.6509ms | 0.4407ms | 2.2693 KOps/s | 2.2290 KOps/s | |
test_lock_nested | 1.3721ms | 0.4106ms | 2.4357 KOps/s | 2.2759 KOps/s | |
test_lock_stack_nested | 85.7205ms | 6.7450ms | 148.2590 Ops/s | 128.3678 Ops/s | |
test_unlock_nested | 75.2142ms | 0.4909ms | 2.0371 KOps/s | 2.1818 KOps/s | |
test_unlock_stack_nested | 82.5595ms | 6.3325ms | 157.9165 Ops/s | 137.2645 Ops/s | |
test_flatten_speed | 0.6315ms | 0.3636ms | 2.7505 KOps/s | 2.6929 KOps/s | |
test_unflatten_speed | 0.7021ms | 0.4522ms | 2.2116 KOps/s | 2.2209 KOps/s | |
test_common_ops | 2.9020ms | 0.7025ms | 1.4234 KOps/s | 1.3930 KOps/s | |
test_creation | 19.2260μs | 1.9934μs | 501.6487 KOps/s | 488.8750 KOps/s | |
test_creation_empty | 31.1180μs | 11.1392μs | 89.7734 KOps/s | 102.1292 KOps/s | |
test_creation_nested_1 | 97.9920μs | 14.0809μs | 71.0182 KOps/s | 78.3109 KOps/s | |
test_creation_nested_2 | 47.7690μs | 19.1138μs | 52.3182 KOps/s | 54.7666 KOps/s | |
test_clone | 0.1348ms | 12.1890μs | 82.0413 KOps/s | 79.1777 KOps/s | |
test_getitem[int] | 42.4990μs | 11.6985μs | 85.4808 KOps/s | 81.3152 KOps/s | |
test_getitem[slice_int] | 60.6930μs | 22.7272μs | 44.0002 KOps/s | 40.6790 KOps/s | |
test_getitem[range] | 83.9870μs | 40.6970μs | 24.5718 KOps/s | 22.2015 KOps/s | |
test_getitem[tuple] | 53.0590μs | 18.7999μs | 53.1919 KOps/s | 50.3760 KOps/s | |
test_getitem[list] | 0.1011ms | 35.9526μs | 27.8144 KOps/s | 24.9452 KOps/s | |
test_setitem_dim[int] | 67.7660μs | 31.1082μs | 32.1458 KOps/s | 31.6479 KOps/s | |
test_setitem_dim[slice_int] | 0.1015ms | 56.8023μs | 17.6049 KOps/s | 17.3871 KOps/s | |
test_setitem_dim[range] | 0.1137ms | 74.6982μs | 13.3872 KOps/s | 11.7668 KOps/s | |
test_setitem_dim[tuple] | 85.4290μs | 46.0773μs | 21.7027 KOps/s | 21.0369 KOps/s | |
test_setitem | 0.1523ms | 19.1389μs | 52.2495 KOps/s | 53.9652 KOps/s | |
test_set | 0.1142ms | 18.3929μs | 54.3687 KOps/s | 55.4491 KOps/s | |
test_set_shared | 3.2345ms | 0.1370ms | 7.3019 KOps/s | 6.8422 KOps/s | |
test_update | 0.1446ms | 21.7032μs | 46.0761 KOps/s | 47.0377 KOps/s | |
test_update_nested | 0.1841ms | 28.8695μs | 34.6387 KOps/s | 35.2382 KOps/s | |
test_set_nested | 0.1340ms | 20.3465μs | 49.1486 KOps/s | 50.5157 KOps/s | |
test_set_nested_new | 0.2011ms | 24.9036μs | 40.1548 KOps/s | 42.1297 KOps/s | |
test_select | 0.2098ms | 49.0579μs | 20.3841 KOps/s | 20.4928 KOps/s | |
test_unbind_speed | 0.6291ms | 0.3361ms | 2.9756 KOps/s | 2.9002 KOps/s | |
test_unbind_speed_stack0 | 78.9038ms | 4.4025ms | 227.1428 Ops/s | 227.6393 Ops/s | |
test_unbind_speed_stack1 | 2.5963μs | 0.6486μs | 1.5417 MOps/s | 1.4873 MOps/s | |
test_split | 70.5956ms | 1.6798ms | 595.3247 Ops/s | 590.1273 Ops/s | |
test_chunk | 69.0545ms | 1.6434ms | 608.4763 Ops/s | 606.8120 Ops/s | |
test_creation[device0] | 0.6034ms | 0.2999ms | 3.3349 KOps/s | 3.3359 KOps/s | |
test_creation_from_tensor | 4.0362ms | 0.3412ms | 2.9309 KOps/s | 3.0125 KOps/s | |
test_add_one[memmap_tensor0] | 0.4437ms | 26.8323μs | 37.2686 KOps/s | 40.1310 KOps/s | |
test_contiguous[memmap_tensor0] | 28.0720μs | 5.9548μs | 167.9315 KOps/s | 173.8288 KOps/s | |
test_stack[memmap_tensor0] | 55.2730μs | 20.4794μs | 48.8295 KOps/s | 51.7539 KOps/s | |
test_memmaptd_index | 0.3149ms | 0.1995ms | 5.0136 KOps/s | 5.0154 KOps/s | |
test_memmaptd_index_astensor | 0.6127ms | 0.2583ms | 3.8716 KOps/s | 3.8430 KOps/s | |
test_memmaptd_index_op | 0.8338ms | 0.5508ms | 1.8154 KOps/s | 1.8966 KOps/s | |
test_serialize_model | 0.1773s | 0.1122s | 8.9142 Ops/s | 9.0243 Ops/s | |
test_serialize_model_pickle | 0.4468s | 0.3743s | 2.6717 Ops/s | 2.5670 Ops/s | |
test_serialize_weights | 0.1669s | 0.1058s | 9.4511 Ops/s | 9.1583 Ops/s | |
test_serialize_weights_returnearly | 0.1917s | 0.1301s | 7.6881 Ops/s | 8.2327 Ops/s | |
test_serialize_weights_pickle | 1.0429s | 0.6191s | 1.6153 Ops/s | 2.4452 Ops/s | |
test_serialize_weights_filesystem | 0.1588s | 97.8462ms | 10.2201 Ops/s | 10.6509 Ops/s | |
test_serialize_model_filesystem | 97.7528ms | 92.7128ms | 10.7860 Ops/s | 9.6574 Ops/s | |
test_reshape_pytree | 72.4170μs | 22.7098μs | 44.0339 KOps/s | 42.2356 KOps/s | |
test_reshape_td | 76.6320μs | 29.7785μs | 33.5812 KOps/s | 33.4378 KOps/s | |
test_view_pytree | 81.0080μs | 22.8504μs | 43.7629 KOps/s | 42.9622 KOps/s | |
test_view_td | 25.4770μs | 4.8589μs | 205.8079 KOps/s | 199.7121 KOps/s | |
test_unbind_pytree | 63.8090μs | 26.1668μs | 38.2164 KOps/s | 35.9609 KOps/s | |
test_unbind_td | 0.1260ms | 53.8304μs | 18.5769 KOps/s | 17.3659 KOps/s | |
test_split_pytree | 82.7340μs | 26.1600μs | 38.2263 KOps/s | 36.9322 KOps/s | |
test_split_td | 0.5785ms | 42.3182μs | 23.6305 KOps/s | 22.8579 KOps/s | |
test_add_pytree | 75.2690μs | 32.1385μs | 31.1153 KOps/s | 30.6513 KOps/s | |
test_add_td | 0.1004ms | 49.3366μs | 20.2689 KOps/s | 21.4296 KOps/s | |
test_distributed | 27.1600μs | 5.9601μs | 167.7814 KOps/s | 163.4396 KOps/s | |
test_tdmodule | 0.3486ms | 23.2029μs | 43.0980 KOps/s | 45.4925 KOps/s | |
test_tdmodule_dispatch | 0.1875ms | 43.0371μs | 23.2358 KOps/s | 24.4844 KOps/s | |
test_tdseq | 0.1287ms | 26.4491μs | 37.8085 KOps/s | 38.9412 KOps/s | |
test_tdseq_dispatch | 0.1366ms | 46.6884μs | 21.4186 KOps/s | 22.0511 KOps/s | |
test_instantiation_functorch | 1.9472ms | 1.2980ms | 770.3986 Ops/s | 764.9018 Ops/s | |
test_instantiation_td | 1.6311ms | 1.0057ms | 994.3012 Ops/s | 991.2855 Ops/s | |
test_exec_functorch | 0.2887ms | 0.1557ms | 6.4235 KOps/s | 6.2931 KOps/s | |
test_exec_functional_call | 0.3714ms | 0.1464ms | 6.8304 KOps/s | 6.8423 KOps/s | |
test_exec_td | 0.2803ms | 0.1413ms | 7.0749 KOps/s | 6.9198 KOps/s | |
test_exec_td_decorator | 0.8991ms | 0.1758ms | 5.6872 KOps/s | 5.6753 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.4874ms | 0.8804ms | 1.1358 KOps/s | 1.1069 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.7210ms | 0.4730ms | 2.1141 KOps/s | 2.1380 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.9439ms | 0.7564ms | 1.3220 KOps/s | 1.2836 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.6310ms | 0.3819ms | 2.6185 KOps/s | 2.5822 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 3.2742ms | 1.7902ms | 558.6117 Ops/s | 566.5992 Ops/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.0537ms | 0.5294ms | 1.8890 KOps/s | 1.8835 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 2.2263ms | 1.5150ms | 660.0778 Ops/s | 674.4944 Ops/s | |
test_vmap_mlp_speed_decorator[False-False] | 83.6564ms | 0.4405ms | 2.2699 KOps/s | 2.5156 KOps/s |
Result of GPU Benchmark TestsExpand to view detailed results
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Benchmarks
bug
Something isn't working
CI
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.