-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Benchmark] Fix GPU benchmark #386
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
May 18, 2023
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_common_ops | 1.2503ms | 1.2189ms | 820.4166 Ops/s | 814.4906 Ops/s | |
test_creation | 4.4521μs | 4.2131μs | 237.3556 KOps/s | 235.9696 KOps/s | |
test_creation_empty | 16.7522μs | 15.9980μs | 62.5079 KOps/s | 60.6001 KOps/s | |
test_creation_nested_1 | 30.2674μs | 29.1668μs | 34.2855 KOps/s | 33.9479 KOps/s | |
test_creation_nested_2 | 29.7674μs | 29.0069μs | 34.4746 KOps/s | 33.9148 KOps/s | |
test_clone | 28.7124μs | 26.6301μs | 37.5516 KOps/s | 37.6257 KOps/s | |
test_getitem[int] | 35.9529μs | 32.9372μs | 30.3608 KOps/s | 30.9120 KOps/s | |
test_getitem[slice_int] | 71.1245μs | 66.7213μs | 14.9877 KOps/s | 14.8163 KOps/s | |
test_getitem[range] | 71.4600μs | 69.8234μs | 14.3218 KOps/s | 14.3239 KOps/s | |
test_getitem[tuple] | 65.8509μs | 62.3835μs | 16.0299 KOps/s | 16.1172 KOps/s | |
test_getitem[list] | 62.1116μs | 61.3578μs | 16.2978 KOps/s | 16.2365 KOps/s | |
test_setitem_dim[int] | 76.8010μs | 45.0438μs | 22.2006 KOps/s | 22.0569 KOps/s | |
test_setitem_dim[slice_int] | 0.1201ms | 80.5670μs | 12.4120 KOps/s | 12.1437 KOps/s | |
test_setitem_dim[range] | 0.1361ms | 77.8661μs | 12.8426 KOps/s | 12.8794 KOps/s | |
test_setitem_dim[tuple] | 0.1767ms | 73.4513μs | 13.6145 KOps/s | 13.4568 KOps/s | |
test_setitem | 40.0867μs | 38.6362μs | 25.8825 KOps/s | 25.9811 KOps/s | |
test_set | 38.9716μs | 37.4837μs | 26.6783 KOps/s | 26.6821 KOps/s | |
test_set_shared | 0.1791ms | 0.1763ms | 5.6735 KOps/s | 5.7000 KOps/s | |
test_update | 48.3478μs | 47.3787μs | 21.1065 KOps/s | 20.8845 KOps/s | |
test_update_nested | 69.6671μs | 67.3231μs | 14.8537 KOps/s | 14.7222 KOps/s | |
test_set_nested | 48.6898μs | 47.4263μs | 21.0853 KOps/s | 21.1141 KOps/s | |
test_set_nested_new | 67.1911μs | 65.9336μs | 15.1668 KOps/s | 15.1975 KOps/s | |
test_select | 0.1050ms | 0.1032ms | 9.6866 KOps/s | 9.5879 KOps/s | |
test_creation[device0] | 1.2986ms | 0.4974ms | 2.0105 KOps/s | 2.0330 KOps/s | |
test_creation_from_tensor | 0.5813ms | 0.4653ms | 2.1490 KOps/s | 1.8832 KOps/s | |
test_add_one[memmap_tensor0] | 37.9736μs | 30.5349μs | 32.7494 KOps/s | 32.8098 KOps/s | |
test_contiguous[memmap_tensor0] | 8.7532μs | 8.2149μs | 121.7307 KOps/s | 124.2277 KOps/s | |
test_stack[memmap_tensor0] | 0.1772ms | 43.0607μs | 23.2230 KOps/s | 23.9288 KOps/s | |
test_reshape_pytree | 38.3856μs | 35.4480μs | 28.2104 KOps/s | 27.8865 KOps/s | |
test_reshape_td | 50.7698μs | 48.7519μs | 20.5120 KOps/s | 20.4975 KOps/s | |
test_view_pytree | 34.1546μs | 32.8233μs | 30.4662 KOps/s | 30.0866 KOps/s | |
test_view_td | 9.6352μs | 9.0758μs | 110.1831 KOps/s | 112.1925 KOps/s | |
test_unbind_pytree | 37.9536μs | 36.7563μs | 27.2062 KOps/s | 27.0192 KOps/s | |
test_unbind_td | 0.1867ms | 0.1848ms | 5.4106 KOps/s | 5.3398 KOps/s | |
test_split_pytree | 43.7767μs | 41.7395μs | 23.9581 KOps/s | 24.1833 KOps/s | |
test_split_td | 0.1182ms | 0.1147ms | 8.7203 KOps/s | 8.7664 KOps/s | |
test_add_pytree | 47.2278μs | 45.3378μs | 22.0567 KOps/s | 22.0406 KOps/s | |
test_add_td | 78.6043μs | 76.4401μs | 13.0821 KOps/s | 12.9257 KOps/s | |
test_distributed | 73.5010μs | 73.5010μs | 13.6053 KOps/s | 13.8311 KOps/s | |
test_tdmodule | 87.4010μs | 28.0538μs | 35.6458 KOps/s | 35.6721 KOps/s | |
test_tdmodule_dispatch | 60.7421ms | 66.3092μs | 15.0809 KOps/s | 16.3723 KOps/s | |
test_tdseq | 0.1887ms | 38.3915μs | 26.0474 KOps/s | 26.0454 KOps/s | |
test_tdseq_dispatch | 0.1206ms | 70.3281μs | 14.2191 KOps/s | 14.0452 KOps/s | |
test_instantiation_functorch | 1.7216ms | 1.5811ms | 632.4843 Ops/s | 639.3993 Ops/s | |
test_instantiation_td | 7.8393ms | 1.2629ms | 791.8403 Ops/s | 830.9754 Ops/s | |
test_exec_functorch | 0.1860ms | 0.1810ms | 5.5249 KOps/s | 5.5310 KOps/s | |
test_exec_td | 0.3335ms | 0.3306ms | 3.0244 KOps/s | 3.0394 KOps/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_common_ops | 1.1097ms | 1.0707ms | 933.9646 Ops/s | 925.5185 Ops/s | |
test_creation | 3.3590μs | 3.1660μs | 315.8553 KOps/s | 313.5322 KOps/s | |
test_creation_empty | 13.7972μs | 13.2702μs | 75.3569 KOps/s | 74.7068 KOps/s | |
test_creation_nested_1 | 23.6043μs | 22.5789μs | 44.2891 KOps/s | 45.4928 KOps/s | |
test_creation_nested_2 | 25.9054μs | 23.5424μs | 42.4765 KOps/s | 42.3503 KOps/s | |
test_clone | 22.7513μs | 21.4895μs | 46.5343 KOps/s | 45.7436 KOps/s | |
test_getitem[int] | 26.7899μs | 25.7370μs | 38.8545 KOps/s | 39.2289 KOps/s | |
test_getitem[slice_int] | 57.1179μs | 53.1039μs | 18.8310 KOps/s | 19.1551 KOps/s | |
test_getitem[range] | 65.0189μs | 59.7029μs | 16.7496 KOps/s | 17.0419 KOps/s | |
test_getitem[tuple] | 53.3438μs | 49.4914μs | 20.2055 KOps/s | 20.6274 KOps/s | |
test_getitem[list] | 59.0771μs | 53.8204μs | 18.5803 KOps/s | 19.2811 KOps/s | |
test_setitem_dim[int] | 68.8010μs | 38.2819μs | 26.1220 KOps/s | 26.5803 KOps/s | |
test_setitem_dim[slice_int] | 0.1436ms | 68.4015μs | 14.6196 KOps/s | 14.7240 KOps/s | |
test_setitem_dim[range] | 0.1424ms | 69.6866μs | 14.3500 KOps/s | 14.6778 KOps/s | |
test_setitem_dim[tuple] | 95.8010μs | 61.2950μs | 16.3145 KOps/s | 16.4174 KOps/s | |
test_setitem | 31.8324μs | 29.9289μs | 33.4125 KOps/s | 33.0491 KOps/s | |
test_set | 30.9124μs | 29.2409μs | 34.1987 KOps/s | 33.8203 KOps/s | |
test_set_shared | 0.1713ms | 0.1666ms | 6.0027 KOps/s | 5.9589 KOps/s | |
test_update | 39.4416μs | 37.3758μs | 26.7552 KOps/s | 26.5816 KOps/s | |
test_update_nested | 54.5058μs | 53.4388μs | 18.7130 KOps/s | 18.5926 KOps/s | |
test_set_nested | 38.4466μs | 36.8375μs | 27.1463 KOps/s | 26.8862 KOps/s | |
test_set_nested_new | 52.7938μs | 51.2965μs | 19.4945 KOps/s | 19.1044 KOps/s | |
test_select | 83.8602μs | 81.5532μs | 12.2619 KOps/s | 12.3248 KOps/s | |
test_creation[device0] | 1.2133ms | 0.4971ms | 2.0117 KOps/s | 2.0000 KOps/s | |
test_creation_from_tensor | 0.5867ms | 0.4685ms | 2.1346 KOps/s | 2.1217 KOps/s | |
test_add_one[memmap_tensor0] | 48.1687μs | 29.4526μs | 33.9529 KOps/s | 33.8414 KOps/s | |
test_contiguous[memmap_tensor0] | 8.3231μs | 7.8694μs | 127.0749 KOps/s | 120.6105 KOps/s | |
test_stack[memmap_tensor0] | 0.1907ms | 44.5407μs | 22.4514 KOps/s | 23.2880 KOps/s | |
test_reshape_pytree | 31.1724μs | 28.4134μs | 35.1946 KOps/s | 34.6509 KOps/s | |
test_reshape_td | 41.9076μs | 39.4706μs | 25.3353 KOps/s | 25.5044 KOps/s | |
test_view_pytree | 27.2894μs | 26.0390μs | 38.4040 KOps/s | 38.1078 KOps/s | |
test_view_td | 7.9931μs | 6.9969μs | 142.9208 KOps/s | 142.8573 KOps/s | |
test_unbind_pytree | 32.1215μs | 30.5991μs | 32.6807 KOps/s | 33.3162 KOps/s | |
test_unbind_td | 0.1537ms | 0.1509ms | 6.6281 KOps/s | 7.0089 KOps/s | |
test_split_pytree | 36.0085μs | 34.0331μs | 29.3831 KOps/s | 29.4610 KOps/s | |
test_split_td | 97.4744μs | 94.6659μs | 10.5635 KOps/s | 10.8312 KOps/s | |
test_add_pytree | 40.0206μs | 37.7422μs | 26.4956 KOps/s | 26.5014 KOps/s | |
test_add_td | 64.4949μs | 61.9605μs | 16.1393 KOps/s | 16.4946 KOps/s | |
test_distributed | 73.1010μs | 73.1010μs | 13.6797 KOps/s | 12.0770 KOps/s | |
test_tdmodule | 46.9010μs | 24.2864μs | 41.1753 KOps/s | 42.2632 KOps/s | |
test_tdmodule_dispatch | 0.2217ms | 53.5443μs | 18.6761 KOps/s | 19.1907 KOps/s | |
test_tdseq | 98.6010μs | 32.4940μs | 30.7749 KOps/s | 31.5865 KOps/s | |
test_tdseq_dispatch | 0.1135ms | 63.1244μs | 15.8417 KOps/s | 15.9736 KOps/s | |
test_instantiation_functorch | 1.3406ms | 1.2724ms | 785.8951 Ops/s | 781.9152 Ops/s | |
test_instantiation_td | 1.1561ms | 0.9940ms | 1.0061 KOps/s | 1.0198 KOps/s | |
test_exec_functorch | 0.1874ms | 0.1579ms | 6.3343 KOps/s | 6.2909 KOps/s | |
test_exec_td | 0.2786ms | 0.2723ms | 3.6729 KOps/s | 3.6314 KOps/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Benchmarks
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.