-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Fix h5 auto batch size #798
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
May 30, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 39.8840μs | 16.9042μs | 59.1569 KOps/s | 54.9812 KOps/s | |
test_plain_set_stack_nested | 37.3200μs | 17.0638μs | 58.6036 KOps/s | 53.8870 KOps/s | |
test_plain_set_nested_inplace | 57.1960μs | 19.2225μs | 52.0224 KOps/s | 48.0917 KOps/s | |
test_plain_set_stack_nested_inplace | 60.5830μs | 19.2570μs | 51.9292 KOps/s | 48.8966 KOps/s | |
test_items | 17.2420μs | 2.4914μs | 401.3818 KOps/s | 366.4193 KOps/s | |
test_items_nested | 0.4358ms | 0.2742ms | 3.6471 KOps/s | 3.8047 KOps/s | |
test_items_nested_locked | 1.1363ms | 0.2762ms | 3.6210 KOps/s | 3.7464 KOps/s | |
test_items_nested_leaf | 0.1629ms | 78.7390μs | 12.7002 KOps/s | 12.6152 KOps/s | |
test_items_stack_nested | 1.7757ms | 0.2764ms | 3.6174 KOps/s | 3.7383 KOps/s | |
test_items_stack_nested_leaf | 0.1476ms | 79.0930μs | 12.6433 KOps/s | 12.3574 KOps/s | |
test_items_stack_nested_locked | 0.4160ms | 0.2737ms | 3.6531 KOps/s | 3.7767 KOps/s | |
test_keys | 16.8310μs | 3.9968μs | 250.1998 KOps/s | 259.4504 KOps/s | |
test_keys_nested | 0.2410ms | 0.1398ms | 7.1518 KOps/s | 7.0534 KOps/s | |
test_keys_nested_locked | 0.6922ms | 0.1436ms | 6.9629 KOps/s | 6.7488 KOps/s | |
test_keys_nested_leaf | 0.2124ms | 0.1185ms | 8.4391 KOps/s | 8.4939 KOps/s | |
test_keys_stack_nested | 0.2448ms | 0.1405ms | 7.1198 KOps/s | 6.9813 KOps/s | |
test_keys_stack_nested_leaf | 0.2358ms | 0.1181ms | 8.4661 KOps/s | 8.1772 KOps/s | |
test_keys_stack_nested_locked | 0.2114ms | 0.1438ms | 6.9560 KOps/s | 6.7030 KOps/s | |
test_values | 4.6446μs | 1.1389μs | 878.0284 KOps/s | 850.7945 KOps/s | |
test_values_nested | 89.2760μs | 50.7111μs | 19.7195 KOps/s | 18.8829 KOps/s | |
test_values_nested_locked | 88.0740μs | 50.9424μs | 19.6300 KOps/s | 18.9382 KOps/s | |
test_values_nested_leaf | 76.2230μs | 45.9874μs | 21.7451 KOps/s | 20.7698 KOps/s | |
test_values_stack_nested | 87.3230μs | 50.9253μs | 19.6366 KOps/s | 18.7600 KOps/s | |
test_values_stack_nested_leaf | 92.8130μs | 45.6432μs | 21.9091 KOps/s | 21.0965 KOps/s | |
test_values_stack_nested_locked | 0.1053ms | 50.8282μs | 19.6741 KOps/s | 18.4395 KOps/s | |
test_membership | 11.4310μs | 1.3400μs | 746.2479 KOps/s | 742.7522 KOps/s | |
test_membership_nested | 19.2860μs | 3.5288μs | 283.3828 KOps/s | 288.1074 KOps/s | |
test_membership_nested_leaf | 28.0330μs | 3.5635μs | 280.6266 KOps/s | 286.8593 KOps/s | |
test_membership_stacked_nested | 20.6180μs | 3.4917μs | 286.3929 KOps/s | 286.6680 KOps/s | |
test_membership_stacked_nested_leaf | 29.3840μs | 3.5519μs | 281.5380 KOps/s | 288.7939 KOps/s | |
test_membership_nested_last | 25.9390μs | 4.3427μs | 230.2691 KOps/s | 234.9774 KOps/s | |
test_membership_nested_leaf_last | 24.8870μs | 4.3930μs | 227.6366 KOps/s | 234.1534 KOps/s | |
test_membership_stacked_nested_last | 18.1940μs | 4.3385μs | 230.4954 KOps/s | 239.9229 KOps/s | |
test_membership_stacked_nested_leaf_last | 19.6060μs | 4.3733μs | 228.6582 KOps/s | 235.5802 KOps/s | |
test_nested_getleaf | 36.1180μs | 10.5870μs | 94.4557 KOps/s | 93.8552 KOps/s | |
test_nested_get | 38.8830μs | 10.0031μs | 99.9690 KOps/s | 101.0254 KOps/s | |
test_stacked_getleaf | 29.9460μs | 10.5405μs | 94.8724 KOps/s | 95.4387 KOps/s | |
test_stacked_get | 33.3320μs | 10.1562μs | 98.4623 KOps/s | 101.9049 KOps/s | |
test_nested_getitemleaf | 32.7710μs | 11.2108μs | 89.1993 KOps/s | 91.4450 KOps/s | |
test_nested_getitem | 29.2040μs | 10.3851μs | 96.2920 KOps/s | 97.7728 KOps/s | |
test_stacked_getitemleaf | 33.0410μs | 11.1298μs | 89.8487 KOps/s | 91.3947 KOps/s | |
test_stacked_getitem | 29.8960μs | 10.2874μs | 97.2065 KOps/s | 98.8913 KOps/s | |
test_lock_nested | 52.2181ms | 0.4099ms | 2.4395 KOps/s | 2.8178 KOps/s | |
test_lock_stack_nested | 0.4472ms | 0.3164ms | 3.1603 KOps/s | 3.1435 KOps/s | |
test_unlock_nested | 0.7355ms | 0.3540ms | 2.8251 KOps/s | 2.4072 KOps/s | |
test_unlock_stack_nested | 0.3964ms | 0.3255ms | 3.0719 KOps/s | 3.0804 KOps/s | |
test_flatten_speed | 0.1805ms | 0.1002ms | 9.9781 KOps/s | 10.2295 KOps/s | |
test_unflatten_speed | 0.5820ms | 0.4149ms | 2.4103 KOps/s | 2.3663 KOps/s | |
test_common_ops | 4.5301ms | 0.7228ms | 1.3836 KOps/s | 1.3395 KOps/s | |
test_creation | 19.1150μs | 1.9481μs | 513.3234 KOps/s | 508.7629 KOps/s | |
test_creation_empty | 23.2830μs | 10.9334μs | 91.4631 KOps/s | 86.2039 KOps/s | |
test_creation_nested_1 | 78.8170μs | 13.7863μs | 72.5355 KOps/s | 67.6625 KOps/s | |
test_creation_nested_2 | 41.0870μs | 17.1754μs | 58.2227 KOps/s | 55.3384 KOps/s | |
test_clone | 66.9650μs | 13.9756μs | 71.5532 KOps/s | 71.6028 KOps/s | |
test_getitem[int] | 39.4040μs | 11.8298μs | 84.5321 KOps/s | 86.2776 KOps/s | |
test_getitem[slice_int] | 55.7640μs | 23.1956μs | 43.1116 KOps/s | 42.4458 KOps/s | |
test_getitem[range] | 86.8010μs | 61.1618μs | 16.3501 KOps/s | 16.6228 KOps/s | |
test_getitem[tuple] | 58.5590μs | 19.5286μs | 51.2070 KOps/s | 52.0666 KOps/s | |
test_getitem[list] | 93.5440μs | 42.0249μs | 23.7954 KOps/s | 24.1733 KOps/s | |
test_setitem_dim[int] | 67.4460μs | 35.6789μs | 28.0278 KOps/s | 27.4384 KOps/s | |
test_setitem_dim[slice_int] | 0.1123ms | 62.3747μs | 16.0321 KOps/s | 15.6062 KOps/s | |
test_setitem_dim[range] | 0.1303ms | 84.5542μs | 11.8267 KOps/s | 11.7214 KOps/s | |
test_setitem_dim[tuple] | 76.0420μs | 51.7563μs | 19.3213 KOps/s | 19.1554 KOps/s | |
test_setitem | 60.1620μs | 21.3618μs | 46.8125 KOps/s | 47.5460 KOps/s | |
test_set | 57.3770μs | 20.7092μs | 48.2878 KOps/s | 48.1711 KOps/s | |
test_set_shared | 1.5910ms | 0.1400ms | 7.1411 KOps/s | 7.0319 KOps/s | |
test_update | 0.1251ms | 22.8014μs | 43.8570 KOps/s | 43.0510 KOps/s | |
test_update_nested | 71.8540μs | 31.4308μs | 31.8159 KOps/s | 31.3856 KOps/s | |
test_update__nested | 66.8250μs | 26.4980μs | 37.7387 KOps/s | 38.5816 KOps/s | |
test_set_nested | 64.9810μs | 22.6472μs | 44.1556 KOps/s | 44.2922 KOps/s | |
test_set_nested_new | 78.4270μs | 27.6934μs | 36.1096 KOps/s | 37.4120 KOps/s | |
test_select | 0.1033ms | 42.5794μs | 23.4855 KOps/s | 23.5324 KOps/s | |
test_select_nested | 0.1299ms | 61.6022μs | 16.2332 KOps/s | 16.4699 KOps/s | |
test_exclude_nested | 0.2603ms | 0.1246ms | 8.0282 KOps/s | 8.0862 KOps/s | |
test_empty[True] | 0.7271ms | 0.4005ms | 2.4969 KOps/s | 2.4522 KOps/s | |
test_empty[False] | 7.7445μs | 1.2132μs | 824.2481 KOps/s | 854.1842 KOps/s | |
test_unbind_speed | 0.3339ms | 0.2667ms | 3.7497 KOps/s | 3.7327 KOps/s | |
test_unbind_speed_stack0 | 4.3208ms | 0.2649ms | 3.7755 KOps/s | 3.8542 KOps/s | |
test_unbind_speed_stack1 | 67.3243ms | 0.7400ms | 1.3514 KOps/s | 1.2967 KOps/s | |
test_split | 66.9079ms | 1.6449ms | 607.9273 Ops/s | 612.7234 Ops/s | |
test_chunk | 67.8922ms | 1.6336ms | 612.1582 Ops/s | 616.2573 Ops/s | |
test_creation[device0] | 0.1636ms | 83.4964μs | 11.9766 KOps/s | 11.6825 KOps/s | |
test_creation_from_tensor | 3.2344ms | 85.4653μs | 11.7007 KOps/s | 11.5131 KOps/s | |
test_add_one[memmap_tensor0] | 70.5220μs | 5.3572μs | 186.6630 KOps/s | 182.1874 KOps/s | |
test_contiguous[memmap_tensor0] | 10.0380μs | 0.6372μs | 1.5694 MOps/s | 1.5277 MOps/s | |
test_stack[memmap_tensor0] | 17.2720μs | 3.6345μs | 275.1396 KOps/s | 279.1328 KOps/s | |
test_memmaptd_index | 0.9440ms | 0.2565ms | 3.8989 KOps/s | 3.8838 KOps/s | |
test_memmaptd_index_astensor | 0.7734ms | 0.3302ms | 3.0287 KOps/s | 2.9856 KOps/s | |
test_memmaptd_index_op | 0.8723ms | 0.6221ms | 1.6073 KOps/s | 1.5496 KOps/s | |
test_serialize_model | 0.1861s | 0.1166s | 8.5750 Ops/s | 8.3742 Ops/s | |
test_serialize_model_pickle | 0.4694s | 0.3768s | 2.6543 Ops/s | 2.6380 Ops/s | |
test_serialize_weights | 0.1072s | 0.1015s | 9.8562 Ops/s | 8.6474 Ops/s | |
test_serialize_weights_returnearly | 0.1979s | 0.1386s | 7.2150 Ops/s | 7.7724 Ops/s | |
test_serialize_weights_pickle | 0.7530s | 0.5111s | 1.9567 Ops/s | 1.5691 Ops/s | |
test_serialize_weights_filesystem | 98.1820ms | 92.5309ms | 10.8072 Ops/s | 10.7730 Ops/s | |
test_serialize_model_filesystem | 0.1581s | 0.1019s | 9.8123 Ops/s | 9.8609 Ops/s | |
test_reshape_pytree | 76.3630μs | 25.9817μs | 38.4886 KOps/s | 37.7756 KOps/s | |
test_reshape_td | 89.6270μs | 35.7596μs | 27.9645 KOps/s | 29.4512 KOps/s | |
test_view_pytree | 72.6150μs | 25.9395μs | 38.5513 KOps/s | 37.7384 KOps/s | |
test_view_td | 82.9240μs | 40.2937μs | 24.8178 KOps/s | 26.1042 KOps/s | |
test_unbind_pytree | 77.3140μs | 29.4605μs | 33.9438 KOps/s | 33.5687 KOps/s | |
test_unbind_td | 0.3982ms | 39.1518μs | 25.5416 KOps/s | 25.9114 KOps/s | |
test_split_pytree | 94.9070μs | 30.1545μs | 33.1625 KOps/s | 33.1057 KOps/s | |
test_split_td | 0.1312ms | 41.8029μs | 23.9218 KOps/s | 24.3492 KOps/s | |
test_add_pytree | 0.1316ms | 35.6944μs | 28.0156 KOps/s | 27.6750 KOps/s | |
test_add_td | 0.1298ms | 56.3849μs | 17.7353 KOps/s | 17.7132 KOps/s | |
test_distributed | 0.2193ms | 0.1043ms | 9.5909 KOps/s | 9.7540 KOps/s | |
test_tdmodule | 50.4140μs | 17.4178μs | 57.4125 KOps/s | 53.0395 KOps/s | |
test_tdmodule_dispatch | 56.5860μs | 35.1021μs | 28.4884 KOps/s | 26.6778 KOps/s | |
test_tdseq | 38.2910μs | 20.7489μs | 48.1954 KOps/s | 45.7738 KOps/s | |
test_tdseq_dispatch | 79.4080μs | 40.8601μs | 24.4737 KOps/s | 23.1413 KOps/s | |
test_instantiation_functorch | 1.6106ms | 1.3301ms | 751.8091 Ops/s | 745.5155 Ops/s | |
test_instantiation_td | 1.4828ms | 1.0445ms | 957.3548 Ops/s | 968.3705 Ops/s | |
test_exec_functorch | 0.3546ms | 0.1611ms | 6.2060 KOps/s | 6.1904 KOps/s | |
test_exec_functional_call | 0.2322ms | 0.1493ms | 6.6985 KOps/s | 6.6469 KOps/s | |
test_exec_td | 0.2310ms | 0.1474ms | 6.7863 KOps/s | 6.9352 KOps/s | |
test_exec_td_decorator | 0.3037ms | 0.2201ms | 4.5441 KOps/s | 4.4947 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.7693ms | 0.4927ms | 2.0297 KOps/s | 2.0176 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.7294ms | 0.4896ms | 2.0423 KOps/s | 2.0268 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.8265ms | 0.3989ms | 2.5066 KOps/s | 2.5226 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.6342ms | 0.3976ms | 2.5151 KOps/s | 2.5193 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.6645ms | 0.5596ms | 1.7870 KOps/s | 1.7759 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8672ms | 0.5622ms | 1.7786 KOps/s | 1.7735 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.6817ms | 0.4602ms | 2.1730 KOps/s | 2.1513 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.5785ms | 0.4594ms | 2.1769 KOps/s | 2.1664 KOps/s | |
test_to_module_speed[True] | 1.7909ms | 1.7029ms | 587.2507 Ops/s | 576.9294 Ops/s | |
test_to_module_speed[False] | 1.7391ms | 1.6659ms | 600.2683 Ops/s | 585.6720 Ops/s | |
test_tc_init | 56.0450μs | 28.7631μs | 34.7667 KOps/s | 32.0511 KOps/s | |
test_tc_init_nested | 96.7510μs | 61.6728μs | 16.2146 KOps/s | 15.7168 KOps/s | |
test_tc_first_layer_tensor | 1.6776μs | 0.6832μs | 1.4638 MOps/s | 1.4257 MOps/s | |
test_tc_first_layer_nontensor | 1.9636μs | 0.6771μs | 1.4768 MOps/s | 1.4425 MOps/s | |
test_tc_second_layer_tensor | 14.3670μs | 1.8437μs | 542.3916 KOps/s | 538.7432 KOps/s | |
test_tc_second_layer_nontensor | 8.1953μs | 1.5342μs | 651.8146 KOps/s | 674.5839 KOps/s | |
test_unbind | 79.1462ms | 6.8547ms | 145.8848 Ops/s | 154.3698 Ops/s | |
test_full_like | 18.0895ms | 10.9475ms | 91.3452 Ops/s | 83.6799 Ops/s | |
test_zeros_like | 10.3735ms | 5.9990ms | 166.6958 Ops/s | 158.0867 Ops/s | |
test_ones_like | 14.6328ms | 6.6459ms | 150.4690 Ops/s | 158.2326 Ops/s | |
test_clone | 14.8330ms | 8.3505ms | 119.7538 Ops/s | 122.2998 Ops/s | |
test_squeeze | 79.6690μs | 14.6946μs | 68.0520 KOps/s | 64.5233 KOps/s | |
test_unsqueeze | 73.3677ms | 88.7009μs | 11.2738 KOps/s | 14.4112 KOps/s | |
test_split | 0.2185ms | 0.1141ms | 8.7648 KOps/s | 8.6529 KOps/s | |
test_permute | 0.2458ms | 0.1407ms | 7.1091 KOps/s | 7.2022 KOps/s | |
test_stack | 24.9314ms | 24.2079ms | 41.3088 Ops/s | 41.8391 Ops/s | |
test_cat | 28.4918ms | 24.5079ms | 40.8031 Ops/s | 41.7958 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.5087ms | 12.9486μs | 77.2287 KOps/s | 79.0829 KOps/s | |
test_plain_set_stack_nested | 40.9510μs | 13.0513μs | 76.6207 KOps/s | 78.7070 KOps/s | |
test_plain_set_nested_inplace | 49.0510μs | 14.0674μs | 71.0865 KOps/s | 71.9462 KOps/s | |
test_plain_set_stack_nested_inplace | 56.4920μs | 14.2086μs | 70.3797 KOps/s | 71.3893 KOps/s | |
test_items | 26.2000μs | 4.6318μs | 215.9006 KOps/s | 210.2393 KOps/s | |
test_items_nested | 0.3958ms | 0.3433ms | 2.9128 KOps/s | 2.9061 KOps/s | |
test_items_nested_locked | 0.4223ms | 0.3525ms | 2.8366 KOps/s | 2.8576 KOps/s | |
test_items_nested_leaf | 0.1041ms | 83.1412μs | 12.0277 KOps/s | 12.0233 KOps/s | |
test_items_stack_nested | 0.4123ms | 0.3526ms | 2.8361 KOps/s | 2.8582 KOps/s | |
test_items_stack_nested_leaf | 0.1242ms | 85.5698μs | 11.6864 KOps/s | 11.9961 KOps/s | |
test_items_stack_nested_locked | 0.4142ms | 0.3576ms | 2.7962 KOps/s | 2.8536 KOps/s | |
test_keys | 18.8000μs | 4.4064μs | 226.9438 KOps/s | 229.6933 KOps/s | |
test_keys_nested | 97.2220μs | 67.2066μs | 14.8795 KOps/s | 14.9336 KOps/s | |
test_keys_nested_locked | 0.7832ms | 72.2174μs | 13.8471 KOps/s | 13.7619 KOps/s | |
test_keys_nested_leaf | 91.3320μs | 57.4265μs | 17.4136 KOps/s | 17.3117 KOps/s | |
test_keys_stack_nested | 0.1025ms | 67.6897μs | 14.7733 KOps/s | 15.0243 KOps/s | |
test_keys_stack_nested_leaf | 90.8620μs | 58.0075μs | 17.2391 KOps/s | 17.2595 KOps/s | |
test_keys_stack_nested_locked | 97.6810μs | 72.6458μs | 13.7654 KOps/s | 14.1516 KOps/s | |
test_values | 8.5300μs | 1.8157μs | 550.7565 KOps/s | 554.4631 KOps/s | |
test_values_nested | 62.3610μs | 34.9287μs | 28.6298 KOps/s | 28.6734 KOps/s | |
test_values_nested_locked | 60.1310μs | 37.1891μs | 26.8896 KOps/s | 26.9573 KOps/s | |
test_values_nested_leaf | 51.3410μs | 31.0512μs | 32.2048 KOps/s | 32.1957 KOps/s | |
test_values_stack_nested | 60.2210μs | 35.5422μs | 28.1356 KOps/s | 27.7653 KOps/s | |
test_values_stack_nested_leaf | 60.9910μs | 31.6568μs | 31.5888 KOps/s | 31.3352 KOps/s | |
test_values_stack_nested_locked | 58.0010μs | 37.6011μs | 26.5950 KOps/s | 26.4220 KOps/s | |
test_membership | 2.0050μs | 0.7172μs | 1.3942 MOps/s | 1.4006 MOps/s | |
test_membership_nested | 29.8700μs | 2.5828μs | 387.1773 KOps/s | 392.3771 KOps/s | |
test_membership_nested_leaf | 20.6210μs | 2.6126μs | 382.7536 KOps/s | 386.7910 KOps/s | |
test_membership_stacked_nested | 14.4210μs | 2.5997μs | 384.6649 KOps/s | 389.4211 KOps/s | |
test_membership_stacked_nested_leaf | 33.6610μs | 2.6368μs | 379.2514 KOps/s | 386.8834 KOps/s | |
test_membership_nested_last | 21.6890μs | 3.1101μs | 321.5331 KOps/s | 325.3120 KOps/s | |
test_membership_nested_leaf_last | 34.3100μs | 3.1265μs | 319.8467 KOps/s | 324.5298 KOps/s | |
test_membership_stacked_nested_last | 23.4010μs | 3.5981μs | 277.9229 KOps/s | 160.0610 KOps/s | |
test_membership_stacked_nested_leaf_last | 34.5200μs | 3.5715μs | 279.9907 KOps/s | 159.4279 KOps/s | |
test_nested_getleaf | 28.1800μs | 8.3800μs | 119.3320 KOps/s | 119.2979 KOps/s | |
test_nested_get | 35.4910μs | 7.9052μs | 126.4997 KOps/s | 127.1482 KOps/s | |
test_stacked_getleaf | 36.3510μs | 8.4285μs | 118.6455 KOps/s | 119.1471 KOps/s | |
test_stacked_get | 25.8310μs | 7.9241μs | 126.1976 KOps/s | 126.4059 KOps/s | |
test_nested_getitemleaf | 33.5100μs | 8.5570μs | 116.8635 KOps/s | 117.0851 KOps/s | |
test_nested_getitem | 30.0010μs | 8.0682μs | 123.9434 KOps/s | 124.0931 KOps/s | |
test_stacked_getitemleaf | 32.1210μs | 8.6246μs | 115.9478 KOps/s | 116.1020 KOps/s | |
test_stacked_getitem | 37.5210μs | 8.0493μs | 124.2340 KOps/s | 123.5406 KOps/s | |
test_lock_nested | 58.6341ms | 0.4187ms | 2.3885 KOps/s | 2.3522 KOps/s | |
test_lock_stack_nested | 0.3490ms | 0.3146ms | 3.1785 KOps/s | 3.1814 KOps/s | |
test_unlock_nested | 0.7349ms | 0.3596ms | 2.7812 KOps/s | 2.7429 KOps/s | |
test_unlock_stack_nested | 0.3501ms | 0.3212ms | 3.1133 KOps/s | 3.1142 KOps/s | |
test_flatten_speed | 0.1853ms | 0.1021ms | 9.7941 KOps/s | 9.8931 KOps/s | |
test_unflatten_speed | 0.3282ms | 0.2919ms | 3.4261 KOps/s | 3.4503 KOps/s | |
test_common_ops | 1.1758ms | 0.5997ms | 1.6674 KOps/s | 1.7091 KOps/s | |
test_creation | 33.5500μs | 1.6718μs | 598.1561 KOps/s | 604.8181 KOps/s | |
test_creation_empty | 39.8010μs | 8.7883μs | 113.7880 KOps/s | 122.0020 KOps/s | |
test_creation_nested_1 | 31.0510μs | 10.4932μs | 95.2994 KOps/s | 98.6446 KOps/s | |
test_creation_nested_2 | 30.9410μs | 12.5667μs | 79.5752 KOps/s | 80.7912 KOps/s | |
test_clone | 85.5100μs | 12.1442μs | 82.3441 KOps/s | 78.7927 KOps/s | |
test_getitem[int] | 27.4810μs | 11.4487μs | 87.3460 KOps/s | 86.7855 KOps/s | |
test_getitem[slice_int] | 40.9200μs | 21.7532μs | 45.9703 KOps/s | 45.8073 KOps/s | |
test_getitem[range] | 68.1010μs | 51.1966μs | 19.5325 KOps/s | 20.0541 KOps/s | |
test_getitem[tuple] | 71.9910μs | 19.4603μs | 51.3867 KOps/s | 50.9324 KOps/s | |
test_getitem[list] | 0.1308ms | 35.7500μs | 27.9720 KOps/s | 27.1341 KOps/s | |
test_setitem_dim[int] | 47.5010μs | 30.7123μs | 32.5602 KOps/s | 30.8863 KOps/s | |
test_setitem_dim[slice_int] | 74.0810μs | 51.0646μs | 19.5831 KOps/s | 18.4935 KOps/s | |
test_setitem_dim[range] | 87.5810μs | 68.5389μs | 14.5903 KOps/s | 13.7388 KOps/s | |
test_setitem_dim[tuple] | 87.3020μs | 44.9667μs | 22.2387 KOps/s | 21.0890 KOps/s | |
test_setitem | 46.1000μs | 17.5346μs | 57.0302 KOps/s | 53.4372 KOps/s | |
test_set | 49.7100μs | 16.9124μs | 59.1281 KOps/s | 54.5767 KOps/s | |
test_set_shared | 1.2967ms | 0.1003ms | 9.9741 KOps/s | 10.0812 KOps/s | |
test_update | 87.1220μs | 18.9685μs | 52.7190 KOps/s | 50.5892 KOps/s | |
test_update_nested | 63.1010μs | 24.0689μs | 41.5474 KOps/s | 40.1110 KOps/s | |
test_update__nested | 57.1210μs | 23.3213μs | 42.8792 KOps/s | 38.4539 KOps/s | |
test_set_nested | 77.6210μs | 18.1812μs | 55.0020 KOps/s | 52.3960 KOps/s | |
test_set_nested_new | 53.0710μs | 20.8883μs | 47.8737 KOps/s | 46.3749 KOps/s | |
test_select | 67.9610μs | 33.6312μs | 29.7343 KOps/s | 28.9451 KOps/s | |
test_select_nested | 0.5384ms | 54.6517μs | 18.2977 KOps/s | 18.5457 KOps/s | |
test_exclude_nested | 0.1532ms | 0.1108ms | 9.0272 KOps/s | 8.8895 KOps/s | |
test_empty[True] | 0.4183ms | 0.3528ms | 2.8343 KOps/s | 2.8504 KOps/s | |
test_empty[False] | 3.1251μs | 0.9191μs | 1.0881 MOps/s | 1.0897 MOps/s | |
test_to | 0.1043ms | 76.9391μs | 12.9973 KOps/s | 12.8627 KOps/s | |
test_to_nonblocking | 0.1008ms | 62.3691μs | 16.0336 KOps/s | 15.5832 KOps/s | |
test_unbind_speed | 0.3202ms | 0.2773ms | 3.6067 KOps/s | 3.5513 KOps/s | |
test_unbind_speed_stack0 | 0.3139ms | 0.2799ms | 3.5730 KOps/s | 3.5934 KOps/s | |
test_unbind_speed_stack1 | 75.4955ms | 0.8324ms | 1.2013 KOps/s | 1.2255 KOps/s | |
test_split | 75.7234ms | 1.7017ms | 587.6613 Ops/s | 579.1127 Ops/s | |
test_chunk | 75.5595ms | 1.7068ms | 585.8750 Ops/s | 576.6418 Ops/s | |
test_creation[device0] | 0.1333ms | 62.1641μs | 16.0864 KOps/s | 15.9254 KOps/s | |
test_creation_from_tensor | 0.1315ms | 58.7775μs | 17.0133 KOps/s | 16.4219 KOps/s | |
test_add_one[memmap_tensor0] | 68.9510μs | 7.4886μs | 133.5358 KOps/s | 131.3234 KOps/s | |
test_contiguous[memmap_tensor0] | 25.0810μs | 0.6771μs | 1.4768 MOps/s | 1.4707 MOps/s | |
test_stack[memmap_tensor0] | 36.3110μs | 4.9785μs | 200.8648 KOps/s | 200.5286 KOps/s | |
test_memmaptd_index | 1.1360ms | 0.2984ms | 3.3518 KOps/s | 3.1444 KOps/s | |
test_memmaptd_index_astensor | 0.7067ms | 0.3696ms | 2.7059 KOps/s | 2.6634 KOps/s | |
test_memmaptd_index_op | 1.1790ms | 0.6933ms | 1.4423 KOps/s | 1.4307 KOps/s | |
test_serialize_model | 0.1827s | 0.1114s | 8.9793 Ops/s | 8.5946 Ops/s | |
test_serialize_model_pickle | 1.3499s | 1.2355s | 0.8094 Ops/s | 0.8084 Ops/s | |
test_serialize_weights | 0.1808s | 0.1095s | 9.1331 Ops/s | 8.7115 Ops/s | |
test_serialize_weights_returnearly | 0.2485s | 0.1007s | 9.9276 Ops/s | 10.2564 Ops/s | |
test_serialize_weights_pickle | 1.3743s | 1.2542s | 0.7973 Ops/s | 0.7983 Ops/s | |
test_reshape_pytree | 55.9800μs | 26.7315μs | 37.4090 KOps/s | 37.6085 KOps/s | |
test_reshape_td | 60.5710μs | 32.5192μs | 30.7511 KOps/s | 30.8583 KOps/s | |
test_view_pytree | 0.2608ms | 26.5862μs | 37.6135 KOps/s | 38.0519 KOps/s | |
test_view_td | 60.6910μs | 37.4968μs | 26.6690 KOps/s | 27.1947 KOps/s | |
test_unbind_pytree | 0.2282ms | 33.1357μs | 30.1789 KOps/s | 30.0098 KOps/s | |
test_unbind_td | 0.4250ms | 43.5101μs | 22.9831 KOps/s | 21.8757 KOps/s | |
test_split_pytree | 70.5020μs | 36.9178μs | 27.0872 KOps/s | 27.7575 KOps/s | |
test_split_td | 0.4965ms | 42.1038μs | 23.7508 KOps/s | 23.1926 KOps/s | |
test_add_pytree | 0.2393ms | 40.0350μs | 24.9781 KOps/s | 24.9722 KOps/s | |
test_add_td | 83.8910μs | 52.6096μs | 19.0079 KOps/s | 18.4559 KOps/s | |
test_distributed | 1.6277ms | 68.7884μs | 14.5373 KOps/s | 11.4025 KOps/s | |
test_tdmodule | 30.2110μs | 14.5790μs | 68.5916 KOps/s | 67.7339 KOps/s | |
test_tdmodule_dispatch | 44.4420μs | 28.6593μs | 34.8927 KOps/s | 34.1641 KOps/s | |
test_tdseq | 31.5410μs | 16.8016μs | 59.5182 KOps/s | 58.3701 KOps/s | |
test_tdseq_dispatch | 54.2800μs | 32.3959μs | 30.8681 KOps/s | 30.6300 KOps/s | |
test_instantiation_functorch | 1.7459ms | 1.5641ms | 639.3429 Ops/s | 635.7864 Ops/s | |
test_instantiation_td | 1.5808ms | 1.0757ms | 929.5855 Ops/s | 927.7306 Ops/s | |
test_exec_functorch | 0.1996ms | 0.1554ms | 6.4353 KOps/s | 6.4280 KOps/s | |
test_exec_functional_call | 0.2140ms | 0.1443ms | 6.9299 KOps/s | 6.8789 KOps/s | |
test_exec_td | 0.1983ms | 0.1460ms | 6.8515 KOps/s | 6.9176 KOps/s | |
test_exec_td_decorator | 0.7028ms | 0.2178ms | 4.5914 KOps/s | 4.6914 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.6956ms | 0.6097ms | 1.6403 KOps/s | 1.6500 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.6787ms | 0.6079ms | 1.6451 KOps/s | 1.6544 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.6213ms | 0.5615ms | 1.7811 KOps/s | 1.8373 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.6296ms | 0.5642ms | 1.7725 KOps/s | 1.8499 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.5088ms | 0.6746ms | 1.4823 KOps/s | 1.5043 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.7818ms | 0.6668ms | 1.4998 KOps/s | 1.5020 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7152ms | 0.5929ms | 1.6866 KOps/s | 1.6811 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7031ms | 0.6110ms | 1.6366 KOps/s | 1.6665 KOps/s | |
test_vmap_transformer_speed[True-True] | 8.2149ms | 8.1024ms | 123.4197 Ops/s | 122.9363 Ops/s | |
test_vmap_transformer_speed[True-False] | 8.1591ms | 8.0841ms | 123.7000 Ops/s | 122.2583 Ops/s | |
test_vmap_transformer_speed[False-True] | 8.1480ms | 8.0387ms | 124.3977 Ops/s | 124.1601 Ops/s | |
test_vmap_transformer_speed[False-False] | 8.3889ms | 8.0407ms | 124.3666 Ops/s | 124.5204 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 20.5266ms | 19.7107ms | 50.7339 Ops/s | 50.7390 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.7376ms | 19.6267ms | 50.9511 Ops/s | 50.9657 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 20.2623ms | 19.5328ms | 51.1959 Ops/s | 51.1612 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 20.3291ms | 19.6007ms | 51.0185 Ops/s | 51.3245 Ops/s | |
test_to_module_speed[True] | 1.6399ms | 1.5141ms | 660.4553 Ops/s | 645.9906 Ops/s | |
test_to_module_speed[False] | 1.6464ms | 1.4934ms | 669.6290 Ops/s | 663.9585 Ops/s | |
test_tc_init | 50.6310μs | 24.2625μs | 41.2159 KOps/s | 41.9176 KOps/s | |
test_tc_init_nested | 91.6320μs | 51.9301μs | 19.2566 KOps/s | 20.3631 KOps/s | |
test_tc_first_layer_tensor | 0.8145μs | 0.3561μs | 2.8084 MOps/s | 2.8093 MOps/s | |
test_tc_first_layer_nontensor | 1.5985μs | 0.3859μs | 2.5912 MOps/s | 2.5761 MOps/s | |
test_tc_second_layer_tensor | 14.9500μs | 1.0739μs | 931.2265 KOps/s | 939.2171 KOps/s | |
test_tc_second_layer_nontensor | 6.1942μs | 0.8260μs | 1.2106 MOps/s | 1.2322 MOps/s | |
test_unbind | 0.1017s | 8.1962ms | 122.0078 Ops/s | 123.7158 Ops/s | |
test_full_like | 13.7560ms | 13.2656ms | 75.3829 Ops/s | 87.6672 Ops/s | |
test_zeros_like | 96.5304ms | 8.2778ms | 120.8056 Ops/s | 142.1129 Ops/s | |
test_ones_like | 7.9737ms | 7.8280ms | 127.7462 Ops/s | 126.3143 Ops/s | |
test_clone | 9.7805ms | 9.5722ms | 104.4687 Ops/s | 105.1390 Ops/s | |
test_squeeze | 63.6610μs | 10.9105μs | 91.6546 KOps/s | 83.6619 KOps/s | |
test_unsqueeze | 0.1143ms | 61.5723μs | 16.2411 KOps/s | 15.8346 KOps/s | |
test_split | 0.1586ms | 0.1010ms | 9.9031 KOps/s | 9.7364 KOps/s | |
test_permute | 0.2050ms | 0.1278ms | 7.8234 KOps/s | 7.9535 KOps/s | |
test_stack | 27.4730ms | 27.2812ms | 36.6553 Ops/s | 36.1249 Ops/s | |
test_cat | 27.6724ms | 27.1845ms | 36.7857 Ops/s | 36.3662 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
bug
Something isn't working
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.