Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Misc] Move to PyTorch compat #533

Merged
merged 1 commit into from
Oct 4, 2023
Merged

[Misc] Move to PyTorch compat #533

merged 1 commit into from
Oct 4, 2023

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Oct 4, 2023

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 4, 2023
Copy link
Contributor

@matteobettini matteobettini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

less do it

@github-actions
Copy link

github-actions bot commented Oct 4, 2023

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 109. Improved: $\large\color{#35bf28}3$. Worsened: $\large\color{#d91a1a}4$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 38.7000μs 20.1209μs 49.6997 KOps/s 49.7980 KOps/s $\color{#d91a1a}-0.20\%$
test_plain_set_stack_nested 0.2456ms 0.1863ms 5.3672 KOps/s 5.3736 KOps/s $\color{#d91a1a}-0.12\%$
test_plain_set_nested_inplace 46.3000μs 23.6539μs 42.2763 KOps/s 42.0374 KOps/s $\color{#35bf28}+0.57\%$
test_plain_set_stack_nested_inplace 0.3015ms 0.2210ms 4.5255 KOps/s 4.5133 KOps/s $\color{#35bf28}+0.27\%$
test_items 27.9000μs 3.5120μs 284.7393 KOps/s 284.0587 KOps/s $\color{#35bf28}+0.24\%$
test_items_nested 0.4393ms 0.3593ms 2.7831 KOps/s 2.7821 KOps/s $\color{#35bf28}+0.04\%$
test_items_nested_locked 0.4441ms 0.3597ms 2.7805 KOps/s 2.7679 KOps/s $\color{#35bf28}+0.45\%$
test_items_nested_leaf 1.1367ms 0.2200ms 4.5448 KOps/s 4.5240 KOps/s $\color{#35bf28}+0.46\%$
test_items_stack_nested 2.0800ms 1.9732ms 506.7827 Ops/s 499.0962 Ops/s $\color{#35bf28}+1.54\%$
test_items_stack_nested_leaf 3.0856ms 1.8058ms 553.7844 Ops/s 552.8438 Ops/s $\color{#35bf28}+0.17\%$
test_items_stack_nested_locked 1.0852ms 0.9729ms 1.0278 KOps/s 1.0154 KOps/s $\color{#35bf28}+1.23\%$
test_keys 24.5000μs 5.1450μs 194.3642 KOps/s 196.0496 KOps/s $\color{#d91a1a}-0.86\%$
test_keys_nested 0.8495ms 0.1815ms 5.5082 KOps/s 5.4275 KOps/s $\color{#35bf28}+1.49\%$
test_keys_nested_locked 0.2007ms 0.1796ms 5.5665 KOps/s 5.4739 KOps/s $\color{#35bf28}+1.69\%$
test_keys_nested_leaf 0.3380ms 0.1743ms 5.7357 KOps/s 5.3428 KOps/s $\textbf{\color{#35bf28}+7.35\%}$
test_keys_stack_nested 1.9427ms 1.8267ms 547.4472 Ops/s 549.3894 Ops/s $\color{#d91a1a}-0.35\%$
test_keys_stack_nested_leaf 1.8999ms 1.8155ms 550.8054 Ops/s 544.1661 Ops/s $\color{#35bf28}+1.22\%$
test_keys_stack_nested_locked 0.9108ms 0.8101ms 1.2344 KOps/s 1.2040 KOps/s $\color{#35bf28}+2.52\%$
test_values 16.9010μs 1.5614μs 640.4584 KOps/s 645.7839 KOps/s $\color{#d91a1a}-0.82\%$
test_values_nested 0.1441ms 66.8391μs 14.9613 KOps/s 14.8298 KOps/s $\color{#35bf28}+0.89\%$
test_values_nested_locked 0.1267ms 67.4459μs 14.8267 KOps/s 14.8403 KOps/s $\color{#d91a1a}-0.09\%$
test_values_nested_leaf 90.5020μs 59.3122μs 16.8599 KOps/s 16.9616 KOps/s $\color{#d91a1a}-0.60\%$
test_values_stack_nested 1.8046ms 1.5977ms 625.9025 Ops/s 627.4367 Ops/s $\color{#d91a1a}-0.24\%$
test_values_stack_nested_leaf 1.6612ms 1.5812ms 632.4488 Ops/s 631.4076 Ops/s $\color{#35bf28}+0.16\%$
test_values_stack_nested_locked 0.7344ms 0.6373ms 1.5690 KOps/s 1.5281 KOps/s $\color{#35bf28}+2.68\%$
test_membership 67.2010μs 1.8716μs 534.2892 KOps/s 514.1434 KOps/s $\color{#35bf28}+3.92\%$
test_membership_nested 25.2010μs 3.6478μs 274.1348 KOps/s 271.7659 KOps/s $\color{#35bf28}+0.87\%$
test_membership_nested_leaf 25.6010μs 3.6988μs 270.3572 KOps/s 270.3293 KOps/s $\color{#35bf28}+0.01\%$
test_membership_stacked_nested 75.8020μs 14.4871μs 69.0271 KOps/s 67.9700 KOps/s $\color{#35bf28}+1.56\%$
test_membership_stacked_nested_leaf 39.2000μs 14.6208μs 68.3956 KOps/s 65.9332 KOps/s $\color{#35bf28}+3.73\%$
test_membership_nested_last 82.4010μs 7.4281μs 134.6237 KOps/s 130.7049 KOps/s $\color{#35bf28}+3.00\%$
test_membership_nested_leaf_last 30.8000μs 7.4844μs 133.6105 KOps/s 132.5495 KOps/s $\color{#35bf28}+0.80\%$
test_membership_stacked_nested_last 0.3037ms 0.2239ms 4.4664 KOps/s 4.4288 KOps/s $\color{#35bf28}+0.85\%$
test_membership_stacked_nested_leaf_last 84.4020μs 16.8141μs 59.4740 KOps/s 57.5974 KOps/s $\color{#35bf28}+3.26\%$
test_nested_getleaf 34.5000μs 15.5854μs 64.1627 KOps/s 64.7167 KOps/s $\color{#d91a1a}-0.86\%$
test_nested_get 77.1010μs 14.7972μs 67.5803 KOps/s 68.0019 KOps/s $\color{#d91a1a}-0.62\%$
test_stacked_getleaf 1.0391ms 0.9139ms 1.0942 KOps/s 1.1460 KOps/s $\color{#d91a1a}-4.52\%$
test_stacked_get 0.9515ms 0.8456ms 1.1825 KOps/s 1.2022 KOps/s $\color{#d91a1a}-1.64\%$
test_nested_getitemleaf 88.0010μs 15.5695μs 64.2283 KOps/s 64.4951 KOps/s $\color{#d91a1a}-0.41\%$
test_nested_getitem 41.6010μs 14.7874μs 67.6253 KOps/s 67.9381 KOps/s $\color{#d91a1a}-0.46\%$
test_stacked_getitemleaf 1.0063ms 0.8890ms 1.1248 KOps/s 1.1449 KOps/s $\color{#d91a1a}-1.75\%$
test_stacked_getitem 1.3423ms 0.8527ms 1.1728 KOps/s 1.1857 KOps/s $\color{#d91a1a}-1.09\%$
test_lock_nested 64.6933ms 1.5474ms 646.2322 Ops/s 687.1872 Ops/s $\textbf{\color{#d91a1a}-5.96\%}$
test_lock_stack_nested 84.0632ms 19.3632ms 51.6444 Ops/s 54.7614 Ops/s $\textbf{\color{#d91a1a}-5.69\%}$
test_unlock_nested 65.0110ms 1.5035ms 665.1076 Ops/s 645.0327 Ops/s $\color{#35bf28}+3.11\%$
test_unlock_stack_nested 87.8747ms 19.7777ms 50.5620 Ops/s 53.4077 Ops/s $\textbf{\color{#d91a1a}-5.33\%}$
test_flatten_speed 1.1124ms 1.0202ms 980.1552 Ops/s 989.7748 Ops/s $\color{#d91a1a}-0.97\%$
test_unflatten_speed 2.1868ms 1.8367ms 544.4417 Ops/s 551.4475 Ops/s $\color{#d91a1a}-1.27\%$
test_common_ops 5.2210ms 1.0858ms 920.9528 Ops/s 908.6677 Ops/s $\color{#35bf28}+1.35\%$
test_creation 23.8010μs 6.2447μs 160.1350 KOps/s 157.1279 KOps/s $\color{#35bf28}+1.91\%$
test_creation_empty 98.1020μs 13.9054μs 71.9144 KOps/s 73.2623 KOps/s $\color{#d91a1a}-1.84\%$
test_creation_nested_1 59.2010μs 25.1146μs 39.8176 KOps/s 39.6522 KOps/s $\color{#35bf28}+0.42\%$
test_creation_nested_2 56.4010μs 27.5743μs 36.2657 KOps/s 36.6744 KOps/s $\color{#d91a1a}-1.11\%$
test_clone 0.1190ms 24.1019μs 41.4905 KOps/s 40.2419 KOps/s $\color{#35bf28}+3.10\%$
test_getitem[int] 51.4010μs 27.4748μs 36.3969 KOps/s 35.4851 KOps/s $\color{#35bf28}+2.57\%$
test_getitem[slice_int] 98.3020μs 54.3731μs 18.3914 KOps/s 18.6386 KOps/s $\color{#d91a1a}-1.33\%$
test_getitem[range] 0.1431ms 81.0025μs 12.3453 KOps/s 12.3042 KOps/s $\color{#35bf28}+0.33\%$
test_getitem[tuple] 0.1124ms 45.0359μs 22.2045 KOps/s 22.1739 KOps/s $\color{#35bf28}+0.14\%$
test_getitem[list] 0.2613ms 76.6085μs 13.0534 KOps/s 13.0219 KOps/s $\color{#35bf28}+0.24\%$
test_setitem_dim[int] 51.2010μs 32.5329μs 30.7381 KOps/s 30.4676 KOps/s $\color{#35bf28}+0.89\%$
test_setitem_dim[slice_int] 88.5010μs 58.5506μs 17.0792 KOps/s 17.3529 KOps/s $\color{#d91a1a}-1.58\%$
test_setitem_dim[range] 0.1074ms 77.7841μs 12.8561 KOps/s 12.6959 KOps/s $\color{#35bf28}+1.26\%$
test_setitem_dim[tuple] 66.3010μs 48.0570μs 20.8086 KOps/s 20.8834 KOps/s $\color{#d91a1a}-0.36\%$
test_setitem 0.1191ms 32.3602μs 30.9022 KOps/s 30.4003 KOps/s $\color{#35bf28}+1.65\%$
test_set 0.1236ms 30.8783μs 32.3852 KOps/s 31.5016 KOps/s $\color{#35bf28}+2.81\%$
test_set_shared 0.3386ms 0.1746ms 5.7258 KOps/s 5.6253 KOps/s $\color{#35bf28}+1.79\%$
test_update 0.1346ms 35.2658μs 28.3561 KOps/s 28.2614 KOps/s $\color{#35bf28}+0.34\%$
test_update_nested 0.2283ms 52.2246μs 19.1481 KOps/s 19.1595 KOps/s $\color{#d91a1a}-0.06\%$
test_set_nested 0.1173ms 34.0935μs 29.3311 KOps/s 28.6200 KOps/s $\color{#35bf28}+2.48\%$
test_set_nested_new 0.1336ms 53.9838μs 18.5241 KOps/s 18.5608 KOps/s $\color{#d91a1a}-0.20\%$
test_select 0.2286ms 96.5973μs 10.3523 KOps/s 10.2703 KOps/s $\color{#35bf28}+0.80\%$
test_unbind_speed 2.7978ms 0.6517ms 1.5344 KOps/s 1.5291 KOps/s $\color{#35bf28}+0.35\%$
test_unbind_speed_stack0 76.5213ms 8.4876ms 117.8187 Ops/s 111.4966 Ops/s $\textbf{\color{#35bf28}+5.67\%}$
test_unbind_speed_stack1 10.2668μs 0.9221μs 1.0844 MOps/s 1.0814 MOps/s $\color{#35bf28}+0.28\%$
test_creation[device0] 0.5074ms 0.4366ms 2.2906 KOps/s 2.2479 KOps/s $\color{#35bf28}+1.90\%$
test_creation_from_tensor 2.5720ms 0.4928ms 2.0294 KOps/s 2.0046 KOps/s $\color{#35bf28}+1.24\%$
test_add_one[memmap_tensor0] 1.7723ms 32.1399μs 31.1140 KOps/s 30.5371 KOps/s $\color{#35bf28}+1.89\%$
test_contiguous[memmap_tensor0] 55.0010μs 8.7631μs 114.1154 KOps/s 112.6069 KOps/s $\color{#35bf28}+1.34\%$
test_stack[memmap_tensor0] 69.6020μs 26.6645μs 37.5031 KOps/s 36.8283 KOps/s $\color{#35bf28}+1.83\%$
test_memmaptd_index 0.4078ms 0.3137ms 3.1875 KOps/s 3.1209 KOps/s $\color{#35bf28}+2.13\%$
test_memmaptd_index_astensor 1.4315ms 1.3607ms 734.8982 Ops/s 732.5633 Ops/s $\color{#35bf28}+0.32\%$
test_memmaptd_index_op 2.7055ms 2.6178ms 381.9932 Ops/s 380.6433 Ops/s $\color{#35bf28}+0.35\%$
test_reshape_pytree 0.1217ms 38.1268μs 26.2283 KOps/s 26.7993 KOps/s $\color{#d91a1a}-2.13\%$
test_reshape_td 98.6020μs 45.3412μs 22.0550 KOps/s 22.2445 KOps/s $\color{#d91a1a}-0.85\%$
test_view_pytree 92.0010μs 35.3557μs 28.2840 KOps/s 28.6227 KOps/s $\color{#d91a1a}-1.18\%$
test_view_td 72.6010μs 8.9310μs 111.9692 KOps/s 107.2153 KOps/s $\color{#35bf28}+4.43\%$
test_unbind_pytree 78.4010μs 38.7743μs 25.7903 KOps/s 25.7845 KOps/s $\color{#35bf28}+0.02\%$
test_unbind_td 0.1906ms 94.2529μs 10.6098 KOps/s 10.1641 KOps/s $\color{#35bf28}+4.38\%$
test_split_pytree 89.3020μs 44.8417μs 22.3006 KOps/s 22.6763 KOps/s $\color{#d91a1a}-1.66\%$
test_split_td 2.8345ms 0.1145ms 8.7338 KOps/s 8.6729 KOps/s $\color{#35bf28}+0.70\%$
test_add_pytree 0.1209ms 48.0348μs 20.8183 KOps/s 21.0040 KOps/s $\color{#d91a1a}-0.88\%$
test_add_td 0.1056ms 76.1557μs 13.1310 KOps/s 13.2562 KOps/s $\color{#d91a1a}-0.94\%$
test_distributed 21.7000μs 8.7202μs 114.6760 KOps/s 115.4931 KOps/s $\color{#d91a1a}-0.71\%$
test_tdmodule 0.1904ms 28.9944μs 34.4894 KOps/s 34.3971 KOps/s $\color{#35bf28}+0.27\%$
test_tdmodule_dispatch 0.2636ms 54.5453μs 18.3334 KOps/s 17.6506 KOps/s $\color{#35bf28}+3.87\%$
test_tdseq 0.4798ms 32.5165μs 30.7536 KOps/s 30.5025 KOps/s $\color{#35bf28}+0.82\%$
test_tdseq_dispatch 0.1860ms 64.6937μs 15.4574 KOps/s 14.8325 KOps/s $\color{#35bf28}+4.21\%$
test_instantiation_functorch 1.7352ms 1.6157ms 618.9405 Ops/s 601.8014 Ops/s $\color{#35bf28}+2.85\%$
test_instantiation_td 1.8603ms 1.3366ms 748.1842 Ops/s 689.0345 Ops/s $\textbf{\color{#35bf28}+8.58\%}$
test_exec_functorch 0.2247ms 0.1867ms 5.3555 KOps/s 5.3433 KOps/s $\color{#35bf28}+0.23\%$
test_exec_td 0.2381ms 0.1771ms 5.6474 KOps/s 5.5460 KOps/s $\color{#35bf28}+1.83\%$
test_vmap_mlp_speed[True-True] 14.8307ms 1.3547ms 738.1556 Ops/s 799.6821 Ops/s $\textbf{\color{#d91a1a}-7.69\%}$
test_vmap_mlp_speed[True-False] 5.8516ms 0.6086ms 1.6432 KOps/s 1.6429 KOps/s $\color{#35bf28}+0.01\%$
test_vmap_mlp_speed[False-True] 7.7272ms 0.9908ms 1.0093 KOps/s 995.2477 Ops/s $\color{#35bf28}+1.41\%$
test_vmap_mlp_speed[False-False] 3.3826ms 0.4476ms 2.2341 KOps/s 2.2089 KOps/s $\color{#35bf28}+1.14\%$
test_vmap_transformer_speed[True-True] 15.4386ms 12.8744ms 77.6735 Ops/s 77.2821 Ops/s $\color{#35bf28}+0.51\%$
test_vmap_transformer_speed[True-False] 21.4110ms 8.4613ms 118.1851 Ops/s 118.8686 Ops/s $\color{#d91a1a}-0.58\%$
test_vmap_transformer_speed[False-True] 66.3503ms 14.7072ms 67.9941 Ops/s 67.5386 Ops/s $\color{#35bf28}+0.67\%$
test_vmap_transformer_speed[False-False] 14.1045ms 8.1559ms 122.6110 Ops/s 121.8870 Ops/s $\color{#35bf28}+0.59\%$

@vmoens vmoens merged commit 302c342 into main Oct 4, 2023
26 of 27 checks passed
@vmoens vmoens deleted the pytorch_move branch October 4, 2023 11:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants