Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] isfinite, isnan, isreal #829

Merged
merged 1 commit into from
Jun 24, 2024
Merged

[Feature] isfinite, isnan, isreal #829

merged 1 commit into from
Jun 24, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Jun 24, 2024

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 24, 2024
@vmoens vmoens merged commit aa2d0bc into main Jun 24, 2024
16 of 30 checks passed
@vmoens vmoens added the enhancement New feature or request label Jun 24, 2024
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 144. Improved: $\large\color{#35bf28}23$. Worsened: $\large\color{#d91a1a}4$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 48.6610μs 17.0311μs 58.7161 KOps/s 58.7779 KOps/s $\color{#d91a1a}-0.11\%$
test_plain_set_stack_nested 63.8890μs 16.8716μs 59.2711 KOps/s 58.4651 KOps/s $\color{#35bf28}+1.38\%$
test_plain_set_nested_inplace 52.3380μs 19.2349μs 51.9887 KOps/s 51.3505 KOps/s $\color{#35bf28}+1.24\%$
test_plain_set_stack_nested_inplace 51.7380μs 19.3119μs 51.7816 KOps/s 52.2317 KOps/s $\color{#d91a1a}-0.86\%$
test_items 27.2110μs 2.6813μs 372.9483 KOps/s 381.1554 KOps/s $\color{#d91a1a}-2.15\%$
test_items_nested 0.6451ms 0.2613ms 3.8268 KOps/s 3.8112 KOps/s $\color{#35bf28}+0.41\%$
test_items_nested_locked 1.3287ms 0.2673ms 3.7408 KOps/s 3.7990 KOps/s $\color{#d91a1a}-1.53\%$
test_items_nested_leaf 0.1545ms 75.9850μs 13.1605 KOps/s 12.9573 KOps/s $\color{#35bf28}+1.57\%$
test_items_stack_nested 1.3084ms 0.2637ms 3.7920 KOps/s 3.7243 KOps/s $\color{#35bf28}+1.82\%$
test_items_stack_nested_leaf 0.1765ms 77.6108μs 12.8848 KOps/s 12.8330 KOps/s $\color{#35bf28}+0.40\%$
test_items_stack_nested_locked 0.6481ms 0.2669ms 3.7471 KOps/s 3.7696 KOps/s $\color{#d91a1a}-0.60\%$
test_keys 39.7040μs 3.9978μs 250.1398 KOps/s 261.9994 KOps/s $\color{#d91a1a}-4.53\%$
test_keys_nested 0.2325ms 0.1420ms 7.0439 KOps/s 7.2538 KOps/s $\color{#d91a1a}-2.89\%$
test_keys_nested_locked 0.7745ms 0.1426ms 7.0150 KOps/s 6.9955 KOps/s $\color{#35bf28}+0.28\%$
test_keys_nested_leaf 0.2379ms 0.1166ms 8.5756 KOps/s 8.4556 KOps/s $\color{#35bf28}+1.42\%$
test_keys_stack_nested 0.2804ms 0.1380ms 7.2480 KOps/s 7.2539 KOps/s $\color{#d91a1a}-0.08\%$
test_keys_stack_nested_leaf 0.1963ms 0.1164ms 8.5945 KOps/s 8.5402 KOps/s $\color{#35bf28}+0.64\%$
test_keys_stack_nested_locked 0.2837ms 0.1423ms 7.0257 KOps/s 7.0299 KOps/s $\color{#d91a1a}-0.06\%$
test_values 8.8648μs 1.0969μs 911.6447 KOps/s 853.7171 KOps/s $\textbf{\color{#35bf28}+6.79\%}$
test_values_nested 0.1027ms 50.0029μs 19.9988 KOps/s 19.4923 KOps/s $\color{#35bf28}+2.60\%$
test_values_nested_locked 95.3590μs 49.8475μs 20.0612 KOps/s 19.6988 KOps/s $\color{#35bf28}+1.84\%$
test_values_nested_leaf 0.1051ms 45.4938μs 21.9810 KOps/s 21.5592 KOps/s $\color{#35bf28}+1.96\%$
test_values_stack_nested 0.1220ms 50.2963μs 19.8822 KOps/s 19.3372 KOps/s $\color{#35bf28}+2.82\%$
test_values_stack_nested_leaf 0.1021ms 45.0160μs 22.2143 KOps/s 21.5754 KOps/s $\color{#35bf28}+2.96\%$
test_values_stack_nested_locked 0.1043ms 50.7901μs 19.6889 KOps/s 19.3681 KOps/s $\color{#35bf28}+1.66\%$
test_membership 16.6010μs 1.3002μs 769.1295 KOps/s 747.2082 KOps/s $\color{#35bf28}+2.93\%$
test_membership_nested 49.0820μs 3.3385μs 299.5347 KOps/s 296.0044 KOps/s $\color{#35bf28}+1.19\%$
test_membership_nested_leaf 24.9270μs 3.3608μs 297.5517 KOps/s 296.3995 KOps/s $\color{#35bf28}+0.39\%$
test_membership_stacked_nested 18.0740μs 3.3580μs 297.7945 KOps/s 289.5132 KOps/s $\color{#35bf28}+2.86\%$
test_membership_stacked_nested_leaf 22.6130μs 3.3667μs 297.0241 KOps/s 274.1521 KOps/s $\textbf{\color{#35bf28}+8.34\%}$
test_membership_nested_last 42.6100μs 4.1385μs 241.6336 KOps/s 241.6755 KOps/s $\color{#d91a1a}-0.02\%$
test_membership_nested_leaf_last 21.2900μs 4.1274μs 242.2851 KOps/s 239.6705 KOps/s $\color{#35bf28}+1.09\%$
test_membership_stacked_nested_last 28.1740μs 4.0882μs 244.6076 KOps/s 211.2584 KOps/s $\textbf{\color{#35bf28}+15.79\%}$
test_membership_stacked_nested_leaf_last 52.3590μs 4.1976μs 238.2286 KOps/s 211.3614 KOps/s $\textbf{\color{#35bf28}+12.71\%}$
test_nested_getleaf 51.6870μs 10.3820μs 96.3202 KOps/s 94.5560 KOps/s $\color{#35bf28}+1.87\%$
test_nested_get 50.9660μs 9.8595μs 101.4246 KOps/s 99.2941 KOps/s $\color{#35bf28}+2.15\%$
test_stacked_getleaf 41.0670μs 10.2482μs 97.5780 KOps/s 93.9447 KOps/s $\color{#35bf28}+3.87\%$
test_stacked_get 53.8220μs 9.7955μs 102.0875 KOps/s 99.8142 KOps/s $\color{#35bf28}+2.28\%$
test_nested_getitemleaf 54.0520μs 10.9444μs 91.3713 KOps/s 87.7545 KOps/s $\color{#35bf28}+4.12\%$
test_nested_getitem 31.6190μs 10.0401μs 99.6010 KOps/s 97.4620 KOps/s $\color{#35bf28}+2.19\%$
test_stacked_getitemleaf 54.9840μs 10.8104μs 92.5034 KOps/s 89.4098 KOps/s $\color{#35bf28}+3.46\%$
test_stacked_getitem 46.9680μs 9.9556μs 100.4460 KOps/s 97.7666 KOps/s $\color{#35bf28}+2.74\%$
test_lock_nested 53.5371ms 0.3862ms 2.5892 KOps/s 2.9508 KOps/s $\textbf{\color{#d91a1a}-12.25\%}$
test_lock_stack_nested 0.5314ms 0.3042ms 3.2876 KOps/s 3.2108 KOps/s $\color{#35bf28}+2.39\%$
test_unlock_nested 0.7507ms 0.3398ms 2.9427 KOps/s 2.8693 KOps/s $\color{#35bf28}+2.56\%$
test_unlock_stack_nested 0.5579ms 0.3119ms 3.2059 KOps/s 3.1179 KOps/s $\color{#35bf28}+2.82\%$
test_flatten_speed 0.2150ms 93.9011μs 10.6495 KOps/s 10.4415 KOps/s $\color{#35bf28}+1.99\%$
test_unflatten_speed 0.7543ms 0.4090ms 2.4447 KOps/s 2.4529 KOps/s $\color{#d91a1a}-0.33\%$
test_common_ops 3.3730ms 0.7037ms 1.4211 KOps/s 1.3549 KOps/s $\color{#35bf28}+4.89\%$
test_creation 75.5420μs 1.8684μs 535.2221 KOps/s 526.1483 KOps/s $\color{#35bf28}+1.72\%$
test_creation_empty 35.9980μs 10.3560μs 96.5627 KOps/s 91.6479 KOps/s $\textbf{\color{#35bf28}+5.36\%}$
test_creation_nested_1 58.4800μs 13.1540μs 76.0224 KOps/s 74.0811 KOps/s $\color{#35bf28}+2.62\%$
test_creation_nested_2 69.8110μs 16.2326μs 61.6043 KOps/s 60.1861 KOps/s $\color{#35bf28}+2.36\%$
test_clone 62.3680μs 12.7898μs 78.1872 KOps/s 72.9238 KOps/s $\textbf{\color{#35bf28}+7.22\%}$
test_getitem[int] 46.2070μs 11.1799μs 89.4463 KOps/s 86.5366 KOps/s $\color{#35bf28}+3.36\%$
test_getitem[slice_int] 0.1253ms 23.1054μs 43.2800 KOps/s 42.9992 KOps/s $\color{#35bf28}+0.65\%$
test_getitem[range] 79.6300μs 59.4320μs 16.8260 KOps/s 14.6193 KOps/s $\textbf{\color{#35bf28}+15.09\%}$
test_getitem[tuple] 74.3900μs 18.6165μs 53.7158 KOps/s 51.9231 KOps/s $\color{#35bf28}+3.45\%$
test_getitem[list] 0.1754ms 43.4475μs 23.0163 KOps/s 23.8338 KOps/s $\color{#d91a1a}-3.43\%$
test_setitem_dim[int] 72.7770μs 35.6559μs 28.0459 KOps/s 26.9667 KOps/s $\color{#35bf28}+4.00\%$
test_setitem_dim[slice_int] 0.1210ms 61.8902μs 16.1576 KOps/s 15.2962 KOps/s $\textbf{\color{#35bf28}+5.63\%}$
test_setitem_dim[range] 0.1439ms 86.5126μs 11.5590 KOps/s 11.6808 KOps/s $\color{#d91a1a}-1.04\%$
test_setitem_dim[tuple] 81.7540μs 50.1620μs 19.9354 KOps/s 18.7416 KOps/s $\textbf{\color{#35bf28}+6.37\%}$
test_setitem 50.9250μs 19.5021μs 51.2765 KOps/s 48.7086 KOps/s $\textbf{\color{#35bf28}+5.27\%}$
test_set 85.1600μs 19.0165μs 52.5858 KOps/s 49.8422 KOps/s $\textbf{\color{#35bf28}+5.50\%}$
test_set_shared 4.2664ms 0.1423ms 7.0266 KOps/s 6.8636 KOps/s $\color{#35bf28}+2.37\%$
test_update 0.1046ms 21.6956μs 46.0923 KOps/s 43.5196 KOps/s $\textbf{\color{#35bf28}+5.91\%}$
test_update_nested 71.6140μs 31.0756μs 32.1796 KOps/s 31.6844 KOps/s $\color{#35bf28}+1.56\%$
test_update__nested 73.8190μs 24.6510μs 40.5663 KOps/s 39.6903 KOps/s $\color{#35bf28}+2.21\%$
test_set_nested 0.1006ms 20.8010μs 48.0746 KOps/s 45.7102 KOps/s $\textbf{\color{#35bf28}+5.17\%}$
test_set_nested_new 0.1144ms 25.3581μs 39.4351 KOps/s 37.9260 KOps/s $\color{#35bf28}+3.98\%$
test_select 0.1103ms 39.2029μs 25.5083 KOps/s 24.1212 KOps/s $\textbf{\color{#35bf28}+5.75\%}$
test_select_nested 0.1344ms 59.5135μs 16.8029 KOps/s 16.6949 KOps/s $\color{#35bf28}+0.65\%$
test_exclude_nested 0.1722ms 0.1178ms 8.4893 KOps/s 8.3227 KOps/s $\color{#35bf28}+2.00\%$
test_empty[True] 0.5280ms 0.3943ms 2.5363 KOps/s 2.5604 KOps/s $\color{#d91a1a}-0.94\%$
test_empty[False] 9.3296μs 1.1377μs 878.9846 KOps/s 866.4650 KOps/s $\color{#35bf28}+1.44\%$
test_unbind_speed 1.4896ms 0.2514ms 3.9783 KOps/s 3.9319 KOps/s $\color{#35bf28}+1.18\%$
test_unbind_speed_stack0 0.5250ms 0.2467ms 4.0533 KOps/s 3.9284 KOps/s $\color{#35bf28}+3.18\%$
test_unbind_speed_stack1 65.9413ms 0.7173ms 1.3941 KOps/s 1.3690 KOps/s $\color{#35bf28}+1.83\%$
test_split 67.3989ms 1.5683ms 637.6387 Ops/s 628.1218 Ops/s $\color{#35bf28}+1.52\%$
test_chunk 67.7845ms 1.5684ms 637.5819 Ops/s 628.7403 Ops/s $\color{#35bf28}+1.41\%$
test_creation[device0] 0.1639ms 83.4716μs 11.9801 KOps/s 12.0420 KOps/s $\color{#d91a1a}-0.51\%$
test_creation_from_tensor 0.1702ms 83.8968μs 11.9194 KOps/s 11.5215 KOps/s $\color{#35bf28}+3.45\%$
test_add_one[memmap_tensor0] 68.3090μs 5.4457μs 183.6319 KOps/s 170.6813 KOps/s $\textbf{\color{#35bf28}+7.59\%}$
test_contiguous[memmap_tensor0] 8.8570μs 0.6589μs 1.5176 MOps/s 1.5891 MOps/s $\color{#d91a1a}-4.50\%$
test_stack[memmap_tensor0] 22.1720μs 3.5729μs 279.8821 KOps/s 270.9593 KOps/s $\color{#35bf28}+3.29\%$
test_memmaptd_index 0.8978ms 0.2489ms 4.0176 KOps/s 3.9094 KOps/s $\color{#35bf28}+2.77\%$
test_memmaptd_index_astensor 0.7133ms 0.3221ms 3.1049 KOps/s 3.0138 KOps/s $\color{#35bf28}+3.02\%$
test_memmaptd_index_op 1.2366ms 0.6010ms 1.6639 KOps/s 1.5996 KOps/s $\color{#35bf28}+4.02\%$
test_serialize_model 0.1787s 0.1163s 8.6020 Ops/s 8.3653 Ops/s $\color{#35bf28}+2.83\%$
test_serialize_model_pickle 0.4591s 0.3756s 2.6623 Ops/s 2.6299 Ops/s $\color{#35bf28}+1.23\%$
test_serialize_weights 0.1805s 0.1136s 8.8060 Ops/s 9.2686 Ops/s $\color{#d91a1a}-4.99\%$
test_serialize_weights_returnearly 0.1890s 0.1366s 7.3196 Ops/s 7.1652 Ops/s $\color{#35bf28}+2.16\%$
test_serialize_weights_pickle 0.7540s 0.5083s 1.9674 Ops/s 2.5043 Ops/s $\textbf{\color{#d91a1a}-21.44\%}$
test_serialize_weights_filesystem 0.1661s 0.1037s 9.6450 Ops/s 10.3347 Ops/s $\textbf{\color{#d91a1a}-6.67\%}$
test_serialize_model_filesystem 0.1015s 95.4289ms 10.4790 Ops/s 9.3008 Ops/s $\textbf{\color{#35bf28}+12.67\%}$
test_reshape_pytree 64.7220μs 25.6917μs 38.9230 KOps/s 39.1068 KOps/s $\color{#d91a1a}-0.47\%$
test_reshape_td 74.7310μs 33.9057μs 29.4936 KOps/s 29.1652 KOps/s $\color{#35bf28}+1.13\%$
test_view_pytree 66.5460μs 26.0595μs 38.3737 KOps/s 38.9884 KOps/s $\color{#d91a1a}-1.58\%$
test_view_td 81.1320μs 38.7062μs 25.8357 KOps/s 26.1625 KOps/s $\color{#d91a1a}-1.25\%$
test_unbind_pytree 62.7180μs 29.0960μs 34.3689 KOps/s 33.6092 KOps/s $\color{#35bf28}+2.26\%$
test_unbind_td 0.3728ms 36.9646μs 27.0529 KOps/s 26.5094 KOps/s $\color{#35bf28}+2.05\%$
test_split_pytree 72.0660μs 29.5704μs 33.8176 KOps/s 34.1338 KOps/s $\color{#d91a1a}-0.93\%$
test_split_td 0.4814ms 40.4453μs 24.7248 KOps/s 24.8920 KOps/s $\color{#d91a1a}-0.67\%$
test_add_pytree 96.5610μs 34.9679μs 28.5977 KOps/s 27.9370 KOps/s $\color{#35bf28}+2.37\%$
test_add_td 0.1036ms 56.1724μs 17.8023 KOps/s 15.3725 KOps/s $\textbf{\color{#35bf28}+15.81\%}$
test_distributed 0.1880ms 99.4981μs 10.0504 KOps/s 9.6278 KOps/s $\color{#35bf28}+4.39\%$
test_tdmodule 64.1110μs 17.9250μs 55.7879 KOps/s 56.1525 KOps/s $\color{#d91a1a}-0.65\%$
test_tdmodule_dispatch 63.1990μs 34.6276μs 28.8787 KOps/s 28.0085 KOps/s $\color{#35bf28}+3.11\%$
test_tdseq 37.4810μs 20.5198μs 48.7333 KOps/s 47.1379 KOps/s $\color{#35bf28}+3.38\%$
test_tdseq_dispatch 62.8080μs 40.0863μs 24.9462 KOps/s 24.4114 KOps/s $\color{#35bf28}+2.19\%$
test_instantiation_functorch 1.9846ms 1.2939ms 772.8511 Ops/s 751.9698 Ops/s $\color{#35bf28}+2.78\%$
test_instantiation_td 1.8913ms 1.0095ms 990.5604 Ops/s 981.2611 Ops/s $\color{#35bf28}+0.95\%$
test_exec_functorch 0.2317ms 0.1603ms 6.2369 KOps/s 6.1197 KOps/s $\color{#35bf28}+1.91\%$
test_exec_functional_call 0.3007ms 0.1529ms 6.5409 KOps/s 6.5507 KOps/s $\color{#d91a1a}-0.15\%$
test_exec_td 0.2514ms 0.1457ms 6.8616 KOps/s 6.7673 KOps/s $\color{#35bf28}+1.39\%$
test_exec_td_decorator 0.5280ms 0.2236ms 4.4724 KOps/s 4.2357 KOps/s $\textbf{\color{#35bf28}+5.59\%}$
test_vmap_mlp_speed[True-True] 0.6566ms 0.4814ms 2.0772 KOps/s 1.9817 KOps/s $\color{#35bf28}+4.82\%$
test_vmap_mlp_speed[True-False] 0.7553ms 0.4804ms 2.0817 KOps/s 2.0096 KOps/s $\color{#35bf28}+3.59\%$
test_vmap_mlp_speed[False-True] 0.7066ms 0.3912ms 2.5560 KOps/s 2.4667 KOps/s $\color{#35bf28}+3.62\%$
test_vmap_mlp_speed[False-False] 0.7150ms 0.3913ms 2.5559 KOps/s 2.4649 KOps/s $\color{#35bf28}+3.69\%$
test_vmap_mlp_speed_decorator[True-True] 1.1784ms 0.5518ms 1.8124 KOps/s 1.7543 KOps/s $\color{#35bf28}+3.31\%$
test_vmap_mlp_speed_decorator[True-False] 0.7559ms 0.5499ms 1.8184 KOps/s 1.7552 KOps/s $\color{#35bf28}+3.60\%$
test_vmap_mlp_speed_decorator[False-True] 0.9327ms 0.4681ms 2.1364 KOps/s 2.1415 KOps/s $\color{#d91a1a}-0.24\%$
test_vmap_mlp_speed_decorator[False-False] 0.7219ms 0.4519ms 2.2128 KOps/s 2.1420 KOps/s $\color{#35bf28}+3.30\%$
test_to_module_speed[True] 1.7845ms 1.6837ms 593.9285 Ops/s 591.5736 Ops/s $\color{#35bf28}+0.40\%$
test_to_module_speed[False] 73.4108ms 1.8065ms 553.5414 Ops/s 598.2313 Ops/s $\textbf{\color{#d91a1a}-7.47\%}$
test_tc_init 56.3260μs 28.9522μs 34.5396 KOps/s 33.3328 KOps/s $\color{#35bf28}+3.62\%$
test_tc_init_nested 0.2950ms 61.9870μs 16.1324 KOps/s 16.3872 KOps/s $\color{#d91a1a}-1.55\%$
test_tc_first_layer_tensor 4.4238μs 0.6547μs 1.5274 MOps/s 1.3886 MOps/s $\textbf{\color{#35bf28}+9.99\%}$
test_tc_first_layer_nontensor 1.8425μs 0.6571μs 1.5219 MOps/s 1.4058 MOps/s $\textbf{\color{#35bf28}+8.26\%}$
test_tc_second_layer_tensor 15.9190μs 1.7999μs 555.5817 KOps/s 536.2918 KOps/s $\color{#35bf28}+3.60\%$
test_tc_second_layer_nontensor 9.5390μs 1.4950μs 668.8794 KOps/s 591.0397 KOps/s $\textbf{\color{#35bf28}+13.17\%}$
test_unbind 81.9796ms 6.4618ms 154.7559 Ops/s 119.1458 Ops/s $\textbf{\color{#35bf28}+29.89\%}$
test_full_like 16.7082ms 11.1019ms 90.0750 Ops/s 81.3076 Ops/s $\textbf{\color{#35bf28}+10.78\%}$
test_zeros_like 11.8589ms 6.1032ms 163.8482 Ops/s 169.4897 Ops/s $\color{#d91a1a}-3.33\%$
test_ones_like 14.7536ms 6.3519ms 157.4344 Ops/s 153.9457 Ops/s $\color{#35bf28}+2.27\%$
test_clone 14.9932ms 8.1656ms 122.4644 Ops/s 119.1169 Ops/s $\color{#35bf28}+2.81\%$
test_squeeze 62.4580μs 13.7181μs 72.8963 KOps/s 72.0764 KOps/s $\color{#35bf28}+1.14\%$
test_unsqueeze 0.1056ms 58.5993μs 17.0650 KOps/s 16.7967 KOps/s $\color{#35bf28}+1.60\%$
test_split 0.1818ms 0.1112ms 8.9905 KOps/s 8.7989 KOps/s $\color{#35bf28}+2.18\%$
test_permute 0.2673ms 0.1251ms 7.9921 KOps/s 7.8419 KOps/s $\color{#35bf28}+1.92\%$
test_stack 31.6435ms 24.9571ms 40.0688 Ops/s 41.4781 Ops/s $\color{#d91a1a}-3.40\%$
test_cat 32.5874ms 23.9787ms 41.7037 Ops/s 41.9609 Ops/s $\color{#d91a1a}-0.61\%$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 152. Improved: $\large\color{#35bf28}11$. Worsened: $\large\color{#d91a1a}26$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 89.1560μs 13.6089μs 73.4814 KOps/s 81.0840 KOps/s $\textbf{\color{#d91a1a}-9.38\%}$
test_plain_set_stack_nested 29.6820μs 13.7858μs 72.5382 KOps/s 79.5744 KOps/s $\textbf{\color{#d91a1a}-8.84\%}$
test_plain_set_nested_inplace 38.0320μs 14.9142μs 67.0501 KOps/s 72.8772 KOps/s $\textbf{\color{#d91a1a}-8.00\%}$
test_plain_set_stack_nested_inplace 34.5320μs 15.1664μs 65.9353 KOps/s 72.2722 KOps/s $\textbf{\color{#d91a1a}-8.77\%}$
test_items 20.1010μs 4.6904μs 213.2031 KOps/s 209.1590 KOps/s $\color{#35bf28}+1.93\%$
test_items_nested 0.3849ms 0.3321ms 3.0110 KOps/s 2.9956 KOps/s $\color{#35bf28}+0.51\%$
test_items_nested_locked 0.3684ms 0.3324ms 3.0082 KOps/s 2.9747 KOps/s $\color{#35bf28}+1.13\%$
test_items_nested_leaf 0.1044ms 82.6199μs 12.1036 KOps/s 12.1354 KOps/s $\color{#d91a1a}-0.26\%$
test_items_stack_nested 0.3704ms 0.3348ms 2.9867 KOps/s 2.9540 KOps/s $\color{#35bf28}+1.11\%$
test_items_stack_nested_leaf 0.1014ms 84.6548μs 11.8127 KOps/s 12.1003 KOps/s $\color{#d91a1a}-2.38\%$
test_items_stack_nested_locked 0.3796ms 0.3346ms 2.9888 KOps/s 2.9894 KOps/s $\color{#d91a1a}-0.02\%$
test_keys 18.9510μs 4.3819μs 228.2132 KOps/s 229.8667 KOps/s $\color{#d91a1a}-0.72\%$
test_keys_nested 90.0160μs 66.4307μs 15.0533 KOps/s 14.9295 KOps/s $\color{#35bf28}+0.83\%$
test_keys_nested_locked 2.2581ms 71.5098μs 13.9841 KOps/s 13.9121 KOps/s $\color{#35bf28}+0.52\%$
test_keys_nested_leaf 76.9540μs 57.1897μs 17.4857 KOps/s 17.4348 KOps/s $\color{#35bf28}+0.29\%$
test_keys_stack_nested 93.9650μs 66.4606μs 15.0465 KOps/s 15.0196 KOps/s $\color{#35bf28}+0.18\%$
test_keys_stack_nested_leaf 85.4750μs 57.1231μs 17.5060 KOps/s 17.5052 KOps/s $+0.01\%$
test_keys_stack_nested_locked 95.8870μs 71.6914μs 13.9487 KOps/s 14.1072 KOps/s $\color{#d91a1a}-1.12\%$
test_values 7.0373μs 1.8121μs 551.8539 KOps/s 551.4764 KOps/s $\color{#35bf28}+0.07\%$
test_values_nested 52.0340μs 35.6302μs 28.0661 KOps/s 28.5009 KOps/s $\color{#d91a1a}-1.53\%$
test_values_nested_locked 52.5030μs 37.1594μs 26.9111 KOps/s 26.7105 KOps/s $\color{#35bf28}+0.75\%$
test_values_nested_leaf 54.5730μs 31.6593μs 31.5863 KOps/s 31.9009 KOps/s $\color{#d91a1a}-0.99\%$
test_values_stack_nested 54.2530μs 36.1997μs 27.6246 KOps/s 27.8226 KOps/s $\color{#d91a1a}-0.71\%$
test_values_stack_nested_leaf 49.8530μs 32.0999μs 31.1527 KOps/s 31.2806 KOps/s $\color{#d91a1a}-0.41\%$
test_values_stack_nested_locked 59.4240μs 37.8934μs 26.3898 KOps/s 26.3491 KOps/s $\color{#35bf28}+0.15\%$
test_membership 3.7701μs 0.7328μs 1.3646 MOps/s 1.3908 MOps/s $\color{#d91a1a}-1.88\%$
test_membership_nested 18.8910μs 2.5597μs 390.6775 KOps/s 393.6767 KOps/s $\color{#d91a1a}-0.76\%$
test_membership_nested_leaf 18.6110μs 2.5913μs 385.9018 KOps/s 390.3211 KOps/s $\color{#d91a1a}-1.13\%$
test_membership_stacked_nested 19.3320μs 2.5527μs 391.7384 KOps/s 392.4667 KOps/s $\color{#d91a1a}-0.19\%$
test_membership_stacked_nested_leaf 19.7320μs 2.5741μs 388.4871 KOps/s 392.9096 KOps/s $\color{#d91a1a}-1.13\%$
test_membership_nested_last 16.9610μs 3.1136μs 321.1684 KOps/s 323.2986 KOps/s $\color{#d91a1a}-0.66\%$
test_membership_nested_leaf_last 20.4510μs 3.0991μs 322.6746 KOps/s 326.5475 KOps/s $\color{#d91a1a}-1.19\%$
test_membership_stacked_nested_last 21.7110μs 3.1203μs 320.4825 KOps/s 161.8014 KOps/s $\textbf{\color{#35bf28}+98.07\%}$
test_membership_stacked_nested_leaf_last 22.8510μs 3.0862μs 324.0240 KOps/s 161.9297 KOps/s $\textbf{\color{#35bf28}+100.10\%}$
test_nested_getleaf 32.1020μs 8.3645μs 119.5526 KOps/s 118.4097 KOps/s $\color{#35bf28}+0.97\%$
test_nested_get 28.0910μs 7.8670μs 127.1134 KOps/s 127.3764 KOps/s $\color{#d91a1a}-0.21\%$
test_stacked_getleaf 24.0820μs 8.4245μs 118.7019 KOps/s 118.6877 KOps/s $\color{#35bf28}+0.01\%$
test_stacked_get 27.8520μs 7.8792μs 126.9168 KOps/s 126.7126 KOps/s $\color{#35bf28}+0.16\%$
test_nested_getitemleaf 24.2920μs 8.5219μs 117.3442 KOps/s 116.0052 KOps/s $\color{#35bf28}+1.15\%$
test_nested_getitem 23.8710μs 7.9943μs 125.0893 KOps/s 124.3628 KOps/s $\color{#35bf28}+0.58\%$
test_stacked_getitemleaf 31.1520μs 8.6015μs 116.2589 KOps/s 116.1665 KOps/s $\color{#35bf28}+0.08\%$
test_stacked_getitem 23.4320μs 8.0558μs 124.1346 KOps/s 123.9143 KOps/s $\color{#35bf28}+0.18\%$
test_lock_nested 59.9783ms 0.3996ms 2.5026 KOps/s 2.5001 KOps/s $\color{#35bf28}+0.10\%$
test_lock_stack_nested 0.3265ms 0.2949ms 3.3911 KOps/s 3.3805 KOps/s $\color{#35bf28}+0.31\%$
test_unlock_nested 61.4732ms 0.4026ms 2.4837 KOps/s 2.4792 KOps/s $\color{#35bf28}+0.18\%$
test_unlock_stack_nested 0.3369ms 0.3043ms 3.2860 KOps/s 3.2919 KOps/s $\color{#d91a1a}-0.18\%$
test_flatten_speed 0.3246ms 0.1029ms 9.7196 KOps/s 9.8817 KOps/s $\color{#d91a1a}-1.64\%$
test_unflatten_speed 0.3241ms 0.2903ms 3.4442 KOps/s 3.4933 KOps/s $\color{#d91a1a}-1.41\%$
test_common_ops 1.0580ms 0.6071ms 1.6471 KOps/s 1.7331 KOps/s $\color{#d91a1a}-4.96\%$
test_creation 20.8210μs 1.6091μs 621.4464 KOps/s 609.6306 KOps/s $\color{#35bf28}+1.94\%$
test_creation_empty 25.0520μs 10.4234μs 95.9377 KOps/s 128.6695 KOps/s $\textbf{\color{#d91a1a}-25.44\%}$
test_creation_nested_1 32.2920μs 12.2170μs 81.8531 KOps/s 104.6673 KOps/s $\textbf{\color{#d91a1a}-21.80\%}$
test_creation_nested_2 34.5320μs 14.4259μs 69.3196 KOps/s 84.8905 KOps/s $\textbf{\color{#d91a1a}-18.34\%}$
test_clone 88.5760μs 11.5785μs 86.3667 KOps/s 83.7819 KOps/s $\color{#35bf28}+3.09\%$
test_getitem[int] 26.6010μs 10.6253μs 94.1149 KOps/s 93.1921 KOps/s $\color{#35bf28}+0.99\%$
test_getitem[slice_int] 46.1020μs 20.3247μs 49.2011 KOps/s 47.7140 KOps/s $\color{#35bf28}+3.12\%$
test_getitem[range] 62.6740μs 46.6110μs 21.4542 KOps/s 21.6870 KOps/s $\color{#d91a1a}-1.07\%$
test_getitem[tuple] 42.2030μs 18.2221μs 54.8784 KOps/s 53.3428 KOps/s $\color{#35bf28}+2.88\%$
test_getitem[list] 0.1479ms 32.8200μs 30.4693 KOps/s 29.6940 KOps/s $\color{#35bf28}+2.61\%$
test_setitem_dim[int] 47.9930μs 30.9972μs 32.2610 KOps/s 36.0811 KOps/s $\textbf{\color{#d91a1a}-10.59\%}$
test_setitem_dim[slice_int] 70.6140μs 51.6686μs 19.3541 KOps/s 20.5066 KOps/s $\textbf{\color{#d91a1a}-5.62\%}$
test_setitem_dim[range] 86.8850μs 68.0071μs 14.7044 KOps/s 15.1854 KOps/s $\color{#d91a1a}-3.17\%$
test_setitem_dim[tuple] 76.3250μs 45.3148μs 22.0678 KOps/s 23.5600 KOps/s $\textbf{\color{#d91a1a}-6.33\%}$
test_setitem 52.1430μs 17.4536μs 57.2949 KOps/s 60.9935 KOps/s $\textbf{\color{#d91a1a}-6.06\%}$
test_set 45.8030μs 16.7872μs 59.5694 KOps/s 63.7317 KOps/s $\textbf{\color{#d91a1a}-6.53\%}$
test_set_shared 1.5239ms 97.9368μs 10.2107 KOps/s 9.3159 KOps/s $\textbf{\color{#35bf28}+9.60\%}$
test_update 67.7040μs 20.5409μs 48.6834 KOps/s 56.3205 KOps/s $\textbf{\color{#d91a1a}-13.56\%}$
test_update_nested 72.5340μs 25.9737μs 38.5005 KOps/s 43.5279 KOps/s $\textbf{\color{#d91a1a}-11.55\%}$
test_update__nested 52.6530μs 21.9698μs 45.5170 KOps/s 44.9216 KOps/s $\color{#35bf28}+1.33\%$
test_set_nested 62.6150μs 17.8759μs 55.9414 KOps/s 59.0603 KOps/s $\textbf{\color{#d91a1a}-5.28\%}$
test_set_nested_new 59.2940μs 20.9385μs 47.7590 KOps/s 51.1451 KOps/s $\textbf{\color{#d91a1a}-6.62\%}$
test_select 63.5240μs 34.3755μs 29.0905 KOps/s 31.3760 KOps/s $\textbf{\color{#d91a1a}-7.28\%}$
test_select_nested 84.9350μs 55.2493μs 18.0998 KOps/s 18.1062 KOps/s $\color{#d91a1a}-0.04\%$
test_exclude_nested 0.1525ms 0.1104ms 9.0611 KOps/s 9.1904 KOps/s $\color{#d91a1a}-1.41\%$
test_empty[True] 0.3654ms 0.3431ms 2.9147 KOps/s 2.9279 KOps/s $\color{#d91a1a}-0.45\%$
test_empty[False] 2.4432μs 0.9183μs 1.0890 MOps/s 1.0691 MOps/s $\color{#35bf28}+1.87\%$
test_to 0.1013ms 77.5726μs 12.8912 KOps/s 13.2435 KOps/s $\color{#d91a1a}-2.66\%$
test_to_nonblocking 89.0350μs 61.0740μs 16.3736 KOps/s 15.6664 KOps/s $\color{#35bf28}+4.51\%$
test_unbind_speed 1.7655ms 0.2601ms 3.8444 KOps/s 3.8280 KOps/s $\color{#35bf28}+0.43\%$
test_unbind_speed_stack0 0.3114ms 0.2613ms 3.8277 KOps/s 3.8204 KOps/s $\color{#35bf28}+0.19\%$
test_unbind_speed_stack1 76.7897ms 0.8032ms 1.2451 KOps/s 1.1690 KOps/s $\textbf{\color{#35bf28}+6.51\%}$
test_split 77.2836ms 1.6633ms 601.2022 Ops/s 632.4772 Ops/s $\color{#d91a1a}-4.94\%$
test_chunk 77.7161ms 1.6568ms 603.5892 Ops/s 588.4782 Ops/s $\color{#35bf28}+2.57\%$
test_creation[device0] 0.1298ms 57.5175μs 17.3860 KOps/s 17.2701 KOps/s $\color{#35bf28}+0.67\%$
test_creation_from_tensor 0.1309ms 54.1210μs 18.4771 KOps/s 18.3370 KOps/s $\color{#35bf28}+0.76\%$
test_add_one[memmap_tensor0] 57.5930μs 6.9860μs 143.1439 KOps/s 146.4005 KOps/s $\color{#d91a1a}-2.22\%$
test_contiguous[memmap_tensor0] 13.4700μs 0.6628μs 1.5087 MOps/s 1.4984 MOps/s $\color{#35bf28}+0.69\%$
test_stack[memmap_tensor0] 43.8930μs 4.7298μs 211.4260 KOps/s 210.8152 KOps/s $\color{#35bf28}+0.29\%$
test_memmaptd_index 1.1302ms 0.2868ms 3.4868 KOps/s 3.5046 KOps/s $\color{#d91a1a}-0.51\%$
test_memmaptd_index_astensor 0.6582ms 0.3555ms 2.8126 KOps/s 2.8368 KOps/s $\color{#d91a1a}-0.85\%$
test_memmaptd_index_op 1.2737ms 0.6850ms 1.4600 KOps/s 1.5561 KOps/s $\textbf{\color{#d91a1a}-6.18\%}$
test_serialize_model 0.1867s 0.1110s 9.0092 Ops/s 8.5587 Ops/s $\textbf{\color{#35bf28}+5.26\%}$
test_serialize_model_pickle 1.3547s 1.2368s 0.8085 Ops/s 0.8082 Ops/s $\color{#35bf28}+0.03\%$
test_serialize_weights 0.1834s 0.1087s 9.1987 Ops/s 8.7617 Ops/s $\color{#35bf28}+4.99\%$
test_serialize_weights_returnearly 0.3143s 0.1063s 9.4097 Ops/s 10.6608 Ops/s $\textbf{\color{#d91a1a}-11.74\%}$
test_serialize_weights_pickle 1.3512s 1.2483s 0.8011 Ops/s 0.8010 Ops/s $+0.01\%$
test_reshape_pytree 58.3630μs 26.3559μs 37.9422 KOps/s 39.1253 KOps/s $\color{#d91a1a}-3.02\%$
test_reshape_td 0.2468ms 31.5457μs 31.7001 KOps/s 32.6892 KOps/s $\color{#d91a1a}-3.03\%$
test_view_pytree 50.1530μs 25.7653μs 38.8119 KOps/s 39.5696 KOps/s $\color{#d91a1a}-1.91\%$
test_view_td 0.2477ms 35.7918μs 27.9394 KOps/s 28.7575 KOps/s $\color{#d91a1a}-2.84\%$
test_unbind_pytree 60.9940μs 31.7452μs 31.5008 KOps/s 31.3762 KOps/s $\color{#35bf28}+0.40\%$
test_unbind_td 0.4734ms 41.0735μs 24.3466 KOps/s 25.0926 KOps/s $\color{#d91a1a}-2.97\%$
test_split_pytree 0.1750ms 35.4407μs 28.2161 KOps/s 30.0618 KOps/s $\textbf{\color{#d91a1a}-6.14\%}$
test_split_td 0.2533ms 40.3395μs 24.7896 KOps/s 25.4928 KOps/s $\color{#d91a1a}-2.76\%$
test_add_pytree 66.3140μs 37.5061μs 26.6623 KOps/s 27.0342 KOps/s $\color{#d91a1a}-1.38\%$
test_add_td 0.2690ms 52.9074μs 18.9010 KOps/s 21.5958 KOps/s $\textbf{\color{#d91a1a}-12.48\%}$
test_distributed 2.7256ms 77.4117μs 12.9179 KOps/s 11.1184 KOps/s $\textbf{\color{#35bf28}+16.19\%}$
test_tdmodule 30.5430μs 15.4661μs 64.6575 KOps/s 67.2938 KOps/s $\color{#d91a1a}-3.92\%$
test_tdmodule_dispatch 61.9640μs 30.6974μs 32.5761 KOps/s 34.4364 KOps/s $\textbf{\color{#d91a1a}-5.40\%}$
test_tdseq 34.1820μs 17.6292μs 56.7242 KOps/s 61.7728 KOps/s $\textbf{\color{#d91a1a}-8.17\%}$
test_tdseq_dispatch 54.1430μs 34.4202μs 29.0527 KOps/s 32.1112 KOps/s $\textbf{\color{#d91a1a}-9.52\%}$
test_instantiation_functorch 1.6070ms 1.5192ms 658.2244 Ops/s 655.4882 Ops/s $\color{#35bf28}+0.42\%$
test_instantiation_td 1.5687ms 1.0457ms 956.3243 Ops/s 883.1898 Ops/s $\textbf{\color{#35bf28}+8.28\%}$
test_exec_functorch 0.1833ms 0.1503ms 6.6525 KOps/s 6.4655 KOps/s $\color{#35bf28}+2.89\%$
test_exec_functional_call 0.1841ms 0.1416ms 7.0602 KOps/s 6.9080 KOps/s $\color{#35bf28}+2.20\%$
test_exec_td 0.1718ms 0.1403ms 7.1256 KOps/s 6.9707 KOps/s $\color{#35bf28}+2.22\%$
test_exec_td_decorator 0.6882ms 0.2110ms 4.7404 KOps/s 4.6859 KOps/s $\color{#35bf28}+1.16\%$
test_vmap_mlp_speed[True-True] 0.8113ms 0.5837ms 1.7132 KOps/s 1.5230 KOps/s $\textbf{\color{#35bf28}+12.49\%}$
test_vmap_mlp_speed[True-False] 0.6229ms 0.5806ms 1.7224 KOps/s 1.6681 KOps/s $\color{#35bf28}+3.26\%$
test_vmap_mlp_speed[False-True] 0.5575ms 0.5090ms 1.9646 KOps/s 1.8965 KOps/s $\color{#35bf28}+3.59\%$
test_vmap_mlp_speed[False-False] 0.5567ms 0.5083ms 1.9672 KOps/s 1.9315 KOps/s $\color{#35bf28}+1.85\%$
test_vmap_mlp_speed_decorator[True-True] 0.7642ms 0.6518ms 1.5342 KOps/s 1.5327 KOps/s $\color{#35bf28}+0.10\%$
test_vmap_mlp_speed_decorator[True-False] 0.8095ms 0.6512ms 1.5356 KOps/s 1.5361 KOps/s $\color{#d91a1a}-0.03\%$
test_vmap_mlp_speed_decorator[False-True] 0.6980ms 0.5727ms 1.7462 KOps/s 1.7303 KOps/s $\color{#35bf28}+0.91\%$
test_vmap_mlp_speed_decorator[False-False] 0.6973ms 0.5723ms 1.7474 KOps/s 1.7021 KOps/s $\color{#35bf28}+2.66\%$
test_vmap_transformer_speed[True-True] 7.8435ms 7.6326ms 131.0178 Ops/s 128.1716 Ops/s $\color{#35bf28}+2.22\%$
test_vmap_transformer_speed[True-False] 8.1346ms 7.6191ms 131.2489 Ops/s 128.6306 Ops/s $\color{#35bf28}+2.04\%$
test_vmap_transformer_speed[False-True] 7.7792ms 7.5633ms 132.2181 Ops/s 129.2206 Ops/s $\color{#35bf28}+2.32\%$
test_vmap_transformer_speed[False-False] 7.7024ms 7.5321ms 132.7645 Ops/s 130.3366 Ops/s $\color{#35bf28}+1.86\%$
test_vmap_transformer_speed_decorator[True-True] 19.4242ms 18.5984ms 53.7680 Ops/s 53.0899 Ops/s $\color{#35bf28}+1.28\%$
test_vmap_transformer_speed_decorator[True-False] 18.8239ms 18.5368ms 53.9468 Ops/s 53.1150 Ops/s $\color{#35bf28}+1.57\%$
test_vmap_transformer_speed_decorator[False-True] 18.7148ms 18.4298ms 54.2600 Ops/s 53.4323 Ops/s $\color{#35bf28}+1.55\%$
test_vmap_transformer_speed_decorator[False-False] 18.7723ms 18.4264ms 54.2701 Ops/s 53.3780 Ops/s $\color{#35bf28}+1.67\%$
test_to_module_speed[True] 2.9169ms 1.5416ms 648.6960 Ops/s 651.3503 Ops/s $\color{#d91a1a}-0.41\%$
test_to_module_speed[False] 1.9867ms 1.5060ms 664.0155 Ops/s 656.0729 Ops/s $\color{#35bf28}+1.21\%$
test_tc_init 53.7530μs 29.2225μs 34.2203 KOps/s 43.7067 KOps/s $\textbf{\color{#d91a1a}-21.70\%}$
test_tc_init_nested 90.9250μs 63.5446μs 15.7370 KOps/s 21.4004 KOps/s $\textbf{\color{#d91a1a}-26.46\%}$
test_tc_first_layer_tensor 0.8690μs 0.3561μs 2.8083 MOps/s 2.7526 MOps/s $\color{#35bf28}+2.03\%$
test_tc_first_layer_nontensor 1.8502μs 0.3857μs 2.5928 MOps/s 2.5157 MOps/s $\color{#35bf28}+3.07\%$
test_tc_second_layer_tensor 4.5204μs 0.9635μs 1.0379 MOps/s 1.0198 MOps/s $\color{#35bf28}+1.78\%$
test_tc_second_layer_nontensor 1.6031μs 0.7931μs 1.2609 MOps/s 1.1951 MOps/s $\textbf{\color{#35bf28}+5.50\%}$
test_unbind 0.1122s 6.7419ms 148.3266 Ops/s 121.4644 Ops/s $\textbf{\color{#35bf28}+22.12\%}$
test_full_like 12.1184ms 11.5033ms 86.9317 Ops/s 74.7096 Ops/s $\textbf{\color{#35bf28}+16.36\%}$
test_zeros_like 8.1063ms 7.8404ms 127.5441 Ops/s 128.7999 Ops/s $\color{#d91a1a}-0.97\%$
test_ones_like 8.1181ms 7.8720ms 127.0330 Ops/s 128.0808 Ops/s $\color{#d91a1a}-0.82\%$
test_clone 9.9149ms 9.6315ms 103.8263 Ops/s 103.7775 Ops/s $\color{#35bf28}+0.05\%$
test_squeeze 60.9440μs 10.9487μs 91.3351 KOps/s 88.8049 KOps/s $\color{#35bf28}+2.85\%$
test_unsqueeze 94.2970μs 51.2008μs 19.5309 KOps/s 19.7687 KOps/s $\color{#d91a1a}-1.20\%$
test_split 0.1540ms 95.9064μs 10.4268 KOps/s 10.2227 KOps/s $\color{#35bf28}+2.00\%$
test_permute 0.1817ms 0.1100ms 9.0888 KOps/s 9.2606 KOps/s $\color{#d91a1a}-1.85\%$
test_stack 28.7988ms 28.0356ms 35.6690 Ops/s 35.7285 Ops/s $\color{#d91a1a}-0.17\%$
test_cat 28.9295ms 27.9367ms 35.7953 Ops/s 35.9579 Ops/s $\color{#d91a1a}-0.45\%$

@vmoens vmoens deleted the isfinite-isnan branch October 21, 2024 14:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants