Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Registering a tensordict as a module buffer #395

Merged
merged 15 commits into from
May 26, 2023

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented May 25, 2023

If a tensordict is registered as a module buffer, it must accept a bunch of calls like double, float etc.

@vmoens vmoens requested a review from tcbegley May 25, 2023 13:39
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 25, 2023
@vmoens vmoens added the enhancement New feature or request label May 25, 2023
Copy link
Contributor

@tcbegley tcbegley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@apbard apbard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just added minor

tensordict/tensordict.py Outdated Show resolved Hide resolved
@github-actions
Copy link

github-actions bot commented May 25, 2023

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 47. Improved: $\large\color{#35bf28}1$. Worsened: $\large\color{#d91a1a}1$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_common_ops 1.0693ms 1.0449ms 957.0082 Ops/s 978.6754 Ops/s $\color{#d91a1a}-2.21\%$
test_creation 3.6820μs 3.3770μs 296.1246 KOps/s 296.9089 KOps/s $\color{#d91a1a}-0.26\%$
test_creation_empty 14.9340μs 14.2796μs 70.0300 KOps/s 69.2013 KOps/s $\color{#35bf28}+1.20\%$
test_creation_nested_1 24.3010μs 23.2652μs 42.9827 KOps/s 43.2597 KOps/s $\color{#d91a1a}-0.64\%$
test_creation_nested_2 25.7450μs 24.9310μs 40.1107 KOps/s 40.0040 KOps/s $\color{#35bf28}+0.27\%$
test_clone 24.8400μs 22.5884μs 44.2704 KOps/s 44.7619 KOps/s $\color{#d91a1a}-1.10\%$
test_getitem[int] 27.6471μs 26.8732μs 37.2118 KOps/s 37.6791 KOps/s $\color{#d91a1a}-1.24\%$
test_getitem[slice_int] 57.9287μs 53.7824μs 18.5935 KOps/s 19.1320 KOps/s $\color{#d91a1a}-2.81\%$
test_getitem[range] 58.3984μs 57.2980μs 17.4526 KOps/s 17.8415 KOps/s $\color{#d91a1a}-2.18\%$
test_getitem[tuple] 50.6427μs 50.1386μs 19.9447 KOps/s 21.2128 KOps/s $\textbf{\color{#d91a1a}-5.98\%}$
test_getitem[list] 50.4754μs 49.1450μs 20.3480 KOps/s 20.6265 KOps/s $\color{#d91a1a}-1.35\%$
test_setitem_dim[int] 0.1149ms 39.1730μs 25.5278 KOps/s 26.0188 KOps/s $\color{#d91a1a}-1.89\%$
test_setitem_dim[slice_int] 0.1047ms 69.5087μs 14.3867 KOps/s 14.6059 KOps/s $\color{#d91a1a}-1.50\%$
test_setitem_dim[range] 0.1539ms 66.7577μs 14.9795 KOps/s 15.0005 KOps/s $\color{#d91a1a}-0.14\%$
test_setitem_dim[tuple] 97.8000μs 63.6265μs 15.7167 KOps/s 15.8491 KOps/s $\color{#d91a1a}-0.84\%$
test_setitem 31.5589μs 30.0036μs 33.3293 KOps/s 33.9409 KOps/s $\color{#d91a1a}-1.80\%$
test_set 30.1199μs 29.2582μs 34.1784 KOps/s 34.4743 KOps/s $\color{#d91a1a}-0.86\%$
test_set_shared 0.1509ms 0.1473ms 6.7909 KOps/s 6.8223 KOps/s $\color{#d91a1a}-0.46\%$
test_update 39.4768μs 37.3155μs 26.7985 KOps/s 27.1286 KOps/s $\color{#d91a1a}-1.22\%$
test_update_nested 56.3398μs 55.0461μs 18.1666 KOps/s 18.5479 KOps/s $\color{#d91a1a}-2.06\%$
test_set_nested 38.7678μs 37.5881μs 26.6041 KOps/s 27.0531 KOps/s $\color{#d91a1a}-1.66\%$
test_set_nested_new 54.7898μs 53.5556μs 18.6722 KOps/s 18.9072 KOps/s $\color{#d91a1a}-1.24\%$
test_select 88.0576μs 86.4625μs 11.5657 KOps/s 11.6217 KOps/s $\color{#d91a1a}-0.48\%$
test_creation[device0] 1.0614ms 0.4331ms 2.3091 KOps/s 2.3350 KOps/s $\color{#d91a1a}-1.11\%$
test_creation_from_tensor 0.5608ms 0.4145ms 2.4123 KOps/s 2.3962 KOps/s $\color{#35bf28}+0.67\%$
test_add_one[memmap_tensor0] 79.6537μs 28.0827μs 35.6092 KOps/s 35.4163 KOps/s $\color{#35bf28}+0.54\%$
test_contiguous[memmap_tensor0] 8.4640μs 7.7743μs 128.6292 KOps/s 126.4842 KOps/s $\color{#35bf28}+1.70\%$
test_stack[memmap_tensor0] 0.1591ms 40.9898μs 24.3963 KOps/s 23.8949 KOps/s $\color{#35bf28}+2.10\%$
test_reshape_pytree 32.2619μs 29.7367μs 33.6285 KOps/s 33.3103 KOps/s $\color{#35bf28}+0.96\%$
test_reshape_td 43.5228μs 40.2094μs 24.8698 KOps/s 24.7949 KOps/s $\color{#35bf28}+0.30\%$
test_view_pytree 28.3519μs 27.4736μs 36.3986 KOps/s 36.3381 KOps/s $\color{#35bf28}+0.17\%$
test_view_td 7.8200μs 7.1987μs 138.9142 KOps/s 136.7587 KOps/s $\color{#35bf28}+1.58\%$
test_unbind_pytree 33.5869μs 31.9598μs 31.2893 KOps/s 31.3914 KOps/s $\color{#d91a1a}-0.33\%$
test_unbind_td 0.1565ms 0.1544ms 6.4773 KOps/s 6.8182 KOps/s $\color{#d91a1a}-5.00\%$
test_split_pytree 37.2178μs 35.6760μs 28.0300 KOps/s 28.3678 KOps/s $\color{#d91a1a}-1.19\%$
test_split_td 0.1016ms 0.1004ms 9.9635 KOps/s 10.3136 KOps/s $\color{#d91a1a}-3.39\%$
test_add_pytree 40.5999μs 38.7788μs 25.7873 KOps/s 25.9094 KOps/s $\color{#d91a1a}-0.47\%$
test_add_td 62.1558μs 60.8510μs 16.4336 KOps/s 16.4307 KOps/s $\color{#35bf28}+0.02\%$
test_distributed 61.2000μs 61.2000μs 16.3399 KOps/s 12.6263 KOps/s $\textbf{\color{#35bf28}+29.41\%}$
test_tdmodule 0.1206ms 24.0952μs 41.5021 KOps/s 41.2831 KOps/s $\color{#35bf28}+0.53\%$
test_tdmodule_dispatch 0.2656ms 51.9998μs 19.2309 KOps/s 19.5137 KOps/s $\color{#d91a1a}-1.45\%$
test_tdseq 89.3000μs 28.4109μs 35.1978 KOps/s 35.6899 KOps/s $\color{#d91a1a}-1.38\%$
test_tdseq_dispatch 0.1204ms 53.5969μs 18.6578 KOps/s 18.8779 KOps/s $\color{#d91a1a}-1.17\%$
test_instantiation_functorch 1.3164ms 1.2472ms 801.8111 Ops/s 779.9557 Ops/s $\color{#35bf28}+2.80\%$
test_instantiation_td 1.0298ms 0.9673ms 1.0338 KOps/s 1.0266 KOps/s $\color{#35bf28}+0.70\%$
test_exec_functorch 0.1616ms 0.1556ms 6.4281 KOps/s 6.4592 KOps/s $\color{#d91a1a}-0.48\%$
test_exec_td 0.2745ms 0.2723ms 3.6728 KOps/s 3.6446 KOps/s $\color{#35bf28}+0.77\%$

@github-actions
Copy link

github-actions bot commented May 25, 2023

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 47. Improved: $\large\color{#35bf28}6$. Worsened: $\large\color{#d91a1a}10$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_common_ops 2.2033ms 1.9578ms 510.7645 Ops/s 497.7788 Ops/s $\color{#35bf28}+2.61\%$
test_creation 7.6760μs 5.1334μs 194.8017 KOps/s 177.9433 KOps/s $\textbf{\color{#35bf28}+9.47\%}$
test_creation_empty 28.7360μs 20.9950μs 47.6304 KOps/s 46.7616 KOps/s $\color{#35bf28}+1.86\%$
test_creation_nested_1 61.8101μs 39.2979μs 25.4467 KOps/s 26.0900 KOps/s $\color{#d91a1a}-2.47\%$
test_creation_nested_2 52.0661μs 38.1444μs 26.2162 KOps/s 26.3135 KOps/s $\color{#d91a1a}-0.37\%$
test_clone 75.7631μs 37.2311μs 26.8593 KOps/s 25.4927 KOps/s $\textbf{\color{#35bf28}+5.36\%}$
test_getitem[int] 68.2920μs 47.8104μs 20.9160 KOps/s 20.6328 KOps/s $\color{#35bf28}+1.37\%$
test_getitem[slice_int] 0.1336ms 0.1046ms 9.5603 KOps/s 9.1228 KOps/s $\color{#35bf28}+4.80\%$
test_getitem[range] 0.1900ms 0.1270ms 7.8758 KOps/s 7.0895 KOps/s $\textbf{\color{#35bf28}+11.09\%}$
test_getitem[tuple] 0.1226ms 88.4319μs 11.3081 KOps/s 10.8899 KOps/s $\color{#35bf28}+3.84\%$
test_getitem[list] 0.1647ms 0.1087ms 9.2027 KOps/s 8.9767 KOps/s $\color{#35bf28}+2.52\%$
test_setitem_dim[int] 4.2576ms 77.0591μs 12.9771 KOps/s 13.4811 KOps/s $\color{#d91a1a}-3.74\%$
test_setitem_dim[slice_int] 2.9383ms 0.1414ms 7.0741 KOps/s 7.2222 KOps/s $\color{#d91a1a}-2.05\%$
test_setitem_dim[range] 4.4607ms 0.1406ms 7.1144 KOps/s 7.1849 KOps/s $\color{#d91a1a}-0.98\%$
test_setitem_dim[tuple] 2.8586ms 0.1211ms 8.2596 KOps/s 8.3056 KOps/s $\color{#d91a1a}-0.55\%$
test_setitem 71.8160μs 58.6816μs 17.0411 KOps/s 15.2702 KOps/s $\textbf{\color{#35bf28}+11.60\%}$
test_set 0.1081ms 58.4781μs 17.1004 KOps/s 16.3858 KOps/s $\color{#35bf28}+4.36\%$
test_set_shared 0.4247ms 0.3203ms 3.1223 KOps/s 3.1250 KOps/s $\color{#d91a1a}-0.09\%$
test_update 95.8501μs 80.4897μs 12.4240 KOps/s 13.2675 KOps/s $\textbf{\color{#d91a1a}-6.36\%}$
test_update_nested 0.1799ms 0.1084ms 9.2232 KOps/s 10.0678 KOps/s $\textbf{\color{#d91a1a}-8.39\%}$
test_set_nested 0.1059ms 72.2482μs 13.8412 KOps/s 13.9138 KOps/s $\color{#d91a1a}-0.52\%$
test_set_nested_new 0.1378ms 98.4094μs 10.1616 KOps/s 9.9812 KOps/s $\color{#35bf28}+1.81\%$
test_select 0.2033ms 0.1527ms 6.5493 KOps/s 6.5433 KOps/s $\color{#35bf28}+0.09\%$
test_creation[device0] 1.3306ms 0.5976ms 1.6734 KOps/s 1.6926 KOps/s $\color{#d91a1a}-1.13\%$
test_creation_from_tensor 0.8120ms 0.5991ms 1.6693 KOps/s 1.7280 KOps/s $\color{#d91a1a}-3.40\%$
test_add_one[memmap_tensor0] 0.1066ms 58.3539μs 17.1368 KOps/s 17.6367 KOps/s $\color{#d91a1a}-2.83\%$
test_contiguous[memmap_tensor0] 58.6561μs 12.8624μs 77.7460 KOps/s 78.9309 KOps/s $\color{#d91a1a}-1.50\%$
test_stack[memmap_tensor0] 0.2374ms 69.7201μs 14.3431 KOps/s 15.1929 KOps/s $\textbf{\color{#d91a1a}-5.59\%}$
test_reshape_pytree 86.4052μs 49.5196μs 20.1940 KOps/s 21.5877 KOps/s $\textbf{\color{#d91a1a}-6.46\%}$
test_reshape_td 0.1500ms 72.0488μs 13.8795 KOps/s 14.3610 KOps/s $\color{#d91a1a}-3.35\%$
test_view_pytree 45.7351μs 42.1856μs 23.7048 KOps/s 21.3986 KOps/s $\textbf{\color{#35bf28}+10.78\%}$
test_view_td 15.6490μs 11.2489μs 88.8973 KOps/s 89.2098 KOps/s $\color{#d91a1a}-0.35\%$
test_unbind_pytree 83.1551μs 47.3251μs 21.1304 KOps/s 17.4449 KOps/s $\textbf{\color{#35bf28}+21.13\%}$
test_unbind_td 0.3600ms 0.2898ms 3.4505 KOps/s 3.4086 KOps/s $\color{#35bf28}+1.23\%$
test_split_pytree 77.5341μs 54.4399μs 18.3689 KOps/s 18.0118 KOps/s $\color{#35bf28}+1.98\%$
test_split_td 0.2504ms 0.1763ms 5.6736 KOps/s 5.6680 KOps/s $\color{#35bf28}+0.10\%$
test_add_pytree 96.3421μs 66.6139μs 15.0119 KOps/s 16.4819 KOps/s $\textbf{\color{#d91a1a}-8.92\%}$
test_add_td 0.1748ms 0.1359ms 7.3564 KOps/s 7.7865 KOps/s $\textbf{\color{#d91a1a}-5.52\%}$
test_distributed 0.1148ms 0.1148ms 8.7108 KOps/s 13.6238 KOps/s $\textbf{\color{#d91a1a}-36.06\%}$
test_tdmodule 5.4064ms 46.3670μs 21.5671 KOps/s 22.7822 KOps/s $\textbf{\color{#d91a1a}-5.33\%}$
test_tdmodule_dispatch 56.3322ms 0.1033ms 9.6849 KOps/s 10.4196 KOps/s $\textbf{\color{#d91a1a}-7.05\%}$
test_tdseq 2.4169ms 59.6202μs 16.7728 KOps/s 17.7425 KOps/s $\textbf{\color{#d91a1a}-5.47\%}$
test_tdseq_dispatch 5.9943ms 0.1100ms 9.0919 KOps/s 9.5514 KOps/s $\color{#d91a1a}-4.81\%$
test_instantiation_functorch 3.1672ms 2.2591ms 442.6457 Ops/s 440.8187 Ops/s $\color{#35bf28}+0.41\%$
test_instantiation_td 8.2024ms 1.7224ms 580.5988 Ops/s 594.7146 Ops/s $\color{#d91a1a}-2.37\%$
test_exec_functorch 0.3738ms 0.3080ms 3.2471 KOps/s 3.4083 KOps/s $\color{#d91a1a}-4.73\%$
test_exec_td 0.6822ms 0.5607ms 1.7835 KOps/s 1.7965 KOps/s $\color{#d91a1a}-0.72\%$

@vmoens vmoens merged commit 19beb2b into main May 26, 2023
@vmoens vmoens deleted the buffer_compatibility branch May 26, 2023 08:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants