-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update PythonZipper action to use CommandLineItem.CapturingMapFn #15037
Update PythonZipper action to use CommandLineItem.CapturingMapFn #15037
Conversation
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). For more information, open the CLA check for this pull request. |
b2235c5
to
b25bb53
Compare
Apologies for the review request spam. It appears that force-pushing and updating the base branch caused some CODEOWNERS auto-requests to go through, and I don't have permission to remove the requests. edit: If anyone sees this and has permissions to remove review requests, please remove all of them. |
@rickeylev WDYT? |
LGTM, I guess! I'm not familiar with the Java APIs to do this, but it looks like it does the deferring of expansion. |
### Summary This PR attempts to address bazelbuild#14890. This updates the PythonZipper action to use a `CommandLineItem.CapturingMapFn` to defer expanding arguments until after analysis. I used `CapturingMapFn`, as it seemed like the only way to go about implementing the arg filtering that is done by the current code to exclude the `executable` and `zipFile` artifacts. ### Test Plan Use the example from bazelbuild#14890 of looking at memory usage in tensorflow before/after this patch is applied on top of version `5.0.0` Initial Setup ``` git clone [email protected]:tensorflow/tensorflow.git cd tensorflow python3 -m venv venv source venv/bin/activate pip install numpy ``` View memory usage at 5.0.0: ```bash # STARTUP_FLAGS from https://bazel.build/rules/performance#enabling-memory-tracking $ bazel ${STARTUP_FLAGS[@]} build --nobuild //tensorflow/python/... $ bazel ${STARTUP_FLAGS[@]} dump --rules | egrep '(RULE|py_)' RULE COUNT ACTIONS BYTES EACH py_test 1,714 15,390 413,338,632 241,154 py_library 1,102 0 2,097,152 1,903 py_binary 33 198 8,914,840 270,146 py_runtime 6 0 0 0 py_runtime_pair 3 0 0 0 _concat_flatbuffer_py_srcs 2 2 0 0 $ bazel ${STARTUP_FLAGS[@]} info used-heap-size-after-gc 635MB ``` View memory usage at 5.0.0 with this patch applied: ```bash $ ~/code/bazel/bazel-bin/src/bazel ${STARTUP_FLAGS[@]} dump --rules | egrep '(RULE|py_)' RULE COUNT ACTIONS BYTES EACH py_test 1,714 15,390 65,323,312 38,111 py_library 1,102 0 2,359,576 2,141 py_binary 33 198 524,400 15,890 py_runtime 6 0 0 0 py_runtime_pair 3 0 0 0 _concat_flatbuffer_py_srcs 2 2 0 0 $ ~/code/bazel/bazel-bin/src/bazel ${STARTUP_FLAGS[@]} info used-heap-size-after-gc 283MB ``` Ensure that the generated actions have not changed: ```bash $ bazel aquery --build_python_zip 'mnemonic("PythonZipper", //tensorflow/python/...)' > /tmp/bazel_5_out $ ~/code/bazel/bazel-bin/src/bazel aquery --build_python_zip 'mnemonic("PythonZipper", //tensorflow/python/...)' > /tmp/bazel_5_patched_out $ diff /tmp/bazel_5_out /tmp/bazel_5_patched_out # note: no diff output ``` Closes bazelbuild#15037. PiperOrigin-RevId: 441257695 (cherry picked from commit 9610ae8)
### Summary This PR attempts to address bazelbuild#14890. This updates the PythonZipper action to use a `CommandLineItem.CapturingMapFn` to defer expanding arguments until after analysis. I used `CapturingMapFn`, as it seemed like the only way to go about implementing the arg filtering that is done by the current code to exclude the `executable` and `zipFile` artifacts. ### Test Plan Use the example from bazelbuild#14890 of looking at memory usage in tensorflow before/after this patch is applied on top of version `5.0.0` Initial Setup ``` git clone [email protected]:tensorflow/tensorflow.git cd tensorflow python3 -m venv venv source venv/bin/activate pip install numpy ``` View memory usage at 5.0.0: ```bash # STARTUP_FLAGS from https://bazel.build/rules/performance#enabling-memory-tracking $ bazel ${STARTUP_FLAGS[@]} build --nobuild //tensorflow/python/... $ bazel ${STARTUP_FLAGS[@]} dump --rules | egrep '(RULE|py_)' RULE COUNT ACTIONS BYTES EACH py_test 1,714 15,390 413,338,632 241,154 py_library 1,102 0 2,097,152 1,903 py_binary 33 198 8,914,840 270,146 py_runtime 6 0 0 0 py_runtime_pair 3 0 0 0 _concat_flatbuffer_py_srcs 2 2 0 0 $ bazel ${STARTUP_FLAGS[@]} info used-heap-size-after-gc 635MB ``` View memory usage at 5.0.0 with this patch applied: ```bash $ ~/code/bazel/bazel-bin/src/bazel ${STARTUP_FLAGS[@]} dump --rules | egrep '(RULE|py_)' RULE COUNT ACTIONS BYTES EACH py_test 1,714 15,390 65,323,312 38,111 py_library 1,102 0 2,359,576 2,141 py_binary 33 198 524,400 15,890 py_runtime 6 0 0 0 py_runtime_pair 3 0 0 0 _concat_flatbuffer_py_srcs 2 2 0 0 $ ~/code/bazel/bazel-bin/src/bazel ${STARTUP_FLAGS[@]} info used-heap-size-after-gc 283MB ``` Ensure that the generated actions have not changed: ```bash $ bazel aquery --build_python_zip 'mnemonic("PythonZipper", //tensorflow/python/...)' > /tmp/bazel_5_out $ ~/code/bazel/bazel-bin/src/bazel aquery --build_python_zip 'mnemonic("PythonZipper", //tensorflow/python/...)' > /tmp/bazel_5_patched_out $ diff /tmp/bazel_5_out /tmp/bazel_5_patched_out # note: no diff output ``` Closes bazelbuild#15037. PiperOrigin-RevId: 441257695 (cherry picked from commit 9610ae8)
) ### Summary This PR attempts to address #14890. This updates the PythonZipper action to use a `CommandLineItem.CapturingMapFn` to defer expanding arguments until after analysis. I used `CapturingMapFn`, as it seemed like the only way to go about implementing the arg filtering that is done by the current code to exclude the `executable` and `zipFile` artifacts. ### Test Plan Use the example from #14890 of looking at memory usage in tensorflow before/after this patch is applied on top of version `5.0.0` Initial Setup ``` git clone [email protected]:tensorflow/tensorflow.git cd tensorflow python3 -m venv venv source venv/bin/activate pip install numpy ``` View memory usage at 5.0.0: ```bash # STARTUP_FLAGS from https://bazel.build/rules/performance#enabling-memory-tracking $ bazel ${STARTUP_FLAGS[@]} build --nobuild //tensorflow/python/... $ bazel ${STARTUP_FLAGS[@]} dump --rules | egrep '(RULE|py_)' RULE COUNT ACTIONS BYTES EACH py_test 1,714 15,390 413,338,632 241,154 py_library 1,102 0 2,097,152 1,903 py_binary 33 198 8,914,840 270,146 py_runtime 6 0 0 0 py_runtime_pair 3 0 0 0 _concat_flatbuffer_py_srcs 2 2 0 0 $ bazel ${STARTUP_FLAGS[@]} info used-heap-size-after-gc 635MB ``` View memory usage at 5.0.0 with this patch applied: ```bash $ ~/code/bazel/bazel-bin/src/bazel ${STARTUP_FLAGS[@]} dump --rules | egrep '(RULE|py_)' RULE COUNT ACTIONS BYTES EACH py_test 1,714 15,390 65,323,312 38,111 py_library 1,102 0 2,359,576 2,141 py_binary 33 198 524,400 15,890 py_runtime 6 0 0 0 py_runtime_pair 3 0 0 0 _concat_flatbuffer_py_srcs 2 2 0 0 $ ~/code/bazel/bazel-bin/src/bazel ${STARTUP_FLAGS[@]} info used-heap-size-after-gc 283MB ``` Ensure that the generated actions have not changed: ```bash $ bazel aquery --build_python_zip 'mnemonic("PythonZipper", //tensorflow/python/...)' > /tmp/bazel_5_out $ ~/code/bazel/bazel-bin/src/bazel aquery --build_python_zip 'mnemonic("PythonZipper", //tensorflow/python/...)' > /tmp/bazel_5_patched_out $ diff /tmp/bazel_5_out /tmp/bazel_5_patched_out # note: no diff output ``` Closes #15037. PiperOrigin-RevId: 441257695 (cherry picked from commit 9610ae8) Co-authored-by: Alex Torok <[email protected]>
Summary
This PR attempts to address #14890.
This updates the PythonZipper action to use a
CommandLineItem.CapturingMapFn
to defer expanding arguments until after analysis. I usedCapturingMapFn
, as it seemed like the only way to go about implementing the arg filtering that is done by the current code to exclude theexecutable
andzipFile
artifacts.Test Plan
Use the example from #14890 of looking at memory usage in tensorflow before/after this patch is applied on top of version
5.0.0
Initial Setup
View memory usage at 5.0.0:
View memory usage at 5.0.0 with this patch applied:
Ensure that the generated actions have not changed: