-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Packaging rules should support --stamp flag #287
Comments
AFAIK, my particular use case does not have a preference WRT preserving mtimes, but some things that immediately come to mind are:
|
The least surprising behavior is that all the top level pkg rules implement stamp
#288 demonstrates how the default behavior could be done. That is, I can get the build time for stamping if --stamp is set on the command line. I didn't do the 0 and 1 cases, nor implement for every rule. An orthogonal feature to that is if we change the time stamps of files expanded out of archives (if I am reading #265 right). That is it's own debate. The request I have seen in the past is along the lines of
I think we need provision for that kind of behavior. I don't have a design in mind right now. I don't think it is a blocker for --stamp support, as long as we design with the idea in mind that individual files might need to say they are immune to stamping Then there is the question of using the time stamp in the archive name This is difficult for many reasons and we should not block basic stamping on that. |
Thanks for following this up and your comments. My main motivation is to set a more dynamically mtime for the input files. At the moment either the default behavior with
If I understand this correctly with "stamp = 1" the rebuild is only triggered if the input of the specific rule change, i.e. the BUILD_TIMESTAMP is relatively close to the recent mtime of the input files. The only drawbacks are fresh builds, e.g. if no cache is available. #285 had an other intention, but if the stamp behavior is closer to the build-in rules and less complex I will have a closer look into #288. |
Maybe we should start from needs for stamping, and then work back to features to achieve that.
Let's say there is a user for case 4. A hypothetical is that I am cutting a product release on Feb. 4, and I want the stamps to be 2020-02-04 33:22:11. Maybe I used the date as is, but encode something in the hh:mm:ss. We obviously don't want to have to set that in each rule, so the value has to come from the command line in some way. We would either pass the value (which requires a blaze change) or use workspace status commant (which does not). Question 1: Can workspace status just replace BUILD_TIMESTAMP? If so, then that seems to be the way to do this. |
Status update: #288 is ready to go
If we need to stamp with arbitrary times other than build time, the technique could
|
…1) (#288) Add stamp attribute to pkg_tar (in the style of cc_binary(stamp=-1,0,1) Part 1 of #287 This is not pretty, because Bazel does not make --stamp available directly - add a config setting to mimic --stamp: private_stamp_detect - This is a bit of a hack, until the value of stamp is available directly from starlark ctx See: bazelbuild/bazel#11164 - use the config setting to pass a bool down to the packaging rule to see that --stamp was set - add the stamp attribute to pkg_tar - use combination of stamp & private_stamp_detect to invoke stamping - this uses an undocumented feature to get to bazel-out/volatile-status.txt - that is bad in theory - In practice Bazel will eventually have admit the cat is out of the bag and document the files. Future work: - implement for all 4 package types. - contain a test where a stamped package is used from a workspace that includes rules_pkg. RELNOTES: Add stamp to pkg_tar.
Advances bazelbuild#287
Advances bazelbuild#287
Advances bazelbuild#287
* Add --stamp to zip. Advances #287 Now both pkg_tar and pkg_zip support --stamp behavior in the same manor as *_binary rules.
So most of the comments here are about using stamping to determine mtimes for the files within a package. What about support though to use values from the workspace status command as part of variable inputs to the packaging rules? I'm mostly looking for a simple way to generate this version string for our debian package git tag "v0.0.$(git show -s --format=%ct HEAD)-$(git rev-parse --short HEAD)" Pretty easy to output this as part of the workspace status command. The challenge I'm having is getting that into a place the I am currently using a custom rule to generate a version file and use that. While the version correctly gets injected into the control file of the debian package, it is not used as part of the package name itself. I'm not sure if this is just a bug or a specific choice. |
It's a specific choice, but more on Bazel itself rather than rules_pkg. We need to predict file names during analysis, but we don't let you read the info file containing the variable until execution phase. To have this work in the easiest to understand way, we would need Bazel to be able to have the equivalent of a workspace status command that created variables that got put directly into a configuration fragment. This would be useful for a handful of people (doing packaging) and incredibly dangerous foot-gun where you could ruin performance by having everything rebuild all the time. Take a look at the examples in: https://github.com/bazelbuild/rules_pkg/tree/main/examples/naming_package_files
|
Thanks for the feedback. Setting up a toolchain seems like an interesting idea. I did look at the naming_package_files but couldn't figure out how to generate the PackageVariablesInfo provider since rules cannot read files or (at least from what I can see) get the stdout of a command they run. From that example I didn't think to try to create a custom toolchain just to get the git revision, I will have to try that next. One thing that was confusing was the version_file attr in pkg_deb not being used in the package name but is embedded in the debian control file. I'm not sure from your answer if you are saying that is also a specific choice or if you were just talking about not being able to use the variables from workspace status. I currently do have a custom genrule that generates that version file its just was really surprising when it was not used as part of the package name. I actually thought my genrule was broken and not being picked up but only after looking at the source for pkg_deb did I realize only the version attr was used in the package name. Lastly, it seems like other bazel rules do seem to support utilizing the workspace status variables IFF the Again appreciate the feedback and explanation. It helps me understand bazel a little bit more. |
The fundamental bazel behavior is that no file reads other than ..bzl and BUILD files can impact the analysis phase, and that all action input and output files are defined during the analysis phase. Thus it is fundamentally impossible to read a version file and have it change the name of an output file.
I think they hit the same wall. You can't use the variables from the workspace file in an output file name. |
This is generally not a blocker. It's not hard to have a CI system do something like bazel build //:my_output.tar //:my_dest.txt
cp bazel-bin/my_output.tar "${ARCHIVE_DIRECTORY}/$(<bazel-bin/my_dest.txt)" where the build for One case that is problematic for this is if you want the name of a file within an archive to depend on stamp information. We have the option for a file providing a prefix for every file in the archive, but it would be useful to have the option to e.g. provide a json file (which could be built by a thing that uses stamp information) with a set of renames, rather than the current situation which is that renames need to all be specified in the |
You can make the renames in a .bzl file that you generate before the bazel build, then load that from BUILD files where you want to rename things. |
Are you suggesting running a bazel build that modifies the source tree to generate a bzl file using stamping information, and then running a second build that uses that? That sounds error prone and difficult to enforce that the generator step is in sync with what the rest of the build is seeing (even worse if the bzl file was being generated by a tool that wasn't part of the bazel build configuration) as well as being slow and inconvenient compared to a solution that only requires a single bazel invocation. It also is problematic because we're talking about a map between bazel Fixing this more cleanly wouldn't be terribly difficult; just allow a user to supply an executable target which can modify the manifest file before it's handed to the tarball packaging tool. |
No. I meant that a --workspace_status_command script would generate status data and a .bzl file (in your source tree) that can be loaded by the BUILD file. Then you turn that into PackageVariables which can be injected as needed. That does not completely help you, but I can easily imagine extending pkg_files so you can take inputs and regex rename them to new things, using the variable read from the .bzl file. |
That sounds rather dangerous to me, as I'm not aware of any documented contract that says that the workspace status command will always be evaluated before the loading phase, or in fact that it would be evaluated at all if no other build actions requested its output. |
Sure. But what other choices do you have? Bazel doesn't make the time stamp accessible directly to Starlark at analysis time, so I can't get it into the file name. The same for the git hash, or workspace name, or any variety of transient values. |
Your statement of the problem makes the solution clear - starlark does not have access to the required information, which means that the renames can't be done in starlark. But that's ok - starlark produces a manifest file that's consumed by the tarball builder action; there's no reason another build action couldn't be interposed between the starlark and the tarball builder, to either rewrite the starlark-generated manifest file or supply a separate renames file that the tarball builder would be able to merge with the one written out by the starlark. Unlike starlark, that action can depend on the workspace status files. That is, something like def _pkg_tar_impl(ctx):
...
write_manifest(ctx, manifest_file, content_map)
if ctx.executables.manifest_transformer:
modified_manifest = ctx.actions.declare_file(ctx.label.name + ".mod.manifest")
ctx.actions.run(
outputs = [modifed_manifest],
inputs = [manifest_file, ctx.version_file, ctx.info_file],
executable = ctx.executables.manifest_transformer,
arguments = [manifest_file.path, ctx.version_file.path, ctx.info_file.path, modified_manifest.path],
)
manifest_file = modifed_manifest
... |
OK. I see how what you are proposing could be possible. That would solve the case of changing file names to include time stamps or commit, or whatever you want to write into the info file with the workspace status command. |
#285 requested a feature to preserve mtime's in tar files. @aiuto brought up the below WRT how this might be assembled:
Originally posted by @aiuto in #285 (comment)
This issue tracks the design of a "stamping" mechanism for packages prior to implementation, especially with regards to fixing/clamping file modifications. Whether the
--stamp
argument should control this is also a question to consider.The text was updated successfully, but these errors were encountered: