Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent Pants from inserting lock file header with local machine binary path #18202

Closed
rafrafek opened this issue Feb 8, 2023 · 21 comments
Closed
Labels
backend: JVM JVM backend-related issues backend: Python Python backend-related issues

Comments

@rafrafek
Copy link

rafrafek commented Feb 8, 2023

How can I prevent Pants from inserting this into the lock file?

// This lockfile was autogenerated by Pants. To regenerate, run:
//
//    /Users/my_user_name/Projects/my_project_name/scie-pants-macos-aarch64 generate-lockfiles --resolve=python-default
//
// --- BEGIN PANTS LOCKFILE METADATA: DO NOT EDIT OR REMOVE ---

I want the lock file to be the same no matter which machine created it. And I don't want to store my local machine paths in git repo.

I've tried with:

./scie-pants-macos-aarch64 generate-lockfiles --generate-lockfiles-custom-command=./pants

But it will create instruction that is not valid:

// This lockfile was autogenerated by Pants. To regenerate, run:
//
//    ./pants
//
// --- BEGIN PANTS LOCKFILE METADATA: DO NOT EDIT OR REMOVE ---

Trying to fix it using --generate-lockfiles-custom-command would lead to something like --generate-lockfiles-custom-command="./pants generate-lockfiles --generate-lockfiles-custom-command='recursion here?'"

@jsirois
Copy link
Contributor

jsirois commented Feb 8, 2023

@rafrafek I'm going to transfer this over to the Pants repo - this will need to be solved in the Pants codebase unless I'm mistaken.

@jsirois jsirois transferred this issue from pantsbuild/scie-pants Feb 8, 2023
@jsirois
Copy link
Contributor

jsirois commented Feb 8, 2023

The lock file generation helper code here:

lockfile_with_header = metadata.add_header_to_lockfile(
initial_lockfile_digest_contents[0].content,
regenerate_command=(
generate_lockfiles_subsystem.custom_command
or f"{bin_name()} generate-lockfiles --resolve={req.resolve_name}"
),

Uses this utility:

def bin_name() -> str:
"""Return the Pants binary name, e.g. './pants'."""
# NB: This will be called at import-time in several files to define static help strings
# (e.g. "help=f'run `{bin_name()} fmt`").
#
# Ideally, we'd assert this is set unconditionally before Pants imports any of the files which
# use it, to give us complete confidence we won't be returning "./pants" in our help strings.
#
# However, this assumption really breaks down when we go to test pants (or a plugin author goes
# to test their plugin). Therefore we give a fallback and have integration test(s) to assert
# we've set this at the right point in time.
return os.environ.get("PANTS_BIN_NAME", "./pants") # noqa: PANTSBIN

@rafrafek you can, of course, work around by noting the or in the 1st code snip and setting: https://www.pantsbuild.org/docs/reference-generate-lockfiles#custom_command

@jsirois
Copy link
Contributor

jsirois commented Feb 8, 2023

I think the bin_name fix here would be to check if the PANTS_BIN_NAME is a path that resolves to a file under the build root, and if so, ensure it is relativized to the build root - this covers the existing ./pants case.

If PANTS_BIN_NAME resolves to an absolute path elsewhere, check if its dirname is on the PATH and use its basename if so. That covers the recommended way of installing scie-pants (https://www.pantsbuild.org/v2.15/docs/installation) which installs in as the platform-neutral pants (as opposed to scie-pants-macos-aarch64) and on the PATH.

Finally, if PANTS_BIN_NAME resolves to an absolute path elsewhere that is not on the PATH, just use it as-is and expect users to use the [generate-lockfiles] custom_command to work around.

@jsirois jsirois added backend: JVM JVM backend-related issues backend: Python Python backend-related issues labels Feb 8, 2023
@jsirois
Copy link
Contributor

jsirois commented Feb 8, 2023

This affects both Python and JVM lock file generation since both add headers and both use the bin_name utility.

@rafrafek
Copy link
Author

rafrafek commented Feb 8, 2023

I've renamed scie-pants-macos-aarch64 to pants then ran:

export PANTS_BIN_NAME=./pants
./pants generate-lockfiles 

Result:

// This lockfile was autogenerated by Pants. To regenerate, run:
//
//    /Users/my_user_name/Projects/my_project_name/pants generate-lockfiles --resolve=python-default
//
// --- BEGIN PANTS LOCKFILE METADATA: DO NOT EDIT OR REMOVE ---

@benjyw
Copy link
Contributor

benjyw commented Feb 8, 2023

Checking the PATH seems more trouble than it's worth. I think we can always take the basename (and add ./ if the full path is in the repo). This will cover almost all cases, and after all this is just a hint to the user, who presumably knows how to run Pants. It's more important that we don't put local paths in lockfiles than that we get the right thing every time.

I'll work up a PR.

PS I noticed this issue in pants help output as well. Obviously less important in that case, but still annoying.

@benjyw
Copy link
Contributor

benjyw commented Feb 8, 2023

Thanks for the report @rafrafek

@benjyw
Copy link
Contributor

benjyw commented Feb 8, 2023

Actually, does this need to be fixed in Pants? Looks like scie-pants sets --pants-bin-name : https://github.com/pantsbuild/scie-pants/blob/a83d66d2070c101b44fc31ada0bbd9e9c961c59e/package/scie-pants.lift.json#L99

And probably should set it to just pants

@benjyw
Copy link
Contributor

benjyw commented Feb 8, 2023

Unfortunately, as @rafrafek has discovered, you can't override this by setting PANTS_BIN_NAME. In this case that env var is just an internal way of communicating the --pants-bin-name option value to code that can't load options. It is the same name as the env var you would use to set that option externally, but since scie-pants is setting it via flag, that overrides the external env var.

@jsirois
Copy link
Contributor

jsirois commented Feb 8, 2023

Well, pants should be defensive and not care how its called. PANTS_BIN_NAME can be set by anybody to anything as this shows. Let me look at scie-pants though.

The obvious aside here is Pants should simply not be adding headers to lock files! An issue tracks that though.

@benjyw
Copy link
Contributor

benjyw commented Feb 8, 2023

True on both counts, but in this case scie-pants shouldn't be setting that option to the full path, since the intent of the option is "display this to end users in help messages and so on".

Apparently at some point along the way this ended up in lockfile headers (which, I agree, should not be a thing) and that is where we should have gotten defensive.

I suggest both changing scie-pants to set --pants-bin-name=pants, at least until we default to that in bin_name(). And also I will look at being defensive.

@jsirois
Copy link
Contributor

jsirois commented Feb 8, 2023

I'll do better and do the right thing in scie-pants and set the bin-name to the basename if PATH / abs path otherwise. If the basename is actually pants and on PATH, great. if not, as in the OP case it will be ./scie-pants-macos-aarch64 since its checked in to the repo apparently. I do not think the PATH calculation is more trouble than its worth and I think the right answer here is better than claiming pants which is wrong in the OP case.

@benjyw
Copy link
Contributor

benjyw commented Feb 8, 2023

Also #18204

@jsirois
Copy link
Contributor

jsirois commented Feb 8, 2023

@rafrafek if you upgrade scie-pants to 0.5.0 (SCIE_BOOT=update <scie-pants-binary> is one way), and you put that binary on your PATH as - say - pants, then you can invoke Pants via pants ... and the lockfile re-gen should now reflect this.

@rafrafek
Copy link
Author

rafrafek commented Feb 9, 2023

I've ran SCIE_BOOT=update ./pants then I've moved pants to /Users/my_user_name/.local/bin/ and then ran pants generate-lockfiles.

Result:

// This lockfile was autogenerated by Pants. To regenerate, run:
//
//    pants generate-lockfiles --resolve=python-default
//
// --- BEGIN PANTS LOCKFILE METADATA: DO NOT EDIT OR REMOVE ---

It is still different than the one from example project:

// This lockfile was autogenerated by Pants. To regenerate, run:
//
//    ./pants generate-lockfiles --resolve=python-default
//
// --- BEGIN PANTS LOCKFILE METADATA: DO NOT EDIT OR REMOVE ---

@rafrafek
Copy link
Author

rafrafek commented Feb 9, 2023

I think it is duplicate of #15740

I don't understand why we need this header at all. Having metadata like "generated_with_requirements", instructions and other stuff in a lock file which is supposed to be handled by tools does not make sense to me.

I don't understand Pants lock file because it is not lock file when you cannot validate it against pinned dependencies nor update it without bumping every dependency versions. There should be two files in a project repository, one containing pinned dependencies (.txt/.toml file etc.) and one containing locked versions of dependencies and sub-dependencies (lock file).

Poetry is one example of how lock files can be handled. See #15723 (comment)

@jsirois
Copy link
Contributor

jsirois commented Feb 9, 2023

It is still different than the one from example project:

@rafrafek if I understand correctly, being different from the example is OK. What you care about is consistency across machines. The scie-pants 0.5.0 fix only gets you this if all your users install pants in the recommended way. So the ultimate fix will need to come from Pants indeed. @benjyw's work in #18204 won't get you that assurance either IIUC. You can have that assurance today with the workaround I described above, check in the following to your pants.toml:

[generate-lockfiles]
custom_command = "whatever fixed string you'd like to have"

Clearly Pants needs to fix this to provide guaranteed uniformity though in the longer run.

@jsirois
Copy link
Contributor

jsirois commented Feb 9, 2023

There should be two files in a project repository, one containing pinned dependencies (.txt/.toml file etc.) and one containing locked versions of dependencies and sub-dependencies (lock file).

This does already exist, I think the issue is in how the generate-lockfiles goal behaves (by default) today. Instead of 2 files there are N + 1 file though. In Pants you can write down your requirements in N files; plenty of folks use - often legacy - multiple different requirement.txt files scattered through the repo to define requirements like ansicolors or requests>=2, etc. and these are all gathered by Pants to generate 1 lockfile with pinned and hashed versions; thus N + 1 instead of strictly 2.

@rafrafek
Copy link
Author

... check in the following to your pants.toml:

[generate-lockfiles]
custom_command = "whatever fixed string you'd like to have"

Thanks! The toml config is much better for providing consistency across machines than the --generate-lockfiles-custom-command cmd parameter .

@rafrafek
Copy link
Author

This does already exist, I think the issue is in how the generate-lockfiles goal behaves (by default) today. Instead of 2 files there are N + 1 file though. In Pants you can write down your requirements in N files; plenty of folks use - often legacy - multiple different requirement.txt files scattered through the repo to define requirements like ansicolors or requests>=2, etc. and these are all gathered by Pants to generate 1 lockfile with pinned and hashed versions; thus N + 1 instead of strictly 2.

Thanks! Now I see how it helps with multiple requirements.txt files. Coming from Poetry it wasn't clear for me.

@rafrafek
Copy link
Author

rafrafek commented Feb 10, 2023

The issue is solved for me, thank you @jsirois and @benjyw for your help!

I'm closing the issue, feel free to reopen it if needed. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend: JVM JVM backend-related issues backend: Python Python backend-related issues
Projects
None yet
Development

No branches or pull requests

3 participants