Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable --remap-path-prefix for absolute paths by default #40552

Closed
mzji opened this issue Mar 15, 2017 · 31 comments
Closed

Enable --remap-path-prefix for absolute paths by default #40552

mzji opened this issue Mar 15, 2017 · 31 comments
Labels
C-feature-request Category: A feature request, i.e: not implemented / a PR. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@mzji
Copy link

mzji commented Mar 15, 2017

According to #40374 , a part of private information is leaked by the path string living in the compiled binaries. To protect privacy and help debugging, I think we can let rustc to mangle the path. Here is my solution:

Basically, we can hide the 'insignificant' part of the path (which usually contains some private and/or unrelated info), leave 'significant' part untouched. Then what is the 'significant' part of a path? Here is an example: Assume that we have a project foobar which is in user's home directory (here I use a windows path, on *nix things work similarly):

C:\Users\username\Documents\foobar

In this case, the useful part is the crate name and the part after the crate name, i.e.

\foobar\lib.rs

Assuming we have a mod called 'somemod', then after mangling, the new path looks like:

[crate]\foobar\somemod\mod.rs

which not only saves the relatitionship information between sources files, but also protects the privacy of the user (since no more user name or aboslute path exists) !

The next question is how to process all paths under this rule. From what I know, all compiled code of a crate comes from 4 sources:

  • crates.io or mirrors of crates.io, or some independent 3rd party repo, in which the code is cached and lives in the cargo cache directory
  • remote git repositories
  • local filesystem
  • the crate itself (which may locate at any place)

We could specify different root names for these sources to indicate their origin:

  • For code from crates.io or mirrors, the root name is [crates.io]. Example:
    C:\Users\username\.cargo\registry\src\github.com-1ecc6299db9ec823\winapi-0.2.8\ will be mangled to [crate.io]\winapi-0.2.8\
  • For code from remote git repository, the root name is [username@git server name]. Example: https://github.com/rust-lang/rust will be manged to [[email protected]]/rust
  • For code from local filesystem, the root name is [local]. Example: D:\workspaces\foobarng\ will be mangled to [local]\foobarng\
  • For code from the crate itself, the root name is crate. Example: C:\Users\username\Documents\foobar\ will be mangled to [crate]\foobar\

And for helping debuggers, all paths in debugging information won't be modified. Thus, users still know where the debugging code is, and won't worry about leaking privacy (just need stripping out debugging information before packaging on *nix, or not distributing .pdb files on windows).

@codyps
Copy link
Contributor

codyps commented Mar 15, 2017

Related to #38322 & #39130 (which are for debuginfo). Likely makes sense to use the same mapping mechanism for filenames in panics too.

(edit: linking to #40492 as it is the PR for #40374)

@mzji
Copy link
Author

mzji commented Mar 15, 2017

We may need to record path mappings in a file if we also want to process paths in debug information.

@Mark-Simulacrum Mark-Simulacrum added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Jun 23, 2017
@Mark-Simulacrum Mark-Simulacrum added the C-feature-request Category: A feature request, i.e: not implemented / a PR. label Jul 27, 2017
@michaelwoerister
Copy link
Member

Related to #38322 & #39130 (which are for debuginfo). Likely makes sense to use the same mapping mechanism for filenames in panics too.

--remap-path-prefix will also remap panic messages.

@mzji
Copy link
Author

mzji commented Aug 10, 2018

Seems like we now have a working and stable solution. Closed.

@mzji mzji closed this as completed Aug 10, 2018
@Boscop
Copy link

Boscop commented Aug 11, 2018

So what's the current solution to not include the path in the exe (when building with cargo)?

@michaelwoerister
Copy link
Member

Try something like RUSTFLAGS=--remap-path-prefix=<your-src-dir>=src cargo build.

@jimmycuadra
Copy link
Contributor

Can we reopen this to track having rustc do this by default? You shouldn't have to know both that rustc does this and this obscure mechanism for changing it to protect privacy. It should just do it.

@michaelwoerister
Copy link
Member

I think this would need an RFC to come up with a solid solution that doesn't break debugging (which relies on these paths being contained in debuginfo). cc @rust-lang/core

@Boscop
Copy link

Boscop commented Sep 3, 2018

@michaelwoerister But it should be stripped automatically from --release builds!

@michaelwoerister
Copy link
Member

An RFC would need to cover both debug and release builds.

@ghost
Copy link

ghost commented Sep 15, 2020

Try something like RUSTFLAGS=--remap-path-prefix=<your-src-dir>=src cargo build.

@michaelwoerister this didnt work for me and a username and full path was still present. and yes i also used the --release flag. i also second @Boscop in that this should be automatic for release builds. why does the rfc need to include debug when we are talking about release specifically? @jimmycuadra is also right normal users shouldnt have to know this exists or be expected to manually specify the opts all the time since its so obscure.

if it helps you including user ids like this violates gdpr https://gdpr.eu/eu-gdpr-personal-data/ so this should be addressed by the rust team. in 2020 people care about privacy and this can be a put off like rust-lang/mdBook#847 where people actively worked away from the project due to the disrespect of user privacy

cc @sneak & @aral who might also have some words about this

@dns2utf8
Copy link
Contributor

To be honest, I expected that release binaries do not contain information like this. Would this be an option to cut the string in these cases?

@ghost
Copy link

ghost commented Sep 15, 2020

To be honest, I expected that release binaries do not contain information like this. Would this be an option to cut the string in these cases?

@dns2utf8

not sure if part of the message got dropped. what is the option you are suggesting? the provided RUSTFLAGS dont actually work. it seems the rust team thinks this is acceptable for release binaries for some reason. perhaps more attention on the issue may help given the push for privacy in 2020

unrelated but this also causes an issue with reproducible builds since the strings and usernames will differ from person to person

@sneak
Copy link

sneak commented Sep 16, 2020

It seems to me that two different people building two identical programs with identical build args on different (but same-architecture) systems should receive the same output binary regardless of the name in $USER or the string in $HOME, for all build types (but especially release).

I'm not going to weigh in on the privacy issue (I think that if you're privacy conscious, $USER should be anonymous or user or something already), just the principle of least astonishment: what I would expect, not knowing the tooling, is that the same thing built on different systems would result in the same output. Deviation from this would surprise me, given what I know about the generally excellent caliber of Rust stewardship (and the well-known gargantuan task of stripping out this unnecessarily nondeterministic stuff from other distros/packages in the pursuit of deterministic builds). Tool designers should probably not be throwing more rocks into their path.

This probably means stripping any mention of local environment, build time, and file paths before the root/prefix of the build.

@yan-zaretskiy
Copy link

I really enjoy using Rust for personal side projects. It made me a better C++ developer. I'm excited about stabilized const generics to be able to speed up my linear algebra code.
This issue though prevents me from recommending Rust for closed source development to my colleagues.

@michaelwoerister
Copy link
Member

I think this might be more of an issue for the @rust-lang/core team rather than the @rust-lang/compiler team to discuss.

@unpavedmop

This comment has been minimized.

@ssokolow
Copy link

@unpavedmop The Rust developers only have so much time and a large portion of the user base either don't care or work around it by building release artifacts in Docker containers or on a CI service, so they (rightly) conclude that, barring more contributors showing up, it's more important to work on things that don't have such a simple workaround.

(For example, the long-awaited min_const_generics feature that got declared stable in nightly on December 27th and is currently making its way to stable channel.)

@unpavedmop

This comment has been minimized.

@ssokolow

This comment has been minimized.

@chemsaf3
Copy link

came to report tthe same

Try something like RUSTFLAGS=--remap-path-prefix=<your-src-dir>=src cargo build.

this didnt work for me, user still in binary, and not flexible to use wildcard for src path

@michaelwoerister But it should be stripped automatically from --release builds!

+1

@alercah
Copy link
Contributor

alercah commented Mar 29, 2021

I was linked this issue from Reddit, and I got interested as personally having good privacy-preserving defaults is important to me. I spoke informally with some colleagues, and it sounds like:

  • This particular issue is fundamentally about the default behaviour of the compiler in absence of controlling flags. It's a request to change that behaviour.
  • If the --remap-path-prefix flag is not working, that needs to be a separate issue thread.
  • Because the exact scope and nature of a solution is unclear, an RFC is required. It would be for the compiler and/or Cargo teams to handle depending on what exactly is proposed.
  • It is worth engaging with the teams (especially cargo) on Zulip to figure out what they would like to see in such an RFC.

Some specific thoughts on what should be in such an RFC:

  • an enumeration of the different cases and how they should be handled (crates.io crates, path deps, git deps, workspace deps, OUT_DIR)
  • should we handle include! and #[path = ..] that refer to files outside the crate somehow? (this significantly complicates any solution)
  • what the debug/release behavior is and why

Personally, I think this is important and should be addressed quickly, but I'm not in a position at the moment to follow up and make this happen. I hope someone else can pick this up.

@ChrisDenton
Copy link
Member

ChrisDenton commented Mar 29, 2021

As suggested, I filed an issue about --remap-path-prefix after doing a bit of experimentation. If anyone can reproduce (or fail to reproduce) my results that'd be great.

Quick summary: It replaces CARGO_HOME path prefixes but there are still a lot of RUSTUP_HOME paths in the final binary.

Another issue not strictly to do with --remap-path-prefix is the path to the Windows debug file. This can be changed with a linker argument but Rust itself has no way to do this for you.

@chemsaf3
Copy link

chemsaf3 commented Apr 3, 2021

@rust-lang/core & @rust-lang/compiler any chance of official comment on this? seems here and in reddit there is significant user interested in it being fixed. workarounds dont work, and using ci is not a solution as same issue is present there, even if different user name

@alercah also suggest rust team input on this which is absent. i dont think its reasonable to force community to fix this as many wont have nessesary compiler knowledge to do so. some maybe even new to rust and want to see a fix but cant help due to skills lacking

@recatek
Copy link

recatek commented Apr 3, 2021

This is a significant footgun for those just learning Rust, without extensive technical backgrounds. Windows explicitly prompts you to use your real name as your username, and includes that as part of your path, so it's a reasonable expectation that newer users of Rust will have a username containing personal information. Rust tutorializes users to build on their own machine by opening a terminal window and typing "cargo build". Doing these two things as-directed will create an unsafe binary by default -- safe code maybe, but not a safe user. If Rust intends to be a welcome and inviting first-language, something core to the Rust book and community, expecting them to protect themselves by pouring over docs and github issues and rfcs to find --remap-path-prefix or even understand what that means isn't the responsible way to go about this.

@jyn514
Copy link
Member

jyn514 commented Apr 12, 2021

@recatek If you think symbols should be omitted by default, I'd recommend opening an RFC (see the instructions in https://github.com/rust-lang/rfcs/#readme). It should address at least the following drawbacks:

I'm not a member of the compiler or core teams, but I'd speculate the reason they haven't made an RFC is because they aren't stakeholders: their names are already public, so this feature doesn't help them. It would be great to have the RFC come from someone who plans to use this feature, not from a team member on behalf of someone else.

@chemsaf3
Copy link

@recatek If you think symbols should be omitted by default, I'd recommend opening an RFC (see the instructions in https://github.com/rust-lang/rfcs/#readme). It should address at least the following drawbacks:

* Debugging with gdb/lldb/whatever windows uses should still work - AFAIK, these all require absolute paths. It sounds like there are some ideas for doing this in [#40552 (comment)](https://github.com/rust-lang/rust/issues/40552#issuecomment-809012730).

* Should the paths be relative to the workspace root or to the crate root? If it's relative to the crate, this will be regression for people using IDEs, because "jump to source" will no longer work for panic messages: #47355. If it's relative to the workspace, cargo needs to support that somehow.

* How will the compiler know all the paths it should remap? Should it remap only the source directory? All of (the source directory, CARGO_HOME, and RUSTUP_HOME)? How will it know what those other values are - does it need cargo changes?

* All the other issues mentioned in [#40552 (comment)](https://github.com/rust-lang/rust/issues/40552#issuecomment-809012730).

I'm not a member of the compiler or core teams, but I'd speculate the reason they haven't made an RFC is because they aren't stakeholders: their names are already public, so this feature doesn't help them. It would be great to have the RFC come from someone who plans to use this feature, not from a team member on behalf of someone else.

you miss his point. how can you expect new rust people to know about rfc or arcane options that dont work right. someone new cant rfc since they cant include how to fix due to not knowing rust or compiler.

there is too much expectation here for new people to fix a compiler when they likely dont have knowledge or skill to do so. this should be default, or easier to find/use like -trimpath in go which is dead simple and actually work

@jyn514 jyn514 changed the title Protect privacy by mangling the path string of source files in the generated binaries Enable --remap-path-prefix for absolute paths by default Apr 13, 2021
@sanxiyn
Copy link
Member

sanxiyn commented Apr 13, 2021

@chemsaf3 trimpath equivalent is easy to implement, I can do that if there is interest. The problem is that trimpath can't be the default because it breaks debuggers.

This issue is really about changing the default. If you want trimpath, I think it should be a separate issue.

@chemsaf3
Copy link

chemsaf3 commented Apr 13, 2021

@chemsaf3 trimpath equivalent is easy to implement, I can do that if there is interest. The problem is that trimpath can't be the default because it breaks debuggers.

This issue is really about changing the default. If you want trimpath, I think it should be a separate issue.

@sanxiyn i think default for release build is prob ok. but either way yeah should exist and likely would fix this issue if done right. even if flag its still easier to find/user than the hacks that only part work now

@ssokolow
Copy link

you miss his point. how can you expect new rust people to know about rfc or arcane options that dont work right. someone new cant rfc since they cant include how to fix due to not knowing rust or compiler.

From this, I get the impression there's been a miscommunication.

An RFC is the formal way to request a change to Rust, so there's no reason new people should need to know about it, any more than with any other RFC got accepted and resulted in a change to the compiler.

@jyn514 is just saying that the request should be written by someone who knows what matters most to them, rather than a team member who's just guessing.

@jyn514
Copy link
Member

jyn514 commented Jun 16, 2023

I'm going to close this in favor of the tracking issue for the merged RFC: #111540

@jyn514 jyn514 closed this as not planned Won't fix, can't repro, duplicate, stale Jun 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-feature-request Category: A feature request, i.e: not implemented / a PR. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests