Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase TypeId's hash from 64 bits to 128 bits. #608

Closed
1 of 3 tasks
thomcc opened this issue Apr 5, 2023 · 9 comments
Closed
1 of 3 tasks

Increase TypeId's hash from 64 bits to 128 bits. #608

thomcc opened this issue Apr 5, 2023 · 9 comments
Labels
major-change A proposal to make a major change to rustc major-change-accepted A major change proposal that was accepted T-compiler Add this label so rfcbot knows to poll the compiler team

Comments

@thomcc
Copy link
Member

thomcc commented Apr 5, 2023

Proposal

Increase the size of TypeId's hash from 64 bits to 128 bits.

With a 64 bit hash, the probability of collision is 1 in 2^32 (due to the birthday bound) -- 1 in roughly 4 billion. This is a number low enough that it seems very likely that some program in the wild has experienced a TypeId collision. My suspicion is that the fact that idiomatic Rust does not heavily use TypeId or Any (at least, in code paths which commonly occur -- it's much more frequently used in error handling) is an explanation for why it has never been detected in the wild.

A 128 bit hash increases the probability of collision to 1 in 2^64, which is considerably larger -- 1 in roughly 18 quintillion. This is high enough that its very likely that we could store all types of all compilations of all Rust programs to date without collisions.

Note that this approach does not remove the possibility of a collision, just makes it very unlikely. However, it also does not preclude solving this problem in a better way in the future. The goal is just to reduce the likelihood of issues that occur in practice from "could happen" to "almost certainly will not happen". This allows programs and libraries which depends on Any/TypeId in their designs to use it with more confidence.

(A preliminary/draft implementation is available at rust-lang/rust#109953)

P.S. Not sure if MCP is correct for this. It does seem like a t-compiler implementation detail, rather than something public-facing like an RFC.

Mentors or Reviewers

N/A

Process

The main points of the Major Change Process are as follows:

  • File an issue describing the proposal.
  • A compiler team member or contributor who is knowledgeable in the area can second by writing @rustbot second.
    • Finding a "second" suffices for internal changes. If however, you are proposing a new public-facing feature, such as a -C flag, then full team check-off is required.
    • Compiler team members can initiate a check-off via @rfcbot fcp merge on either the MCP or the PR.
  • Once an MCP is seconded, the Final Comment Period begins. If no objections are raised after 10 days, the MCP is considered approved.

You can read more about Major Change Proposals on forge.

Comments

This issue is not meant to be used for technical discussion. There is a Zulip stream for that. Use this issue to leave procedural comments, such as volunteering to review, indicating that you second the proposal (or third, etc), or raising a concern that you would like to be addressed.

@thomcc thomcc added major-change A proposal to make a major change to rustc T-compiler Add this label so rfcbot knows to poll the compiler team labels Apr 5, 2023
@rustbot
Copy link
Collaborator

rustbot commented Apr 5, 2023

This issue is not meant to be used for technical discussion. There is a Zulip stream for that. Use this issue to leave procedural comments, such as volunteering to review, indicating that you second the proposal (or third, etc), or raising a concern that you would like to be addressed.

cc @rust-lang/compiler @rust-lang/compiler-contributors

@rustbot rustbot added the to-announce Announce this issue on triage meeting label Apr 5, 2023
@WaffleLapkin
Copy link
Member

@rustbot second

@rustbot rustbot added the final-comment-period The FCP has started, most (if not all) team members are in agreement label Apr 5, 2023
@apiraino apiraino removed the to-announce Announce this issue on triage meeting label Apr 6, 2023
@apiraino
Copy link
Contributor

@rustbot label -final-comment-period +major-change-accepted

@rustbot rustbot added major-change-accepted A major change proposal that was accepted to-announce Announce this issue on triage meeting and removed final-comment-period The FCP has started, most (if not all) team members are in agreement labels Apr 17, 2023
@apiraino apiraino removed the to-announce Announce this issue on triage meeting label Apr 20, 2023
@pnkfelix
Copy link
Member

pnkfelix commented Jun 2, 2023

See rust-lang/rust#109953

this was accepted and is in the process of being implemented+landed

@briansmith
Copy link

A 128 bit hash increases the probability of collision to 1 in 2^64,

This is only true if the hash function is designed to be collision resistant. If the hash function is not collision-resistant then you can't infer anything about the probability of collision.

Note in particular that SipHash is NOT collision-resistant if the attacker knows the key.

@the8472
Copy link
Member

the8472 commented Dec 21, 2023

Note in particular that SipHash is NOT collision-resistant if the attacker knows the key.

That presupposes an attacker instead of natural collisions. Afaik malicious inputs are currently outside rustc's threat model. If that ever changes then this can be revisited.

@briansmith
Copy link

Note in particular that SipHash is NOT collision-resistant if the attacker knows the key.

That presupposes an attacker instead of natural collisions.

OK, first, do we agree that the argument about birthday bounds in the initial comment is not sound, at least?

The SipHash paper says "We comment that SipHash is not meant to be, and (obviously) is not, collision resistant." No qualifications regarding malicious vs. natural collisions. SipHash wasn't designed to be used as a substitute for equality checks of the full hashed value; it was designed for a very narrow purpose that isn't what TypeId uses it for (AFAICT). I cannot find any claim regarding the "natural" collision resistance of SipHash with a public key, nor do I know a way to distinguish "natural" from "malicious" inputs for collisions.

Afaik malicious inputs are currently outside rustc's threat model. If that ever changes then this can be revisited.

I do think malicious inputs are necessarily inside a realistic threat model for the Rust toolchain. rustc enforces the fix for CVE-2021-42574, for example. Also, Cargo has removed malicious-looking crates in https://blog.rust-lang.org/inside-rust/2023/09/01/crates-io-malware-postmortem.html. Even without prior history, it is easy to come up with scenarios where TypeId collisions could be detrimental.

In summary, we do not know much about the likelihood of TypeId collisions, before or after this change, nor do we have significant evidence that this is an improvement.

@the8472
Copy link
Member

the8472 commented Dec 22, 2023

Since this MCP is already approved and merged I'd recommend starting a new issue in the rust repo or a zulip thread if you want to have an extended discussion about this. A lot of people are subscribed to this repo and will get unnecessary notifications.

@briansmith
Copy link

OK, I'll take that to mean you agree that the original argument isn't sound. There is already rust-lang/rust#10389 about the unsoundness of TypeId so there's no need to file a separate issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
major-change A proposal to make a major change to rustc major-change-accepted A major change proposal that was accepted T-compiler Add this label so rfcbot knows to poll the compiler team
Projects
None yet
Development

No branches or pull requests

7 participants