-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wallet2: prevent duplicate outs #8047
Conversation
I'm not sure this is a even bug. Properly random output selection will have overlapping outputs sometimes and actively avoiding them only creates one more statistical skew. Also, even if you have the same output used twice, you can't know which one is decoy even if you know one of them is a decoy. It's definitely not a bug. Edit: it's not much different from a case where the same output is used in two different transactions. |
Being double or triple spent makes them distinct from other outputs thereby reducing the uniformity and weakening the group signature. That makes these outputs known spends/decoys. Also using distinct outputs in group signatures can increase the anonymity pool. |
Absolutely not. It's only "N-1 of N are decoys, but unknown which ones". Same logic can be applied to multiple transactions spending the same output. Using distinct outputs moves the output distribution away from truly random because truly random can have duplicates! |
Or do you mean double spent in a single decoy ring? |
No in different inputs in one transaction. |
Yes just so we are clear the knowledge gained is "N-1 of N are decoys and one is a definite spend OR N of N are decoys". Please review the issue I linked. By including the same output multiple times, the anonymity is severely reduced. Of course with Monero, it is not so harsh, but the same principle applies. The goal of decoy selection is to protect the privacy of monero users, not achieve perfect randomness. It is already skewed and not properly random by design. |
I need to think about this more, I think you might be right @SChernykh |
The difference is that one transaction is necessarily spent by one person whereas different transactions can be spent by multiple people. Imagine I have a 3 IN transaction with group size 3 and the same OUT appears in each IN. Then there is only 7 unique OUT being spent and I am claiming ownership of 3 of the 7. In the case where duplicates are not allowed, I have improved deniability, 3 of 9 outs are claimed to be mine. In the case of three transactions, each spending one IN, there could be three different people making the transaction whereby each one is claiming ownership of 1 of 3 OUT. |
Even with this observation, it needs fixing only if it happens statistically more often in the same multi-input transaction than in different transactions.
I think the Monero research lab must look into it first. |
I agree that this needs more study by MRL before we can give a green or red light to incorporating this commit into Monero. Here are my initial thoughts:
Some potential near-term action items are:
|
Ok great. Yes more attention to the issue would be welcome. @Rucknium is exactly on point with their summation. Let me know if I can be of any further assistance. |
Thanks for the mention @Rucknium. This is an interesting question, I will check if and how it has occurred onchain. |
According to my preliminary results produced with this R code, @neptuneresearch 's Monero SQL database, and @Gingeropolous 's server:
This makes sense. We expect as the number of inputs increase, the probability of a collision occurring also increases. Two or fewer inputs would be consistent with a typical user transaction. A larger number of inputs would be more typical of a service like an exchange or consolidation of mining pool payments. Whether or not these empirical results constitute a big deal is in the eye of the beholder. IMHO at this point, it seems that the low incidence of the issue on the Monero blockchain plus the fact that the tracing vulnerability -- if it exists -- may not be severe might suggest that this is a back burner issue for Monero for now. Aeon may be a different story, as I mentioned above. I lack the infrastructure to do a empirical similar analysis on the Aeon blockchain data, but if infrastructure could be set up, we could do it for Aeon as well. Among txs that have a single input, a collision has occurred a total of 4 (four) times in the entire history of Monero. I am told by @j-berman that such collisions are "prohibited at consensus since hf v6: monero/src/cryptonote_core/cryptonote_core.cpp Lines 1277 to 1291 in 298c9a3
|
Excellent work, @Rucknium. I'm confused why a bad thing happening rarely means it can be ignored, when it can easily be prevented. According to contributing guidelines for aeon, all requests must come upstream from monero. If this is a non-issue or even just a tiny speck of an improvement, would you approve it to be merged? |
We don't have to ignore it, but it seems to be minor in comparison to much bigger issues with the decoy selection algorithm. I outline some of those issues here. It may make sense to put greater priority on the more salient issues when thinking about how to allocate research labor hours. I understand that you have a fix ready, but that fix also has to be vetted, at least in terms of the code and probably also in terms of the statistical conceptual issues. I'll defer to others on whether this change should be merged at this time. I can only speak from the research angle. |
Ok, thanks for your input, that is understandable. I have been following your ospead research closely and wish you best of luck. |
@@ -8444,13 +8445,17 @@ void wallet2::get_outs(std::vector<std::vector<tools::wallet2::get_outs_entry>> | |||
bool own_found = false; | |||
for (const auto &out: ring) | |||
{ | |||
if (seen_outputs.count({amount, out})) | |||
seen_indices.emplace(out); | |||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This if statement seems unnecessary to me.
Assume you have 2 known rings to be used as inputs in a tx, and the first ring includes the real output from the second ring as a decoy. This if statement looks like it would prevent the real output from being added to the second ring, which would then cause the function to throw just outside the for loop because own_found
won't get set to true. The user's likely recourse in that case is to then remove the ring entirely and have their client construct a new one, which is worse than allowing duplicates.
It seems it's ok to let the duplicates slide here and remove the if statement altogether. The assumption with known rings is that the rings have already been seen by the world, i.e. you're not gaining something tangible by removing the duplicates from rings that have already been seen by the world.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding to this, it seems the algorithm in general is still able to select duplicates because of this logic
Assume you have 2 rings to be used as inputs in a tx, and the first ring includes the real output from the second ring as a decoy.
I'm not seeing what prevents this, this could use a deeper look.
My inclination is that it makes sense to not allow duplicates in order to increase the anon set. Allowing duplicates (very) marginally increases the chances you re-select your own output as a decoy in your own transaction relative to not allowing duplicates, and reduces the chances your output can be picked up as a decoy in someone else's transaction (since others are also marginally more likely to re-select their own outputs as decoys). Ideally you'd want to maximize the chances your output gets picked up as a decoy by other people, and hide among the largest crowd. I generally agree with @Rucknium the incidence level isn't very significant. But it seems like a simple, sensible thing to do. It also would make sense to start doing at a fork because this would be a way to fingerprint an older wallet if a tx still includes duplicates. |
@@ -8484,6 +8489,7 @@ void wallet2::get_outs(std::vector<std::vector<tools::wallet2::get_outs_entry>> | |||
{ | |||
num_found = 1; | |||
seen_indices.emplace(td.m_global_output_index); | |||
seen_outputs.emplace({amount, td.m_global_output_index}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding to this comment, I think you need to emplace the real outputs in seen_outputs
before iterating to select decoys (much earlier on in get_outs
, otherwise if you have 2 rings, ring 1 would be able to include a decoy that is a real output from ring 2)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're going to do this, you should take care to allow these if there are few enough outputs on the chain to prevent a tx being made with the patch. For pre-rct outputs of uncommon value, this may well be the case, and the patch without the special case would make these outputs unspendable.
@@ -8444,13 +8445,17 @@ void wallet2::get_outs(std::vector<std::vector<tools::wallet2::get_outs_entry>> | |||
bool own_found = false; | |||
for (const auto &out: ring) | |||
{ | |||
if (seen_outputs.count({amount, out})) | |||
seen_indices.emplace(out); | |||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes the loop a noop. I assume that is not intended.
Seeing a bug in Aeon transactions where multiple inputs in one transaction use the same output. Sorry, I have not confirmed this bug exists in Monero yet, but I anticipate it does. Is this possibly the culprit?
aeon#253