-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wallet2: apply gamma distribution from chain tip when selecting decoys (#7807) #7821
Conversation
@UkoeHB Made a small change for when the gamma spits out an output more recent than unlock time. In that case, it now selects a random value between 0 and 120, rather than scales down the gamma to between 0 and 120 because the scaling down approach biases toward 60 and 120. Don't see why it should have a bias like that, and it seems all outputs should actually be equally likely to be picked from that initial 0 to 120 should it fall into that else. |
While working through the impact of tx uniformity in #7798, I recognized a potential area of concern as a result of this PR's impact on tx uniformity. I don't think it's a reason to hold this PR back, but figured it's worth sharing to better understand this risk. If you assume that 99% of clients update, and 1% do not, then the 1% non-updated clients may stick out. Assume that a user constructs a transaction with 20 rings and not a single one has a very early output in it. The chances of an updated client doing that are very low ( Such a circumstance seems extraordinarily rare. I can, however, see it negatively harming a user who is very late to update their wallet. But I cannot see it impacting other users who do not update. And considering that current users are being harmed today by the implications of this issue, it needs to be patched for those users immediately. |
Latest developmentIn monero-dev IRC, @luigi1111 suggested smoothing out the simulation of the decoy selection algorithm and plotting that, in order to get a better idea of how the patch would alter the algorithm. After graphing a smooth simulation of the decoy selection algorithm, I observed that my initial proposed fix alters the shape of the distribution such that it seems to perform marginally worse for outputs 15-100 blocks old: I went back to the drawing board and applied many different combinations of fixes to arrive at what I believe is the safest, do-no-harm approach, that patches the issue at hand, while simultaneously providing sufficient protection for the earliest spents. The current proposed patch
The justification for continuing to factor in outputs younger than the unlock time picked by the gamma is still that people who spent outputs very quickly back when the gamma was observed, likely would still be spending outputs relatively quickly today. The justification for using a window of 50 to slot them in is that empirically it seems to perform well: Math backing up the decisionAs mentioned by a fellow by the username of Rucknium to me in a 1-on-1 IRC, as well as in Miller et al, the Kolmogorov-Smirnov test is a test to quantify the distance between an observed distribution, and expected. As such, it can be used to quantify how well the proposed patch would perform compared to the current decoy selection algorithm. The lower the distance given by the Kolmogorov-Smirnov statistic, the better the algorithm is at matching the observed distribution. If my math is correct (see below), the K-S statistic for the current decoy selection algorithm is 0.0167, while the K-S statistic of the current proposed fix is 0.0071. Thus, the patch is a material improvement over the current. Finally, the K-S statistic of the patch I initially proposed (dumping outputs in the first spendable block) is 0.0190. Thus, the fix I'm proposing now is a material improvement over the patch I initially proposed. Math to recreate K-S StatisticDownload the following CSV, then run the following python script:
|
Latest DevelopmentIn monero-dev IRC, @luigi1111 requested to see what using a window of 20 would look like, and @Gingeropolous requested to see simulations run over different epochs, as well as comparisons to the gamma distribution (ignoring block density).
After doing the above, I still lean towards the current window of 50 as best achieving the sanest, least-potential-for-harm approach that I described in IRC as follows:
As discussed in IRC, beyond this PR, we could continue with deeper research toward an even stronger solution that takes another crack at the assumptions laid out in Miller et al, and factors in observed chain data since then. Said research will take a fair amount of time to complete, and this PR's do-no-harm approach offers a solid patch until that research is finished. Also shoutout to @Rucknium who has some excellent ideas and is an applied statistician by trade who has offered to contribute in this area :) Results simulating different windows, over different epochsv14 (2210720 - 2413735)First general takeaways from this chartThe current decoy selection algorithm (orange) appears to be under-selecting outputs relative to the observed output ages in rings (blue), as is apparent by the large triangular gap above the orange line to the blue. Basically, there are many more outputs observed on-chain than the current decoy selection algorithm would produce over that age range, which is visualized as the triangular gap. Here are 3 potential causes for this triangular gap:
After doing a fair amount of investigating, I lean strongly toward 3 for a host of reasons, and I can dive further into my reasoning for why. But for the sake of staying focused, will hold off unless asked. And will continue with the assumption that 3 seems most likely (that the decoy selection algorithm as is likely is under-selecting recent real outputs). Continuing with that assumption, it would make sense to try and arrive at a solution that would bridge the gap between the current decoy selection algorithm and observed. Additionally, it would be apparent that the gamma distribution (green line) is not perfectly applicable on its own as fitting the distribution to, and serving as "the source of truth". Meaning that if we were to try and match the decoy selection algorithm identically to the gamma (which is what the wallet did before block density was factored in), then the algorithm would likely perform even worse by under-selecting even fewer outputs in the range in the chart (since green is a fair amount below orange, which is already below blue). Comparing the different windowsThe window of 20 seems to perform well between ages 11 and 30 (by "well", I mean it bridges the gap from current to observed on-chain data, which is defined as "well" because of the assumption above that the gap from current to observed is caused by the algorithm missing real outputs). However, around age 30, it shifts below the current decoy selection algorithm, which is also below observed on-chain. Thus, it starts to marginally under-select outputs, and is thus performing marginally worse than the current. This, to me, felt like potential for harm (it is why I initially chose 50 over 40, because you can see that 40 starts to move slightly below orange line later on as well). The potential for harm does, however, seem very small. Here are the Kolmogorov-Smirnov statistics for each distribution when comparing to observed on-chain data (recall, smaller means the distance to observed is lower, and therefore it performs "better" based on the above assumption that getting closer to observed is the desirable outcome):
Given the above K-S stats, it seems you can't really go wrong with a window between 10-50. Thus, I figure visual analysis combined with the K-S stat seems a prudent route to arrive at a window of 50. v12 (1978433 - 2210000)
Seems to exhibit very similar properties to v14, and the same analysis above applies. v11 (1788720 - 1978433)
In this interval, it appears clear that the gamma does apply best. There isn't really much to go off of here in way of deciding between the different windows. I believe this interval looks like this because the code to factor in block density was not released until July 17, 2019, or circa block 1881000. Further, this code wasn't released as part of a hard fork. So I don't think it makes much sense to use it as part of analysis to decide how to modify the current approach. average_output_time shift from 2 to 1 (2383730 - 2413735)
As highlighted in #7798, the decoy selection algorithm is currently missing recent outputs at a step in the selection calculation by a factor of nearly 2x because of an integer truncation issue. Since block 2383730, Not much extra to add to this analysis in way of choosing between the windows, however. ConclusionI still believe a window of 50 offers the least-potential-for-harm approach, as it won't marginally start to under-select outputs between ages 30 - 100 (like a smaller window appears to do), and it will also select a decent-sized share of younger outputs, as appears to be desired. My data + code to plot diagrams and calculate K-S statisticsOutput age data for each epochUnzip the following csv's: v14 (2210720 - 2413735).zip My code to produce the csv's is a bit messy, but happy to clean it up and share if desired. I heavily modified Python to plot diagrams
Python to calculate K-S statistics
Edit: as requested by @Rucknium in monero-dev IRC, made the charts consistent probability densities with the same x- and y-axis cutoffs. I also noticed a small, inconsequential off-by-1 issue when pulling the normal gamma distribution data and fixed it. |
I agree with the analysis on the surface, and the conclusion that something in the 10-50 range is the "least wrong" of the current set. I have no particular objection to 50 as the choice. In any case, will give this some additional time for further percolation and input, if any. |
From a statistical perspective, I support the latest version. What is accomplished here is "thickening" the probability density function of the selection algorithm in the section closest to zero. This more closely mimics the observed distribution of mixins + real spends. However, in the near future it is crucial that we consider moving away from the current selection algorithm that is based on Moser et al. 2018. I have some ideas about how to accomplish this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
- matches the paper by Miller et al to apply the gamma from chain tip, rather than after unlock time - if the gamma produces an output more recent than the unlock time, the algo packs that output into one of the first 50 spendable blocks, respecting the block density factor
- select_outputs.gamma: decrease expected median for recent changes to algorithm (monero-project#7821) & re-attempt test with 10x larger sample size if the test fails on first try - select_outputs.density: allow a wider deviation from chain data to selected data for larger blocks, and smaller deviation for smaller blocks (the allowed deviation is proportional to size now) + test some other sensible heuristics - select_outputs.same_distribution: allow slightly larger average deviation from picks to chain data - still not perfect, but only deterministic tests can be perfect
- select_outputs.gamma: decrease expected median for recent changes to algorithm (monero-project#7821) & re-attempt test with 10x larger sample size if the test fails on first try - select_outputs.density: allow a wider deviation from chain data to selected data for larger blocks, and smaller deviation for smaller blocks (the allowed deviation is proportional to size now) + test some other sensible heuristics - select_outputs.same_distribution: allow slightly larger average deviation from picks to chain data - still not perfect, but only deterministic tests can be perfect
Overview
When the wallet selects decoys using the gamma distribution, the expected distribution is supposed to be fit to the chain tip, but instead is being fit prior to the unlock time. The fix in this PR simply shifts the result of the gamma forward the expected duration of the unlock time. If the gamma spits out an output spent less than the unlock time, then it places that output in a random block within the first 50 spendable blocks (while still factoring in block density). I assumed that outputs younger than the unlock time that the gamma suggests should still be factored into the distribution because one would expect outputs selected by the gamma in this range to be spent soon after unlock. "Someone who spent an output after 1 block back when it was allowed likely would spend that output soon after it unlocks today."
The material negative impact of this general issue still appears to be contained to the very earliest spents, however deeper analysis would be needed to arrive at harder figures.
Reasoning behind gamma being fit to the start of the chain
The decoy selection algo is applying the gamma distribution starting 10 blocks prior to the chain tip (bc of unlock time), but it appears the gamma distribution should be applied starting at the chain tip.
As mentioned in "The fix" of #7807 , this was my first thought as to the issue, but I figured it may have been incorrect because the gamma suggests that there is a non-negligible chance of spending an output between 1 and 9 blocks old, which I assumed had no chance of being plausible because of the unlock time, and so there must be some other explanation for the issue. But it seems the gamma distribution in the paper was taken from a time when the unlock time wasn't enforced by consensus. And as a talk done by isthmus demonstrated, some didn't follow this convention at the time. Therefore, there were some outputs on chain spent before 10 blocks that the gamma distribution factors in.
I reached out to one of the authors of the paper to sanity check, and the above does seem to be the case. See the final section where it fits the chain data to the gamma, specifically:
Analysis of the fix
See this comment for an explanation of how I arrived at using a window of 50 to place outputs spit out by the gamma more recent than the unlock time. It also provides analysis on the fix's impact.
The analysis below is outdated, but keeping it here for posterity so the flow of discussion in comments below makes more sense
Results of the fix
I used this code to simulate
get_outs
with the fix in this PR, and plotted against the current:You can see in the above charts that the very earliest outputs are selected in higher frequency with the fix. As you move further right, the outputs are selected in marginally lower frequency. Then it hits a steady state of selecting roughly equivalent outputs.
The numbers explaining the above observation
The gamma's expected probability of a spent output between 0 to 10 blocks old is ~2.1%.
Between 1 to 10 blocks, it's still ~2.1% (since 0 to 1 block is negligible).
Between 10 to 11 blocks, it's ~0.3%.
Between 11 to 20 blocks, it's ~2.3%.
Between 20 to 30 blocks, it's ~2.2%.
With the fix, we would expect outputs less than < 10 blocks old produced by the gamma to be spent in the first available block; thus, we should expect 2.1% + 0.3% of outputs to be spent between 10 to 11 blocks. However, the current algorithm suggests that between 10 to 11 blocks, close to 0% of outputs would be selected as decoys (before factoring in density). Thus, between 10 to 11 blocks, it would seem the current algorithm under-selects decoys by about 2.4%. This means that the outputs observed in this age range of 10 to 11 blocks are more likely to be real spents. In practice, however, thanks to the decoy selection algorithm factoring in density and therefore likely selecting some decoys in this age range, and considering we've only observed ~0.5% of outputs in this age range, it appears as though only a very small percentage of real spents would have been identifiable.
Between 11 to 20 blocks, the current algorithm is expected to match the gamma's 1 to 10 block range of 2.1%. And as noted, the gamma's expected probability of a spent output between 11 and 20 blocks is 2.3%. Thus, the current algorithm only slightly under-selects outputs in this age range, which means it is likely that nothing definitive can be gained from observing outputs in rings in this age range (ignoring the impact of block density).
Between 20 to 30 blocks, the current algorithm is expected to match the gamma's 10 to 20 block range of 0.3% + 2.3%. And as noted, for comparison, the gamma's expected is 2.2%. Therefore the current algorithm over-selects decoys in this interval, and thus, outputs spent in this range are overly protected, aka are even less likely to be deducible as real. The same goes for the range between 30 to 250 blocks.
Beyond 250 blocks, both current algorithm and fixed algorithm are roughly equivalent.
Conclusion
As first reported, the impact appears mostly contained to the very earliest spents that are spent right when they unlock, and were created in relatively smaller blocks than average. The initial maximum estimate of 1% of transactions affected seems corroborated by the above findings. A deeper analysis factoring in block density would be necessary to arrive at harder figures.
Appendix: impact on tx uniformity
As a result of the fix, approximately 1 in 5 rings
((1 - [1 - (2.1% + 0.3%)]^10) = 21% = 1 in 5)
are now expected to include at least 1 very early decoy. With this information, someone would likely be able to guess that a transaction is coming from a fixed wallet with a higher degree of certainty, however, I do not see a practical vulnerability that can stem from this knowledge.Edit 1: small change to select unspendable output from first spendable block randomly
Edit 2: corrected charts for small change
Edit 3: updated for using a window of 50