-
Notifications
You must be signed in to change notification settings - Fork 29.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pummel/test-crypto-timing-safe-equal-benchmarks failing #34073
Comments
Fwiw, this has been discussed in the private repo for a bit since this is potentially security-relevant. |
I bisected and it seems to be a result of a V8 update. Since we can't compile/test without most of the cherry-picks required for V8 updates, I was not able to identify a single commit. The first bad commit could be any of: a489288 |
Some work done by @sam-github revealed that adding the patch below makes the test pass even though there's no reason it should change execution, in theory. Sam commented:
|
Ah okay, I have no visibility of that. Given the state of daily master CI was discussed at the collaborator summit yesterday I was surprised no issue was open for it. |
Yeah, I think it’s fine to discuss it openly since the CI failures are available publicly as well. |
We'd been talking about moving it to a public repo anyway because it does not seem to be security-relevant once people dug into it a bit. So, this is probably as good a way for that to happen as any. Plus, I think the results are public anyway. (I'm not sure if node-daily-master results is viewable by anyone with a GitHub account. I think it is.) |
FWIW I can reproduce this on a Linux system inside IBM, where it's flaky:
Applying the patch from @sam-github doesn't appear to help. Running
|
Or this:
(there's lots of |
We can try and rule in/out v8 changes. As far as I can tell, this was bisected to the v8 8.3 upgrade, which was upgraded from 8.1? Using the version numbers there, here's the git log between the two versions: https://chromium.googlesource.com/v8/v8/+log/8.1.307.31..8.3.110.9/?pretty=fuller&n=10000 One thing that would be helpful to know from someone who can reproduce this: why does the T-test fail? Is it that the first execution is slow, and then the rest are very fast? That would support the 'something gets cached incorrectly' theory. |
I dug in a bit using V8 canary builds, since they provide a good way to bisect the combination of Node.js and V8 changes. Fake git bisect log
That means the causing commit is one of https://chromium.googlesource.com/v8/v8/+log/8.2.220..8.2.222/?pretty=fuller&n=10000. v8/v8@a4c1408 is the only one that sticks out to me. Unfortunately, reverting that patch does not apply cleanly on top of our 8.3 upgrade PR, so I’m not able to confirm. However! This appears something that has been fixed in one of the later V8 versions: Fake git bisect log
Iiuc, that means that the fix is one of https://chromium.googlesource.com/v8/v8/+log/8.4.134..8.4.157/?pretty=fuller&n=10000. Unfortunately, that’s a lot of commits, and none of them particularly stand out to me as possibly related to this test or possibly related to one of the commits that have potentially caused this. (This is all based on repeated runs of the test with up to 30 iterations. That’s not a lot, so there’s always a chance that I went wrong here somewhere.)
@psmarshall My understanding is that the code in the test is supposed to remove outliers, so that’s probably not it. |
The fix might have been v8/v8@8c68536 which is in the fix range @addaleax provided. That would suggest the issue is sensitive to inlining. You could try configuring v8 with |
@psmarshall Yes, My understanding is that v8/v8@8c68536 is an optimization, so even if it might do the right thing for us here, it still means that something about the test isn’t quite right and should be fixed on our side? |
So … I’m still trying to understand this. The test is very carefully written to avoid the influence of JS engine modifications, so I still don’t fully understand why they might influence this test one way or another. One thing I’ve noticed is that this flakiness goes away when testing the binding function directly instead of the JS function, so one thing I will try is to move the typechecking that we do into C++ and see if that helps. It’s probably a good idea anyway for this particular case (and imo other cases as well but that’s way out of scope here). |
This makes the function more robust against V8 inlining. Fixes: nodejs#34073
Running with The |
Oh just saw your PR @addaleax - that JS wrapper before the C++ function would do it for sure |
This makes the function more robust against V8 inlining. Fixes: #34073 PR-URL: #34141 Reviewed-By: Richard Lau <[email protected]> Reviewed-By: Ujjwal Sharma <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Tobias Nießen <[email protected]> Reviewed-By: Rich Trott <[email protected]> Reviewed-By: Zeyu Yang <[email protected]>
This makes the function more robust against V8 inlining. Fixes: #34073 PR-URL: #34141 Reviewed-By: Richard Lau <[email protected]> Reviewed-By: Ujjwal Sharma <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Tobias Nießen <[email protected]> Reviewed-By: Rich Trott <[email protected]> Reviewed-By: Zeyu Yang <[email protected]>
This makes the function more robust against V8 inlining. Fixes: #34073 PR-URL: #34141 Reviewed-By: Richard Lau <[email protected]> Reviewed-By: Ujjwal Sharma <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Tobias Nießen <[email protected]> Reviewed-By: Rich Trott <[email protected]> Reviewed-By: Zeyu Yang <[email protected]>
https://ci.nodejs.org/job/node-test-commit-custom-suites-freestyle/15073/console started via https://ci.nodejs.org/job/node-daily-master/1949/
Daily master builds appear to have been consistently broken (for at least as long as we have history (27 May, we only keep a month's worth of results)) due to this test.
Previous issues with this test:
The text was updated successfully, but these errors were encountered: