Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Align fs.ReadStream buffer pool writes to 8-byte boundary #24838

Closed
wants to merge 1 commit into from

Conversation

trxcllnt
Copy link
Contributor

@trxcllnt trxcllnt commented Dec 4, 2018

fs: align fs.ReadStream buffer pool writes to 8-byte boundary

Prevents alignment issues when people create a typed array from a chunk buffer, similar to 285d8c6.

Fixes #24817

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • commit message follows commit guidelines

@nodejs-github-bot nodejs-github-bot added the fs Issues and PRs related to the fs subsystem / file system. label Dec 4, 2018
Copy link
Member

@bnoordhuis bnoordhuis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. It might be infinitesimally nicer to move it into a function so you can simply write pool.used = align(pool.used), with align() being:

function align(n) {
  return (n + 7) & ~7;  // Align to 8 byte boundary.
}

@trxcllnt
Copy link
Contributor Author

trxcllnt commented Dec 4, 2018

👍 thanks @bnoordhuis, I moved the logic into a roundUpToMultipleOf8 function and updated the PR

@trxcllnt trxcllnt force-pushed the bugfix/24817 branch 2 times, most recently from 3c5ec58 to f530dc5 Compare December 5, 2018 05:30
thisPool.used = roundUpToMultipleOf8(alignedOffset);
} else if (toRead - bytesRead > kMinPoolSpace) {
alignedOffset = roundUpToMultipleOf8(start + bytesRead);
poolFragments.push(thisPool.slice(alignedOffset, start + toRead));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're only using alignedOffset inside the if/else arms, can you make it const and scope it to inside the arms?

Is start + toRead correct when alignedOffset may not be start + bytesRead? What I mean is, shouldn't start + toRead be rounded up too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rounding up is unsafe here, since toRead is clamped within the size of the pool on initialization. I had the same thought initially, but rounding it up caused unrelated tests to fail.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But you agree it's slicing less than before? Note that slice() takes start and end arguments (inclusive), not start and length.

After thinking about this some more, I think that the rules/constraints should be:

  1. thisPool.used is always a multiple of 8 (the pool itself is already suitably aligned), and

  2. this particular slice() call should only slice multiples of 8 at offsets that are multiples of 8.

(1) implies (2) because (2) caches slices for reuse by (1).

(2) can be enforced by rounding start + toRead down.

Potential pitfall: rounding up or down can result in slices smaller than kMinPoolSpace. It should check that the slice's length >= kMinPoolSpace before pushing it into poolFragments.

Copy link
Contributor Author

@trxcllnt trxcllnt Dec 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it is slicing less than before. By rounding up the start position, we advance 8 - n_remainder bytes past where we would have sliced. In this PR, pool.used (and thisPool.used) should always be a multiple of 8, which I believe addresses point (1).

With regards to (2), the byteLength of each slice as it is now isn't guaranteed to be a multiple of 8, though I think that's also a good idea. Performance-wise, it's best to align the toRead value to the processor's cache line size, but I left it as-is to keep the surface area of this PR to a minimum. I would be happy to add this though.

Good catch on the adjusted size causing the pool fragment size to dip below kMinPoolSpace, I'll update the PR with a fix here in a bit.

@trxcllnt trxcllnt force-pushed the bugfix/24817 branch 2 times, most recently from e13a20c to 0d29ef3 Compare December 5, 2018 23:40
Copy link
Member

@bnoordhuis bnoordhuis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

else if (toRead - bytesRead > kMinPoolSpace)
poolFragments.push(thisPool.slice(start + bytesRead, start + toRead));
if (start + toRead === thisPool.used && thisPool === pool) {
const alignedOffset = thisPool.used + bytesRead - toRead;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiniest of nits: alignedOffset is a misnomer since it's not actually aligned (yet.)

const alignedOffset = thisPool.used + bytesRead - toRead;
thisPool.used = roundUpToMultipleOf8(alignedOffset);
} else {
const alignedEnd = (start + toRead) & ~7; // round down to multiple of 8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more micro-nit: comments should be capitalized and punctuated (and normally have two spaces before the // - I expect the linter will complain.)

@trxcllnt trxcllnt force-pushed the bugfix/24817 branch 2 times, most recently from 899e641 to 73d1ca2 Compare December 6, 2018 23:34
Copy link
Member

@bnoordhuis bnoordhuis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@trxcllnt
Copy link
Contributor Author

@bnoordhuis what else needs to be done here? Are those three failing checks related to this change? I tried parsing through the console output, but can't see any indication of why it's failing.

@bnoordhuis
Copy link
Member

Let's run the CI again and see what happens: https://ci.nodejs.org/job/node-test-pull-request/19989/

@addaleax
Copy link
Member

@trxcllnt Do you think you could rebase this? Sorry for the merge conflicts!

@trxcllnt
Copy link
Contributor Author

@addaleax just resolved the conflict via the github UI, let me know if I should do anything further. Thanks!

@addaleax
Copy link
Member

@trxcllnt One problem is that merge conflicts break our CI and make landing changes harder – can you use git rebase rather than a merge commit?

(We can probably figure this out another way if not, but this would definitely make things easier)

@trxcllnt
Copy link
Contributor Author

@addaleax sure thing -- rebased from upstream/master and resolved the conflict locally. I'm always wary of resolving conflicts during a rebase, so let me know if it doesn't look right.

@trxcllnt
Copy link
Contributor Author

trxcllnt commented Mar 31, 2019

@bnoordhuis @addaleax anything left to do here? The node folks in the Apache Arrow project would appreciate if we could merge this soon 🙏 since we have a workaround in place that has to resort to copying chunks if the offset isn't aligned. It can be a nasty/hidden performance cliff for users if they're doing a lot of file I/O. Let me know if there's anything else I can do ❤️

@addaleax
Copy link
Member

@trxcllnt It looks like the rebase went wrong somehow? There’s 22000 added lines in this PR now…

Generally, there’s nothing in the way of landing it, and bumping the thread is the exact right thing to do :)

@trxcllnt trxcllnt force-pushed the bugfix/24817 branch 2 times, most recently from d6d9c3f to bdc2ba4 Compare April 1, 2019 00:26
@trxcllnt
Copy link
Contributor Author

trxcllnt commented Apr 1, 2019

@addaleax thanks for the heads up! should be fixed now :-)

@nodejs-github-bot
Copy link
Collaborator

Prevents alignment issues when creating a typed array from a buffer.

Fixes: nodejs#24817
@trxcllnt
Copy link
Contributor Author

trxcllnt commented May 1, 2019

@addaleax bump :-]

@nodejs-github-bot
Copy link
Collaborator

@addaleax
Copy link
Member

addaleax commented May 2, 2019

Landed in b884ceb 🎉 Thanks for the PR!

@addaleax addaleax closed this May 2, 2019
addaleax pushed a commit that referenced this pull request May 2, 2019
Prevents alignment issues when creating a typed array from a buffer.

Fixes: #24817

PR-URL: #24838
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
targos pushed a commit that referenced this pull request May 4, 2019
Prevents alignment issues when creating a typed array from a buffer.

Fixes: #24817

PR-URL: #24838
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Anna Henningsen <[email protected]>
@targos targos mentioned this pull request May 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fs Issues and PRs related to the fs subsystem / file system.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unaligned writes to fs.ReadStream internal Buffer pool leads to misaligned reads
5 participants