Potential performance improvement #2

NickStrupat · 2018-04-13T00:59:25Z

Hi! Thanks for the awesome work here. I've been tinkering with it for the past couple of evenings.

I've been testing for potential performance improvements by switching from std::vector & your allocator to using blocks of memory allocated on the stack (via alloca). I'm seeing a 25%-ish improvement for short strings, but about 25%-ish degradation for longer strings. I'm wondering if you have any ideas why that might be?

Also, I haven't yet done a proper multi-thread test. I suspect the contention on _mm_malloc will yield a higher performance improvement when using alloca, since the threads don't need to sync on heap allcoation. If I get some time next week I will try to do that and I'd love to hear what you think of this approach.

The changes I made to achieve the stack allocation are here...
NickStrupat@58ee960

The text was updated successfully, but these errors were encountered:

addaleax · 2018-04-14T16:02:46Z

I like the idea, but switching to alloca unconditionally might not be ideal, because it’s completely possible to bring things to a stack overflow using a single call that way…

What we do in Node.js core a lot is using a fixed-size stack-allocated buffer for small strings, and deferring to a larger, heap-allocated buffer for longer ones.

Also, this library is generally better at handling long strings – for short ones, vectorization doesn’t really pay out, and there’s the additional overhead of JS → C++ calls.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Potential performance improvement #2

Potential performance improvement #2

NickStrupat commented Apr 13, 2018 •

edited

Loading

addaleax commented Apr 14, 2018

Potential performance improvement #2

Potential performance improvement #2

Comments

NickStrupat commented Apr 13, 2018 • edited Loading

addaleax commented Apr 14, 2018

NickStrupat commented Apr 13, 2018 •

edited

Loading