Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potentially expensive trimming of Chunk bytes #1571

Closed
bbannier opened this issue Oct 11, 2023 · 0 comments · Fixed by #1572
Closed

Potentially expensive trimming of Chunk bytes #1571

bbannier opened this issue Oct 11, 2023 · 0 comments · Fixed by #1572
Assignees
Labels
Runtime Library Issues related to the HILTI or Spicy runtime libraries

Comments

@bbannier
Copy link
Member

bbannier commented Oct 11, 2023

A Chunk currently directly wraps containers holding its data. When calling Chunk::trim we directly trim these wrapped values,

memmove(a->second.data(), begin, a->first.Ref());
v.erase(v.begin(), v.begin() + static_cast<Vector::difference_type>((o - _offset).Ref()));

For the Zeek case we typically receive larger chunks and extract small bits from them, e.g., imagine a unit parsing a uint8[]. The current trim implementation makes sure that we free as much memory as possible (by removing each individual byte as soon as possible), but potentially incurs a performance overhead for that (e..g, std::vector::erase requires touching all bytes in the Chunk). The current implementation shows pathological performance for e.g., inputs generated by oss-fuzz.

We should consider using a cheaper implementation where Chunk::trim just moves the begin of a Chunk, but does not free the memory anymore.

@bbannier bbannier added the Runtime Library Issues related to the HILTI or Spicy runtime libraries label Oct 11, 2023
@bbannier bbannier self-assigned this Oct 11, 2023
bbannier added a commit that referenced this issue Oct 12, 2023
Trimming `Chunk`s (always from the left) causes a lot of internal work
with only limited benefit since we manage visbility with `stream::View`s
on top of `Chunk`s anyway.

This patch removes trimming inside `Chunk`s so now any trimming only
removes `Chunk`s from `Chain`s, but does not internally change
individual `Chunk`s anymore. This might lead to slighly increased memory
use, but callers usually have that data in memory anyway.

Closes #1571.
bbannier added a commit that referenced this issue Oct 12, 2023
Trimming `Chunk`s (always from the left) causes a lot of internal work
with only limited benefit since we manage visbility with `stream::View`s
on top of `Chunk`s anyway.

This patch removes trimming inside `Chunk`s so now any trimming only
removes `Chunk`s from `Chain`s, but does not internally change
individual `Chunk`s anymore. This might lead to slighly increased memory
use, but callers usually have that data in memory anyway.

Closes #1571.
bbannier added a commit that referenced this issue Oct 13, 2023
Trimming `Chunk`s (always from the left) causes a lot of internal work
with only limited benefit since we manage visibility with `stream::View`s
on top of `Chunk`s anyway.

This patch removes trimming inside `Chunk`s so now any trimming only
removes `Chunk`s from `Chain`s, but does not internally change
individual `Chunk`s anymore. This might lead to slightly increased memory
use, but callers usually have that data in memory anyway.

Closes #1571.

(cherry picked from commit 625a607)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Runtime Library Issues related to the HILTI or Spicy runtime libraries
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant