Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fast opt-out of shared structures per object? #112

Open
somebee opened this issue May 31, 2023 · 3 comments
Open

Fast opt-out of shared structures per object? #112

somebee opened this issue May 31, 2023 · 3 comments

Comments

@somebee
Copy link
Contributor

somebee commented May 31, 2023

We are packing tons of objects into a sequential stream. Most objects follow fairly consistent structure {id,name,...keys}, but within these objects there are several objects that I suspect does not benefit much from using structure definitions, like {someId:1,someOtherId:2} where many of them don't show up more than once among 10k+ objects.

I see the option shouldShareStructures but from reading the source I don't think it fits our need. I'm wondering if it might make sense to be able to pass in a function like shouldUseStructures(value) which can return false if we want to pack an object as a plain old regular object?

Our function would be as simple as (value)=>!!value.id. Any object without an id should skip all the logic testing for shared structures, key combinations and all that. I can definitely add it, but was wondering if you think it would make a difference at all? I will try to hardcode it in locally and do some very informal tests here :)

@somebee
Copy link
Contributor Author

somebee commented May 31, 2023

Fwiw, skipping structures for these objects reduced the (compressed) size of the full stream from 500kb to 470kb, and the uncompressed by ~200kb. Unpacking performance seems about the same, but hard to say since it's so damn fast either way. Intuitively I would think that packing perf is faster as well, but haven't made an isolated case to test it.

@somebee
Copy link
Contributor Author

somebee commented May 31, 2023

Made an isolated test with the real-world data we have.

useRecords(fn) – 1.35mb – pack: 10.2498ms unpack: 9.3448ms
useRecords – 1.46mb – pack: 11.4979ms unpack: 15.3999ms

So, in our usecase it makes a quite substantial difference actually. I'll submit a PR today where the only public-facing change is that you can supply useRecords as either a function or a boolean. If it is a function, it will essentially call useRecords(value) for each value, and opt out to writePlainObject if it returns false. If you set useRecords to true/false no code-paths will change, so there is no performance impact for any other cases.

@kriszyp
Copy link
Owner

kriszyp commented May 31, 2023

I assume it wouldn't make viable to use a Map for the objects that are... maps :) (without consistent structure).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants