Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

smallByteSize methods are very trappy in many classes -- should be changed or have warnings in javadocs #53

Closed
hossman opened this issue Apr 27, 2015 · 1 comment

Comments

@hossman
Copy link

hossman commented Apr 27, 2015

When integrating t-digest into Solr, the first patch contributed had code which looked like this...

ByteBuffer buf = ByteBuffer.allocate(tdigest.smallByteSize());
tdigest.asSmallBytes(buf);
res.add("percentiles", buf.array());

...the contributor who wrote that code did so based on the javadocs of ArrayDigest.smallByteSize, not realizing that under the covers hte internals of this method are:

  • allocate a ByteBuffer based on the result of byteSize()
  • do a complete serialization using asSmallBytes(ByteBuffer)
  • return the final position of the ByteBuffer

...meaning that ultimately at least 2x the smallByteSize of RAM was being allocated everytime, and the ArrayDigest was being serialized twice.

Methods like ArrayDigest.smallByteSize (AVLTreeDigest.smallByteSize, etc...) that have such a high internal cost should either be changed to just call byteSize() (since the overallocation the user will have to do is still better for space/time performance then doing that same overallocation internally plus a behind the scenes serialiation that is thrown away) or warn users about this heavy internal cost in the javadocs.

@tdunning
Copy link
Owner

I am opting to warn users away from these methods. I see no way to find out the size without doing the compression. There is probably a better API based on streams that would avoid this need. Or maybe over-allocation idioms are the rule of the day.

Either way, I am taking the easy way out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants