Skip to content

Latest commit

 

History

History
124 lines (80 loc) · 4.63 KB

RELEASE-NOTES.md

File metadata and controls

124 lines (80 loc) · 4.63 KB

Release 3.2

In release 3.2, the goal is to produce an update to the code given the large number of improvements since the previous release.

There are a few bugs that will survive this release, most notably in the AVLTreeDigest. These have to do with large numbers of repeated data points and are not new bugs.

There is also a lot of work going on with serialization. I need to hear from people about what they are doing with serialization so that we can build some test cases to allow an appropriate migration strategy to future serialization.

The paper continues to be updated. The algorithmic descriptions are getting reasonably clear, but the speed and accuracy sections need a complete revamp with current implementations.

Bugs, fixed and known

Fixed

The following important issues are fixed in this release

Issue #90 Serialization for MergingDigest

Issue #92 Serialization for AVLTreeDigest

Maybe fixed

This issue has substantial progress, but lacks a definitive test to determine whether it should be closed.

Issue 78 Stability under merging.

Pushed

The following issues are pushed beyond this release

Issue #87 Future proof and extensible serialization

Issue #89 Bad handling for duplicate values in AVLTreeDigest

All fixed issues

Here is a complete list of issues resolved in this release:

Issue #55 Add time decay to t-digest

Issue #52 General factory method for "fromBytes"

Issue #90 Deserialization of MergingDigest BufferUnderflowException in 3.1

Issue #92 Error in AVLTreeDigest.fromBytes

Issue #93 high centroid frequency causes overflow - giving incorrect results

Issue #67 Release of version 3.2

Issue #81 AVLTreeDigest with a lot of datas : integer overflow

Issue #75 Adjusting the centroid threshold values to obtain better accuracy at interesting values

Issue #74 underlying distribution : powerlaw

Issue #72 Inverse quantile algorithm is non-contiguous

Issue #65 totalDigest add spark dataframe column / array

Issue #60 Getting IllegalArgumentException when adding digests

Issue #53 smallByteSize methods are very trappy in many classes -- should be changed or have warnings in javadocs

Issue #82 TDigest class does not implement Serializable interface in last release.

Issue #42 Histogram

Issue #40 Improved constraint on centroid sizes

Issue #37 Allow arbitrary scaling laws for centroid sizes

Issue #29 Test method testScaling() always adds values in ascending order

Issue #84 Remove deprecated kinds of t-digest

Issue #76 Add serializability

Issue #77 Question: Proof of bounds on merging digest size

Issue #71 Simple alternate algorithm using maxima, ranks and fixed cumulative weighting

Issue #61 Possible improvement to the speed of the algorithm

Issue #58 jdk8 doclint incompatibility

Issue #48 Build is unstable under some circumstances

Issue #63 Which TDigest do you recommend?

Issue #62 Very slow performance; what am I missing?

Issue #47 Make TDigest serializable

Issue #49 MergingDigest.centroids is wrong on an empty digest