-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve lazy performance of Data.Text.Lazy.inits
#572
Conversation
8ccae28
to
d3c20b2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is our test coverage good enough to generate lazy Text
with multiple chunks?
++ L.map (Chunk t) (inits' ts) | ||
initsNE ts0 = Empty NE.:| inits' 0 ts0 | ||
where | ||
inits' :: Int64 -- Number of previous chunks i |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think number of chunks cannot exceed Int
even on 32-bit machines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a lazy structure so it can, given enough time. For instance, initsNE (cycle "text")
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, if we want to be precise to that extent it should be Integer
;)
Yes, the text/tests/Tests/QuickCheckUtils.hs Line 98 in 240682e
|
The previous implementation, itself based on an earlier version of `Data.List.inits`, inherited the flaw that accessing the i-th element took quadratic time O(i²). This now takes linear time O(i) as expected. The current version of `Data.List.inits` uses a banker's queue to obtain good performance when generating very long lists. For lazy text, consisting of a few big chunks, that benefit seems negligible. So I chose a simpler implementation.
(The Core was already good but, just to be safe.)
Thanks a ton! |
Closes #562
The previous implementation, itself based on an earlier version of
Data.List.inits
, inherited the flaw that accessing the i-th element took quadratic time O(i²). This now takes linear time O(i) as expected.The current version of
Data.List.inits
uses a banker's queue to obtain good performance when generating very long lists. For lazy text, consisting of a few big chunks, that benefit seems negligible. So I chose a simpler implementation.Benchmarks included.
Quadratic growth before (the "2k" benchmarks take 4x the time of the "1k" benchmarks):
Linear growth after (the "2k" benchmarks take twice the time of the "1k" benchmarks):