-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize isSpace functions #315
Merged
Merged
Changes from 1 commit
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -101,6 +101,9 @@ byteStringChunksData = map (S.pack . replicate (4 ) . fromIntegral) intData | |
oldByteStringChunksData :: [OldS.ByteString] | ||
oldByteStringChunksData = map (OldS.pack . replicate (4 ) . fromIntegral) intData | ||
|
||
{-# NOINLINE loremIpsum #-} | ||
loremIpsum :: S.ByteString | ||
loremIpsum = S8.pack "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.\nSed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam, eaque ipsa quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt explicabo. Nemo enim ipsam voluptatem quia voluptas sit aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos qui ratione voluptatem sequi nesciunt. Neque porro quisquam est, qui dolorem ipsum quia dolor sit amet, consectetur, adipisci velit, sed quia non numquam eius modi tempora incidunt ut labore et dolore magnam aliquam quaerat voluptatem. Ut enim ad minima veniam, quis nostrum exercitationem ullam corporis suscipit laboriosam, nisi ut aliquid ex ea commodi consequatur? Quis autem vel eum iure reprehenderit qui in ea voluptate velit esse quam nihil molestiae consequatur, vel illum qui dolorem eum fugiat quo voluptas nulla pariatur?\n" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. FWIW, I would fold this across multiple lines. The version I used was: paragraphs :: S.ByteString
paragraphs = S8.pack $
"Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor\n\
\incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis\n\
\nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.\n\
\Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu\n\
\fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in\n\
\culpa qui officia deserunt mollit anim id est laborum.\n\
\\n\
\Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium\n\
\doloremque laudantium, totam rem aperiam, eaque ipsa quae ab illo inventore\n\
\veritatis et quasi architecto beatae vitae dicta sunt explicabo. Nemo enim\n\
\ipsam voluptatem quia voluptas sit aspernatur aut odit aut fugit, sed quia\n\
\consequuntur magni dolores eos qui ratione voluptatem sequi nesciunt. Neque\n\
\porro quisquam est, qui dolorem ipsum quia dolor sit amet, consectetur,\n\
\adipisci velit, sed quia non numquam eius modi tempora incidunt ut labore et\n\
\dolore magnam aliquam quaerat voluptatem. Ut enim ad minima veniam, quis\n\
\nostrum exercitationem ullam corporis suscipit laboriosam, nisi ut aliquid\n\
\ex ea commodi consequatur? Quis autem vel eum iure reprehenderit qui in ea\n\
\voluptate velit esse quam nihil molestiae consequatur, vel illum qui dolorem\n\
\eum fugiat quo voluptas nulla pariatur?" Though one paragraph is likely sufficient... |
||
|
||
-- benchmark wrappers | ||
--------------------- | ||
|
@@ -397,6 +400,10 @@ main = do | |
] | ||
] | ||
, bgroup "sort" $ map (\s -> bench (S8.unpack s) $ nf S.sort s) sortInputs | ||
, bgroup "words" | ||
[ bench "lorem ipsum" $ nf S8.words loremIpsum | ||
, bench "one huge word" $ nf S8.words byteStringData | ||
] | ||
, bgroup "folds" | ||
[ bgroup "foldl'" $ map (\s -> bench (show $ S.length s) $ | ||
nf (S.foldl' (\acc x -> acc + fromIntegral x) (0 :: Int)) s) foldInputs | ||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This condition discriminates 127 out of 256 possibilities. Could you please benchmark
w .&. 0x50 == 0
, which discriminates 192 values?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the
& 0x50
test, I get noticeably better results, which outperform also the proposed PR on all the test cases.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I get even better results combining both filters:
Just reran the PR as-is as a sanity check that nothing changed in the mean-time and I get:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Between the two filters only 33 candidate characters are left:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's more, other than whitespace, almost all are infrequent in text strings (rather than binary data):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ethercrow I see you've switched to the implementation I was testing, are you seeing similar benchmark improvements on your hardware?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the version with two filters is the fastest for me as well.