-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: consider splitting out shortbytestring #444
Comments
@hasufell If you want to have a |
Hmm, well I doubt that we'll ever have two boot packages that both expose To simplify the migration, maybe we can re-create the existing |
What does this mean? I thought |
Yes it does. What I mean is that
The module names don't have to be the same. I don't mind breaking backwards compat, but given the recent discussion on the CLC issue tracker I thought maybe try to not break it. Literally every package using Also, I'd want to avoid pulling in more core libraries into the equation for now, otherwise the scope goes out of hand and I will burn out from too many open PRs and discussions. |
Right, thanks for clarifying. Do note that
Good! |
@sjakobi it was me who suggested @hasufell that we can split One more thing to consider is that
All boot packages evolve at the same pace with GHC. You cannot deliver a new version sooner than it is bundled with a new GHC release. And you cannot expect to fix bugs by putting a new version onto Hackage and deprecating an older one, instead users must switch to a newer GHC. Essentially it means that you want a very high level of stability for any boot package. Another way around is to integrate all encoding-independent functions from @hasufell I'm really sorry, I'm very busy with |
That sounds like a fairly gross simplification to me. At least with
Does this roughly amount to providing the same API that we have for |
It roughly amounts to adopting https://hackage.haskell.org/package/shortbytestring-0.2.0.0/docs/Data-ByteString-Short.html into |
Looks good. 👍 |
I don't see a clear verdict here. Do we split out or do we not? If not, what's the exact course of action? |
@hasufell Let's merge (encoding-agnostic) https://hackage.haskell.org/package/shortbytestring-0.2.0.0/docs/Data-ByteString-Short.html into |
I'd personally prefer a split, but the primary goal is to get more eyes on the codebase (including the UTF-16 variant), so that things like In that sense: whatever the strategy, I need reviews for the entire codebase. Also: merging only the encoding agnostic part won't save me the trouble of adding a new boot library, would it? And what if ShortByteString as a type becomes obsolete in the future? Will we drop Would a split not make it easier for bytestring to change its memory semantics, because it can simply drop the ShortByteString modules with relatively little effort and provide a simple migration guide for legacy codebases? But I don't have a strong opinion. |
You have to add a new boot library anyway, but without a split there is less impact and churn. In fact with a split you'd likely end up with two new boot libraries, because The trouble is that if
Why would it become obsolete?
Are you talking about |
I opened #471 ...good luck 😅 |
* Merge `shortbytestring` package back into `bytestring` wrt #444 * Fix build on ARM Reusing compareByteArrays and avoiding excessive pointer arithmetic. * Speed up reverse by using byteSwap64 tricks * Remove phase control from inlines * Improve performance of elemIndex * Use setByteArray in replicate * Implement intercalate manually * Annotate partial functions with HasCallStack * Fix build on base < 4.12.0.0 * Add uncons/unsnoc * Correct complexities * Exclude reverse optimization path from ARM It seems to cause segfaults on armv7, suggesting there are issues with 'indexWord8ArrayAsWord64#'. All other platforms are fine and tests pass. * Add benchmarks for ShortByteString * Improve inlining * Adjust haddock identifiers * Get rid of writeCharArray# * Haddock fixes * Clean up tests * Use -fexpose-all-unfoldings * Improve reverse * Cleanup 'reverse' * Fix possible GC race with foreign imports For more information, see #471 (comment) * Disable asserts in shortbytestring.c * Remove redundant import * Add documentation about partial functions * Fold ShortByteString prop tests into ByteString * Restore previous INLINEs * Improve naming of bindings * Consolidate error handling functions * Remove trailing whitespace * Fix uncons in documentation * Rename indexWord64Array to indexWord8ArrayAsWord64 * Improve error message * Clean up incorrect documentation * Use div/mod instead of quot/rem * Simplify branching in reverse * Move asserts to Haskell * Prefix C functions * Fix return type of c_elem_index * Fix documentation in unfoldrN * Make unfoldrN more efficient * Fix maintainer field * Fix formatting * Implement takeEnd, dropeEnd and splitAt manually * Fix some haddock identifiers * Fix unfoldrN doc * Add a primops bounds-checking job to CI * Document and clean up createAndTrim * Rename errorEmptyList to errorEmptySBS * Improve documentation for findFromEndUntil * Improve documentation and naming * Optimize out quotRem * Document compareByteArraysOff * Simplify findIndexOrLength and findFromEndUntil * Use c_count for count * Simplify elemIndex * Remove use of 'mempty' * Make sure breakSubstring is inlined into isInfixOf * Simplify stripSuffix and stripPrefix * Fix redundant import warnings * Improve 'take' * Use existing bounnds check in 'drop' * Avoid 'create' when bytestring is empty * Optimize filter * Remove redundant INLINABLE * Use shorter 'createAndTrim' in 'filter' * Simplify 'take' * Simplify 'drop' * Better formatting * Add comment to explain DNDEBUG * Refactor elemIndex * Optimize 'partition' * Optimize hot loop in 'partition'
* Merge `shortbytestring` package back into `bytestring` wrt #444 * Fix build on ARM Reusing compareByteArrays and avoiding excessive pointer arithmetic. * Speed up reverse by using byteSwap64 tricks * Remove phase control from inlines * Improve performance of elemIndex * Use setByteArray in replicate * Implement intercalate manually * Annotate partial functions with HasCallStack * Fix build on base < 4.12.0.0 * Add uncons/unsnoc * Correct complexities * Exclude reverse optimization path from ARM It seems to cause segfaults on armv7, suggesting there are issues with 'indexWord8ArrayAsWord64#'. All other platforms are fine and tests pass. * Add benchmarks for ShortByteString * Improve inlining * Adjust haddock identifiers * Get rid of writeCharArray# * Haddock fixes * Clean up tests * Use -fexpose-all-unfoldings * Improve reverse * Cleanup 'reverse' * Fix possible GC race with foreign imports For more information, see #471 (comment) * Disable asserts in shortbytestring.c * Remove redundant import * Add documentation about partial functions * Fold ShortByteString prop tests into ByteString * Restore previous INLINEs * Improve naming of bindings * Consolidate error handling functions * Remove trailing whitespace * Fix uncons in documentation * Rename indexWord64Array to indexWord8ArrayAsWord64 * Improve error message * Clean up incorrect documentation * Use div/mod instead of quot/rem * Simplify branching in reverse * Move asserts to Haskell * Prefix C functions * Fix return type of c_elem_index * Fix documentation in unfoldrN * Make unfoldrN more efficient * Fix maintainer field * Fix formatting * Implement takeEnd, dropeEnd and splitAt manually * Fix some haddock identifiers * Fix unfoldrN doc * Add a primops bounds-checking job to CI * Document and clean up createAndTrim * Rename errorEmptyList to errorEmptySBS * Improve documentation for findFromEndUntil * Improve documentation and naming * Optimize out quotRem * Document compareByteArraysOff * Simplify findIndexOrLength and findFromEndUntil * Use c_count for count * Simplify elemIndex * Remove use of 'mempty' * Make sure breakSubstring is inlined into isInfixOf * Simplify stripSuffix and stripPrefix * Fix redundant import warnings * Improve 'take' * Use existing bounnds check in 'drop' * Avoid 'create' when bytestring is empty * Optimize filter * Remove redundant INLINABLE * Use shorter 'createAndTrim' in 'filter' * Simplify 'take' * Simplify 'drop' * Better formatting * Add comment to explain DNDEBUG * Refactor elemIndex * Optimize 'partition' * Optimize hot loop in 'partition' (cherry picked from commit 731caea)
* Merge `shortbytestring` package back into `bytestring` wrt #444 * Fix build on ARM Reusing compareByteArrays and avoiding excessive pointer arithmetic. * Speed up reverse by using byteSwap64 tricks * Remove phase control from inlines * Improve performance of elemIndex * Use setByteArray in replicate * Implement intercalate manually * Annotate partial functions with HasCallStack * Fix build on base < 4.12.0.0 * Add uncons/unsnoc * Correct complexities * Exclude reverse optimization path from ARM It seems to cause segfaults on armv7, suggesting there are issues with 'indexWord8ArrayAsWord64#'. All other platforms are fine and tests pass. * Add benchmarks for ShortByteString * Improve inlining * Adjust haddock identifiers * Get rid of writeCharArray# * Haddock fixes * Clean up tests * Use -fexpose-all-unfoldings * Improve reverse * Cleanup 'reverse' * Fix possible GC race with foreign imports For more information, see #471 (comment) * Disable asserts in shortbytestring.c * Remove redundant import * Add documentation about partial functions * Fold ShortByteString prop tests into ByteString * Restore previous INLINEs * Improve naming of bindings * Consolidate error handling functions * Remove trailing whitespace * Fix uncons in documentation * Rename indexWord64Array to indexWord8ArrayAsWord64 * Improve error message * Clean up incorrect documentation * Use div/mod instead of quot/rem * Simplify branching in reverse * Move asserts to Haskell * Prefix C functions * Fix return type of c_elem_index * Fix documentation in unfoldrN * Make unfoldrN more efficient * Fix maintainer field * Fix formatting * Implement takeEnd, dropeEnd and splitAt manually * Fix some haddock identifiers * Fix unfoldrN doc * Add a primops bounds-checking job to CI * Document and clean up createAndTrim * Rename errorEmptyList to errorEmptySBS * Improve documentation for findFromEndUntil * Improve documentation and naming * Optimize out quotRem * Document compareByteArraysOff * Simplify findIndexOrLength and findFromEndUntil * Use c_count for count * Simplify elemIndex * Remove use of 'mempty' * Make sure breakSubstring is inlined into isInfixOf * Simplify stripSuffix and stripPrefix * Fix redundant import warnings * Improve 'take' * Use existing bounnds check in 'drop' * Avoid 'create' when bytestring is empty * Optimize filter * Remove redundant INLINABLE * Use shorter 'createAndTrim' in 'filter' * Simplify 'take' * Simplify 'drop' * Better formatting * Add comment to explain DNDEBUG * Refactor elemIndex * Optimize 'partition' * Optimize hot loop in 'partition' (cherry picked from commit 731caea)
@hasufell can we close this now? |
Two more questions:
|
Do you mean a newtype of
Maybe put it into a new package |
Well, then that would require a new boot library. |
|
I would have preferred another package too. |
Should this new package be based off the recently added |
Where is this need for a new package coming from? We already have |
@sjakobi you can always put more things in the same package. It's better to flip things around: If I can use More libraries frees us up to get more creative with the bootstraping. E.g. we have a looming |
You would not believe it ;) bytestring/Data/ByteString/Short/Internal.hs Lines 293 to 299 in 4e62154
Just to be clear: the decision to expand |
That instance could move to |
As part of my efforts of the abstract filepath proposal, I have added a lot of missing functions from the ShortByteString modules and published it as the shortbytestring package.
Since I'm now maintainer of
filepath
I will move the AFPP forward inside filepath via new modules (that will require patches to unix and Win32). I believe it makes sense to move all ShortByteString stuff into the newshortbytestring
package, because:For backwards compat, bytestring could/should re-export the current API, no more, no less. That also means shortbytestring can't depend on it and the functions
toShort
andfromShort
will have to live inbytestring
for the foreseeable future. This can be added to the module documentation to avoid confusion.The code sharing between both packages is otherwise negligible, imo.
This also means that
shortbytestring
will have to become a boot package (and be properly reviewed/audited beforehand).@Bodigrim @bgamari
The text was updated successfully, but these errors were encountered: