This library is an attempt to clean up text processing in Idris, especially with respect to text encodings.
The idea is that a string is a sequence of Unicode scalar values; this sequence can be encoded in various ways
- a sequence of bytes
- UTF-8
- UTF-16
- etc.
- a
List
of code points- like
String
in Haskell
- like
- etc.
This library should provide a convenient interface to them.
MORE DOC TBD
Data.Text.Text
as a UTF-8 specialisation ofData.Text.EncodedString
Data.Text.CodePoint.CodePoint
Data.Text.Encoding.Encoding
This is mostly an API prototype and its implementation should certainly be improved in various ways, especially wrt. performance and error checking.
(A very non-exhaustive) list of things to do:
- Create
Text
-only specialisations of the current modules (Data.Text
,Lightyear.Text
) to aid elaboration by fixing the encoding to UTF-8. (The encodinge
inEncodedString e
is sometimes not possible to infer if it is left totally general.)