forked from rapidsai/cudf
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
De-DOS line-endings (rapidsai#14880)
These are the only two files in the repo (other than the sphinx make.bat files, which should have DOS line-endings) that use \r\n as the line-ending. Let's fix that. Authors: - Lawrence Mitchell (https://github.com/wence-) Approvers: - David Wendt (https://github.com/davidwendt) - Nghia Truong (https://github.com/ttnghia) URL: rapidsai#14880
- Loading branch information
Showing
2 changed files
with
315 additions
and
315 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,23 +1,23 @@ | ||
# Unicode Limitations | ||
|
||
The strings column currently supports only UTF-8 characters internally. | ||
For functions that require character testing (e.g. cudf::strings::all_characters_of_type()) or | ||
case conversion (e.g. cudf::strings::capitalize(), etc) only the 16-bit [Unicode 13.0](http://www.unicode.org/versions/Unicode13.0.0) | ||
character code-points (0-65535) values are supported. | ||
Case conversion and character testing on characters above code-point 65535 are not supported. | ||
|
||
Case conversions that are context-sensitive are not supported. Also, case conversions that result | ||
in multiple characters are not reversible. That is, adjacent individual characters will not be case converted | ||
to a single character. For example, converting character ß to upper case will result in the characters "SS". But converting "SS" to lower case will produce "ss". | ||
|
||
Strings case and type APIs: | ||
|
||
- cudf::strings::all_characters_of_type() | ||
- cudf::strings::to_upper() | ||
- cudf::strings::to_lower() | ||
- cudf::strings::capitalize() | ||
- cudf::strings::title() | ||
- cudf::strings::swapcase() | ||
|
||
Also, using regex patterns that use the shorthand character classes `\d \D \w \W \s \S` will include only appropriate characters with | ||
code-points between (0-65535). | ||
# Unicode Limitations | ||
|
||
The strings column currently supports only UTF-8 characters internally. | ||
For functions that require character testing (e.g. cudf::strings::all_characters_of_type()) or | ||
case conversion (e.g. cudf::strings::capitalize(), etc) only the 16-bit [Unicode 13.0](http://www.unicode.org/versions/Unicode13.0.0) | ||
character code-points (0-65535) values are supported. | ||
Case conversion and character testing on characters above code-point 65535 are not supported. | ||
|
||
Case conversions that are context-sensitive are not supported. Also, case conversions that result | ||
in multiple characters are not reversible. That is, adjacent individual characters will not be case converted | ||
to a single character. For example, converting character ß to upper case will result in the characters "SS". But converting "SS" to lower case will produce "ss". | ||
|
||
Strings case and type APIs: | ||
|
||
- cudf::strings::all_characters_of_type() | ||
- cudf::strings::to_upper() | ||
- cudf::strings::to_lower() | ||
- cudf::strings::capitalize() | ||
- cudf::strings::title() | ||
- cudf::strings::swapcase() | ||
|
||
Also, using regex patterns that use the shorthand character classes `\d \D \w \W \s \S` will include only appropriate characters with | ||
code-points between (0-65535). |
Oops, something went wrong.