Releases: gagolews/stringi
stringi_1.7.3
[BUGFIX] Fixed the previous patch of ICU55 causing a build failure on,
amongst others, CRAN's Solaris-based target.
stringi_1.7.2
- [BUGFIX] Workaround for a bug in
tools::checkFF
failing
whenNA_character_
is passed to.Call
.
stringi_1.7.1
What Is New in stringi
1.7.1 (2021-07-14)
-
[BACKWARD INCOMPATIBILITY]
%s$%
and%stri$%
now use the newstri_sprintf
(see below) function instead ofbase::sprintf
. -
[BACKWARD INCOMPATIBILITY, NEW FEATURE] In
stri_sub<-
andstri_sub_all<-
,
providing a negativelength
from now on does not result in the corresponding
input string being altered. -
[BACKWARD INCOMPATIBILITY, NEW FEATURE] In
stri_sub
andstri_sub_all
,
negativelength
results in the corresponding output beingNA
or not extracted at all, depending on the setting of the new argument
ignore_negative_length
. -
[BACKWARD INCOMPATIBILITY, BUGFIX, NEW FEATURE] In
stri_subset*
and their replacement versions,pattern
andvalue
cannot be longer
thanstr
(but now they are recycled if necessary). -
[BACKWARD INCOMPATIBILITY, NEW FEATURE]
stri_sub*
now accept the
from
argument being a matrix likecbind(from, length=length)
.
Unnamed columns or any other names are still interpreted ascbind(from, to)
.
Also, the new argumentuse_matrix
can be used to disable
the special treatment of such matrices. -
[DOCUMENTATION] It has been clarified that the syntax of
*_charclass
(e.g., used instri_trim*
) differs slightly from regex character
classes. -
[NEW FEATURE] #420:
stri_sprintf
(alias:stri_string_format
)
is a Unicode-aware replacement for and enhancement of the basesprintf
:
it adds a customised handling ofNA
s (on demand), computing field size
based on code point width, outputting substrings of at most given width,
variable width and precision (both at the same time), etc. Moreover,
stri_printf
can be used to display formatted strings conveniently. -
[NEW FEATURE] #153:
stri_match_*_regex
now extract capture group names. -
[NEW FEATURE] #25:
stri_locate_*_regex
now have a new argument,
capture_groups
, which allows for extracting positions of matches
to parenthesised subexpressions. -
[NEW FEATURE]
stri_locate_*
now have a new argument,get_length
,
whose setting may result in generating from-length matrices
(instead of from-to ones). -
[NEW FEATURE] #438:
stri_trans_general
now supports rule-based
as well as reverse-direction transliteration. -
[NEW FEATURE] #434:
stri_datetime_format
andstri_datetime_parse
are now vectorised also with respect to theformat
argument. -
[NEW FEATURE]
stri_datetime_fstr
has a new argument,ignore_special
,
which defaults toTRUE
for backward compatibility. -
[NEW FEATURE]
stri_datetime_format
,stri_datetime_add
, and
stri_datetime_fields
now callas.POSIXct
more eagerly. -
[NEW FEATURE]
stri_trim*
now have a new argument,negate
. -
[NEW FEATURE]
stri_replace_rstr
convertsgsub
-style replacement strings
tostri_replace
-style. -
[INTERNAL]
stri_prepare_arg*
have been refactored, buffer overruns
in the exception handling subsystem are now avoided. -
[BUGFIX] Few functions (
stri_length
,stri_enc_toutf32
, etc.)
did not throw an exception on an invalid UTF-8
byte sequence (and merely issues a warning instead). -
[BUGFIX]
stri_datetime_fstr
did not honourNA_character_
and did not parse format strings such as"%Y%m%d"
correctly.
It has now been completely rewritten (in C). -
[BUGFIX]
stri_wrap
did not recognise the width of certain Unicode sequences
correctly.
stringi_1.6.2
-
[BACKWARD INCOMPATIBILITY] In
stri_enc_list()
,
simplify
now defaults toTRUE
. -
[NEW FEATURE] #425: The outputs of
stri_enc_list()
,stri_locale_list()
,
stri_timezone_list()
, andstri_trans_list()
are now sorted. -
[NEW FEATURE] #428: In
stri_flatten
,na_empty=NA
now omits missing values. -
[BUILD TIME] #431: Pre-4.9.0 GCC has
::max_align_t
,
but notstd::max_align_t
, added a (possible) workaround, see the INSTALL
file. -
[BUGFIX] #429:
stri_width()
misclassified the width of certain
code points (including grave accent, Eszett, etc.);
General category Sk (Symbol, modifier) is no longer of width 0,
UCHAR_EAST_ASIAN_WIDTH of U_EA_AMBIGUOUS is no longer of width 2. -
[BUGFIX] #354:
ALTREP
CHARSXP
s were not copied, and thus could have been
garbage collected in the so-called meanwhile (with thanks to @jimhester).
stringi_1.6.1
What Is New in stringi
1.6.1 (2021-05-05)
-
[GENERAL] #401: stringi is now bundled with ICU4C 69.1 (upgraded from 61.1),
which is used on most Windows and OS X builds as well as on *nix systems
not equipped with system ICU. However, if the C++11 support is disabled,
stringi will be built against the battle-tested ICU4C 55.1.
The update to ICU brings Unicode 13.0 and CLDR 39 support. -
[DOCUMENTATION] A draft version of a paper on
stringi
is now available at
https://stringi.gagolewski.com/_static/vignette/stringi.pdf -
[GENERAL] stringi now requires R >= 3.1 (
CXX_STD
ofCXX11
orCXX1X
). -
[NEW FEATURE] #408:
stri_trans_casefold()
performs case folding;
this is different from case mapping, which is locale-dependent.
Folding makes two pieces of text that differ only in case identical.
This can come in handy when comparing strings. -
[NEW FEATURE] #421:
stri_rank()
ranks strings in a character vector
(e.g., for ordering data frames with regards to multiple criteria,
the ranks can be passed toorder()
, see #219). -
[NEW FEATURE] #266:
stri_width()
now supports emojis. -
[NEW FEATURE]
%s$%
and%stri$%
are now vectorised with respect to
both arguments. -
[BUGFIX]
stri_sort_key()
now outputsbytes
-encoded strings. -
[BUGFIX] #415:
locale=''
was not equivalent tolocale=NULL
instri_opts_collator()
. -
[INTERNAL] #414: Use
LEVELS(x)
macro instead of accessing(x)->sxpinfo.gp
directly (@lukaszdaniel).
stringi_1.5.3
1.5.3 (2020-09-04) CRAN
-
[NEW FEATURE] #400:
%s$%
and%stri$%
are now binary operators
that call base R'ssprintf()
. -
[NEW FEATURE] #399: The
%s*%
and%stri*%
operators can be used
in addition tostri_dup()
, for the very same purpose. -
[NEW FEATURE] #355:
stri_opts_regex()
now accepts thetime_limit
and
stack_limit
options so as to prevent malformed or malicious regexes
from running for too long. -
[NEW FEATURE] #345:
stri_startswith()
andstri_endswith()
are now equipped
with thenegate
parameter. -
[NEW FEATURE] #382: Incorrect regexes are now reported to ease debugging.
-
[DEPRECATION WARNING] #347: Any unknown option passed to
stri_opts_fixed()
,
stri_opts_regex()
,stri_opts_coll()
, andstri_opts_brkiter()
now
generates a warning. In the future, the...
parameter will be removed,
so that will be an error. -
[DEPRECATION WARNING]
stri_duplicated()
'sfromLast
argument
has been renamedfrom_last
.fromLast
is now its alias scheduled
for removal in a future version of the package. -
[DEPRECATION WARNING]
stri_enc_detect2()
is scheduled for removal in a future version of the package.
Usestri_enc_detect()
or the more targetedstri_enc_isutf8()
,
stri_enc_isascii()
, etc., instead. -
[DEPRECATION WARNING]
stri_read_lines()
,stri_write_lines()
,
stri_read_raw()
: usecon
argument instead offname
now.
The argumentfallback_encoding
is scheduled for removal and is no longer
used.stri_read_lines()
does not supportencoding="auto"
anymore. -
[DEPRECATION WARNING]
nparagraphs
instri_rand_lipsum()
has been renamed
n_paragraphs
. -
[NEW FEATURE] #398: Alternative, British spelling of function parameters
has been introduced, e.g.,stri_opts_coll()
now supports both
normalization
andnormalisation
. -
[NEW FEATURE] #393:
stri_read_bin()
,stri_read_lines()
, and
stri_write_lines()
are no longer marked as draft API. -
[NEW FEATURE] #187:
stri_read_bin()
,stri_read_lines()
, and
stri_write_lines()
now support connection objects as well. -
[NEW FEATURE] #386: New function
stri_sort_key()
for generating
locale-dependent sort keys which can be ordered at the byte level and
return an equivalent ordering to the original string (@DavisVaughan). -
[BUGFIX] #138:
stri_encode()
andstri_rand_strings()
now can generate strings of much larger lengths. -
[BUGFIX]
stri_wrap()
did not honourindent
correctly when
use_width
wasTRUE
.
stringi_1.5.2
1.5.2 (2020-09-01) CRAN
-
[NEW FEATURE] #400:
%s$%
and%stri$%
are now binary operators
that call base R'ssprintf()
. -
[NEW FEATURE] #399: The
%s*%
and%stri*%
operators can be used
in addition tostri_dup()
, for the very same purpose. -
[NEW FEATURE] #355:
stri_opts_regex()
now accepts thetime_limit
and
stack_limit
options so as to prevent malformed or malicious regexes
from running for too long. -
[NEW FEATURE] #345:
stri_startswith()
andstri_endswith()
are now equipped
with thenegate
parameter. -
[NEW FEATURE] #382: Incorrect regexes are now reported to ease debugging.
-
[DEPRECATION WARNING] #347: Any unknown option passed to
stri_opts_fixed()
,
stri_opts_regex()
,stri_opts_coll()
, andstri_opts_brkiter()
now
generates a warning. In the future, the...
parameter will be removed,
so that will be an error. -
[DEPRECATION WARNING]
stri_duplicated()
'sfromLast
argument
has been renamedfrom_last
.fromLast
is now its alias scheduled
for removal in a future version of the package. -
[DEPRECATION WARNING]
stri_enc_detect2()
is scheduled for removal in a future version of the package.
Usestri_enc_detect()
or the more targetedstri_enc_isutf8()
,
stri_enc_isascii()
, etc., instead. -
[NEW FEATURE] #398: Alternative, British spelling of function parameters
has been introduced, e.g.,stri_opts_coll()
now supports both
normalization
andnormalisation
. -
[NEW FEATURE] #393:
stri_read_bin()
,stri_read_lines()
, and
stri_write_lines()
are no longer marked as draft API.
stri_read_lines()
does not supportencoding="auto"
anymore. -
[NEW FEATURE] #187:
stri_read_bin()
,stri_read_lines()
, and
stri_write_lines()
now support connection objects as well. -
[NEW FEATURE] #386: New function
stri_sort_key()
for generating
locale-dependent sort keys which can be ordered at the byte level and
return an equivalent ordering to the original string (@DavisVaughan). -
[BUGFIX] #138:
stri_encode()
andstri_rand_strings()
now can generate strings of much larger lengths. -
[BUGFIX]
stri_wrap()
did not honourindent
correctly when
use_width
wasTRUE
.
stringi_1.5.1
1.5.1 (2020-08-31)
-
[NEW FEATURE] #400:
%s$%
and%stri$%
are now binary operators
that call base R'ssprintf()
. -
[NEW FEATURE] #399: The
%s*%
and%stri*%
operators can be used
in addition tostri_dup()
, for the very same purpose. -
[NEW FEATURE] #355:
stri_opts_regex()
now accepts thetime_limit
and
stack_limit
options so as to prevent malformed or malicious regexes
from running for too long. -
[NEW FEATURE] #345:
stri_startswith()
andstri_endswith()
are now equipped
with thenegate
parameter. -
[NEW FEATURE] #382: Incorrect regexes are now reported to ease debugging.
-
[DEPRECATION WARNING] #347: Any unknown option passed to
stri_opts_fixed()
,
stri_opts_regex()
,stri_opts_coll()
, andstri_opts_brkiter()
now
generates a warning. In the future, the...
parameter will be removed,
so that will be an error. -
[DEPRECATION WARNING]
stri_duplicated()
'sfromLast
argument
has been renamedfrom_last
.fromLast
is now its alias scheduled
for removal in a future version of the package. -
[DEPRECATION WARNING]
stri_enc_detect2()
is scheduled for removal in a future version of the package.
Usestri_enc_detect()
or the more targetedstri_enc_isutf8()
,
stri_enc_isascii()
, etc., instead. -
[NEW FEATURE] #398: Alternative, British spelling of function parameters
has been introduced, e.g.,stri_opts_coll()
now supports both
normalization
andnormalisation
. -
[NEW FEATURE] #393:
stri_read_bin()
,stri_read_lines()
, and
stri_write_lines()
are no longer marked as draft API.
stri_read_lines()
does not supportencoding="auto"
anymore. -
[NEW FEATURE] #187:
stri_read_bin()
,stri_read_lines()
, and
stri_write_lines()
now support connection objects as well. -
[NEW FEATURE] #386: New function
stri_sort_key()
for generating
locale-dependent sort keys which can be ordered at the byte level and
return an equivalent ordering to the original string (@DavisVaughan). -
[BUGFIX] #138:
stri_encode()
andstri_rand_strings()
now can generate strings of much larger lengths. -
[BUGFIX]
stri_wrap()
did not honourindent
correctly when
use_width
wasTRUE
.
stringi_1.4.6
stringi_1.4.5
v1.4.5 v1.4.5