You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Checking for missing values while performing calculations adds extra overhead. The overhead is particularly expensive when working with floats, i.e. checking for NA_real_ and NaN. I've already tried to minimize the cost of this in the code.
In R 3.5.0 (April 2018), R introduced ALTREP. In ALTREP, objects may carry internal flags indicating whether or not it knows that there are NAs in the object. More specifically, you can ask an SEXP x object if it has "no NAs" via ALTREP functions INTEGER_NO_NA(x), LOGICAL_NO_NA(x), and REAL_NO_NA(x). These boolean functions will return instantly with:
1 meaning "there is certainly no NAs", and
0 meaning "it is unknown whether or not there are NAs".
Important: Note to self, make sure that the above interpretations are correct. I'm a bit surprised there are not three possible return values here: (i) definitely no NAs, (ii) definitely NAs, and (iii) unknown.
In other words, if it is known that there are no NAs, we can skip checking for NAs while doing the calculations, which should be much faster. For example, if we run with na.rm = TRUE, we can internally run with much faster na.rm = FALSE if ALTREP tells us there are no NAs.
In branch feature/ALTREP-NAs, I've started to explore these ALTREP functions. I've already made sure I can write the code such that it is backward compatible with R (< 3.5.0). Thus far I've implemented it for the functions that I already prepared with an internal hasNA flag but the plan is to implement it wherever possible.
The text was updated successfully, but these errors were encountered:
Checking for missing values while performing calculations adds extra overhead. The overhead is particularly expensive when working with floats, i.e. checking for
NA_real_
andNaN
. I've already tried to minimize the cost of this in the code.In R 3.5.0 (April 2018), R introduced ALTREP. In ALTREP, objects may carry internal flags indicating whether or not it knows that there are NAs in the object. More specifically, you can ask an SEXP
x
object if it has "no NAs" via ALTREP functionsINTEGER_NO_NA(x)
,LOGICAL_NO_NA(x)
, andREAL_NO_NA(x)
. These boolean functions will return instantly with:1
meaning "there is certainly no NAs", and0
meaning "it is unknown whether or not there are NAs".Important: Note to self, make sure that the above interpretations are correct. I'm a bit surprised there are not three possible return values here: (i) definitely no NAs, (ii) definitely NAs, and (iii) unknown.
In other words, if it is known that there are no NAs, we can skip checking for NAs while doing the calculations, which should be much faster. For example, if we run with
na.rm = TRUE
, we can internally run with much fasterna.rm = FALSE
if ALTREP tells us there are no NAs.In branch feature/ALTREP-NAs, I've started to explore these ALTREP functions. I've already made sure I can write the code such that it is backward compatible with R (< 3.5.0). Thus far I've implemented it for the functions that I already prepared with an internal
hasNA
flag but the plan is to implement it wherever possible.The text was updated successfully, but these errors were encountered: