Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERFORMANCE: Check for NAs via ALTREP, if available #191

Open
HenrikBengtsson opened this issue Dec 2, 2020 · 0 comments
Open

PERFORMANCE: Check for NAs via ALTREP, if available #191

HenrikBengtsson opened this issue Dec 2, 2020 · 0 comments

Comments

@HenrikBengtsson
Copy link
Owner

HenrikBengtsson commented Dec 2, 2020

Checking for missing values while performing calculations adds extra overhead. The overhead is particularly expensive when working with floats, i.e. checking for NA_real_ and NaN. I've already tried to minimize the cost of this in the code.

In R 3.5.0 (April 2018), R introduced ALTREP. In ALTREP, objects may carry internal flags indicating whether or not it knows that there are NAs in the object. More specifically, you can ask an SEXP x object if it has "no NAs" via ALTREP functions INTEGER_NO_NA(x), LOGICAL_NO_NA(x), and REAL_NO_NA(x). These boolean functions will return instantly with:

  • 1 meaning "there is certainly no NAs", and
  • 0 meaning "it is unknown whether or not there are NAs".

Important: Note to self, make sure that the above interpretations are correct. I'm a bit surprised there are not three possible return values here: (i) definitely no NAs, (ii) definitely NAs, and (iii) unknown.

In other words, if it is known that there are no NAs, we can skip checking for NAs while doing the calculations, which should be much faster. For example, if we run with na.rm = TRUE, we can internally run with much faster na.rm = FALSE if ALTREP tells us there are no NAs.

In branch feature/ALTREP-NAs, I've started to explore these ALTREP functions. I've already made sure I can write the code such that it is backward compatible with R (< 3.5.0). Thus far I've implemented it for the functions that I already prepared with an internal hasNA flag but the plan is to implement it wherever possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant