Skip to content
/ oscar Public

CRAN R package - oscar: Optimal Subset CArdinality Regression models

Notifications You must be signed in to change notification settings

Syksy/oscar

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OSCAR models with R

Optimal Subset CArdinality Regression (OSCAR) models with R

The R extension package oscar offers L0-pseudonorm based cardinality constrained regression models, with a user-friendly R interface tied to Fortran implemented optimizers via C. The R package comes with a suite of useful model building tools, such as visualizations, diagnostics, cross-validation, and bootstrapping functions.

Citation

If you use the oscar methodology/package, please use the following citation:

Halkola AS, Joki K, Mirtti T, Mäkelä MM, Aittokallio T, Laajala TD (2023). OSCAR: Optimal subset cardinality regression using the L0-pseudonorm with applications to prognostic modelling of prostate cancer. PLoS Comput Biol 19(3): e1010333. https://doi.org/10.1371/journal.pcbi.1010333

CRAN

The oscar package is available from the Central R Archive Network (CRAN), and can be installed with:

install.packages("oscar")

Canonical direct CRAN link:

https://CRAN.R-project.org/package=oscar

GitHub installation

While the CRAN installation may be most convenient for user installation, the latest modifications and package release come through GitHub. Therefore there may be a slight lag in delivering the latest release to CRAN, and the user may wish to install the latest GitHub-version by:

install.packages("devtools")
devtools::install_github("Syksy/oscar")

Brief examples

Below are brief examples on oscar usage. More elaborate examples can be found from within the package in the examples vignette.

# Load the oscar-package

library(oscar)

# Load example dataset (consists of ex_X, ex_Y, ex_K and ex_c) for
# Cox regression

data(ex)

# Test run, notice this uses all the data! Smaller test would be
# feasible

fit <- oscar::oscar(x = ex_X, y = ex_Y, k = ex_K, w = ex_c, family = "cox")

# Show model results and other attributes

fit

# Visualize fit as a function of allowed kits

oscar::oscar.visu(fit, y = c("target", "goodness"))

# Example of cross-validation based k-selection

cv <- oscar::oscar.cv(fit, fold = 5, seed = 123)  # 5-fold cross-validation, with fixed seed  

# Visualize the cross-validation curve (highest point in c-index in
# optimal generalizable k-value)

oscar::oscar.cv.visu(cv)

# Naive example using events in logistic regression (placeholder PoC,
# end-point ought to be time-dependent rather)

fit2 <- oscar::oscar(x = ex_X, y = ex_Y[, 2], k = ex_K, w = ex_c, family = "logistic")

oscar::oscar.visu(fit2, y = c("target", "goodness"))

# Example swiss fertility data

data(swiss)

fit_mse <- oscar::oscar(x = swiss[, -1], y = swiss[, 1], family = "gaussian")

fit_mse