MultSampSize is a Shiny app which can be used to calculate the sample size required for studies using a latent variable to analyse mixed outcome endpoints in the form of:
- Co-primary - requires an effect in all outcomes
- Multiple primary - requires an effect in at least one outcome
- Composite - requires an effect in some overall combination of the outcomes
The latent variable model employed for the mixed endpoints assumes the discrete variables are manifestations of latent continuous variables. The observed continuous and latent continuous outcomes are assumed to follow a multivariate normal distribution. More details on sample size determination using these models is available at https://arxiv.org/abs/1912.05258.
The tutorial below provides step-by-step guidance for using the MultSampSize Shiny app. In the case that further queries arise about the functionality of the app for specific applications, contact Martina McMenamin at [email protected].
To access the MultSampSize GUI, go to https://martinamcm.shinyapps.io/multsampsize/.
As the composite case requires pilot data to get the covariance components, an example dataset for a composite with one continuous and one binary component is available in the repository.
The underlying model assumed for analysis is the same for each of the endpoints considered, however it is employed in different ways. The left side bar on the landing page allows the user to choose the relevant endpoint from co-primary, multiple primary and composite endpoints as shown.
The functionality for co-primary and multiple primary endpoints is the same. The co-primary case is demonstrated here with an example.
The user should begin by choosing the number of continuous and binary outcomes that make up the endpoint, which can be one or two continuous and zero or one binary measures. Clicking the 'Generate Model' button shows a summary of the latent variable model used and the power function calculated. This is shown below for the case when the co-primary endpoint is made up of two continuous and one binary outcome.
Note that the model employed is the same for the co-primary and multiple primary endpoints however the power function differs.
After the structure of the endpoint has been selected, the following parameters are user specified: δk, the risk difference; πTk and πCk, the probability that observed binary outcome Yik is equal to 1 for patient i; σk, the standard deviation in outcome k; ρkl, the correlation between outcome k and l.
The alpha level, desired power and maximum number of subjects can be adjusted in the 'Sample Size Estimation' panel as shown below. The resulting power curve is displayed and is updated when these or the model parameters are adjusted. The number of subjects required per arm is stated below the plot, as shown.
For composite responder endpoints, the user must select the number of continuous and binary components along with the dichotomisation threshold for each of the continuous measures.
After selecting 'Get Model' a model summary, treatment effect definition and the power function will be displayed in the 'Model Summary' panel.
The user must upload pilot data and click 'Obtain estimates' to provide parameter estimates using the panel as shown. A loading bar will be shown when the model is being fitted and a table with parameter estimates from both the latent approach and standard binary approach will be displayed when it is complete. An example dataset for the case of two continuous and one binary components is provided in the repository.
The uploaded data must have columns ordered as follows:
- Patient id
- Observed continuous outcomes
- Observed binary outcome (if any)
- Baseline measures for observed continuous measures
The 'Sample Size Estimation' panel shows a plot of the power curve for both the latent variable model and the standard binary model, using the δ and σ estimates from the panel above. The one-sided significance level, power target and number of patients on the x-axis can be altered using the controls shown.
Note that the efficiency gains offered by the latent variable method depend on many factors, such as the dichotomisation thresholds, probability of response in each arm and the correlation between components. To illustrate this the power curve for the same data with different response thresholds is shown below.