-
Notifications
You must be signed in to change notification settings - Fork 1
Parameters
philippa1812 edited this page Jul 9, 2024
·
18 revisions
Each workflow contains parameters which can be adjusted in a configuration file or in the graphical user interface of msiFlow. Every workflow directory contains a data subfolder with an example config.yaml.
Important note: One common parameter of all workflows is data
which defines the path to the input files. This parameter is only intended for the local version of msiFlow. In the Docker version the data path is specified in the command which launches the graphical user interface or the command which executes a workflow using the command-line interface. Therefore don't change this parameter in the Docker version!
Parameter | Type | Default Value | Description |
---|---|---|---|
threshold_algorithm | categorical | otsu |
select a thresholding algorithm for segmentation from the following: otsu yen isodata mean minimum triangle
|
gauss_sigma | numeric | 1 |
sigma of Gaussian filter (increase for more smoothing) |
min_size | numeric | 10 |
min. object size in pixels in the segmentation |
img_channels_to_segment | string | 1 |
comma-separated list of image channels to segment in the TIF stacks |
Parameter | Type | Default Value | Description |
---|---|---|---|
dot_size | numeric | 5 |
dot size of scatter plots |
metric | categorical | cosine |
select a UMAP distance metric from the following: euclidean chebyshev cosine correlation
|
n_neighbors | numerical | 3 |
size of local neighborhood UMAP will look at |
min_dist | numerical | 0.0 |
min. distance apart that point are allowed to be in low dim. UMAP representation |
use_model | boolean | False |
set to True to use a pre-trained UMAP model |
min_samples | numeric | 30 |
min. number of neighbors to a core point in HDBSCAN clustering |
min_cluster_size | numeric | 500 |
min. size of an HDBSCAN cluster |
Parameter | Type | Default Value | Description |
---|---|---|---|
model | categorical | XGBoost |
select a classification model from the following: XGBoost , LGBoost , AdaBoost , CatBoost , GBoost , RandomForest
|
img_channels | string | '' |
comma-separated list of image channels to classify in the TIF stacks (leave this string empty if image channels are provided as individual TIF files) |
multiclass | boolean | True |
set to True to perform multi-classification if you have multiple image channels/classes |
class_balancing_method | categorical | weights |
if your classes are unevenly distributed select a method to tackle class imbalance from the following: smote , undersample , oversample , weights else select standard to not use any class imbalance methods |
num_top_feat | numeric | 10 |
number of top features to plot from classification model |
save_ion_imgs | boolean | False |
set to True to save the ion images of top features |
save_umap_imgs | boolean | True |
set to True to save the UMAP images of top features (requires umap_data.csv in input folder) |
n_folds | numeric | 0 |
to perform stratified k-fold cross-validation set the number of folds or set to 0 to not perform cross-validation (time-consuming) |
annotate | boolean | True |
set to True to annotate molecules (requires annotation.csv in input folder) |
Parameter | Type | Default Value | Description |
---|---|---|---|
radius | numeric | 100 |
radius of rolling-ball background subtraction |
sigma | numeric | 1 |
sigma of Gaussian filter (increase for more smoothing) |
lower_perc | percentage | 0.0 |
lower percentage of intensity values which should be suppressed for contrast enhancement |
upper_perc | percentage | 99.9 |
upper percentage of intensity values which should be suppressed for contrast enhancement |
af_chan | numeric | 1 |
image number/channel containing the autofluorescence image after image stack creation (generated in alphabetical order) |
mask_val_chan | numeric | 2 |
image slice number of TIF stack containing the mask which is used for validating the registration result (requires a mask provided in input folder for MSI) |
Parameter | Type | Default Value | Description |
---|---|---|---|
matrix_removal | boolean |
True |
set to True to apply matrix removal |
peak_filtering | boolean |
True |
set to True to apply peak filtering |
norm | boolean |
True |
set to True to apply normalisation |
outlier_removal | boolean |
False |
set to True to apply outlier removal |
deisotoping | boolean |
True |
set to True to apply deisotoping |
Parameter | Type | Default Value | Description |
---|---|---|---|
snr | numeric |
3 |
signal-to-noise threshold for peaks |
smooth | binary | 1
|
set to 1 to enable Savitzky-Golay smoothing and set to 0 to disable Savitzky-Golay smoothing |
window | numeric | 11
|
length of the Savitzky-Golay filter window |
order | numeric | 3
|
order of the polynomial of Savitzky-Golay filter |
Parameter | Type | Default Value | Description |
---|---|---|---|
num_pixel_percentage | percentage |
100 |
percentage of pixels to consider for building common m/z vector (decrease this value if you have low RAM) |
mz_resolution | numeric | 0.005
|
bin size in Da of histogram which is used to build common m/z vector |
pixel_percentage | percentage | 3
|
min. percentage of m/z to form common m/z vector |
max_shift | numeric | 0.01
|
max. shift in Da to shift peaks |
Parameter | Type | Default Value | Description |
---|---|---|---|
clustering | boolean |
True |
set to `True to use clustering for matrix/off-tissue identification |
dim_reduction | categorical | umap
|
select a method to reduce spectra from the following: umap , t-sne or `pca |
n_components | numeric | 2
|
number of components of dim. reduction |
metric | categorical | cosine
|
select a UMAP distance metric from the following: euclidean chebyshev cosine correlation |
n_neighbors | numeric | 100
|
size of local neighborhood UMAP will look at |
min_dist | numeric | 0.0 |
min. distance apart that points are allowed to be in low dim. representation (UMAP) |
cluster_algorithm | categorical |
hdbscan |
select a cluster algorithm to cluster the low-dim. representation from the following: hierarchical k-means gaussian_mixture hdbscan |
min_cluster_size | numeric |
100 |
min. size of an HDBSCAN cluster |
min_samples | numeric | 500
|
min. number of neighbors to a core point |
matrix_corr_thr | percentage | 0.7 |
clusters with this Spearman correlation threshold to the initial off-tissue cluster are combined to an extended matrix cluster |
pixel_perc_thr | percentage |
30 |
pixel percentage threshold of clusters to extend off-tissue cluster |
matrix_postproc | boolean |
True |
set to True for post-processing of the matrix/off-tissue image |
pixel_removal | boolean |
True |
set to True to remove matrix/off-tissue pixels from each dataset |
matrix_subtraction | boolean |
False |
set to True to subtract the mean matrix spectrum from each pixel spectrum |
num_matrix_peaks | numeric |
0 |
set number of top matrix peaks to remove |
matrix_peak_removal | boolean | False |
set to True to remove matrix peaks |
matrix_mzs | list |
'' |
list of known matrix m/z values |
Parameter | Type | Default Value | Description |
---|---|---|---|
method | categorial |
mfc |
select a method from the following: mfc sum mean
|
Parameter | Type | Default Value | Description |
---|---|---|---|
method | categorial |
mfc |
select a method from the following: mfc sum mean
|
Parameter | Type | Default Value | Description |
---|---|---|---|
sum | categorical |
max |
select a method to summarize the spatial coherence over all samples from the following: min mean max
|
thr | numeric |
500 |
filter peaks based on this defined spatial coherence threshold |
Parameter | Type | Default Value | Description |
---|---|---|---|
n_neighbors | numeric |
10 |
size of local neighborhood UMAP will look at |
min_cluster_size | numeric |
5000 |
min. size of an HDBSCAN cluster |
min_samples | numeric | 1000 |
min. number of neighbors to a core point |
cluster_thr | percentage |
70 |
cluster pixel percentage which must be covered by one sample to be considered a sample-specific cluster (SSC) |
sample_thr | percentage |
70 |
sample pixel percentage which must be covered by SSC pixels to be considered a sample outlier |
remove_scc | boolean |
True |
set to True to remove SSC pixels |
Parameter | Type | Default Value | Description |
---|---|---|---|
tolerance | numeric |
0.01 |
the tolerance used to match isotopic peaks |
min_isotopes | numeric |
2 |
min. number of isotopic peaks |
max_isotopes | numeric |
6 |
max. number of isotopic peaks |
openMS | boolean |
True |
set to True to use openMS routine |
Parameter | Type | Default Value | Description |
---|---|---|---|
multi_sample | boolean | False |
set to True to perform multi-sample segmentation, set to False to perform segmentation on each sample individually |
dot_size | numeric | 1 |
dot size of scatter plots |
dim_reduction method | categorical | umap |
select a method for dimensionality reduction from the following: pca t-sne umap or set to '' to not perform dimensionality reduction |
n_components | numeric | 2 |
number of components to reduce the data |
metric | categorical | cosine |
select a UMAP distance metric from the following: euclidean chebyshev cosine correlation
|
n_neighbors | numerical | 15 |
size of local neighborhood UMAP will look at |
min_dist | numerical | 0.0 |
min. distance apart that point are allowed to be in low dim. UMAP representation |
clustering method | categorical |
k-means |
select a clustering algorithm from the following: hierarchical k-means gaussian_mixture hdbscan SA where SA performs spatial k-means (only applicable in single-sample segmentation) |
n_clusters | numeric | 3
|
number of clusters (applicable for k-means gaussian_mixture hierarchical ) |
min_cluster_size | numeric | 100 |
min. size of an HDBSCAN cluster |
min_samples | numeric | 5 |
min. number of neighbors to a core point in HDBSCAN clustering |
Parameter | Type | Default Value | Description |
---|---|---|---|
annotate | boolean |
True |
set to True to annotate molecules (requires annotation.csv in input folder) |
method | categorical |
mean |
select a method from the following to summarize the spectra in defined regions: mean median
|
fold_change_thr | numeric | 1.0 |
absolute log2 fold change threshold to define regulated molecules (value between 0.0 and 1.0) |
infected_grp | string |
UPEC |
name of the infected group (must be provided in the file names) |
control_grp | string |
control |
name of the control group (must be provided in the file names) |
save_ion_imgs | boolean |
True |
set to True to save ion images of regulated molecules |
row_norm | boolean | True |
set to True to perform row normalization for the heatmap |