Skip to content

Latest commit

 

History

History
27 lines (15 loc) · 1.54 KB

README.md

File metadata and controls

27 lines (15 loc) · 1.54 KB

mfd_hab_classification

Repo for the classification of the MFD Ontology habitats in the Microflora Danica project and reproduces the results from "Section IV: Convergence of supervised and unsupervised habitat descriptors" of the MFD manuscripts.

Please download the input data from the Zenodo repo in the /data folder and amend the /config/hab_class.yaml file as indicated.

The script is meant to run with a SLURM system and requires mamba and conda but installs all the other required packages via mamba. The precise partition names or the resource requirements might need to be adjusted in config/config.yaml and in the .rule files in scripts/scripts_python/rules/. It will aslo try to detect automatically the location where it is. If this fails please amend the file scripts/scripts_bash/hab_class.sh as indicated there.

The results from the paper can be reproduced by running:

sbatch scripts/scripts_bash/hab_class.sh

The results will be collected in the /analysis folder.

Figures

The figures related to the habitat classification are generated by the following scripts:

  • '/scripts/scripts_R/tree_pr_auc_classes.Rmd' generates Figure 3;
  • '/scripts/scripts_R/FN_analysis.Rmd' generates Supplementary Note Figure 1b.

The input of these figure-generating scripts are some of the files generated by the first one, they location where to find them is controlled by the "data.path" variable.