-
Notifications
You must be signed in to change notification settings - Fork 1
/
bacteria_processing_1.Rmd
55 lines (41 loc) · 1.76 KB
/
bacteria_processing_1.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
---
title: "bacteria_processing_1"
author: "Ilya"
date: "1/28/2019"
output: github_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
####install and load required packages
```{r packages, echo=FALSE}
source("packages_bacteria1.R")
```
###1. Get bacteria-caused dx in GIDEON
###2. Match dx to pathogen spp. names
###3. Match spp. names (from GIDEON zdx and GMPD) to traits (in GMPD & other)
###4. Compile master list of bacteria spp & traits
###5. Feature construction with bacterial traits
###6. Data visualization: summary “state of knowledge” on bacteria causing disease in mammals or humans
###7. Use traits to predict transmissibility and human disease outcomes
###2. Match dx to pathogen spp. names
####Include non-zoonotic pathogenic bacteria from GMPD
####get zoonotic and non-zoonotic pathogens from GMPD and assign taxonomic ID only for those that are species-level
select only bacteria using info in G_taxonomy
```{r R_name2taxid_GMPD}
source("R_name2taxid_GMPD.R")
# source(paste0(wd,"/R_name2taxid_GMPD.R"))
# ../CODE/DATA_PROCESSING/
```
####Remove records that are NA for tax_id. classify pathogens in GMPD to order, family, genus, species (official species name may be different from entered name). Make G_classified.Rdata and save
```{r R_classify_bacteria_observed_GMPD}
source("R_classify_bacteria_observed_GMPD.R")
```
####label pathogens as zoonotic, human only, or wild only. output: bacteria_pathogenic_mammals. Input: G_classified, df_all.Rdata
```{r R_human_zoonotic_non_zoonotic}
source("R_human_zoonotic_non_zoonotic1.R")
```
####check that all pathogenic bacteria are present in list of all bacteria. Output: bacteria_species_out2.Rdata (put this on dropbox)
```{r}
source("R_graph_bacteria_order_pathogen_status2.R")
```