From c4c717a865817d800d5f4cc69549f312439f7c9f Mon Sep 17 00:00:00 2001 From: January Weiner Date: Thu, 19 Sep 2024 08:35:18 +0200 Subject: [PATCH] updated l3 --- Lectures/lecture_03.html | 35 ++++++++++++++++++++++++++++++++--- Lectures/lecture_03.rmd | 25 ++++++++++++++++++++++++- 2 files changed, 56 insertions(+), 4 deletions(-) diff --git a/Lectures/lecture_03.html b/Lectures/lecture_03.html index 0c55950..5c35b52 100644 --- a/Lectures/lecture_03.html +++ b/Lectures/lecture_03.html @@ -9,7 +9,7 @@ - + @@ -3270,7 +3270,7 @@

-

2024-09-18

+

2024-09-19

@@ -3342,6 +3342,13 @@

Note: there are also “base R” functions read.table, read.csv, read.tsv (there is no function for reading XLS[X] files in base R). The tidyverse functions above are preferable.

+

Reading data

+ +
    +
  • For reading text files (csv, tsv etc.), use the readr package. This package is loaded automatically when you load the tidyverse package: library(tidyverse). Then, use the functions read_csv, read_tsv etc.
  • +
  • For reading Excel files, use the readxl package: library(readxl). Then, use the function read_excel.
  • +
+

Where are your files - absolute vs relative paths

    @@ -3497,7 +3504,9 @@

    (we use the back ticks because the column name contains a space)

    -

table() for constructing contingency tables

+

table() for overview

+ +

When used with one argument, table shows how many times each value occurs:

table(myiris$Species)
@@ -3505,6 +3514,26 @@

## setosa Setosa versicolor Versicolor virginica Virginica ## 45 5 42 8 46 4 +

table() for constructing contingency tables

+ +

When used with two arguments, table constructs a contingency table:

+ +
library(readxl)
+meta_data <- read_excel("../Datasets/meta_data_botched.xlsx")
+table(meta_data$PLACEBO, meta_data$ARM)
+ +
##      
+##        A A . Agrip. AGRIPPAL control  F Fl. FLUAD  P PLACEBO
+##   0    1   1      3       34       0  2   1    35  0       0
+##   1    0   0      0        0       4  0   0     0  1      33
+##   no   0   0      0        2       0  0   0     1  0       0
+##   No   0   0      0        0       0  0   0     1  0       0
+##   NO   0   0      0        1       0  0   0     0  0       0
+##   Yes  0   0      0        0       0  0   0     0  0       1
+##   YES  0   0      0        0       0  0   0     0  0       1
+ +

This can tell us if there are any inconsistencies in the data.

+

Diagnosing problems

    diff --git a/Lectures/lecture_03.rmd b/Lectures/lecture_03.rmd index 0a260ba..9d8be39 100644 --- a/Lectures/lecture_03.rmd +++ b/Lectures/lecture_03.rmd @@ -62,6 +62,14 @@ Note: there are also "base R" functions `read.table`, `read.csv`, `read.tsv` (there is no function for reading XLS[X] files in base R). The tidyverse functions above are preferable. +## Reading data + + * For reading text files (csv, tsv etc.), use the `readr` package. This + package is loaded automatically when you load the `tidyverse` package: + `library(tidyverse)`. Then, use the functions `read_csv`, `read_tsv` etc. + * For reading Excel files, use the `readxl` package: `library(readxl)`. + Then, use the function `read_excel`. + ## Where are your files - absolute vs relative paths * absolute path start at root directory, e.g. @@ -208,13 +216,28 @@ summary(myiris$`Sepal Length`) (we use the back ticks because the column name contains a space) -## table() for constructing contingency tables +## `table()` for overview +When used with one argument, `table` shows how many times each value +occurs: ```{r eval=TRUE,results="markdown"} table(myiris$Species) ``` +## `table()` for constructing contingency tables + +When used with two arguments, `table` constructs a contingency table: + +```{r eval=TRUE,results="markdown"} +library(readxl) +meta_data <- read_excel("../Datasets/meta_data_botched.xlsx") +table(meta_data$PLACEBO, meta_data$ARM) +``` + +This can tell us if there are any inconsistencies in the data. + + ## Diagnosing problems * The colorDF package provides a function called `summary_colorDF` which