updated l3

bihealth · Sep 19, 2024 · c4c717a · c4c717a
1 parent 9185989
commit c4c717a
Show file tree

Hide file tree

Showing 2 changed files with 56 additions and 4 deletions.
diff --git a/Lectures/lecture_03.html b/Lectures/lecture_03.html
@@ -9,7 +9,7 @@
 
 
 
-  <meta name="date" content="2024-09-18" />
+  <meta name="date" content="2024-09-19" />
 
   <meta name="viewport" content="width=device-width, initial-scale=1">
   <meta name="apple-mobile-web-app-capable" content="yes">
@@ -3270,7 +3270,7 @@
       <h1 data-config-title><!-- populated from slide_config.json --></h1>
 
       <p data-config-presenter><!-- populated from slide_config.json --></p>
-            <p style="margin-top: 6px; margin-left: -2px;">2024-09-18</p>
+            <p style="margin-top: 6px; margin-left: -2px;">2024-09-19</p>
           </hgroup>
   </slide>
 
@@ -3342,6 +3342,13 @@ <h1 data-config-title><!-- populated from slide_config.json --></h1>
 
 <p>Note: there are also &ldquo;base R&rdquo; functions <code>read.table</code>, <code>read.csv</code>, <code>read.tsv</code> (there is no function for reading XLS[X] files in base R). The tidyverse functions above are preferable.</p>
 
+</article></slide><slide class=""><hgroup><h2>Reading data</h2></hgroup><article  id="reading-data-1" class="smaller ">
+
+<ul>
+<li>For reading text files (csv, tsv etc.), use the <code>readr</code> package. This package is loaded automatically when you load the <code>tidyverse</code> package: <code>library(tidyverse)</code>. Then, use the functions <code>read_csv</code>, <code>read_tsv</code> etc.</li>
+<li>For reading Excel files, use the <code>readxl</code> package: <code>library(readxl)</code>. Then, use the function <code>read_excel</code>.</li>
+</ul>
+
 </article></slide><slide class=""><hgroup><h2>Where are your files - absolute vs relative paths</h2></hgroup><article  id="where-are-your-files---absolute-vs-relative-paths" class="smaller ">
 
 <ul>
@@ -3497,14 +3504,36 @@ <h1 data-config-title><!-- populated from slide_config.json --></h1>
 
 <p>(we use the back ticks because the column name contains a space)</p>
 
-</article></slide><slide class=""><hgroup><h2>table() for constructing contingency tables</h2></hgroup><article  id="table-for-constructing-contingency-tables" class="smaller ">
+</article></slide><slide class=""><hgroup><h2><code>table()</code> for overview</h2></hgroup><article  id="table-for-overview" class="smaller ">
+
+<p>When used with one argument, <code>table</code> shows how many times each value occurs:</p>
 
 <pre class = 'prettyprint lang-r'>table(myiris$Species)</pre>
 
 <pre >## 
 ##     setosa     Setosa versicolor Versicolor  virginica  Virginica 
 ##         45          5         42          8         46          4</pre>
 
+</article></slide><slide class=""><hgroup><h2><code>table()</code> for constructing contingency tables</h2></hgroup><article  id="table-for-constructing-contingency-tables" class="smaller ">
+
+<p>When used with two arguments, <code>table</code> constructs a contingency table:</p>
+
+<pre class = 'prettyprint lang-r'>library(readxl)
+meta_data &lt;- read_excel(&quot;../Datasets/meta_data_botched.xlsx&quot;)
+table(meta_data$PLACEBO, meta_data$ARM)</pre>
+
+<pre >##      
+##        A A . Agrip. AGRIPPAL control  F Fl. FLUAD  P PLACEBO
+##   0    1   1      3       34       0  2   1    35  0       0
+##   1    0   0      0        0       4  0   0     0  1      33
+##   no   0   0      0        2       0  0   0     1  0       0
+##   No   0   0      0        0       0  0   0     1  0       0
+##   NO   0   0      0        1       0  0   0     0  0       0
+##   Yes  0   0      0        0       0  0   0     0  0       1
+##   YES  0   0      0        0       0  0   0     0  0       1</pre>
+
+<p>This can tell us if there are any inconsistencies in the data.</p>
+
 </article></slide><slide class=""><hgroup><h2>Diagnosing problems</h2></hgroup><article  id="diagnosing-problems-4" class="smaller ">
 
 <ul>

diff --git a/Lectures/lecture_03.rmd b/Lectures/lecture_03.rmd
@@ -62,6 +62,14 @@ Note: there are also "base R" functions `read.table`, `read.csv`,
 `read.tsv` (there is no function for reading XLS[X] files in base R). The
 tidyverse functions above are preferable.
 
+## Reading data
+
+ * For reading text files (csv, tsv etc.), use the `readr` package. This
+   package is loaded automatically when you load the `tidyverse` package:
+   `library(tidyverse)`. Then, use the functions `read_csv`, `read_tsv` etc.
+ * For reading Excel files, use the `readxl` package: `library(readxl)`.
+   Then, use the function `read_excel`.
+
 ## Where are your files - absolute vs relative paths
 
  * absolute path start at root directory, e.g.  
@@ -208,13 +216,28 @@ summary(myiris$`Sepal Length`)
 
 (we use the back ticks because the column name contains a space)
 
-## table() for constructing contingency tables
+## `table()` for overview
 
+When used with one argument, `table` shows how many times each value
+occurs:
 
 ```{r eval=TRUE,results="markdown"}
 table(myiris$Species)
 ```
 
+## `table()` for constructing contingency tables
+
+When used with two arguments, `table` constructs a contingency table:
+
+```{r eval=TRUE,results="markdown"}
+library(readxl)
+meta_data <- read_excel("../Datasets/meta_data_botched.xlsx")
+table(meta_data$PLACEBO, meta_data$ARM)
+```
+
+This can tell us if there are any inconsistencies in the data.
+
+
 ## Diagnosing problems
 
  * The colorDF package provides a function called `summary_colorDF` which