-
Notifications
You must be signed in to change notification settings - Fork 1
/
Class 3.html
249 lines (210 loc) · 19.8 KB
/
Class 3.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
<!DOCTYPE html>
<!--
Google HTML5 slide template
Authors: Luke Mah?? (code)
Marcin Wichary (code and design)
Dominic Mazzoni (browser compatibility)
Charles Chen (ChromeVox support)
URL: http://code.google.com/p/html5slides/
-->
<html>
<head>
<title>Presentation</title>
<meta charset='utf-8'>
<!-- LOAD HTML5SLIDES CSS FILES AND JS -->
<link rel="stylesheet" href="html5slides/styles.css">
<script src='html5slides/slides.js'></script>
<!-- LOAD HIGHLIGHTER CSS FILES AND JS -->
<link rel="stylesheet" href="highlight/acid.css">
<!-- LOAD MATHJAX JS -->
<script type="text/x-mathjax-config">
MathJax.Hub.Config({
tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}
});
</script>
<script type="text/javascript" src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
</script>
<!-- LOAD CUSTOM CSS AND JS -->
</head>
<body style='display: none'>
<section class='slides layout-regular template-default'>
<article>
<h1>Class 3</h1>
</article>
<article>
<h1>Review</h1>
</article>
<article>
<ul>
<li>Preparing data in csv format (excel)</li>
<li>Importing data set from csv.</li>
<li>Calling variables</li>
<li>Using <strong>attach()</strong> function</li>
<li>Telling apart <strong>numeric variables</strong> and <strong>factor variable</strong></li>
<li>Using <strong>str()</strong> function</li>
<li>Using <strong>mean(), median(), sd(), range(), min(), max()</strong> functions</li>
<li>Storing result in a new variable</li>
<li>Using <strong>summary()</strong> function</li>
<li>Getting subset of the data: <strong>subset()</strong> function</li>
</ul>
</article>
<article>
<h1>1. More on factors</h1>
</article>
<article>
<h3>Converting to and from factor</h3>
<ul>
<li>Whis is the distinction between numeric and factor variable important?</li>
</ul>
<pre class="knitr"><div class="source"><span class="symbol">cscore</span> <span class="assignement"><-</span> <span class="functioncall">read.csv</span><span class="keyword">(</span><span class="string">"Class.score.csv"</span><span class="keyword">)</span>
<span class="functioncall">par</span><span class="keyword">(</span><span class="argument">mfrow</span><span class="argument">=</span><span class="functioncall">c</span><span class="keyword">(</span><span class="number">1</span><span class="keyword">,</span><span class="number">2</span><span class="keyword">)</span><span class="keyword">)</span>
<span class="functioncall">plot</span><span class="keyword">(</span><span class="symbol">cscore</span><span class="keyword">$</span><span class="symbol">class</span><span class="keyword">,</span> <span class="symbol">cscore</span><span class="keyword">$</span><span class="symbol">score</span><span class="keyword">)</span>
<span class="functioncall">plot</span><span class="keyword">(</span><span class="functioncall">as.factor</span><span class="keyword">(</span><span class="symbol">cscore</span><span class="keyword">$</span><span class="symbol">class</span><span class="keyword">)</span><span class="keyword">,</span> <span class="symbol">cscore</span><span class="keyword">$</span><span class="symbol">score</span><span class="keyword">)</span>
</div><img src="figure/unnamed-chunk-1.png" class="plot" />
</pre>
</article>
<article>
<pre class="knitr"><div class="source"><span class="functioncall">round</span><span class="keyword">(</span><span class="functioncall">coef</span><span class="keyword">(</span><span class="functioncall">summary</span><span class="keyword">(</span><span class="functioncall">lm</span><span class="keyword">(</span><span class="symbol">score</span><span class="keyword">~</span><span class="symbol">class</span><span class="keyword">,</span> <span class="argument">data</span><span class="argument">=</span><span class="symbol">cscore</span><span class="keyword">)</span><span class="keyword">)</span><span class="keyword">)</span><span class="keyword">,</span> <span class="number">3</span><span class="keyword">)</span>
</div><div class="output">## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 74.613 3.125 23.88 0.000
## class -0.015 0.504 -0.03 0.976
</div><div class="source"><span class="functioncall">round</span><span class="keyword">(</span><span class="functioncall">coef</span><span class="keyword">(</span><span class="functioncall">summary</span><span class="keyword">(</span><span class="functioncall">lm</span><span class="keyword">(</span><span class="symbol">score</span><span class="keyword">~</span><span class="functioncall">as.factor</span><span class="keyword">(</span><span class="symbol">class</span><span class="keyword">)</span><span class="keyword">,</span> <span class="argument">data</span><span class="argument">=</span><span class="symbol">cscore</span><span class="keyword">)</span><span class="keyword">)</span><span class="keyword">)</span><span class="keyword">,</span> <span class="number">3</span><span class="keyword">)</span>
</div><div class="output">## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 70.4 4.447 15.830 0.000
## as.factor(class)2 -1.4 6.290 -0.223 0.824
## as.factor(class)3 9.8 6.290 1.558 0.123
## as.factor(class)4 10.2 6.290 1.622 0.108
## as.factor(class)5 11.9 6.290 1.892 0.062
## as.factor(class)6 -4.0 6.290 -0.636 0.526
## as.factor(class)7 7.1 6.290 1.129 0.262
## as.factor(class)8 -0.2 6.290 -0.032 0.975
## as.factor(class)9 4.1 6.290 0.652 0.516
## as.factor(class)10 3.8 6.290 0.604 0.547
</div></pre>
</article>
<article>
<ul>
<li>How can we make factors?</li>
</ul>
<pre class="knitr"><div class="source"><span class="comment"># We have this character variable names 'ses'</span>
<span class="symbol">ses</span> <span class="assignement"><-</span> <span class="functioncall">c</span><span class="keyword">(</span><span class="string">"low"</span><span class="keyword">,</span> <span class="string">"middle"</span><span class="keyword">,</span> <span class="string">"low"</span><span class="keyword">,</span> <span class="string">"low"</span><span class="keyword">,</span> <span class="string">"low"</span><span class="keyword">,</span> <span class="string">"low"</span><span class="keyword">,</span> <span class="string">"middle"</span><span class="keyword">,</span> <span class="string">"low"</span><span class="keyword">,</span> <span class="string">"middle"</span><span class="keyword">,</span> <span class="string">"middle"</span><span class="keyword">,</span> <span class="string">"middle"</span><span class="keyword">,</span> <span class="string">"middle"</span><span class="keyword">,</span> <span class="string">"middle"</span><span class="keyword">,</span> <span class="string">"high"</span><span class="keyword">,</span> <span class="string">"high"</span><span class="keyword">,</span> <span class="string">"low"</span><span class="keyword">,</span> <span class="string">"middle"</span><span class="keyword">,</span> <span class="string">"middle"</span><span class="keyword">,</span> <span class="string">"low"</span><span class="keyword">,</span> <span class="string">"high"</span><span class="keyword">)</span>
<span class="symbol">ses.factor</span><span class="assignement"><-</span><span class="functioncall">factor</span><span class="keyword">(</span><span class="symbol">ses</span><span class="keyword">)</span>
<span class="symbol">ses.factor</span>
</div><div class="output">## [1] low middle low low low low middle low middle middle
## [11] middle middle middle high high low middle middle low high
## Levels: high low middle
</div></pre>
</article>
<article>
<ul>
<li>Why is the factor level "high, low, middle" not "low, middle, high"?</li>
</ul>
<pre class="knitr"><div class="source"><span class="comment"># ordered factor</span>
<span class="symbol">ses.ordered</span><span class="assignement"><-</span><span class="functioncall">ordered</span><span class="keyword">(</span><span class="symbol">ses</span><span class="keyword">,</span> <span class="argument">levels</span><span class="argument">=</span><span class="functioncall">c</span><span class="keyword">(</span><span class="string">'low'</span><span class="keyword">,</span> <span class="string">'middle'</span><span class="keyword">,</span> <span class="string">'high'</span><span class="keyword">)</span><span class="keyword">)</span>
<span class="symbol">ses.ordered</span>
</div><div class="output">## [1] low middle low low low low middle low middle middle
## [11] middle middle middle high high low middle middle low high
## Levels: low < middle < high
</div><div class="source">
<span class="symbol">cscore</span><span class="keyword">$</span><span class="symbol">class</span><span class="assignement"><-</span><span class="functioncall">ordered</span><span class="keyword">(</span><span class="symbol">cscore</span><span class="keyword">$</span><span class="symbol">class</span><span class="keyword">,</span> <span class="argument">levels</span><span class="argument">=</span><span class="number">1</span><span class="keyword">:</span><span class="number">10</span><span class="keyword">)</span>
<span class="symbol">cscore</span><span class="keyword">$</span><span class="symbol">class</span>
</div><div class="output">## [1] 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3
## [24] 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6
## [47] 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9
## [70] 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2
## [93] 3 4 5 6 7 8 9 10
## Levels: 1 < 2 < 3 < 4 < 5 < 6 < 7 < 8 < 9 < 10
</div></pre>
</article>
<article>
<ul>
<li>How can we treat a factor variable as a numeric variable?</li>
</ul>
<pre class="knitr"><div class="source"><span class="functioncall">as.numeric</span><span class="keyword">(</span><span class="symbol">cscore</span><span class="keyword">$</span><span class="symbol">class</span><span class="keyword">)</span>
</div><div class="output">## [1] 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3
## [24] 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6
## [47] 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9
## [70] 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2
## [93] 3 4 5 6 7 8 9 10
</div></pre>
</article>
<article>
<h1>2. Correlation</h1>
</article>
<article>
<ul>
<li>What is correlation?
: How much of the total variation is explained by co-variation?</li>
</ul>
<pre class="knitr"><div class="source"><span class="symbol">a</span><span class="assignement"><-</span><span class="number">1</span><span class="keyword">:</span><span class="number">100</span>
<span class="symbol">b</span><span class="assignement"><-</span><span class="symbol">a</span><span class="keyword">+</span><span class="functioncall">rnorm</span><span class="keyword">(</span><span class="number">100</span><span class="keyword">,</span> <span class="number">0</span><span class="keyword">,</span> <span class="functioncall">sqrt</span><span class="keyword">(</span><span class="symbol">a</span><span class="keyword">)</span><span class="keyword">)</span>
<span class="symbol">sample</span><span class="assignement"><-</span><span class="functioncall">data.frame</span><span class="keyword">(</span><span class="symbol">a</span><span class="keyword">,</span><span class="symbol">b</span><span class="keyword">)</span>
<span class="functioncall">plot</span><span class="keyword">(</span><span class="symbol">sample</span><span class="keyword">)</span>
</div><img src="figure/unnamed-chunk-6.png" class="plot" />
</pre>
</article>
<article>
<pre class="knitr"><div class="source"><span class="symbol">c</span><span class="assignement"><-</span><span class="functioncall">cov</span><span class="keyword">(</span><span class="symbol">sample</span><span class="keyword">)</span>
<span class="symbol">c</span><span class="keyword">[</span><span class="number">1</span><span class="keyword">,</span><span class="number">2</span><span class="keyword">]</span><span class="keyword">/</span><span class="keyword">(</span><span class="functioncall">sqrt</span><span class="keyword">(</span><span class="symbol">c</span><span class="keyword">[</span><span class="number">1</span><span class="keyword">,</span><span class="number">1</span><span class="keyword">]</span><span class="keyword">)</span><span class="keyword">*</span><span class="functioncall">sqrt</span><span class="keyword">(</span><span class="symbol">c</span><span class="keyword">[</span><span class="number">2</span><span class="keyword">,</span><span class="number">2</span><span class="keyword">]</span><span class="keyword">)</span><span class="keyword">)</span>
</div><div class="output">## [1] 0.9765
</div><div class="source"><span class="functioncall">cor</span><span class="keyword">(</span><span class="symbol">sample</span><span class="keyword">)</span>
</div><div class="output">## a b
## a 1.0000 0.9765
## b 0.9765 1.0000
</div></pre>
</article>
<article>
<ul>
<li>Correlation among multiple variables</li>
</ul>
<pre class="knitr"><div class="source"><span class="functioncall">require</span><span class="keyword">(</span><span class="symbol">ISwR</span><span class="keyword">)</span>
<span class="functioncall">data</span><span class="keyword">(</span><span class="symbol">cystfibr</span><span class="keyword">)</span>
<span class="functioncall">require</span><span class="keyword">(</span><span class="symbol">ellipse</span><span class="keyword">)</span>
<span class="functioncall">plotcorr</span><span class="keyword">(</span><span class="functioncall">cor</span><span class="keyword">(</span><span class="symbol">cystfibr</span><span class="keyword">)</span><span class="keyword">)</span>
</div><img src="figure/unnamed-chunk-9.png" class="plot" />
</pre>
</article>
<article>
<pre class="knitr"><div class="source"><span class="functioncall">pairs</span><span class="keyword">(</span><span class="symbol">cystfibr</span><span class="keyword">,</span> <span class="argument">lower.panel</span><span class="argument">=</span><span class="symbol">panel.smooth</span><span class="keyword">,</span> <span class="argument">upper.panel</span><span class="argument">=</span><span class="symbol">panel.cor</span><span class="keyword">)</span>
</div><img src="figure/unnamed-chunk-10.png" class="plot" />
</pre>
</article>
<article>
<h1>3. Simple Linear Regression</h1>
</article>
<article>
<ul>
<li>Independent variable & Dependent variable</li>
<li><p>Explanatory variable or Predictor variable</p></li>
<li><p>How can we <strong>explain</strong> or <strong>predict</strong> dependent variables from <strong>independent variables</strong>?</p></li>
</ul>
</article>
<article>
<ul>
<li><strong>Formula notation</strong>
<ul>
<li>independent variable ~ dependent variable(1) + dependent variable(2)....</li>
</ul></li>
</ul>
</article>
<article>
<pre class="knitr"><div class="source"><span class="functioncall">plot</span><span class="keyword">(</span><span class="symbol">sample</span><span class="keyword">)</span>
</div><img src="figure/unnamed-chunk-11.png" class="plot" />
</pre>
</article>
<article>
<pre class="knitr"><div class="source"><span class="symbol">model</span><span class="assignement"><-</span><span class="functioncall">lm</span><span class="keyword">(</span><span class="symbol">b</span><span class="keyword">~</span><span class="symbol">a</span><span class="keyword">,</span> <span class="argument">data</span><span class="argument">=</span><span class="symbol">sample</span><span class="keyword">)</span>
<span class="functioncall">round</span><span class="keyword">(</span><span class="functioncall">coef</span><span class="keyword">(</span><span class="functioncall">summary</span><span class="keyword">(</span><span class="symbol">model</span><span class="keyword">)</span><span class="keyword">)</span><span class="keyword">,</span> <span class="number">3</span><span class="keyword">)</span>
</div><div class="output">## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.623 1.291 0.482 0.631
## a 0.996 0.022 44.882 0.000
</div><div class="source"><span class="functioncall">plot</span><span class="keyword">(</span><span class="functioncall">density</span><span class="keyword">(</span><span class="symbol">b</span><span class="keyword">-</span><span class="functioncall">mean</span><span class="keyword">(</span><span class="symbol">b</span><span class="keyword">)</span><span class="keyword">)</span><span class="keyword">,</span> <span class="argument">ylim</span><span class="argument">=</span><span class="functioncall">c</span><span class="keyword">(</span><span class="number">0</span><span class="keyword">,</span><span class="number">0.07</span><span class="keyword">)</span><span class="keyword">,</span> <span class="argument">main</span><span class="argument">=</span><span class="string">'Density of b and residuals'</span><span class="keyword">)</span>
<span class="functioncall">lines</span><span class="keyword">(</span><span class="functioncall">density</span><span class="keyword">(</span><span class="functioncall">resid</span><span class="keyword">(</span><span class="symbol">model</span><span class="keyword">)</span><span class="keyword">)</span><span class="keyword">,</span> <span class="argument">col</span><span class="argument">=</span><span class="string">'red'</span><span class="keyword">)</span>
<span class="functioncall">text</span><span class="keyword">(</span><span class="number">50</span><span class="keyword">,</span> <span class="number">0.04</span><span class="keyword">,</span> <span class="functioncall">paste</span><span class="keyword">(</span><span class="string">'var='</span><span class="keyword">,</span> <span class="functioncall">round</span><span class="keyword">(</span><span class="functioncall">var</span><span class="keyword">(</span><span class="symbol">b</span><span class="keyword">-</span><span class="functioncall">mean</span><span class="keyword">(</span><span class="symbol">b</span><span class="keyword">)</span><span class="keyword">)</span><span class="keyword">,</span><span class="number">0</span><span class="keyword">)</span><span class="keyword">)</span><span class="keyword">)</span>
<span class="functioncall">text</span><span class="keyword">(</span><span class="number">50</span><span class="keyword">,</span> <span class="number">0.035</span><span class="keyword">,</span> <span class="functioncall">paste</span><span class="keyword">(</span><span class="string">'var='</span><span class="keyword">,</span> <span class="functioncall">round</span><span class="keyword">(</span><span class="functioncall">var</span><span class="keyword">(</span><span class="functioncall">resid</span><span class="keyword">(</span><span class="symbol">model</span><span class="keyword">)</span><span class="keyword">)</span><span class="keyword">,</span><span class="number">0</span><span class="keyword">)</span><span class="keyword">)</span><span class="keyword">,</span> <span class="argument">col</span><span class="argument">=</span><span class="string">'red'</span><span class="keyword">)</span>
</div><img src="figure/unnamed-chunk-12.png" class="plot" />
</pre>
</article>
</section>
</body>
</html>