-
Notifications
You must be signed in to change notification settings - Fork 14
/
ch7.Rmd
158 lines (119 loc) · 4.36 KB
/
ch7.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
Nonlinear Models
========================================================
Here we explore the use of nonlinear models using some tools in R
```{r}
require(ISLR)
attach(Wage)
```
Polynomials
------------
First we will use polynomials, and focus on a single predictor age:
```{r}
fit=lm(wage~poly(age,4),data=Wage)
summary(fit)
```
The `poly()` function generates a basis of *orthogonal polynomials*.
Lets make a plot of the fitted function, along with the standard errors of the fit.
```{r fig.width=7, fig.height=6}
agelims=range(age)
age.grid=seq(from=agelims[1],to=agelims[2])
preds=predict(fit,newdata=list(age=age.grid),se=TRUE)
se.bands=cbind(preds$fit+2*preds$se,preds$fit-2*preds$se)
plot(age,wage,col="darkgrey")
lines(age.grid,preds$fit,lwd=2,col="blue")
matlines(age.grid,se.bands,col="blue",lty=2)
```
There are other more direct ways of doing this in R. For example
```{r}
fita=lm(wage~age+I(age^2)+I(age^3)+I(age^4),data=Wage)
summary(fita)
```
Here `I()` is a *wrapper* function; we need it because `age^2` means something to the formula language,
while `I(age^2)` is protected.
The coefficients are different to those we got before! However, the fits are the same:
```{r}
plot(fitted(fit),fitted(fita))
```
By using orthogonal polynomials in this simple way, it turns out that we can separately test
for each coefficient. So if we look at the summary again, we can see that the linear, quadratic
and cubic terms are significant, but not the quartic.
```{r}
summary(fit)
```
This only works with linear regression, and if there is a single predictor. In general we would use `anova()`
as this next example demonstrates.
```{r}
fita=lm(wage~education,data=Wage)
fitb=lm(wage~education+age,data=Wage)
fitc=lm(wage~education+poly(age,2),data=Wage)
fitd=lm(wage~education+poly(age,3),data=Wage)
anova(fita,fitb,fitc,fitd)
```
### Polynomial logistic regression
Now we fit a logistic regression model to a binary response variable,
constructed from `wage`. We code the big earners (`>250K`) as 1, else 0.
```{r}
fit=glm(I(wage>250) ~ poly(age,3), data=Wage, family=binomial)
summary(fit)
preds=predict(fit,list(age=age.grid),se=T)
se.bands=preds$fit + cbind(fit=0,lower=-2*preds$se,upper=2*preds$se)
se.bands[1:5,]
```
We have done the computations on the logit scale. To transform we need to apply the inverse logit
mapping
$$p=\frac{e^\eta}{1+e^\eta}.$$
(Here we have used the ability of MarkDown to interpret TeX expressions.)
We can do this simultaneously for all three columns of `se.bands`:
```{r}
prob.bands=exp(se.bands)/(1+exp(se.bands))
matplot(age.grid,prob.bands,col="blue",lwd=c(2,1,1),lty=c(1,2,2),type="l",ylim=c(0,.1))
points(jitter(age),I(wage>250)/10,pch="|",cex=.5)
```
Splines
-------
Splines are more flexible than polynomials, but the idea is rather similar.
Here we will explore cubic splines.
```{r}
require(splines)
fit=lm(wage~bs(age,knots=c(25,40,60)),data=Wage)
plot(age,wage,col="darkgrey")
lines(age.grid,predict(fit,list(age=age.grid)),col="darkgreen",lwd=2)
abline(v=c(25,40,60),lty=2,col="darkgreen")
```
The smoothing splines does not require knot selection, but it does have a smoothing parameter,
which can conveniently be specified via the effective degrees of freedom or `df`.
```{r}
fit=smooth.spline(age,wage,df=16)
lines(fit,col="red",lwd=2)
```
Or we can use LOO cross-validation to select the smoothing parameter for us automatically:
```{r}
fit=smooth.spline(age,wage,cv=TRUE)
lines(fit,col="purple",lwd=2)
fit
```
Generalized Additive Models
---------------------------
So far we have focused on fitting models with mostly single nonlinear terms.
The `gam` package makes it easier to work with multiple nonlinear terms. In addition
it knows how to plot these functions and their standard errors.
```{r fig.width=10, fig.height=5}
require(gam)
gam1=gam(wage~s(age,df=4)+s(year,df=4)+education,data=Wage)
par(mfrow=c(1,3))
plot(gam1,se=T)
gam2=gam(I(wage>250)~s(age,df=4)+s(year,df=4)+education,data=Wage,family=binomial)
plot(gam2)
```
Lets see if we need a nonlinear terms for year
```{r}
gam2a=gam(I(wage>250)~s(age,df=4)+year+education,data=Wage,family=binomial)
anova(gam2a,gam2,test="Chisq")
```
One nice feature of the `gam` package is that it knows how to plot the functions nicely,
even for models fit by `lm` and `glm`.
```{r fig.width=10, fig.height=5}
par(mfrow=c(1,3))
lm1=lm(wage~ns(age,df=4)+ns(year,df=4)+education,data=Wage)
plot.gam(lm1,se=T)
```