Skip to content

CourseAce/Stanford-MachineLearning

Repository files navigation

Stanford-MachineLearning

Stanford Coursera Machine Learning
Andrew Ng

#MATLAB

  • The heart of MATLAB is matrix.
  • Default data type is double.
  • Lambda: g = arrayfun(@(x) 1/(1+exp(-x)), z);.
  • Mathematical operations use index starting from 1. And X(1, :) is different from X(1).
  • A(:) is used for matrix unrolling to vector.
  • theta'*theta is different from theta*theta'; thus theta .^2 is preferred.
  • dpquit to quit the debug mode.
  • X(2:end, :): use end for slicing.
  • Cell array is indexed by A{1}.
  • ~ to skip a return value: [U, S, ~] = svd(Sigma).
  • Matrix multiplication orders depend on whether the data point is a col vector or row vector.
  • For loop: for epsilon = min(pval):stepsize:max(pval)

Index

Linear Regression with Multiple Variables

  1. Cost function for one var
  2. Gradient descent for one var
  3. Feature normalization
  4. Cost function for multi-var
  5. Gradient descent for multi-var
  6. Normal Equations



Regularization - Logistic Regression

  1. Sigmoid function
  2. Cost function for logistic regression (LR)
  3. Gradient descent for LR
  4. Predict function (hypothesis)
  5. Cost function for regularized LR
  6. Gradient descent for regularized LR




![](http://latex.codecogs.com/gif.latex?\\frac{\\partial J(\theta)}{\partial \theta_j} = \Bigg(\frac{1}{m}\sum_{i=1}^m{\big(h_\theta(x^{(i)})-y^{(i)}\big)x_j^{(i)}}\Bigg)+\frac{\lambda}{m}\theta_j)

Neural Networks: Representation

  1. Regularized Logistic Regression
  2. One-vs-all classifier training
  3. One-vs-all classifier prediction
  4. Neural Network predict function


![](http://latex.codecogs.com/gif.latex?\\frac{\\partial J(\theta)}{\partial \theta} = \frac{1}{m}X^T\big(h_\theta(X)-y\big)+\frac{\lambda}{m}\theta)

Neural Networks: Learning

  1. Feedforward and cost function
  2. Regularized cost function
  3. Sigmoid gradient
  4. Neural Net gradient function (Backpropagation)
  5. Regularized gradient


![](http://latex.codecogs.com/gif.latex?J(\\theta)=\\frac{1}{m}\\sum_{i=1}^{m}\\sum_{k=1}^K{{\\big[-y_k^{(i)}\\log{(h_\\theta(x^{(i)}))_k}-(1-y_k^{(i)})\\log{(1-h_\\theta(x^{(i)}))_k}\\big]}}+\\frac{\\lambda}{2m}\\sum_{l}{\\sum_{j\\in (l+1)}{\sum_{k\in l}{(\Theta_{j,k}^{(l)})^2}}})

![](http://latex.codecogs.com/gif.latex?\\delta^{(l)}= (\Theta^{(l)})^T\delta^{(l+1)}\circ g'(z^{(l)}))

![](http://latex.codecogs.com/gif.latex?\\frac{\\partial}{\\partial \Theta_{ij}^{(l)}}J(\Theta)=D_{ij}^{(l)}=\frac{1}{m}\Delta_{ij}^{(l)}+\frac{\lambda}{m}\Theta_{ij}^{(l)})

Regularized Linear Regression and Bias/Variance

  1. Regularized LR, cost function (review)
  2. Regularized LR, gradient (review)
  3. Learning Curve - Bias-Variance trade-off
  4. Polynomial feature mapping
  5. Cross validation curve - (select lambda)

![](http://latex.codecogs.com/gif.latex?h_\\theta(x)=\\theta_0+\\theta_1 x_1+...+\theta_p x_p), where x_i = normalize(x .^ i)

Support Vector Machines

  1. Gaussian Kernel
  2. Parameters (C, sigma)
  3. Email preprocessing
  4. Email feature extraction

![](http://latex.codecogs.com/gif.latex?\\operatornamewithlimits{min}_\\theta C\sum_{i=1}^{m}{\big[y^{(i)}cost_1{(\theta^Tx^{(i)})}+(1-y^{(i)})cost_0{(\theta^Tx^{(i)})}\big]}+\frac{1}{2}\sum_{j=1}^n{\theta_j^2})
![](http://latex.codecogs.com/gif.latex?K_{gaussian}(x^{(i)}, x^{(j)})=\exp{\Bigg(-\frac{||x^{(i)}-x^{(j)}||^2}{2\sigma^2}\Bigg)})
![](http://latex.codecogs.com/gif.latex?\\operatornamewithlimits{min}_\\theta C\sum_{i=1}^{m}{\big[y^{(i)}cost_1{(\theta^Tf^{(i)})}+(1-y^{(i)})cost_0{(\theta^Tf^{(i)})}\big]}+\frac{1}{2}\sum_{j=1}^n{\theta_j^2})
![](http://latex.codecogs.com/gif.latex?f_k^{(i)} = K(x^{(i)}, l^{(k)}))

K-Means Clustering and PCA

  1. Find closest centroids
  2. Compute centroid means
  3. PCA
  4. Project data
  5. Recover data

![](http://latex.codecogs.com/gif.latex?c^{(i)}:= \operatornamewithlimits{argmin}_{j} ||x^{(i)}-\mu_j||^2)
![](http://latex.codecogs.com/gif.latex?\\mu_k:=\\frac{1}{C_k}\\sum_{i \in C_k}{x^{(i)}})

Anomaly Detection and Recommender Systems

  1. Estimate Gaussian parameters
  2. Select threshold
  3. Collaborative Filtering cost
  4. Collaborative Filtering gradient
  5. Regularized cost
  6. Gradient with regularization



![](http://latex.codecogs.com/gif.latex?\\frac{\\partial J}{\partial x_k^{(i)}}=\sum_{j:r(i,j)=1}{\big((\theta^{(j)})^Tx^{(i)}-y^{(i,j)}\big)\theta_k^{(j)}}+\lambda x_k^{(i)})
![](http://latex.codecogs.com/gif.latex?\\frac{\\partial J}{\partial \theta_k^{(j)}}=\sum_{i:r(i,j)=1}{\big((\theta^{(j)})^Tx^{(i)}-y^{(i,j)}\big)x_k^{(i)}}+\lambda \theta_k^{(j)})

Releases

No releases published

Packages

No packages published

Languages