-
Notifications
You must be signed in to change notification settings - Fork 0
/
pca.py
27 lines (22 loc) · 772 Bytes
/
pca.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
import numpy as np
def PCA(data):
'''
PCA Principal Component Analysis
Input:
data - Data numpy array. Each row vector of fea is a data point.
Output:
eigvector - Each column is an embedding function, for a new
data point (row vector) x, y = x*eigvector
will be the embedding result of x.
eigvalue - The sorted eigvalue of PCA eigen-problem.
'''
# YOUR CODE HERE
# Hint: you may need to normalize the data before applying PCA
# begin answer
# normalize
data = data - np.mean(data, axis=1, keepdims=True)
C = np.cov(data.T)
eig_value, eig_vec = np.linalg.eig(C)
idx = np.argsort(eig_value)[::-1]
return eig_value[idx], eig_vec[:, idx]
# end answer