The XXX Winter School of Astrophysics on Big Data in Astronomy:
This repository contains the slides from the lectures I gave during the school, and python tutorials. The topics I covered in the school:
- Introduction to Unsupervised Learning: The main differences between supervised and unsupervised learning algorithms, and the basic anatomy of unsupervised learning algorithms.
- Clustering Algorithms: K-means, Hierarchical Clustering, and Gaussian mixture models.
- Decision Trees and Random Forests:: decision trees and their advantages and disadvantages, ensemble methods, random forest, probabilistic random forest, unsupervised random forest for distance assignment.
- Dimensionality Reduction Algorithms:: principle component analysis (PCA), independent component analysis (ICA), non-negative matrix factorization (NNMF), t-distributed stochastic neighbor embedding (tSNE), uniform manifold approximation and projection (UMAP),auto-encoders, self-organizing maps (SOM), and PINK. Specifically, we discussed methods with which we can interpret the low-dimensional output of such algorithms.
- Outlier Detection Algorithms: anomaly detection with supervised learning algorithms, isolation forests, and unsupervised random forests for distance estimation.
- Neural Networks and Deep Learning tutorials by M. Huertas Company:
- Unsupervised Random Forest and distance assignment by D. Baron:
- Probabilistic Random Forest by I. Reis:
- Anaconda Python 2.7 version:
- The tutorials are available through Jupyter notebooks. The tutorials use the following packages: numpy, scipy, scikit-learn, and matplotlib.