Skip to content

Latest commit

 

History

History
42 lines (27 loc) · 1.5 KB

README.md

File metadata and controls

42 lines (27 loc) · 1.5 KB

GitHub Repository for the CAS Applied Data Science

Below you will find an overview of the different modules which will be part of this CAS program

Module 1: Data Acquisition and Management

  • Learn to understand** different data sources and types** and how to design data management models and plans

Module 2: Statistical Inference for Data Science

  • Gain basic understanding of statistical modules used for analysis and descriptive statistics

Module 3: Data Analysis and Machine Learning

  • Overview of machine learning pipelines and their implementation with scikit-learn
  • Regression and** Classification**: linear models and logistic regression
  • Decision trees & random forest models
  • Principal component analysis (PCA) and non-linear embeddings (t-SNE and UMAP)
  • Clustering with K-means and Gaussian mixtures
  • Artificial Neural networks as general fitters, fully connected nets used to classify the fashion-MNIST dataset
  • Scikit-learn and clustering maps, Q&A

Module 4: Ethics and Best Practices

  • Create GitHub repository for your CAS material and projects
  • Document repository and subfolders with Readme files

Module 5: Peer Consulting and Selected Readings

  • Peer knowledge exchange and consultation groups
  • Discussion and Collaboration with peers on key concepts and practical applications

Module 6: Deep Learning

  • TensorFlow for deep learning applications