Skip to content

πŸŽ“ Penny's path to a free self-taught education in Data Science!

Notifications You must be signed in to change notification settings

bradleygrant/penny-data-science

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

77 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Hi Penny dear!

You wanted an email, but this is easier. The Internet already did most of the work for me.

So, without further ado,

Welcome to Penny Data Science

A hub where you can git a look at all the courses you need

By the way, did you know you can (and should) edit this? Get your GitHub account set up and you can edit this page as you make decisions and take courses.

Contents

Goals

Simply put, what do you want to accomplish? We should spend a little bit of time trying to figure this out.

But don't feel like you have to get EVERYTHING figured out. All paths forward lead through the same waypoints. You'll need linear algebra, probability, statistics, and Python or R programming. As long as you stay in this domain, those are the next places you need to go. So if you want to make a little bit of progress, then just pick one or two and do 'em.

Finishing a bachelor's degree?

If you want to pick up degree work again next semester or in the summer, then we'll both feel a lot better if you've already done some pre-study so you don't feel overwhelmed.

Options include:

Going to grad school?

Well, you'll need to get into a grad school.

Here are a few grad school options that should appeal to you: (TODO)

  • Colorado State University -- Master of Applied Statistics (here, and online, and awesome!) | Program Page | Github Page (TODO)
  • Georgia Tech -- Online Master of Science Analysis (cheap, and online via edX! and awesome!) | Program Page | edX Page with more info | Subreddit
  • Michigan State
  • Northwestern
  • probably some other ones

In order to get in to a grad school, you'll need to take (at a minimum) the following courses:

  • Undergrad linear algebra (required by CSU and Georgia Tech)
  • Undergrad statistics (required by CSU and Georgia Tech)
  • Python programming (required by Georgia Tech who specifically recommends CS1301)

Just want to go get a data analysis job?

Then what you actually need is a data science bootcamp. (TODO)

But before that, you should take a data science bootcamp prep course. (TODO)

I want to learn all these things, and I don't want to spend even a single penny.

That's the entire point of the Open Source Society University! OSSU is a path to a free, self-taught education in Computer Science, Data Science, or a couple of other tracks. All you have to do is take the free resources and courses shown in this section down here. Sounds easy, right? Well.... yes, but you do have to take them.

Progress So Far

You've come a long way, baby. Here's what you already have DONE and out of the way:

  • Pre-calculus
  • Calculus 1
  • Calculus 2
  • Calculus 3 / multivariate
  • Differential Equations
  • Discrete Math
  • Computer Programming (Java)
  • Computer Science 1: Control Flow and Objects
  • Computer Science 2: Data Structures
  • abunchofotherwisepointlessphysicsprerequisites erm I mean
  • Physics 1
  • Physics 2

Current Classes

Things you can take right now!

Courses Duration Start Date
LAFF: Linear Algebra - Foundations to Frontiers 15 weeks August 1, 2019 (now)
MIT 600.1x: Introduction to Computer Science and Programming Using Python 9 weeks August 28, 2019

Future Classes

Things you can take in future months without any further study.

Courses Duration Start Date
MIT 643.1x: Probability: The Science and Uncertainty of Data 16 weeks January 27, 2019

Things you can take in future sessions if you take some of these other classes now.

Courses Duration Start Date
MIT 600.2x: Introduction to Computational Thinking and Data Science 9 weeks October 16, 2019

The section below is adapted from the curriculum of the

Open Source Society University

πŸ“Š Path to a free self-taught education in Data Science!

Open Source Society University - Data Science

OSSU Program Contents

About

This is a solid path for those of you who want to complete a Data Science course on your own time, for free, with courses from the best universities in the World.

In our curriculum, we give preference to MOOC (Massive Open Online Course) style courses because these courses were created with our style of learning in mind.

Are you ready to get started?

Curriculum


Linear Algebra

Courses Duration Effort
Linear Algebra - Foundations to Frontiers 15 weeks 8 hours/week
Applications of Linear Algebra Part 1 5 weeks 4 hours/week
Applications of Linear Algebra Part 2 4 weeks 5 hours/week

Single Variable Calculus

Done? Courses Duration Effort
βœ”οΈ Calculus 1A: Differentiation 13 weeks 6-10 hours/week
βœ”οΈ Calculus 1B: Integration 13 weeks 5-10 hours/week
βœ”οΈ Calculus 1C: Coordinate Systems & Infinite Series 13 weeks 6-10 hours/week

Multivariable Calculus

Done? Courses Duration Effort
βœ”οΈ MIT OCW Multivariable Calculus 15 weeks 8 hours/week

Python

Courses Duration Effort
Introduction to Computer Science and Programming Using Python 9 weeks 15 hours/week
Introduction to Computational Thinking and Data Science 9 weeks 15 hours/week
Introduction to Python for Data Science 6 weeks 2-4 hours/week
Programming with Python for Data Science 6 weeks 3-4 hours/week

Probability and Statistics

Courses Duration Effort
Probability: The Science and Uncertainty of Data 16 weeks 12 hours/week
Statistical Reasoning - weeks - hours/week
Introduction to Statistics: Descriptive Statistics 5 weeks - hours/week
Introduction to Statistics: Probability 5 weeks - hours/week
Introduction to Statistics: Inference 5 weeks - hours/week

Introduction to Data Science

Courses Duration Effort
Introduction to Data Science 8 weeks 10-12 hours/week
Data Science - CS109 from Harvard 12 weeks 5-6 hours/week
The Analytics Edge 12 weeks 10-15 hours/week

Machine Learning

Courses Duration Effort
Learning From Data (Introductory Machine Learning) [caltech] 10 weeks 10-20 hours/week
Statistical Learning - weeks 3 hours/week
Stanford's Machine Learning Course - weeks 8-12 hours/week

Project

Complete Kaggle's Getting Started and Playground Competitions

Convex Optimization

Courses Duration Effort
Convex Optimization 9 weeks 10 hours/week

Data Wrangling

Courses Duration Effort
Data Wrangling with MongoDB 8 weeks 10 hours/week

Big Data

Courses Duration Effort
Intro to Hadoop and MapReduce 4 weeks 6 hours/week
Deploying a Hadoop Cluster 3 weeks 6 hours/week

Database

Courses Duration Effort
Stanford's Database course - weeks 8-12 hours/week

Natural Language Processing

Courses Duration Effort
Deep Learning for Natural Language Processing - weeks - hours/week

Deep Learning

Courses Duration Effort
Deep Learning 12 weeks 8-12 hours/week

Capstone Project

  • Participate in Kaggle competition
  • List down other ideas

How to use this guide

Order of the classes

This guide was developed to be consumed in a linear approach. What does this mean? That you should complete one course at a time.

The courses are already in the order that you should complete them. Just start in the Linear Algebra section and after finishing the first course, start the next one.

If the course isn't open, do it anyway with the resources from the previous class.

Should I take all courses?

Yes! The intention is to conclude all the courses listed here!

Which programming languages should I use?

Python and R are heavily used in Data Science community and our courses teach you both, but...

The important thing for each course is to internalize the core concepts and to be able to use them with whatever tool (programming language) that you wish.

Prerequisite

The only things that you need to know are how to use Git and GitHub. Here are some resources to learn about them:

Note: Just pick one of the courses below to learn the basics. You will learn a lot more once you get started!

Change Log

About

πŸŽ“ Penny's path to a free self-taught education in Data Science!

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published