I realized that attempting the same problem multiple times is vital to preparation. It is not an exercise in coding but an exercise in mental visualization. Solving them again in regular intervals makes you familiar with the data structures and helps to visualize them better. Once you get comfortable with visualizing them you will be able to manipulate them.
- Machine Learning
- Supervised
- Classification
- Regression
- Unsupervised
- Clustering
- Reinforcement
- Dimensionality Reduction
- Principle component analysis
- Deep Learning
- Data Mining
- Probability and Statistics
- Linear Algebra and Differential Calculus
- Programming in Python, R, SAS, Java
- Databases – MySQL
- Big Data Analytics
- Data Visualization
- Tableau
- Power BI
- Matplotlib, ggplot, seaborn
- Data Analysis
- Feature Engineering
- Data Wrangling
- Exploratory Data Analysis
- Deployment
- AWS, Azure, Google Cloud Platform(GCP), Flask, Django
- Housing price prediction problem
- Iris Dataset - identify the flower based on sepal length and sepal width.
- Try to apply reverse engineering on the usecases
- PyCharm
- Jupyter
- Spyder
- VS Code
- RStudio (for R programming)
- Beautiful Soup
- scrapy
- urllib
- Kaggle
-
Programming in either Python or R a. prefer Python for Jobs. b. R for Scientific and Academic purposes. c. Python is a bit more versatile
-
have a very basic understanding of statistics
-
Basic foundational knowledge of data science core subjects
-
doing real-world projects is the most effective way to grasp this field
-
You should learn just enough programming and statistics to explore your own projects
-
knowledge through very introductory online courses a. like the micro courses on Kaggle b. 365 data science
-
Kaggle is the best as they have large amounts of datasets and also has analysis for all the projects
-
Kaggle is a public forum for people to submit their analysis of shared data sets
-
We can see the code of established data scientists
-
From this you can see what packages they used the way that they explore the data
-
the different ways that they optimize the algorithms that they use
-
follow along with a few more of these advanced notebooks
-
then I would recommend you starting on your own basic projects
-
I made a video about the three beginner projects that I recommend and
-
Split your time about 50/50 between working on your own projects and other people's code
-
Of the new things that I saw in these more advanced workbooks to the code that
-
Along learning this way you'll see when different people use algorithms and different packages
-
I recommend compiling a list of all the different things you see you should go
-
through the source code of all these different things and try to grasp how they're constructed
-
frankly if you can understand the source code for an algorithm you functionally understand
-
the math behind it is still good to supplement this information with some actual theory using Wikipedia or some math textbooks but that will give you only theoretical knowledge behind the logic actually implemented.
References: