-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add AI onboarding directory to python directory (#13)
- Loading branch information
1 parent
51c930c
commit c09b4ec
Showing
2 changed files
with
79 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
# Machine Learning | ||
|
||
This machine learning onboarding consists on a set of free courses and competitions in [Kaggle](https://www.kaggle.com/), which requires registration (also free). | ||
|
||
## Week 1 | ||
1. [Pandas](https://www.kaggle.com/learn/pandas) | ||
- Optional, only for people with no experience with Pandas | ||
2. [Intro to Machine Learning](https://www.kaggle.com/learn/intro-to-machine-learning) | ||
- Explains the basic of machine learning, shows how to build and evaluate ML models | ||
- Focus on regression models, namely decision trees and then random forests | ||
3. [Intermediate Machine Learning](https://www.kaggle.com/learn/intermediate-machine-learning) | ||
- Explain basic data preparation | ||
- Expands on model evaluation | ||
- Also introduces gradient boosting | ||
4. [Data Cleaning](https://www.kaggle.com/learn/data-cleaning) | ||
- Continuation of presentation data preparation techniques already introduced in the previous course | ||
5. [Feature Engineering](https://www.kaggle.com/learn/feature-engineering) | ||
- Describes techniques to identify important features or potential new features | ||
- Suggests entering a competition in the end, great to apply the knowledge gathered from the previous courses | ||
6. [House Prices - Advanced Regression Techniques](https://www.kaggle.com/competitions/house-prices-advanced-regression-techniques/overview) | ||
- Competition suggested at the end of the previous course | ||
- Useful as an exercise for consolidating the knowledge from all the previous courses | ||
|
||
## Week 2 | ||
1. Continue working on your solution to the [House Prices - Advanced Regression Techniques](https://www.kaggle.com/competitions/house-prices-advanced-regression-techniques/overview) competition | ||
|
||
## Week 3 | ||
1. [Intro to Deep Learning](https://www.kaggle.com/learn/intro-to-deep-learning) | ||
- Shows how to build, evaluate and tune neural networks | ||
- In addition to regression, explains how to use neural networks to solve classification problems | ||
2. Enter one of the following courses suggested in the previous course: | ||
- [Petals to the Metal - Flower Classification on TPU](https://www.kaggle.com/c/tpu-getting-started) | ||
- [I’m Something of a Painter Myself](https://www.kaggle.com/c/gan-getting-started) | ||
- [Natural Language Processing with Disaster Tweets](https://www.kaggle.com/c/nlp-getting-started) | ||
- [Contradictory, My Dear Watson](https://www.kaggle.com/c/contradictory-my-dear-watson) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
# Large Language Models | ||
|
||
## Large Language Models (LLMs) | ||
|
||
### Relevant Topics | ||
- Pre-training | ||
- Fine-tuning | ||
|
||
### Resources | ||
| Type | Title | Comments | | ||
| ----------- | ----------- | ----------- | | ||
| Course | [Introduction to Large Language Models](https://www.cloudskillsboost.google/course_templates/539) | <ul><li>8 hour free course. Requires enrollment</li><li>Begins with a very good 15 minute video describing LLMs and their main concepts</li><li>Includes a collection of useful pages of information about LLMs.</li><li>Has a mini quiz at the end about LLMs</li></ul> | | ||
| Video | [A Hackers' Guide to Language Models](https://www.youtube.com/watch?v=jkrNMKz9pWU) | <ul><li>1h30 YouTube video</li><li>Describes LLMs and important concepts (e.g., tokenization, training, fine-tuning)</li><li>Explains some actual models (e.g., GPT-4) and their limitations</li></ul> | | ||
| Course | [Training & Fine-Tuning LLMs for Production](https://learn.activeloop.ai/courses/llms) | <ul><li>Hands-on course for training, fine-tuning and adapting LLMs to specific tasks</li><li>Running all of the course’s examples will cost around $100, although it is not necessary to complete the course</li></ul> | | ||
| Video (Playlist) | [Training & Fine-Tuning LLMs Course](https://www.youtube.com/playlist?list=PLD80i8An1OEGqqXeNZ5w0IBmeZcxpZEYL) | <ul><li>Series of 4 YouTube videos of about 1 hour each</li><li>They explain the basics of LLMs and important concepts (e.g., evaluation, data, training and fine-tuning) in a practical way</li><li>Skip video #4 since it has audio isues, video #5 is a reupload with fixed audio</li></ul> | | ||
|
||
## Retrieval Augmented Generation (RAG) | ||
|
||
### Relevant Topics | ||
- Embeddings | ||
- Vector Databases | ||
- Document Retrieval | ||
- LangChain | ||
|
||
### Resources | ||
| Type | Title | Comments | | ||
| ----------- | ----------- | ----------- | | ||
| Course | [LangChain for LLM Application Development](https://learn.deeplearning.ai/langchain/) | <ul><li>Series of 8 videos</li><li>Each video (except for the first and last) is accompanied by a notebook and you are encouraged to explore it, testing different prompts to produce different outputs</li><li>The course was partially created and taught by the creator of LangChain</li></ul> | | ||
| Course | [LangChain Chat with Your Data](https://learn.deeplearning.ai/langchain-chat-with-your-data) | <ul><li>Series of 8 videos</li><li>Follows up the first course</li><li>Thoroughly explains how to implement the RAG pipeline using LangChain, offering several different approaches for each of the steps</li><li>Explains how to produce an end-to-end chatbot that can answer questions about a certain dataset</li></ul> | | ||
| Video | [Vector Embeddings for Beginners](https://www.youtube.com/watch?v=PR7xz5vQKGg) | <ul><li>35 minute video</li><li>Covers vector embeddings, vector databases and LangChain</li></ul> | | ||
| Video | [What is Retrieval-Augmented Generation](https://www.youtube.com/watch?v=T-D1OfcDW1M) | <ul><li>6 minute video by IBM</li><li>Describes some of the challenges presented by LLMs</li><li>Describes what RAG is and how it solves those problems</li></ul> | | ||
| Video | [Chatbots with RAG: LangChain Full Walkthrough](https://www.youtube.com/watch?v=LhnCsygAvzY) | <ul><li>35 minute video</li><li>Good explanation of the RAG pipeline</li><li>Explains how to build a RAG chatbot with code</li><li>Requires API keys for OpenAI and Pinecone</li></ul> | | ||
|
||
|
||
## PandasAI | ||
|
||
### Relevant Topics | ||
- Smart Data Frames & Datalakes | ||
|
||
### Resources | ||
| Type | Title | Comments | | ||
| ----------- | ----------- | ----------- | | ||
| Video | [PandasAI - Talk to Your Data](https://www.youtube.com/watch?v=mQmRi2QTebM) | <ul><li>27 minute video, presented by the creator of PandasAI</li><li>Includes an explanation of how it works</li><li>Shows several examples of PandasAI’s functionalities</li></ul> | | ||
| Webpage | [An Introduction to Pandas AI](https://www.datacamp.com/blog/an-introduction-to-pandas-ai) | <ul><li>Short introduction to PandasAI with examples</li><li>Covers basic aspects like setting up PandasAI, prompting a dataframe for basic answers and charts</li></ul> | |