Skip to content

Latest commit

 

History

History
208 lines (121 loc) · 5.86 KB

Week 5 : ML Project.md

File metadata and controls

208 lines (121 loc) · 5.86 KB

Week 5 : ML Project

Why do so many projects fail?

  • ML is still reserch - you shouldn’t aim for 100% succss rate
  • But many are doomed to fail:
    • Technically infeasible or poorly scoped
    • Never make the leap to production
    • Unclear success criteria
    • Poor team management

Module Overview

  • Lifechcle
  • Prioritizing Projects
  • Archetypes
  • Metrics
    • 최적화하는 과정에서 뽑아보는 single number
  • Baselines
    • model이 well performing하고 있는지 확인하기 위해


1. Lifecycle

  • How to think about all of the activities in an ML project

1

2

3

4

5

6

7

8

9

10

11


What else do you need to know?

  • Understand state of the art in your domain
    • Understand what’s possible
    • Know what to try next
  • most promising research areas


2. Prioritizing Project

  • Assessing the feasibility and impact of your projects

Key Points

A) High-impact ML problems

  • Friction in your product
  • Complex parts of your pipeline
  • Places where cheap prediction is valuable
  • What else are people doing?

B) Cost of ML project is driven by data avilability. Also consider accuracy requirements and intrinsic difficulty of the problem


General framework for prioritizing

12


Mental models for high-impact ML projects

  • Where can you take advantage of cheap prediction?
  • Where is there friction in your product?
  • Where can you automate complicated manual processes?
  • What are other people doing?

What does ML make economically feasible?

13

Assessing feasibility of ML projects

14

What is still hard in ML?

  • Unsupervised learning
  • Reinforcement learning

→ Both are showing promise in limited domains where tons of data and compute are available


How to run a ML feasibility assessment

  1. Are you sure you need ML at all?

  2. Put in the work up-front to define success criteria with all of the stakeholders

  3. Consider the ethics of using ML

  4. Do a literature review

  5. Try to rapidly build a labeled benchmark dataset

  6. Build a “minimal” viable product (e.g., manual rules)

  7. Are you sure you need ML at all?



3. Archetypes

  • The main categories of ML projects, and the implications for project management

15

16



4. Metrics

  • How to pick a single number to optimize

Key points

A. The real world is messy ; you usually care about lots of metrics

B. However, ML systems work best when optimizing a single number

C. As a result, you need to pick a formula for combining metrics

D. This formula can and will change


How to combine metrics

  • Simple average / weighted average
  • Threshold n-1 metrics, evaluate the nth
  • More complex / domain-specific formula


5. Baselines

  • How to know if your model is performing well

Key points

A. Baselilnes give you a lower bound on expected model perfrormance

B. The tighter the lower bound, the more useful the baseline(e.g, published results, carefully tuned pipelines, & human baselines are better)


Where to look for baselines

17



Conclusion

18

Where to go to learn more

  • Andrew Ng’s “Machine Learning Yearning”
  • Andrej Karpathy’s “Software 2.0”
  • Agrawal’s “The Economics of AI”
  • Chip Huyen’s “Introdution to Machine Learning Systems Design”
  • Apple’s “Human Interface Guidelines for Machine Learning”
  • Google’s “Rules of Machine Learning”


Reference