-
Notifications
You must be signed in to change notification settings - Fork 2
/
index.qmd
58 lines (48 loc) · 2.8 KB
/
index.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
# Overview {.unnumbered}
This collection of workshops provides an introduction to machine learning. The
collection has two standalone parts:
* **Overview of Machine Learning** (one 2-hour session): This is a
non-technical workshop that emphasizes building vocabulary and gaining an
intuitive understanding of machine learning concepts and methods. Start here
if you're new to machine learning and want to get a sense of what it's about
and whether it's relevant to you. There's no code in this workshop and only a
little (high-school level) math. This workshop is also good preparation for
the Machine Learning in R series.
::: {.callout-note title="Learning Goals" collapse="true"}
After completing this workshop, learners should be able to:
* Define the following terms: observation, feature, machine learning,
supervised learning, unsupervised learning, regression, classification,
clustering, training set, validation set, test set, cross-validation,
overfitting, underfitting, model bias, model variance, bias-variance
tradeoff, ensemble model.
* Explain the difference between supervised and unsupervised learning.
* Explain the difference between regression and classification.
* List and briefly describe popular machine learning methods.
* Give an example of an ensemble model.
* Explain what cross-validation is used for and give an overview of the
procedure.
* Assess whether and which machine learning methods might be helpful for a
given research problem.
:::
::: {.callout-important}
[This slide deck][overview-slides] is the only material for this workshop.
:::
* **Machine Learning in R** (two 2-hour sessions): This is a hands-on,
technical introduction to using machine learning methods in R. The two
sessions cover and include examples of supervised learning (emphasis on
classification), model evaluation, unsupervised learning (emphasis on
clustering), and dimension reduction. The sessions also provide advice for
navigating R's fractured machine learning package landscape. Intermediate
familiarity with R programming (equivalent to completing DataLab's [R Basics
workshop series][r-basics]) is required.
::: {.callout-note title="Learning Goals" collapse="true"}
After completing this series, learners should be able to:
* Build and train a classification model on their data.
* Use cross-validation to estimate accuracy and tune hyperparameters for
classification models.
* Identify strategies to improve results from classification models.
* Explain the tradeoffs between popular clustering algorithms.
* Run a clustering algorithm on their data.
:::
[overview-slides]: https://docs.google.com/presentation/d/1hQQVCuGFL5nYssFba1ZK_J-HxGiiRreV9_wx03NlPM4/edit?usp=sharing
[r-basics]: https://ucdavisdatalab.github.io/workshop_r_basics/