Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better broken link detection #166

Merged
merged 3 commits into from
Dec 16, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
190 changes: 96 additions & 94 deletions _episodes/01-r-plotting.md

Large diffs are not rendered by default.

52 changes: 27 additions & 25 deletions _episodes/04-r-data-analysis.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ keypoints:



### Contents {#contents}
### Contents

1. [Getting started](#getting-started)
- [Loading in the data](#loading-in-the-data)
Expand All @@ -39,9 +39,9 @@ keypoints:
3. [Cleaning up data](#cleaning-up-data)
4. [Joining data frames](#joining-data-frames)
5. [Analyzing combined data](#analyzing-combined-data)
6. [Putting it all together](#putting-it-all-together)
6. [Finishing with Git and GitHub](#Finishing-with-Git-and-GitHub)

# Getting Started {#getting-started}
# Getting Started

First, navigate to the un-reports directory however you'd like and open `un-report.Rproj`.
This should open the un-report R project in RStudio.
Expand All @@ -51,7 +51,7 @@ Yesterday we spent a lot of time making plots in R using the ggplot2 package. Vi

First, we will create a new RScript file for our work. Open RStudio. Choose "File" \> "New File" \> "RScript". We will save this file as `un_data_analysis.R`

### Loading in the data {#loading-in-the-data}
### Loading in the data

We will start by importing the complete gapminder dataset that we used yesterday into our fresh new R session. Yesterday we did this using a "point-and-click" commands. Today let's type them into the console ourselves: `gapminder_data <- read_csv("data/gapminder_data.csv")`

Expand All @@ -76,7 +76,7 @@ library(tidyverse)


~~~
── Attaching packages ─────────────────────────────────────────────────────── tidyverse 1.3.1 ──
── Attaching packages ──────────────────────────────────────────────────────────────── tidyverse 1.3.1 ──
~~~
{: .output}

Expand All @@ -86,14 +86,14 @@ library(tidyverse)
✔ ggplot2 3.3.5 ✔ purrr 0.3.4
✔ tibble 3.1.6 ✔ dplyr 1.0.7
✔ tidyr 1.1.4 ✔ stringr 1.4.0
✔ readr 2.1.0 ✔ forcats 0.5.1
✔ readr 2.1.1 ✔ forcats 0.5.1
~~~
{: .output}



~~~
── Conflicts ────────────────────────────────────────────────────────── tidyverse_conflicts() ──
── Conflicts ─────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
~~~
Expand Down Expand Up @@ -121,7 +121,7 @@ Rows: 1704 Columns: 6


~~~
── Column specification ────────────────────────────────────────────────────────────────────────
── Column specification ─────────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr (2): country, continent
dbl (4): year, pop, lifeExp, gdpPercap
Expand Down Expand Up @@ -754,7 +754,7 @@ Notice here that we tell `pivot_wider()` which columns to pull the names we wish

Before we move on to more data cleaning, let's create the final gapminder dataframe we will be working with for the rest of the lesson!

> ## final Americas 2007 gapminder dataset
> ## Final Americas 2007 gapminder dataset
> Read in the `gapminder_data.csv` file, filter out the year 2007 and the continent "Americas." Then drop the `year` and `continent` columns from the dataframe. Then save the new dataframe into a variable called `gapminder_data_2007`.
>
> > ## Solution:
Expand All @@ -776,7 +776,7 @@ Before we move on to more data cleaning, let's create the final gapminder datafr
> >
> >
> > ~~~
> > ── Column specification ────────────────────────────────────────────────────────────────────────
> > ── Column specification ─────────────────────────────────────────────────────────────────────────────────
> > Delimiter: ","
> > chr (2): country, continent
> > dbl (4): year, pop, lifeExp, gdpPercap
Expand Down Expand Up @@ -836,7 +836,7 @@ git push
git status
```

# Cleaning up data {#cleaning-up-data}
# Cleaning up data

[*Back to top*](#contents)

Expand Down Expand Up @@ -872,7 +872,7 @@ Rows: 2133 Columns: 7


~~~
── Column specification ────────────────────────────────────────────────────────────────────────
── Column specification ─────────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr (7): T24, CO2 emission estimates, ...3, ...4, ...5, ...6, ...7
~~~
Expand Down Expand Up @@ -933,7 +933,7 @@ Rows: 2132 Columns: 7


~~~
── Column specification ────────────────────────────────────────────────────────────────────────
── Column specification ─────────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr (4): ...2, Series, Footnotes, Source
dbl (3): Region/Country/Area, Year, Value
Expand Down Expand Up @@ -971,7 +971,8 @@ dbl (3): Region/Country/Area, Year, Value

Now we get a similar Warning message as before, but the outputted table looks better.

> **Warnings and Errors: **It's important to differentiate between Warnings and Errors in R. A warning tells us, "you might want to know about this issue, but R still did what you asked". An error tells us, "there's something wrong with your code or your data and R didn't do what you asked". You need to fix any errors that arise. Warnings, are probably best to resolve or at least understand why they are coming up.
> ## Warnings and Errors
> It's important to differentiate between Warnings and Errors in R. A warning tells us, "you might want to know about this issue, but R still did what you asked". An error tells us, "there's something wrong with your code or your data and R didn't do what you asked". You need to fix any errors that arise. Warnings, are probably best to resolve or at least understand why they are coming up.
{.callout}

We can resolve this warning by telling `read_csv()` what the column names should be with the `col_names()` argument where we give it the column names we want within the c() function separated by commas. If we do this, then we need to set skip to 2 to also skip the column headings. Let's also save this dataframe to `co2_emissions_dirty` so that we don't have to read it in every time we want to clean it even more.
Expand All @@ -993,7 +994,7 @@ Rows: 2132 Columns: 7


~~~
── Column specification ────────────────────────────────────────────────────────────────────────
── Column specification ─────────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr (4): country, series, footnotes, source
dbl (3): region, year, value
Expand Down Expand Up @@ -1065,7 +1066,7 @@ co2_emissions_dirty
>
>
> ~~~
> ── Column specification ────────────────────────────────────────────────────────────────────────
> ── Column specification ─────────────────────────────────────────────────────────────────────────────────
> Delimiter: ","
> chr (4): ...2, Series, Footnotes, Source
> dbl (3): Region/Country/Area, Year, Value
Expand Down Expand Up @@ -1116,7 +1117,7 @@ co2_emissions_dirty
>
>
> ~~~
> ── Column specification ────────────────────────────────────────────────────────────────────────
> ── Column specification ─────────────────────────────────────────────────────────────────────────────────
> Delimiter: ","
> chr (4): ...2, Series, Footnotes, Source
> dbl (3): Region/Country/Area, Year, Value
Expand Down Expand Up @@ -1155,7 +1156,7 @@ co2_emissions_dirty

We previously saw how we can subset columns from a data frame using the select function. There are a lot of columns with extraneous information in this dataset, let's subset out the columns we are interested in.

> ## reviewing selecting columns
> ## Reviewing selecting columns
> Select the country, year, series, and value columns from our dataset.
>
> > ## Solution:
Expand Down Expand Up @@ -1289,7 +1290,7 @@ Excellent! The last step before we can join this data frame is to get the most d
{: .solution}


> ##
> ## Filtering rows and removing columns
> Filter out data from 2005 and then drop the year column. (Since we will have only data from one year, it is now irrelevant.)
>
> > ## Solution:
Expand Down Expand Up @@ -1345,7 +1346,7 @@ co2_emissions <- co2_emissions_dirty %>%
> **Looking at your data:** You can get a look at your data-cleaning hard work by navigating to the **Environment** tab in RStudio and clicking the table icon next to the variable name. Notice when we do this, RStudio automatically runs the `View()` command. We've made a lot of progress!
{.callout}

# Joining data frames {#joining-data-frames}
# Joining data frames

[*Back to top*](#contents)

Expand All @@ -1370,7 +1371,7 @@ Rows: 1704 Columns: 6


~~~
── Column specification ────────────────────────────────────────────────────────────────────────
── Column specification ─────────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr (2): country, continent
dbl (4): year, pop, lifeExp, gdpPercap
Expand Down Expand Up @@ -1527,7 +1528,7 @@ Rows: 2132 Columns: 7


~~~
── Column specification ────────────────────────────────────────────────────────────────────────
── Column specification ─────────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr (4): country, series, footnotes, source
dbl (3): region, year, value
Expand Down Expand Up @@ -1593,7 +1594,7 @@ Rows: 1704 Columns: 6


~~~
── Column specification ────────────────────────────────────────────────────────────────────────
── Column specification ─────────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr (2): country, continent
dbl (4): year, pop, lifeExp, gdpPercap
Expand Down Expand Up @@ -1635,7 +1636,7 @@ Rows: 1704 Columns: 6


~~~
── Column specification ────────────────────────────────────────────────────────────────────────
── Column specification ─────────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr (2): country, continent
dbl (4): year, pop, lifeExp, gdpPercap
Expand Down Expand Up @@ -1717,7 +1718,7 @@ write_csv(gapminder_co2, "data/gapminder_co2.csv")

Great - Now we can move on to the analysis!

# Analyzing combined data {#analyzing-combined-data}
# Analyzing combined data

[*Back to top*](#contents)

Expand Down Expand Up @@ -1850,6 +1851,7 @@ We see that although Canada, the United States, and Mexico account for close to


## Finishing with Git and GitHub

Awesome work! Let's make sure it doesn't go to waste. Time to add, commit, and push our changes to GitHub again - do you remember how?

> ## changing directories
Expand Down
16 changes: 8 additions & 8 deletions _episodes/05-r-markdown.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ keypoints:


### Contents
1. [What is R Markdown and why use it?](#why-use-r-markdown?)
1. [What is R Markdown and why use it?](#why-use-r-markdown)
1. [Creating a reports directory](#creating-a-reports-directory)
1. [Creating an R Markdown file](#creating-an-r-markdown-file)
1. [Basic components of R Markdown](#basic-components-of-r-markdown)
Expand All @@ -45,7 +45,7 @@ Recall that our goal is to generate a report to the United Nations on how a cou
> How do you usually share data analyses with your collaborators? Many people share them through a Word or PDF document, a spreadsheet, slides, a graphic, etc.
{: .discussion}

## What is R Markdown and why use it?
## What is R Markdown and why use it? {##why-use-r-markdown}
_[Back to top](#contents)_

In R Markdown, you can incorporate ordinary text (ex. experimental methods, analysis and discussion of results) alongside code and figures! (Some people write entire manuscripts in R Markdown.) This is useful for writing reproducible reports and publications, sharing work with collaborators, writing up homework, and keeping a bioinformatics notebook. Because the code is emedded in the document, the tables and figures are *reproducible*. Anyone can run the code and get the same results. If you find an error or want to add more to the report, you can just re-run the document and you'll have updated tables and figures! This concept of combining text and code is called "literate programming". To do this we use R Markdown, which combines Markdown (renders plain text) with R. You can output an html, PDF, or Word document that you can share with others. In fact, this webpage is an example of a rendered R markdown file!
Expand Down Expand Up @@ -417,7 +417,7 @@ Only one of the people in your pair is going to create the R Markdown file. The

**For the person who is going to collaborate with the host of the R Markdown file:**

If you don't already have your partner's GitHub repo cloned from the git/GitHub lesson, clone their repo to your Desktop under the name `USERNAME-un-report`. If you don't remember how to do this, you can review the [git lesson](_episodes/03-intro-git-github.md).
If you don't already have your partner's GitHub repo cloned from the git/GitHub lesson, clone their repo to your Desktop under the name `USERNAME-un-report`. If you don't remember how to do this, you can review the [git lesson]({{ page.root }}/03-intro-git-github).

The way you will collaborate with each other is as follows:
1. For each exercise, both people will be thinking about how to answer the question, but only one person will be writing the code.
Expand Down Expand Up @@ -458,7 +458,7 @@ First we're going to start out with a few questions about the gapminder dataset.
>
>
> ~~~
> ── Column specification ────────────────────────────────────────────────────────────────────────
> ── Column specification ─────────────────────────────────────────────────────────────────────────────────
> Delimiter: ","
> chr (2): country, continent
> dbl (4): year, pop, lifeExp, gdpPercap
Expand Down Expand Up @@ -528,7 +528,7 @@ _[Back to top](#contents)_
{: .solution}


##### Bonus questions: come back to these if you have time at the end
#### Bonus questions: come back to these if you have time at the end
_[Back to top](#contents)_

[5] In the plot above, the years look kind of messy. Can you rotate the x axis text 90 degrees so that the years are more readable? Feel free to search the internet if you don't know how to do this!
Expand Down Expand Up @@ -752,7 +752,7 @@ _[Back to top](#contents)_
>
>
> ~~~
> ── Column specification ────────────────────────────────────────────────────────────────────────
> ── Column specification ─────────────────────────────────────────────────────────────────────────────────
> Delimiter: ","
> chr (4): ...2, Series, Footnotes, Source
> dbl (3): Region/Country/Area, Year, Value
Expand Down Expand Up @@ -851,7 +851,7 @@ Now we're going to work with the CO2 and R&D datasets together.

Unfortunately, we don't have the exact same dates for all of them.

[7] First, read in the CO2 dataset. You can use the code from the [R for data analysis]({{ page.root }}/04-r-data-analysis.md) lesson to clean the CO2 data.
[7] First, read in the CO2 dataset. You can use the code from the [R for data analysis]({{ page.root }}/04-r-data-analysis) lesson to clean the CO2 data.

> ## Solution
>
Expand All @@ -876,7 +876,7 @@ Unfortunately, we don't have the exact same dates for all of them.
>
>
> ~~~
> ── Column specification ────────────────────────────────────────────────────────────────────────
> ── Column specification ─────────────────────────────────────────────────────────────────────────────────
> Delimiter: ","
> chr (4): country, series, footnotes, source
> dbl (3): region, year, value
Expand Down
2 changes: 1 addition & 1 deletion _episodes/06-conclusion.md
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ In the following, we list some strategies and resources we find useful. As you m

* [Getting started with R Markdown Online Tutorial](https://rmarkdown.rstudio.com/lesson-1.html)
* [R Markdown Cheat Sheet](https://github.com/rstudio/cheatsheets/blob/main/rmarkdown-2.0.pdf)
* [R Markdown Reference Guide](https://www.rstudio.com/wp-content/uploads/2015/03/rmarkdown-reference.pdf)
* [R Markdown Reference Guide](http://www.utstat.toronto.edu/reid/sta2201s/rmarkdown-reference.pdf)

### Free learning platforms available at U-M

Expand Down
Loading