Improved README

JBGruber · Jul 13, 2021 · c0377b7 · c0377b7
1 parent 7cd995f
commit c0377b7
Show file tree

Hide file tree

Showing 4 changed files with 59 additions and 28 deletions.
diff --git a/.Rbuildignore b/.Rbuildignore
@@ -3,3 +3,4 @@
 ^README\.Rmd$
 /tests/local-files
 ^\.github$
+^codecov\.yml$
diff --git a/README.Rmd b/README.Rmd
@@ -11,14 +11,28 @@ knitr::opts_chunk$set(
   fig.path = "man/figures/README-",
   out.width = "100%"
 )
+
+knit_print.tbl_df = function(x, ...) {
+  res = paste(c("", "", knitr::kable(x)), collapse = "\n")
+  knitr::asis_output(res)
+}
+
+registerS3method(
+  "knit_print", "tbl_df", knit_print.tbl_df,
+  envir = asNamespace("knitr")
+)
 ```
 
 # paperboy
 
 <!-- badges: start -->
 [![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)
+[![R-CMD-check](https://github.com/JBGruber/paperboy/workflows/R-CMD-check/badge.svg)](https://github.com/JBGruber/paperboy/actions)
+[![Codecov test coverage](https://codecov.io/gh/JBGruber/paperboy/branch/main/graph/badge.svg)](https://codecov.io/gh/JBGruber/paperboy?branch=main)
 <!-- badges: end -->
 
+[![Twitter](https://img.shields.io/twitter/url/https/twitter.com/JohannesBGruber.svg?style=social&label=Follow%20%40JohannesBGruber)](https://twitter.com/JohannesBGruber)
+
 The philosophy of `paperboy` is that the package is a comprehensive collection of webscraping scripts for news media sites.
 Many data scientist and researchers write their own code when they have to retrieve news media content from websites.
 At the end of research projects, this code is often collecting digital dust on researchers hard drives instead of being made public for others to use.
@@ -50,7 +64,7 @@ Notice, that the function had no problem reading the link, even though it was sh
 `paperboy` is an unfinished and even highly experimental package at the moment.
 You will therefore often encounter this warning:
 
-```{r nomethod}
+```{r nomethod, results="hide"}
 deliver(url = "google.com")
 ```
 
@@ -71,19 +85,19 @@ tibble::tribble(
 ```
 
 Since some outlets will give you additional information, the `misc` column was included so these can be retained.
-If you have a scaper you want to contribute, look in the list below if it already exists.
+If you have a scraper you want to contribute, look in the list below if it already exists.
 If it does not yet exist, you can become a co-author of this package by adding it via a pull request.
 
-# Available Scrapers
+## Available Scrapers
 
 ```{r available, echo=FALSE}
 tibble::tribble(
-  ~domain,                ~status, ~author,
-  "theguardian.com",      "Broken",  "Johannes B. Gruber",
-  "huffingtonpost.co.uk", "Broken",  "Johannes B. Gruber",
-  "buzzfeed.com",         "Broken",  "Johannes B. Gruber",
-  "forbes.com",           "Broken",  "Johannes B. Gruber",
-)
+  ~domain,                ~status,   ~author,              ~note,
+  "theguardian.com",      "Broken",  "Johannes B. Gruber", "[#1](https://github.com/JBGruber/paperboy/issues/1)",
+  "huffingtonpost.co.uk", "Broken",  "Johannes B. Gruber", "[#1](https://github.com/JBGruber/paperboy/issues/1)",
+  "buzzfeed.com",         "Broken",  "Johannes B. Gruber", "[#1](https://github.com/JBGruber/paperboy/issues/1)",
+  "forbes.com",           "Broken",  "Johannes B. Gruber", "[#1](https://github.com/JBGruber/paperboy/issues/1)",
+) 
 ```
 
 - **Gold**: Runs without any issues

diff --git a/README.md b/README.md
@@ -7,8 +7,13 @@
 
 [![Lifecycle:
 experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)
+[![R-CMD-check](https://github.com/JBGruber/paperboy/workflows/R-CMD-check/badge.svg)](https://github.com/JBGruber/paperboy/actions)
+[![Codecov test
+coverage](https://codecov.io/gh/JBGruber/paperboy/branch/main/graph/badge.svg)](https://codecov.io/gh/JBGruber/paperboy?branch=main)
 <!-- badges: end -->
 
+[![Twitter](https://img.shields.io/twitter/url/https/twitter.com/JohannesBGruber.svg?style=social&label=Follow%20%40JohannesBGruber)](https://twitter.com/JohannesBGruber)
+
 The philosophy of `paperboy` is that the package is a comprehensive
 collection of webscraping scripts for news media sites. Many data
 scientist and researchers write their own code when they have to
@@ -39,12 +44,12 @@ links to a media article to the main function, `deliver`:
 library(paperboy)
 df <- deliver("https://tinyurl.com/386e98k5")
 df
-#> # A tibble: 1 x 8
-#>   url   expanded_url domain datetime headline author text  misc            
-#>   <lgl> <lgl>        <lgl>  <lgl>    <lgl>    <lgl>  <lgl> <list>          
-#> 1 NA    NA           NA     NA       NA       NA     NA    <tibble [1 × 1]>
 ```
 
+| url                            | expanded\_url                                                                     | domain              | status | datetime | headline | author | text | misc |
+|:-------------------------------|:----------------------------------------------------------------------------------|:--------------------|:-------|:---------|:---------|:-------|:-----|:-----|
+| <https://tinyurl.com/386e98k5> | <https://www.theguardian.com/tv-and-radio/2021/jul/12/should-marge-divorce-homer> | www.theguardian.com | NA     | NA       | NA       | NA     | NA   | 200  |
+
 The returned `data.frame` contains important meta information about the
 news items and their full text. Notice, that the function had no problem
 reading the link, even though it was shortened. `paperboy` is an
@@ -55,7 +60,6 @@ therefore often encounter this warning:
 deliver(url = "google.com")
 #> Warning in deliver.default(u, ...): No method for www.google.com yet. Url
 #> ignored.
-#> # A tibble: 0 x 0
 ```
 
 If you enter a vector of multiple URLs, the unsupported ones will be
@@ -67,27 +71,25 @@ column will be different from `200` and contain `NA`s.
 
 Every webscraper should retrieve a `tibble` with the following format:
 
-    #> # A tibble: 2 x 9
-    #>   url     expanded_url domain  status  datetime  headline author text  misc     
-    #>   <chr>   <chr>        <chr>   <chr>   <chr>     <chr>    <chr>  <chr> <chr>    
-    #> 1 charac… character    charac… integer as.POSIX… charact… chara… char… list     
-    #> 2 the or… the full url the do… http s… publicat… the hea… the a… the … all othe…
+| url                                 | expanded\_url | domain     | status           | datetime             | headline     | author     | text          | misc                                                                      |
+|:------------------------------------|:--------------|:-----------|:-----------------|:---------------------|:-------------|:-----------|:--------------|:--------------------------------------------------------------------------|
+| character                           | character     | character  | integer          | as.POSIXct           | character    | character  | character     | list                                                                      |
+| the original url fed to the scraper | the full url  | the domain | http status code | publication datetime | the headline | the author | the full text | all other information that can be consistently found on a specific outlet |
 
 Since some outlets will give you additional information, the `misc`
-column was included so these can be retained. If you have a scaper you
+column was included so these can be retained. If you have a scraper you
 want to contribute, look in the list below if it already exists. If it
 does not yet exist, you can become a co-author of this package by adding
 it via a pull request.
 
-# Available Scrapers
+## Available Scrapers
 
-    #> # A tibble: 4 x 3
-    #>   domain               status author            
-    #>   <chr>                <chr>  <chr>             
-    #> 1 theguardian.com      Broken Johannes B. Gruber
-    #> 2 huffingtonpost.co.uk Broken Johannes B. Gruber
-    #> 3 buzzfeed.com         Broken Johannes B. Gruber
-    #> 4 forbes.com           Broken Johannes B. Gruber
+| domain               | status | author             | note                                                 |
+|:---------------------|:-------|:-------------------|:-----------------------------------------------------|
+| theguardian.com      | Broken | Johannes B. Gruber | [\#1](https://github.com/JBGruber/paperboy/issues/1) |
+| huffingtonpost.co.uk | Broken | Johannes B. Gruber | [\#1](https://github.com/JBGruber/paperboy/issues/1) |
+| buzzfeed.com         | Broken | Johannes B. Gruber | [\#1](https://github.com/JBGruber/paperboy/issues/1) |
+| forbes.com           | Broken | Johannes B. Gruber | [\#1](https://github.com/JBGruber/paperboy/issues/1) |
 
 -   **Gold**: Runs without any issues
 -   **Silver**: Runs with some issues

diff --git a/codecov.yml b/codecov.yml
@@ -0,0 +1,14 @@
+comment: false
+
+coverage:
+  status:
+    project:
+      default:
+        target: auto
+        threshold: 1%
+        informational: true
+    patch:
+      default:
+        target: auto
+        threshold: 1%
+        informational: true