-
Notifications
You must be signed in to change notification settings - Fork 0
/
README.Rmd
196 lines (138 loc) · 4.87 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# kewr
<!-- badges: start -->
[![R build status](https://github.com/barnabywalker/kewr/workflows/R-CMD-check/badge.svg)](https://github.com/barnabywalker/kewr/actions)
<!-- badges: end -->
An R package to access data from RGB Kew's APIs.
## Overview
kewr is meant to make accessing data from one of RGB Kew easier and to provide a consistent interface their public APIs.
This package should cover:
- [x] [World Checklist of Vascular Plants](https://wcvp.science.kew.org/)
- [x] [Plants of the World Online](http://powo.science.kew.org/)
- [x] [International Plant Names Index](https://www.ipni.org/)
- [x] [Kew Names Matching Service](http://namematch.science.kew.org/)
- [x] [Kew's Tree of Life](https://treeoflife.kew.org)
- [x] [Kew Reconciliation Service](http://data1.kew.org/reconciliation/about/IpniName)
New sources will be added as they come up.
## Installation
kewr is not on CRAN yet but you can install the latest development version from GitHub:
``` r
# install.packages("devtools")
devtools::install_github("barnabywalker/kewr")
```
## Usage
Functions in this package all start with a prefix specifying what action you want to perform and a suffix referring to the resource.
Four of the resources (POWO, WCVP, IPNI, and ToL) are databases storing flora, taxonomic, nomenclatural, or genetic information. These three resources all have a `search_*` and `lookup_*`.
### Retrieving records
The `lookup_` functions can be used to retrieve a particular record by its unique IPNI ID:
``` r
lookup_powo("320035-2")
lookup_wcvp("320035-2")
lookup_ipni("320035-2")
```
IPNI contains records for authors and publications, which can also be retrieved using the `lookup_ipni` function:
``` r
lookup_ipni("20885-1", type="author")
lookup_ipni("987-2", type="publication")
```
The ToL uses its own ID system. These IDs can be found by first searching the database.
``` r
lookup_tol("2717")
```
### Searching databases
All four of these databases can be searched as well:
``` r
search_powo("Poa annua")
search_wcvp("Poa annua")
search_ipni("Poa annua")
search_tol("Poa annua")
```
And all, except the ToL, use filters and keywords for more advanced searches:
``` r
search_powo(list(genus="Poa", distribution="Madagascar"),
filters=c("accepted", "species"))
search_wcvp(list(genus="Poa"), filters=c("accepted", "species"))
search_ipni(list(genus="Poa", published=1920),
filters=c("species"))
```
The number of search results returned are determined by the `limit` keyword:
```r
search_powo(list(genus="Poa"), limit=20)
search_wcvp(list(genus="Poa"), limit=20)
search_ipni(list(genus="Poa"), limit=20)
search_tol("Poa", limit=20)
```
The next page for a set of search results can be requested using the `request_next` function:
```r
results <- search_powo(list(genus="Poa"))
request_next(results)
```
### Loading data from ToL
Tree and gene data can be loaded directly from ToL into R.
For instance, you can load the whole Tree of Life.
``` r
load_tol()
```
Or a gene tree for a particular gene.
``` r
gene_info <- lookup_tol("51", type="gene")
load_tol(gene_info$tree_file_url)
```
Or a FASTA file for a specimen.
``` r
specimen_info <- lookup_tol("1296")
load_tol(specimen_info$fasta_file_url)
```
### Downloading from the ToL
The corresponding files can also be downloaded for use later or in other programmes.
``` r
specimen_info <- lookup_tol("1296")
download_tol(specimen_info$fasta_file_url)
```
### Downloading the WCVP
The whole of WCVP can be download to a directory using:
``` r
download_wcvp()
```
### Matching names
The KNMS resource is only used for matching names to records in POWO/WCVP:
```r
match_knms(c("Poa annua", "Magnolia grandifolia", "Bulbophyllum sp."))
```
Single names can also be matched to IPNI using the KRS resources.
``` r
match_krs("Poa annua")
```
KRS is slower for matching many names, as a request needs to be made for each one.
But it has the advantage of allowing more complex matching:
```r
match_krs(list(genus="Solanum", species="sanchez-vegae", author="S.Knapp"))
```
### Tidying results
Each function in this package returns an object that stores the original response as well as the content of the response parsed into a list. This is to give the user as much flexibility as possible and to make debugging things a bit easier.
But this can be hard to use, so all the results objects can be tidied as a `tibble`:
``` r
results <- search_powo("Poa annua")
tidy(results)
```
## Citing
You can get information about how to cite `kewr` by using:
```r
citation("kewr")
```
You can also get the citation to use for each data service using the different results objects:
```
r <- search_wcvp("Poa")
kew_citation(r)
```