Provides access, download and archiving tools for Copernicus marine datasets using R language. This package has been developed primarily around the daily ocean physics forecast but with no or minimal modification it can work with other products.
In 2023/2024 Copernicus migrated to a new service model; learn more here. This migration introduced the Copernicus Marine Toolbox which provides a Python API and a command line interface (CLI). This package leverages the latter.
The toolbox is under active development, so if you are having troubles (like we have sometimes!) try reinstalling. We have some notes here.
Copernicus serves so many data resources; finding what you want can be a challenge. Check out the new Marine Data Store. And checkout the listing here.
Like data offerings from OBPG, Copernicus strives to provided consistent dataset identifiers that are easily decoded programmatically (and with practice by eye). In order to download programmatically you must have the datasetID in hand. Learn more about Copernicus nomenclature rules here.
The Copernicus Marine
Toolbox
command-line application, copernicus-marine
provides two primary
methods for donwloading data: get
and subset
. get
is not well
documented, but subset does what it implies - subsetting resources by
variable, spatial bounding box and time. This package only supports
subset
.
remotes::install_github("BigelowLab/twinkle")
remotes::install_github("BigelowLab/copernicus")
You can preconfigure a credentials file (required) and a path definition file (optional) to streamline accessing and storing data.
You must have credentials to access Copernicus holdings - if you don’t have them now please request access here.
Once you have them you can add them you need to run a
copernicusmarine
app
to set them.
If you plan to use our directory-driven database storage system then you should set the root path for the data directory. You can always change or override it, but, like credentials, storing this path in a hidden file will ease subsequent use of the functions. We don’t actually run this in the README, but you can copy-and-paste to use in R. Again, replace the path with one suiting your own situation.
set_root_path("/the/path/to/copernicus/data")
To fetch data we’ll focus on ocean physics daily forecast which serves daily mean sea surface currents. We’ll define a date range and the bounding box that covers the Gulf of Maine (gom).
suppressPackageStartupMessages({
library(stars)
library(copernicus)
library(dplyr)
})
dataset_id = "cmems_mod_glo_phy-cur_anfc_0.083deg_P1D-m"
vars = c("uo","vo")
bb = c(xmin = -72, xmax = -63, ymin = 39, ymax = 46)
depth = c(0,1) # just the top 10 meters
time = (c(0, 9) + Sys.Date()) # today - and a little ahead window
x = fetch_copernicus(dataset_id = dataset_id,
vars = vars,
bb = bb,
time = time,
cleanup = FALSE,
verbose = TRUE)
## copernicusmarine subset -i cmems_mod_glo_phy-cur_anfc_0.083deg_P1D-m --log-level QUIET -v uo -v vo -x -72.00 -X -63.00 -y 39.00 -Y 46.00 -z 0.00 -Z 10.00 -t "2024-11-15" -T "2024-11-24" --overwrite -f output.nc -o .
x
## stars object with 4 dimensions and 2 attributes
## attribute(s):
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## uo -0.6483049 -0.10751992 -0.03664098 -0.02019374 0.012626727 1.486507 216320
## vo -0.5238913 -0.09847574 -0.04307169 -0.02811467 0.004602717 1.044959 216320
## dimension(s):
## from to offset delta refsys
## x 1 109 -72.04 0.08333 WGS 84
## y 1 85 46.04 -0.08333 WGS 84
## depth 1 8 NA NA NA
## time 1 10 2024-11-15 UTC 1 days POSIXct
## values x/y
## x NULL [x]
## y NULL [y]
## depth [0.494025,1.541375),...,[9.572997,11.21643)
## time NULL
We can plot a subset of these using base graphics and the [
function.
stars
objects are indexed first by the attribute (variable) followed
by the dimensions. In this case the index order is [attribute
, x
,
y
, depth
, time
] or [attribute
, x
, y
, time
] for
single-depth objects.
itime = 1
date = format(st_get_dimension_values(x, "time")[itime], "%Y-%m-%d")
plot(x[1, , , , itime], key.lab = paste("uo for each depth on", date))
If you want to plot all of the times for a given depth you can use
slice
.
idepth = 1
depth = round(st_get_dimension_values(x, 'depth')[idepth],2)
plot(slice(x, "depth", idepth),
key.lab = paste0("uo at depth ", depth, "m"))
You can download and archive data using the database functionality provided in this package. There are a number of ways to manage suites of data, this is just one fairly light weight method.
Here, we store data in a directory tree that starts with region
at
it’s root. Within the region
we divide by year
, monthday
. Within
in each monthday
directory there are one or more files uniquely named
to provide complete identification of datasetid, time, depth, period,
variable and treatment. Each file contains one raster for one variable
at one depth and one time.
Here is an example of a file name that follows this pattern.
datasetid__time_depth_period_variable_treatment.tif
gom/2023/1205/cmems_mod_glo_phy-cur_anfc_0.083deg_P1D-m__2023-12-05T000000_0.49_day_uo_raw.tif
Here you can see that datasetid and the rest of the identifiers are
separated by a double underscore to aid in programmatic parsing. Time
includes the hour in case we ever want to download the 6-hour datasets.
The depth is rounded to three significant digits by default, but that is
configurable. The treatment, raw
, in this case means the values are as
downloaded, however, if you ever wanted to roll your own running mean
(say 8-day rolling mean) or some other statistic this naming system
provides the flexibility you will need.
NOTE Don’t forget to set your root data path.
First we define an output path for the Gulf of Maine data. The path
isn’t created until data is written to it. Then we simply call
archive_copernicus()
to automatically write individual GeoTIFF files
into a database structure. Note that we provide an identifier that
provides the provenance of the data. We receive, in turn, a table that
serves as a database.
path = copernicus_path("gom")
db = archive_copernicus(x, path = path, id = 'cmems_mod_glo_phy-cur_anfc_0.083deg_P1D-m') |>
dplyr::glimpse()
## Rows: 160
## Columns: 7
## $ id <chr> "cmems_mod_glo_phy-cur_anfc_0.083deg_P1D-m", "cmems_mod_glo_…
## $ date <date> 2024-11-15, 2024-11-15, 2024-11-16, 2024-11-16, 2024-11-17,…
## $ time <chr> "000000", "000000", "000000", "000000", "000000", "000000", …
## $ depth <chr> "0.494", "0.494", "0.494", "0.494", "0.494", "0.494", "0.494…
## $ period <chr> "day", "day", "day", "day", "day", "day", "day", "day", "day…
## $ variable <chr> "uo", "vo", "uo", "vo", "uo", "vo", "uo", "vo", "uo", "vo", …
## $ treatment <chr> "raw", "raw", "raw", "raw", "raw", "raw", "raw", "raw", "raw…
Since this is the first time you uhave downloaded and archived data, be sure to save the database.
write_database(db, path)
The database is very light and easy to filter for just the records you might need. Note that depth is a character data type; this provides you with flexibility to define depth as ‘surface’ or ‘50-75’ or something like that.
Let’s walk through reading the database, filtering it for a subset, reading the files and finally displaying.
db <- copernicus::read_database(path) |>
dplyr::filter(dplyr::between(date, Sys.Date(), Sys.Date()+1)) |>
dplyr::glimpse()
## Rows: 32
## Columns: 7
## $ id <chr> "cmems_mod_glo_phy-cur_anfc_0.083deg_P1D-m", "cmems_mod_glo_…
## $ date <date> 2024-11-15, 2024-11-15, 2024-11-16, 2024-11-16, 2024-11-15,…
## $ time <chr> "000000", "000000", "000000", "000000", "000000", "000000", …
## $ depth <chr> "0.494", "0.494", "0.494", "0.494", "1.541", "1.541", "1.541…
## $ period <chr> "day", "day", "day", "day", "day", "day", "day", "day", "day…
## $ variable <chr> "uo", "vo", "uo", "vo", "uo", "vo", "uo", "vo", "uo", "vo", …
## $ treatment <chr> "raw", "raw", "raw", "raw", "raw", "raw", "raw", "raw", "raw…
Now we can read in the files.
s = read_copernicus(db, path)
s
## stars object with 4 dimensions and 2 attributes
## attribute(s):
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## uo -0.6483049 -0.1503338 -0.05121171 -0.03224251 0.013144923 1.4865069 43264
## vo -0.4451101 -0.1224391 -0.05912938 -0.04963893 -0.007916083 0.9898032 43264
## dimension(s):
## from to offset delta refsys point
## x 1 109 -72.04 0.08333 WGS 84 FALSE
## y 1 85 46.04 -0.08333 WGS 84 FALSE
## depth 1 8 NA NA NA NA
## time 1 2 2024-11-15 UTC 1 days POSIXct NA
## values x/y
## x NULL [x]
## y NULL [y]
## depth [0.494,1.541),...,[9.573,11.216)
## time NULL