-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Leverage on giscoR #264
Leverage on giscoR #264
Conversation
Excellent work, thank you very much! I tested this version and everything seems to be working smoothly. I was left thinking about issue #240 since now there's a difference between datasets read from the cache and datasets downloaded from GISCO. Like this: Downloaded from GISCO by using resolution "20":
Read from cache by using resolution "60":
So the difference being mainly in the position of the "geo" column. I think having "geometry" column as the last column of the "sf" "data.frame" object would be desirable. Another question is whether it's actually needed, but then again someone needed it in the first place since it was added there. Does @muuankarski have any input on this? |
I wasn’t aware of #240, let me have a look |
Hi @pitkant After 3a6e48f the output of the spatial objects are consistent in the number, name and position of the columns. If I check for all years: years <- c('2003','2006','2010','2013','2016','2021')
allyears <- lapply(years, function(x){
tb <- suppressMessages(eurostat::get_eurostat_geospatial(country = "LU",
nuts_level = 0, resolution = 20,
update_cache = TRUE))
# Prepare summary
df <- data.frame(column = names(tb),
ncol = seq_len(ncol(tb)))
df <- tidyr::pivot_wider(df, names_from = column, values_from = ncol)
df$year <- x
df
})
dplyr::bind_rows(allyears) |>
dplyr::relocate(year, .before = 1) |>
knitr::kable(caption = "Number indicates the position of the column in the object")
Number indicates the position of the column in the object Created on 2023-07-31 with reprex v2.0.2 All the objects have the same columns in the same position. Note that, as mentioned in #240, for some years (i.e., from 2003 to 2013) there are columns that are not provided (i.e., eurostat::get_eurostat_geospatial(year = 2010) |>
dplyr::glimpse()
#> Extracting data using giscoR package, please report issues on https://github.com/rOpenGov/giscoR/issues
#> Rows: 1,920
#> Columns: 12
#> $ id <chr> "AT111", "AT112", "AT113", "AT121", "AT122", "AT123", "AT12…
#> $ LEVL_CODE <dbl> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,…
#> $ NUTS_ID <chr> "AT111", "AT112", "AT113", "AT121", "AT122", "AT123", "AT12…
#> $ CNTR_CODE <chr> "AT", "AT", "AT", "AT", "AT", "AT", "AT", "AT", "AT", "AT",…
#> $ NAME_LATN <chr> "Mittelburgenland", "Nordburgenland", "Südburgenland", "Mos…
#> $ NUTS_NAME <chr> "Mittelburgenland", "Nordburgenland", "Südburgenland", "Mos…
#> $ MOUNT_TYPE <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
#> $ URBN_TYPE <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
#> $ COAST_TYPE <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
#> $ FID <chr> "AT111", "AT112", "AT113", "AT121", "AT122", "AT123", "AT12…
#> $ geo <chr> "AT111", "AT112", "AT113", "AT121", "AT122", "AT123", "AT12…
#> $ geometry <MULTIPOLYGON [°]> MULTIPOLYGON (((16.648 47.4..., MULTIPOLYGON (… Created on 2023-07-31 with reprex v2.0.2 |
Thank you for the fixes! The output looks nice and consistent now. I'm still not sure about adding duplicated columns in the dataset for supposedly easier joins, especially if geo and FID are both duplicates, but I guess it doesn't add too much overhead either. I added a note about geo column having the "Questioning" status |
Thanks @pitkant , I consider this PR ready now. |
R/get_eurostat_geospatial.R
Outdated
if (!requireNamespace("sf")) { | ||
message("'sf' package is required for geospatial functionalities") | ||
return(invisible()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though this PR was already merged I noticed a logical problem that originated here:
!requireNamespace("sf")
check is not run if use_local == TRUE
This causes problems later on in rows 265:267 (see another comment)
R/get_eurostat_geospatial.R
Outdated
sf_col <- attr(shp, "sf_column") | ||
# Remove geometry | ||
shp <- sf::st_drop_geometry(shp) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though this PR was already merged I noticed a logical problem that originated here:
!requireNamespace("sf")
check is not run if use_local == TRUE
If sf is not installed then sf::st_drop_geometry(shp)
can't be run. Something like shp$geometry <- NULL
could do the same in this situation, I guess?
Hi @antagomir @pitkant
This PR is related with #230 and #263
Basically now
get_eurostat_geospatial
is a wrapper ofgiscoR::gisco_get_nuts
. I skimmed the function as much as I can so issues on that function would be addressed now in giscoR, reducing the burden of maintenance by eurostat package.I also reviewed the dependencies removing a bunch of them, updated the vignettes and the actions and deleted unused files in
docs/vignettes/revdep
.I checked the package and revdeps and all it’s fine. Summary of changes:
get_eurostat_geospatial()
now leverages ongiscoR::gisco_get_nuts()
fordownloading geospatial data:
"spdf"
output class soft-deprecated, it would return asf
object with a message.make_valid
parameter soft-deprecated....
to the function so additional parametes can be passed togiscoR::gisco_get_nuts()
.eurostat_geodata_60_2016
updated.