Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using DBIConnection with dplyr causes problems when using SQLite functions #3093

Closed
shntnu opened this issue Sep 10, 2017 · 4 comments
Closed

Comments

@shntnu
Copy link

shntnu commented Sep 10, 2017

dplyr docs mention that the modern best practice is to use tbl() directly on a DBIConnection, but that seems to be causing problems when using SQLite extension functions like log. Why?

Approach 1 does not work. This is the recommended approach as mentioned in
dplyr docs:

df <- data.frame(x = 10)

db1 <- DBI::dbConnect(RSQLite::SQLite(), ":memory:")

dplyr::copy_to(db1, df)

df1 <- dplyr::tbl(db1, "df")

df1 %>% dplyr::mutate(y = log(x)) %>% dplyr::collect()
#> Error in rsqlite_send_query(conn@ptr, statement) : no such function: log

Approach 2 works. This is the deprecated approach as mentioned in dplyr docs:

df <- data.frame(x = 10)

db2 <- dplyr::src_sqlite(":memory:", create = TRUE)

df2 <- dplyr::copy_to(db2, df)

df2 %>% dplyr::mutate(y = log(x)) %>% dplyr::collect()
#> # A tibble: 1 x 2
#>       x        y
#>   <dbl>    <dbl>
#> 1    10 2.302585

Here is the environment

> sessionInfo()
R version 3.3.3 (2017-03-06)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X El Capitan 10.11.6
> packageVersion("dplyr")
[1] ‘0.7.2’
> packageVersion("DBI")
[1] ‘0.7’
> packageVersion("dbplyr")
[1] ‘1.1.0’
@JohnMount
Copy link

JohnMount commented Sep 10, 2017

I think your documentation reading is accurate, and I also see the same problem when re-running your example. So I think there is definitely an issue.

I think directly calling RSQLite::initExtension(db1) may be a work-around:

suppressPackageStartupMessages(library("dplyr"))
df <- data.frame(x = 10)
db1 <- DBI::dbConnect(RSQLite::SQLite(), ":memory:")
# print(src_sqlite)
RSQLite::initExtension(db1)
df1 <- dplyr::copy_to(db1, df)

df1 %>% dplyr::mutate(y = log(x)) 
#> # Source:   lazy query [?? x 2]
#> # Database: sqlite 3.19.3 [:memory:]
#>       x        y
#>   <dbl>    <dbl>
#> 1    10 2.302585

packageVersion("dplyr")
#> [1] '0.7.2.9000'
packageVersion("dbplyr")
#> [1] '1.1.0.9000'
packageVersion("DBI")
#> [1] '0.7'

I got the work-around by calling print(src_sqlite). I have seen mysterious warning messages about extensions being or not-being loaded when working with other dbplyr data sources, but haven't been able to capture a reproducible example.

@shntnu
Copy link
Author

shntnu commented Sep 10, 2017

Thanks! Indeed RSQLite::initExtension(db1) does the trick, at least for now.

@JohnMount
Copy link

@shntnu , You are very welcome. Finding this may save me some debugging. I think we are probably on the same page that we don't trust work-arounds to age well.

@hadley
Copy link
Member

hadley commented Oct 23, 2017

Duplicate of #3150, and moved to r-dbi/RSQLite#236.

@hadley hadley closed this as completed Oct 23, 2017
@lock lock bot locked as resolved and limited conversation to collaborators Jun 7, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants