Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segfault cause 'memory not mapped' #322

Closed
amcox opened this issue Mar 14, 2014 · 15 comments
Closed

segfault cause 'memory not mapped' #322

amcox opened this issue Mar 14, 2014 · 15 comments
Assignees
Labels
bug an unexpected problem or unintended behavior
Milestone

Comments

@amcox
Copy link

amcox commented Mar 14, 2014

I am getting a segfault when trying to use dplyr to analyze some survey data.

Here is a minimal set of code to reproduce the error:

library(reshape2)
library(dplyr)

df <- data.frame(teacher = c('foo'),
  explains.different = c(4),
  explains.clearly = c(4),
  materials.place = c(2),
  kids.comfortable.sharing = c(3),
  respects.opinion = c(3),
  have.a.say = c(3),
  understand.mistakes = c(1),
  notes.on.work = c(4),
  other.classes = c(4),
  teacher.cares = c(4),
  pays.attention = c(4),
  bothering.me = c(4),
  stays.busy = c(3),
  students.respectful = c(4),
  classmates.behave = c(3),
  cultural.materials = c(4),
  good.job = c(NA)
)

id.cols <- c("teacher")
d <- melt(df, id.vars=id.cols, variable.name="question", value.name="response")

teacher.means <- d %.%
  group_by(question) %.%
  summarize(
    response.mean = mean(response, na.rm=T)
  ) %.%
  arrange(response.mean)

I get the following error message:

 *** caught segfault ***
address 0x60000eabfed8, cause 'memory not mapped'

Traceback:
 1: .Call("dplyr_arrange_impl", PACKAGE = "dplyr", data, args, dots)
 2: arrange_impl(.data, dots(...), environment())
 3: arrange.tbl_df(`__prev`, response.mean)
 4: arrange(`__prev`, response.mean)
 5: eval(expr, envir, enclos)
 6: eval(new_call, e)
 7: chain_q(list(substitute(x), substitute(y)), env = parent.frame())
 8: d %.% group_by(question) %.% summarize(response.mean = mean(response,     na.rm = T)) %.% arrange(response.mean)

I'm happy to help test anything needed.

@amcox
Copy link
Author

amcox commented Mar 16, 2014

I installed 0.1.3 and it does not appear to fix the issue.

@kevinushey
Copy link
Contributor

I can't replicate this with latest dplyr + latest Rcpp. Can you post your sessionInfo()? For me:

> sessionInfo()
R Under development (unstable) (2014-03-07 r65143)
Platform: x86_64-apple-darwin13.1.0 (64-bit)

locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] dplyr_0.1.3       Rcpp_0.11.1       reshape2_1.3.0.99 devtools_1.4.1.99

loaded via a namespace (and not attached):
 [1] assertthat_0.1 digest_0.6.4   evaluate_0.5.1 formatR_0.10   httr_0.2      
 [6] knitr_1.5.15   memoise_0.1    parallel_3.1.0 plyr_1.8       RCurl_1.95-4.1
[11] stringr_0.6.2  tcltk_3.1.0    tools_3.1.0    whisker_0.3-2 

@amcox
Copy link
Author

amcox commented Mar 16, 2014

This is after right before the dplyr call.

> sessionInfo()
R version 3.0.3 (2014-03-06)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] dplyr_0.1.3    reshape2_1.2.2

loaded via a namespace (and not attached):
[1] assertthat_0.1 plyr_1.8.1     Rcpp_0.11.1    stringr_0.6.2  tools_3.0.3   

@kevinushey
Copy link
Contributor

Is your version of R + all packages from CRAN? Or did you install one or more packages from source? It's possible that you're running into old vs. new compiler issues. Do you know whether you're compiling with llvm-g++4.2 or clang?

One thing you could try is reinstalling both Rcpp and dplyr from source, if you have up-to-date Apple command line tools installed, try creating a file in your home directory, ~/.R/Makevars, and placing in it:

CC=clang
CXX=clang++
PKG_CFLAGS=-g -O2
PKG_CXXFLAGS=-g -O2 -stdlib=libc++

and then, in R, running

install.packages("Rcpp", type="source")
install.packages("dplyr", type="source")

Note that if you go this route, you may have to reinstall any package depending on Rcpp.

If the problem is still reproducible, then I am stumped :)

@kevinushey
Copy link
Contributor

Ah, you're running Snow Leopard? That's a pretty old OS now and I don't think Apple distributes clang for it -- you might have to discard my previous advice on the Makevars.

@yanlinlin82
Copy link

Hi Kevin,

I reproduced this crash error on 64-bit Linux too. I tried to minimize the
crash code and found the weird number of 17 (rows of data.frame):

$ R --vanilla

R version 3.0.2 (2013-09-25) -- "Frisbee Sailing"
Copyright (C) 2013 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

library(dplyr)

Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

filter, lag

The following objects are masked from ‘package:base’:

intersect, setdiff, setequal, union

sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
[1] LC_CTYPE=en_US.utf8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.utf8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.utf8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] dplyr_0.1.3

loaded via a namespace (and not attached):
[1] assertthat_0.1 Rcpp_0.11.1 tools_3.0.2

for (i in 2:17) { data.frame(a = 1:i, b = c(1:(i-1), NA)) %.% group_by(a)
%.% summarize(b = mean(b, na.rm = TRUE)) %.% arrange(b); cat(i, " - OK\n")
}
2 - OK
3 - OK
4 - OK
5 - OK
6 - OK
7 - OK
8 - OK
9 - OK
10 - OK
11 - OK
12 - OK
13 - OK
14 - OK
15 - OK
16 - OK

*** caught segfault ***
address 0x1960a1d8, cause 'memory not mapped'

Traceback:
1: .Call("dplyr_arrange_impl", PACKAGE = "dplyr", data, args, dots)
2: arrange_impl(.data, dots(...), environment())
3: arrange.tbl_df(__prev, b)
4: arrange(__prev, b)
5: eval(expr, envir, enclos)
6: eval(new_call, e)
7: chain_q(list(substitute(x), substitute(y)), env = parent.frame())
8: data.frame(a = 1:i, b = c(1:(i - 1), NA)) %.% group_by(a) %.%
summarize(b = mean(b, na.rm = TRUE)) %.% arrange(b)

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
Selection: 3

On Mon, Mar 17, 2014 at 3:07 AM, Kevin Ushey [email protected]:

Ah, you're running Snow Leopard? That's a pretty old OS now and I don't
think Apple distributes clang for it -- you might have to discard my
previous advice on the Makevars.


Reply to this email directly or view it on GitHubhttps://github.com//issues/322#issuecomment-37766656
.

@kevinushey
Copy link
Contributor

Strange, thanks. I can reproduce on a Linux VM (13.04 32bit Ubuntu, gcc 4.7.3), but not on my Mac with Apple clang.

@kevinushey
Copy link
Contributor

Some output from gdb:

Program received signal SIGSEGV, Segmentation fault.
0xb5e2f2b7 in dplyr::OrderVectorVisitorImpl<14, true, Rcpp::Vector<14, Rcpp::PreserveStorage> >::equal (
    this=0x8a1cee8, i=16, j=148895952) at ../inst/include/dplyr/OrderVisitorImpl.h:19
19              return compare::equal_or_both_na( vec[i], vec[j] ) ;

(gdb) frame 1
#1  0x00007ffff1cda52d in dplyr::OrderVisitors_Compare::operator() (
    this=0x7fffffff8250, i=16, j=33688448)
    at ../inst/include/dplyr/Order.h:50
50                  if( ! obj.visitors[k]->equal(i,j) )

Something terrible has happened to the value of j:

(gdb) print i
$2 = 16
(gdb) print j
$3 = 148895952

@amcox
Copy link
Author

amcox commented Mar 16, 2014

I'm running Mac OS 10.9.2, which does make the "10.8" a little weird. As far as I remember, I installed all my packages and the base R from CRAN. In fact I just downloaded the updated R version from the website a few days ago and installed that.

Trying and makvars file method now.

@amcox
Copy link
Author

amcox commented Mar 16, 2014

The instructions you gave earlier about installing from source with the Makevars file set to clang do solve the problem for me. Thanks for the help.

I'll leve the issue open since it appears that other people have been able to reproduce it, but I'm happy to close if that's the custom.

@kevinushey
Copy link
Contributor

Glad to hear it helped! You should leave it open since the bug is reproducible on Linux (for me and @yanlinlin82, anyhow)

@hadley hadley added the bug label Mar 17, 2014
@hadley hadley added this to the v0.2 milestone Mar 17, 2014
@romainfrancois
Copy link
Member

@kevinushey and @yanlinlin82 can you still reproduce this ?

@yanlinlin82
Copy link

Yes so far before I change anything. Which version and how should I update the package?

I tried to update 'dplyr' with 'devtools' but failed like this:

devtools::install_github("dplyr")
Installing github repo dplyr/master from hadley
Downloading dplyr.zip from https://github.com/hadley/dplyr/archive/master.zip
Installing package from /tmp/RtmpFZsVAf/dplyr.zip
arguments 'minimized' and 'invisible' are for Windows only
Installing dplyr
'/usr/lib64/R/bin/R' --vanilla CMD build
'/tmp/RtmpFZsVAf/devtools73d22c5008ba/dplyr-master' --no-manual
--no-resave-data

  • checking for file '/tmp/RtmpFZsVAf/devtools73d22c5008ba/dplyr-master/DESCRIPTION' ... OK
  • preparing 'dplyr':
  • checking DESCRIPTION meta-information ... OK
  • cleaning src
  • installing the package to build vignettes
  • creating vignettes ... ERROR

Attaching package: 'dplyr'

The following objects are masked from 'package:stats':

filter, lag

The following objects are masked from 'package:base':

intersect, setdiff, setequal, union

geom_smooth: method="auto" and size of largest group is >=1000, so using gam with formula: y ~ s(x, bs = "cs"). Use 'method = x' to change the smoothing method.
Warning: Removed 1 rows containing missing values (stat_smooth).
Warning: Removed 1 rows containing missing values (geom_point).
Quitting from lines 6-9 (window-functions.Rmd)
Error: processing vignette 'window-functions.Rmd' failed with diagnostics:
there is no package called 'Lahman'
Execution halted
Error: Command failed (1)

@hadley
Copy link
Member

hadley commented Mar 20, 2014

Try devtools::install_github("hadley/dplyr", build_vignettes = FALSE)

@yanlinlin82
Copy link

Thanks. It works now. No crash on my 64-bit Linux any more, with either my
code or @amcox's.

On Thu, Mar 20, 2014 at 9:55 PM, Hadley Wickham [email protected]:

Try devtools::install_github("hadley/dplyr", build_vignettes = FALSE)


Reply to this email directly or view it on GitHubhttps://github.com//issues/322#issuecomment-38168956
.

@hadley hadley closed this as completed Mar 20, 2014
@lock lock bot locked as resolved and limited conversation to collaborators Jun 10, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug an unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

5 participants