Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read_sas() failure examples on compressed data #263

Closed
ajdamico opened this issue Jan 15, 2017 · 5 comments
Closed

read_sas() failure examples on compressed data #263

ajdamico opened this issue Jan 15, 2017 · 5 comments

Comments

@ajdamico
Copy link
Contributor

hi, not sure if files like these should be supported? i didn't find anything in the haven or ReadStat docs saying compressed .sas7bdat files are not allowed. thanks

tf <- tempfile()
download.file( "http://www2.census.gov/programs-surveys/ahs/2002/AHS_2002_Value_Label_Package.zip" , tf , mode = 'wb' )
z <- unzip( tf )
sas_file <- grep( "\\.sas7bdat$" , z , value = TRUE )
haven::read_sas( grep( "\\.sas7bdat$" , z , value = TRUE ) )
# sas7bdat::read.sas7bdat(sas_file)
# Error in sas7bdat::read.sas7bdat(sas_file) : 
  # file contains compressed data


tf <- tempfile()
download.file( "http://www2.census.gov/programs-surveys/ahs/2004/AHS_2004_Value_Labels_Package.zip" , tf , mode = 'wb' )
z <- unzip( tf )
sas_file <- grep( "\\.sas7bdat$" , z , value = TRUE )
haven::read_sas( sas_file )
# sas7bdat::read.sas7bdat(sas_file)
# Error in sas7bdat::read.sas7bdat(sas_file) : 
  # file contains compressed data
@BioStatMatt
Copy link
Contributor

BioStatMatt commented Jan 15, 2017 via email

@ajdamico
Copy link
Contributor Author

thanks, but it also fails for these files

devtools::install_github("biostatmatt/sas7bdat.parso")
tf <- tempfile()
download.file( "http://www2.census.gov/programs-surveys/ahs/2002/AHS_2002_Value_Label_Package.zip" , tf , mode = 'wb' )
z <- unzip( tf , exdir = tempdir() )
sas_file <- grep( "\\.sas7bdat$" , z , value = TRUE )
sas7bdat.parso::read.sas7bdat.parso( sas_file )
#Error in .jcall(sps, "S", "s7b2csv", s7bfile, csvfile) : 
#  java.lang.IndexOutOfBoundsException: Index: 0, Size: 0

@BioStatMatt
Copy link
Contributor

BioStatMatt commented Jan 15, 2017 via email

@rogerjdeangelis
Copy link

I want to thank Evan and Matt for opening up SAS datasets for R programmers.

Matt your documentation on SAS datasets has opened my eyes and haven has added speed and flexibility.

The code below reads and writes compressed SAS datasets with free
commercial software in combination with open source R(haven) or Python
Unfortunately WPS does not create a SAS data
set within the R script.

It would be wonderful if sas7bdat or haven provided a
write SAS capability, the missing piece.
WPS does not limit the size of the SAS dataset from R in the
free edition.

utl_compressed_r.zip

Converting CSV text numbers to double floats is very expensive so
building a binary SAS dataset has major advantages. I would initially
only support 8 byte float and of course character variables.

@hadley
Copy link
Member

hadley commented Jan 25, 2017

Follow progress at WizardMac/ReadStat#21 - this will be added to haven soon after it's added to ReadStat.

@hadley hadley closed this as completed Jan 25, 2017
@lock lock bot locked and limited conversation to collaborators Jun 26, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants