Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fread "improperly" parses MS Excel Mac-style .csv #4186

Open
MPagel opened this issue Jan 17, 2020 · 0 comments
Open

fread "improperly" parses MS Excel Mac-style .csv #4186

MPagel opened this issue Jan 17, 2020 · 0 comments
Labels

Comments

@MPagel
Copy link

MPagel commented Jan 17, 2020

This bug is really a bug in Microsoft Excel, but manifests as a "bug" in fread.
This bug is related to Issues #2542, #2248, #1183

# Summary of Issue
When saving a CSV as a mac-type in MS Excel Office 2016 from a Windows10 PC, Excel creates a file that has \r after every line EXCEPT the final line, which ends as \r\n. fread then interprets each "line" as a separate column.

# Minimal reproducible example

You may be able to skip many of these file creation steps...taking a well-formed csv or xls file and saving it as mac-type csv is the most critical step.

Create the following in a notepad application and save as input.csv

Col1,Col2
1,data1
2,data2
3,data3

Open input.csv in Excel (2016 used...but likely applies to other versions)
File->Save As->Save as type: Excel Workbook (*.xlsx)->intermed.xlsx
Close Excel

Open intermed.xlsx in Excel
File->Save As->Save as type: CSV (Macintosh) (*.csv)->final.csv
Close Excel

In R...

fread("final.csv")

3,data3ata.table (0 rows and 5 cols): Col1,Col2

fredT<-fread("final.csv",header=T)
dim(fredT)

[1] 0 5

fredF <-fread("final.csv",header=F)
dim(fredF)

[1] 1 5

fredF

V1      V2       V3       V4    V5

1: Col1 Col2\r1 data1\r2 data2\r3 data3

# Output of sessionInfo()
sessioninfo::platform_info()

setting value
version R version 3.5.0 (2018-04-23)
os Windows 8.1 x64
system x86_64, mingw32
ui RStudio
language (EN)
collate English_United States.1252
ctype English_United States.1252
tz America/Los_Angeles
date 2020-01-17

packageVersion("data.table")

[1] ‘1.12.2’

@MPagel MPagel changed the title fread messed up by MS Excel mac .csv fread "improperly" parses MS Excel "mac .csv Jan 22, 2020
@MPagel MPagel changed the title fread "improperly" parses MS Excel "mac .csv fread "improperly" parses MS Excel Mac-style .csv Jan 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants