-
Notifications
You must be signed in to change notification settings - Fork 991
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatic detection of dec=',' in Europe #2431
Comments
It would be enough, if |
+1 for auto- |
pls. |
|
It sounds like a different issue than the one here |
Any sample files for this issue? |
fread("a;b\n1,5;2,5", sep=";", dec=",") |
A real-world example would be better 😄 I saw a 🇫🇷 government website using |
AFAIR this is how excel produces csv files in France and Poland. |
@cderv @dmpe @thohan88 @GabijaSakalyte @Boyoron @IndreSakalauskaite @Amygdalae @AndriusJasinevicius @dvaitkus @Katazyna-Stankevic @labutytegreta @rasainsodaite @ievajuozapaityte @gertrudam @RPrakapaite @Grazvile @bugampo @raugulis @Ignnn @iurbon @EvitaJ @rutele13 @jstonkus @pociuteagne @Kaamile @LinaAnu @zyginta @evelina11101 @silvimi @Auguste11 @1075353 @Andrealek @esadausk @vaiiva @supermenas @ramintares @viktorija-romovaite @egle-lele @RokasStat @ema-malinauskaite @1611003 @zyle1 @tokotrienoliai @DanasKl @danielius-mockus @Gabriele-gif @domasrupkus @GerdaSkin @emyliuxe @GegznaV @clarkdk @s-fleck Sorry for the wide ping. I have a PR addressing this issue in #4482 -- it would be great if anyone could provide some "real world" sample CSVs rather than testing on my toy examples. Thanks in advance! |
@MichaelChirico, here are some examples of data: data.zip. Inside the ZIP:
|
Wouldn't it be more logical to automatically choose |
AFAIR excel uses |
Thanks a bunch for the data sets Vilmantas.
|
Remember not to try to handle every possible input. For example
Sounds that fread would need to skip first 6 lines, then read seventh, skip 8th, and read the rest. Hnadling that is doable but it impose maintainance overhead, can introduce new bugs, etc. It might be better to provide a more general interface where skip can be a vector, so user need to understand what is wrong with their files, and then just |
Yea I don't think there's an automatic way on this one that's not a fragile house of cards to support. |
However, if we return to the original issue When could one expect these changes to be on CRAN? |
@GegznaV probably not very soon. Note that we provide windows binaries so Rtools/compilation is not needed. If you are on R 3.6 you can just install.packages("data.table", repos="https://Rdatatable.gitlab.io/data.table", type="win.binary") If you are on other version you can try install.packages("https://rdatatable.gitlab.io/data.table/bin/windows/contrib/3.6/data.table_1.12.9.zip", repos=NULL) Note that soon those 3.6 will move to 4.0. |
Ok, can I expect the new version of data.table on CRAN somewhere around mid-August? (Before the new school year in September). Or your dates are even further? |
We don't have any fixed release dates. New version on CRAN might eventually be just a patch release not having new features like this. |
I'm not sure what
base
andreadr
do in this regard, but currently infread
,dec=='.'
by default and needs manually setting to','
in Europe for numerics with comma as the decimal separator. It could instead be detected automatically likesep
already is. Please +1 this issue if you'd like this.Further,
dec
could be automatically detected per-column for files where some numeric columns use','
and other columns use'.'
. But does anyone need that?The text was updated successfully, but these errors were encountered: