-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
apply coerces to matrix, inane design decision #21
Comments
From a thread discussing why PHP has a left-associative ternary operator for inconceivable reasons. Given that the response to raising this issue on the R forums was 'this is correct behaviour', I guess we shouldn't complain about anything. There are no bugs. |
library(plyr)
a = c(TRUE, FALSE, TRUE, FALSE, TRUE, TRUE)
b = c('a', 'b', 'c', 'de', 'f', 'g')
c = c(1, 2, 3, 4, 5, 6)
d = c(0, 0, 0, 0, 0, 1)
wtf = data.frame(a, b, c, d)
foo.huh <- function(row) {
if (row['a'] == T) { return('we win') }
if (row['c'] < 5) { return('hooray') }
if (row['d'] == 1) { return('a thing') }
return('huh?')
}
plyr::adply(wtf, 1, .fun = foo.huh, .expand = TRUE, .id = NULL)
#> a b c d V1
#> 1 TRUE a 1 0 we win
#> 2 FALSE b 2 0 hooray
#> 3 TRUE c 3 0 we win
#> 4 FALSE de 4 0 hooray
#> 5 TRUE f 5 0 we win
#> 6 TRUE g 6 1 we win Created on 2018-07-06 by the reprex package (v0.2.0). |
I got the same result as ifly6 did in R as was offered as the "more correct" result in Python. (and then also offered via plyr construction by Eluvias. This whole rant seems to ignore the fact that And the correct way to create a range that iterates over a sequence like rownames(df) is: |
Let's be honest. Apply is just broken for data frames. Defending it by saying that the user just doesn't understand the language, that the language is just fine, and the function is functioning correctly is like saying that your toolbox of misshapen tools where the hammer is just the curved end on both sides is 'just fine'.
The 'correct' way to do this in R apparently is just to write out a for loop. Fortunately for you, you can't just make a for loop iterate over rows, like
for row in df.iterrows()
in Pandas, you have to explicitly index them.And fortunately for you, you can't just make a range like
1:nrow(df)
(also, who made the stupid choice to call itnrow
whennrows
makes more sense, their being more than one row...) because ifnrow(df0 == 0
then it returns a sequence (1, 0) which breaks your code when you try and run that. R is just built for robustness!But if you're doing lots of manipulation with lists, so you're familiar with
sapply
, you can probably fix that issue by usingapply
with the proper functions, right? Wrong.You get this. Because R inexplicably decides that the best way to deal with data frames is to turn them all into data matrices first. So, here, the
a
column turns into' TRUE'
and'FALSE'
. Silently. Fantastic behaviour.But in a reasonable and sensibly constructed system like Pandas, you can run the exact same thing, like this:
And get reasonable answers like these that follow. Look what is possible when you don't make stupid design decisions!
The text was updated successfully, but these errors were encountered: