-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added multinomial, ordinal and firth logistic regression #179
added multinomial, ordinal and firth logistic regression #179
Conversation
Awesome :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine, but has some rough edges. Namely:
- AIC/BIC is missing for the new models, is that intentional?
- Adding a term to the null model in multinomial model crashes.
- Adding a weight to any analysis crashes.
- Adding a term to the null model in firth regression crashes (but with a different error).
See attached .jasp file for reproducible examples
glm-bugs.jasp.zip
Otherwise the code looks ok, I am just a little confused about your .getWeightVariable()
which does not seem necessary to me (and especially the combination with eval()
). What is the reason for doing this, it looks a little over-engineered to me
R/glmCommonFunctions.R
Outdated
nullModel <- stats::glm(nf, | ||
family = familyLink, | ||
data = dataset, | ||
weights = get(options$weights)) | ||
weights = eval(.getWeightVariable(options$weights))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this necessary? Could you just pass the vector of values from the dataset?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Simon, thanks for the quick review!
-
Yes the implementation has some rough edges. I haven't been able to load the module into JASP (see screenshot for error, any idea what happened?), so it wasn't easy for me to test it thoroughly. I'll work on the issues you helped to find out (thx!).
-
The reason for using .getWeightVariable() is to make the code cleaner. It used to be two if conditions (e.g. evaluating whether options[["weight"]] is equal to ""), so there was quite some repetition of code (especially after adding the three new logistic regression variants).
-
Passing the vector of values from the dataset would lead to error. The default (no weight is added) is an empty string, which glm does not accept. If a weight variable is used, it has to be added to the glm function as a variable (instead of a string), hence eval(call(get("variable name"))). I tried to simplify the implementations but this seems to be the only one that works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for using .getWeightVariable() is to make the code cleaner
Yes that's fine, I was just wondering about the implementation.
Passing the vector of values from the dataset would lead to error. The default (no weight is added) is an empty string, which glm does not accept. If a weight variable is used, it has to be added to the glm function as a variable (instead of a string), hence eval(call(get("variable name"))).
Yes, but how about something like
weights <- if(options[["weights"]] == "") NULL else dataset[[options[["weights"]]]]
...
nullModel <- stats::glm(..., weights = weights)
That wouldn't work? :)
I haven't been able to load the module into JASP (see screenshot for error, any idea what happened?), so it wasn't easy for me to test it thoroughly.
Hm apart from the usual "check whether you use your personal GitHub PAT" or "click Clear installed modules and packages
and try again", or "which JASP version are you using?" not sure, but if nothing helps perhaps it's something related to https://github.com/jasp-stats/INTERNAL-jasp/issues/2151 and needs to be fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
weights <- if(options[["weights"]] == "") NULL else dataset[[options[["weights"]]]]
...
nullModel <- stats::glm(..., weights = weights)
Unfortunately, this didn't work. When weights
isn't NULL
, the glm
function would somehow look for a variable called weights
and of course, this variable doesn't exist. Tried some variations, same problem.
I think this probably has to do with how R is run behind JASP, because in a normal R session, I can use your suggested approach perfectly fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm okay, i see the issue.
Seems it it due to the following lines in stats::glm
mf <- match.call(expand.dots = FALSE)
m <- match(c("formula", "data", "subset", "weights", "na.action",
"etastart", "mustart", "offset"), names(mf), 0L)
mf <- mf[c(1L, m)]
mf$drop.unused.levels <- TRUE
mf[[1L]] <- quote(stats::model.frame)
mf <- eval(mf, parent.frame())
Where the model.frame()
fails to find the appropriate object in the parent.frame()
.
What is kind of weird that stats::lm
does the same thing, but supplying weights
as a numeric vector works in our linear regression without issues:
jaspRegression/R/regressionlinear.R
Line 842 in fcb6de3
fit <- stats::lm(formula, data = dataset, weights = weights, x = TRUE) |
@vandenman do you have an idea what could be going wrong here? I guess the workaround by @fqixiang works fine, but it irks me that it does not just work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure. These match.call
constructions that end with eval(mf, parent.frame())
seem a bit unreliable to me. BAS has a similar problem where an object cannot be found, but only when you pass the formula as an object and not as a literal (see merliseclyde/BAS#56).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just fixed the issue in BAS (finally!!!) and the problem/solution is discussed nicely in https://stackoverflow.com/questions/61164404/call-to-weight-in-lm-within-function-doesnt-evaluate-properly/61164660#61164660?newreg=04a2b71c6da04693a5b172a54a4a43b0
The problem is that BAS
, lm
and glm
assume that the weights are in the same environment as the formula
here is a simple fix with lm
that is now in BAS
on GitHub (not CRAN)
`
data(UScrime, package = "MASS")
UScrime <- UScrime[, 1:5]
mylm = function(object) {
modelform = as.formula(eval(object$call$formula, parent.frame()))
environment(modelform) = environment()
data = eval(object$call$data)
weights = eval(object$call$weights)
object = lm(formula = modelform,
data = data,
weights = weights)
return(object) }
crime.lm1 <- lm(formula = M ~ So + Ed + Po1 + Po2, data = UScrime)
tmp1 = mylm(crime.lm1)
broken before
form = M ~ So + Ed + Po1 + Po2
crime.lm2 <- lm(formula = form, data = UScrime)
tmp = mylm(crime.lm2)
test that::expect_equal(coef(tmp), coef(tmp1))
`
The key lines are the
modelform = as.formula(eval(object$call$formula, parent.frame())); environment(modelform) = environment()
in the function to change the environment for the formula, data, and weights to the current environment
eeb4d56
to
a042ab8
Compare
The unit tests succeeded for ubuntu but not for windows or macOS. Is this expected? |
Hopefully that will go away once we merge #185 if you rebase, it should work now |
a042ab8
to
5b78934
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! The issues are fixed, just two suggestions and we can merge it
R/glmCommonFunctions.R
Outdated
ff <- .createGLMFormula(options, nullModel = FALSE) | ||
nf <- .createGLMFormula(options, nullModel = TRUE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if we do it like this, it should work without the workaround:
ff <- .createGLMFormula(options, nullModel = FALSE) | |
nf <- .createGLMFormula(options, nullModel = TRUE) | |
# make sure that the formula get the current environment (https://github.com/jasp-stats/jaspRegression/pull/179#discussion_r1017412764) | |
ff <- .createGLMFormula(options, nullModel = FALSE) | |
environment(ff) <- environment() | |
nf <- .createGLMFormula(options, nullModel = TRUE) | |
environment(nf) <- environment() | |
weights <- dataset[[options[["weights"]]]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks. changed it.
R/glmCommonFunctions.R
Outdated
weights = eval(.getWeightVariable(options$weights))) | ||
} | ||
|
||
if (options$family == "other") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should probably be else if
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if
is fine in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a preference for using else if
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for options$family
? Could also be an else?
now you're doing
if (options$family != "other") {
... # code
}
if (options$family == "other") {
... # code
}
but if the two code paths are mutually exclusive then this is more clear:
if (options$family != "other") {
... # code
} else { # family == "other"
... # code
}
in the original code it's not clear that the two code paths are mutually exclusive (since the first one may change the value of options$family
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
true, I'll fix that. thx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good now!
The test failures on ubuntu are not related to this PR.
fixed jasp-stats/jasp-issues#202
fixed jasp-stats/jasp-issues#784
fixed jasp-stats/jasp-issues#1345