Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

termnames #299

Merged
merged 25 commits into from
Sep 6, 2023
Merged

termnames #299

merged 25 commits into from
Sep 6, 2023

Conversation

palday
Copy link
Member

@palday palday commented Sep 2, 2023

src/statsmodel.jl Outdated Show resolved Hide resolved
src/statsmodel.jl Outdated Show resolved Hide resolved
src/statsmodel.jl Outdated Show resolved Hide resolved
src/statsmodel.jl Outdated Show resolved Hide resolved
src/statsmodel.jl Outdated Show resolved Hide resolved
src/statsmodel.jl Outdated Show resolved Hide resolved
test/statsmodel.jl Show resolved Hide resolved
test/statsmodel.jl Show resolved Hide resolved
src/statsmodel.jl Outdated Show resolved Hide resolved
src/statsmodel.jl Outdated Show resolved Hide resolved
test/statsmodel.jl Outdated Show resolved Hide resolved
src/statsmodel.jl Outdated Show resolved Hide resolved
src/contrasts.jl Outdated Show resolved Hide resolved
src/contrasts.jl Outdated Show resolved Hide resolved
src/contrasts.jl Outdated Show resolved Hide resolved
src/statsmodel.jl Outdated Show resolved Hide resolved
src/statsmodel.jl Outdated Show resolved Hide resolved
src/statsmodel.jl Outdated Show resolved Hide resolved
src/statsmodel.jl Outdated Show resolved Hide resolved
test/statsmodel.jl Outdated Show resolved Hide resolved
test/statsmodel.jl Outdated Show resolved Hide resolved
src/statsmodel.jl Outdated Show resolved Hide resolved
palday and others added 3 commits September 5, 2023 10:28
src/contrasts.jl Outdated Show resolved Hide resolved
Co-authored-by: Milan Bouchet-Valat <[email protected]>
Copy link
Member

@kleinschmidt kleinschmidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get why this functionality is needed vs. coefnames (one string per "variable"), and I think StatsModels is a good home for it, but I'm not sure termnames is quite right. some things that are terms never get named (tupleterms, matrixterms, formulaterm), some things that contain terms forward to their children while others do not (intercept term).

What about something like variablenames? I think that better captures what folks mean when they want this, without introducing confusion about what is and is not an AbstractTerm in the statsmodels system...

src/statsmodel.jl Outdated Show resolved Hide resolved
Comment on lines 161 to 174
termnames(::InterceptTerm{H}) where {H} = H ? "(Intercept)" : nothing
termnames(t::ContinuousTerm) = string(t.sym)
termnames(t::CategoricalTerm) = string(t.sym)
termnames(t::Term) = string(t.sym)
termnames(t::ConstantTerm) = string(t.n)
termnames(t::FunctionTerm) = string(t.exorig)
# termnames(TupleTerm)) always returns a vector, even if it's just one element, e.g.,
# termnames((term(:a),))
termnames(ts::TupleTerm) = mapreduce(termnames, vcat, ts; init=String[])
# termnames(MatrixTerm)) always returns a vector, even if it's just one element, e.g.,
# termnames(MatrixTerm(term(:a)))
termnames(t::MatrixTerm) = mapreduce(termnames, vcat, t.terms; init=String[])
termnames(t::InteractionTerm) =
only(kron_insideout((args...) -> join(args, " & "), vectorize.(termnames.(t.terms))...))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these all shoudl be in terms.jl I think, with other term API functions

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 158 to 159
Return value is either a `String`, an iterable of `String`s or nothing if there
no associated name (e.g. `termnames(InterceptTerm{false}())`).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is an interesting edge case. there is an associated name for the term, even if there's no coefficient that corresponds in the model.

src/statsmodel.jl Outdated Show resolved Hide resolved
src/statsmodel.jl Outdated Show resolved Hide resolved
@palday
Copy link
Member Author

palday commented Sep 5, 2023

@nalimilan I’m not opposed to the variablenames bit but I still think we should deprecate the internal termnames in favor of coefnames because it also didn’t refer to terms but rather coefs.

Thoughts on using variablenames instead of termnames? I'm pretty neutral and @kleinschmidt has a point.

@palday
Copy link
Member Author

palday commented Sep 5, 2023

@kleinschmidt GH won't let me reply to the comment so....

I'm a bit worried by teh inconsistency with coefnames here, where an empty vector is used. Also, vcat with nothing will add an explicit nothing into the result. but perhaps that's intentional?

Yeah, I don't think that most users should call these methods directly, and String[] plays nicer with vcat, which is called if applying this to a collection of terms / formula. I'll update when we get Milan's thoughts on the other bit of bikeshed. 😄

Co-authored-by: Dave Kleinschmidt <[email protected]>
@nalimilan
Copy link
Member

nalimilan commented Sep 5, 2023

variablenames could mean something different: in R, all.vars(f) returns the names of variables which are used in a formula, which allows doing e.g. subset(df, all.vars(f)) to keep only useful variables. termnames sounds closer to what we want but I would have to double-check the formula machinery to have a definitive opinion. MatrixTerm and TupleTerm are kind of meta-terms wrapping other terms, right? So it makes sense that they don't have names themselves.

@palday
Copy link
Member Author

palday commented Sep 5, 2023

I'm open to other names -- it's easy enough to change variablenames to anything else. The hard part was termnames to variablenames because that wasn't a universal replacement.

src/deprecated.jl Outdated Show resolved Hide resolved
Co-authored-by: Alex Arslan <[email protected]>
@palday
Copy link
Member Author

palday commented Sep 6, 2023

@kleinschmidt @nalimilan we could also go with the hybrid function name termvarnames. Or make this a completely internal method _variablenames because we're really only using it for gvif. We can always make it public later.

@palday
Copy link
Member Author

palday commented Sep 6, 2023

okay, it seems like @kleinschmidt is now onboard with termnames. Great!

@kleinschmidt kleinschmidt self-requested a review September 6, 2023 21:20
Copy link
Member

@kleinschmidt kleinschmidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like docs are not picking up the termnames crossref: https://github.com/JuliaStats/StatsModels.jl/actions/runs/6102408363/job/16561033364?pr=299#step:5:279

fix that and then LGTM.

@palday palday merged commit 1632bff into master Sep 6, 2023
6 of 7 checks passed
@palday palday deleted the pa/termnames branch September 6, 2023 21:34
@nalimilan
Copy link
Member

Looks like you haven't noticed my last comment at #299 (comment). Not a big deal, but maybe still worth adjusting?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants