-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve estimation table #379
Conversation
Three open questions remain (for now):
|
Codecov Report
@@ Coverage Diff @@
## main #379 +/- ##
==========================================
+ Coverage 93.24% 93.26% +0.01%
==========================================
Files 247 247
Lines 17987 18000 +13
==========================================
+ Hits 16772 16787 +15
+ Misses 1215 1213 -2
... and 1 file with indirect coverage changes Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
Thanks! While you are at it, can you please turn off the performance warnings, which are irrelevant for tables:
|
Thanks. Please make sure that when you call a test While the bug we saw today seems fixed in that function, it has no effect on for _, r in footer.iterrows():
r = _center_align_integers_and_non_numeric_strings(r) The (correct) result on the RHS is assigned to I tried to fix with for idx, row in footer.iterrows():
footer[idx] = _center_align_integers_and_non_numeric_strings(row) but this gives an error TypeError: incompatible index of inserted column with frame index in my application. |
Reminder: At some point, we should make sure that columns consisting of integer values only are right aligned (instead of center-aligned) |
@janosg, I just merged main and updated the first comment in this PR. The remaining comments are:
My suggestion: leave as is
If I am not mistaken, this comments was brought forward by Hans-Martin. I suggest opening an issue and not implementing it here as thinking about comparisons over full columns will most likely need a deeper thought. |
Given that the docstring reads:
I would be very surprised if (under the default!) suddenly all integers showed up as floats. Or do you mean by interaction "columns that contain both integers and floats" ? And by footer the number of observations or so? In that case I would not have a strong opinion, we don't really support tables containing all kinds of different numbers yet, anyhow.
I guess I do not quite understand why we can easily center-align columns but not right-align them? |
No, I really mean that integers in any of the model outputs are shown as floats. (I guess usually those coefficients are just not integers). But this can easily be changed. def _get_models_multiindex():
df = pd.DataFrame(
data={}, columns=["value", "ci_lower", "ci_upper", "p_value"]
)
df.loc[0] = [0.23123, 0, 1, 1]
df.loc[1] = [1, 0, 1, 1]
df.loc[2] = [2, 0, 1, 1]
df.index = pd.MultiIndex.from_tuples(
[("p_1", "v_1"), ("p_1", "v_2"), ("p_2", "v_2")]
)
info = {"n_obs": 400}
mod1 = {"params": df, "info": info, "name": "m1"}
mod2 = {"params": df, "info": info, "name": "m2"}
models = [mod1, mod2]
return models
models = _get_models_multiindex()
out = et.estimation_table(
models, return_type="render_inputs", confidence_intervals=True
)
out["body"] leads to
Have I misunderstood your comment that you would like to check the whole column whether it only consists of integers? This is something we do not do at the moment. We currently do not center-align whole columns, but only check and align single cells. Switching from center-align to right-align for those specific cells wouldn't be a problem. But I am not sure whether this really is what we want in columns in which we have floats in the main part of the body and then n_observations at the end. |
Ah, sorry, seems like I was mistaken in my assumptions. When talking about integer cells, I was thinking about columns with (Pandas) dtype object, which may contain integer cells:
Seems like you are thinking about the case where we have all-float columns, some of which may happen to look like integers. I agree these cases should not be treated in any special way. |
Apologies, I had not looked at the code and did not realise that this is fully internal, i.e., it cannot be controlled by the user. This is something that should change eventually (and we may prepare for it in the code by making two functions |
Perfect. I opened #451. Janos, if you agree, I think we can merge. |
With "We may prepare for it in the code ..." I meant "make the preparations now" 😆 In the issue, that line will be misleading to future developers, I believe. |
Alright, that makes sense. I deleted it in the issue. I think it isn`t worth right now to split up the function before we know exactly which changes we are planning to make. The failing tests seem to be unrelated to this PR. |
Hi @ChristianZimpelmann, thanks a lot for finishing this up. A few questions:
Is this still the case? I don't think there ever was an intentional choice to have it and it is just a bug and we should fix it before merging.
Is the workflow for this documented, ideally with an example? I think this is a very important feature and it should be documented before we merge.
Is this done? Should not take long and should be done before we merge.
Is the bug mentioned in this comment by HM fixed?
While I agree that the default visualization of integer only columns should change (see #451), I am always hesitant to add yet another argument to give more control to the user. The whole philosophy of Sorry that I ask for more changes, but I think we need to make sure that no bug that was uncovered during this PR slips through the review. |
I think to really improve Do we have a volunteer to set this up / contribute examples? I think it should be a mix of real world examples and hand crafted examples that contain elements that do not look good in the current version. Once we have that, I would be more motivated to spend a day on cleaning up the code and making sure everything looks nice. @timmens could you look into the failing tests on Monday? I agree that they are not related to the current PR but it looks like they fall in your area of expertise. |
I guess I tend to forget because so far, I have only used it in projects where core developers of this feature were involved and things ended up being fixed in PRs like this one :-) Would also be great to document how that would work in practice. (not that I checked the website right now...) |
Thanks!
No, this was fixed in this PR.
I am afraid it is not. Would you document it in a docstring or the Notebook?
I don't see where it is turned off. But I don't get any performance warning when running
Yes!
I could make a first draft of it next week. |
The documentation should be here. Nobody reads the docstring. The Performance warnings are probably only triggered when you have params with a MultiIndex. But it would not hurt to ignore these warnings preemptively for the entire estimation table function. Thanks a lot for the PR and for setting up the repo! @hmgaudecker I would say it's documented here but I am very open to any suggestions to improve this documentation. |
S+R on that page: fist->first, exmample
I'd expand this with:
There should be some heading above the imports on that page, the way it is rendered it seems like they are somehow part of the introduction. |
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
I now updated the how-to notebook and added in particular the feature which was added in this PR. I also implemented all suggestions by @hmgaudecker (see below). So far, I haven't renamed the how-to though. I don't have a strong opinion or a good idea what could be more "catchy". I will work on the repo of example tables later today.
|
show_index_names=show_index_names, | ||
show_col_names=show_col_names, | ||
escape_special_characters=escape_special_characters, | ||
with warnings.catch_warnings(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it would not hurt to ignore these warnings preemptively for the entire estimation table function.
@janosg, is this the implementation you had in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have renamed the current estimation_table
to _estimation_table
and defined a new estimation_table
that calls _estimation_table
in this catch warnings context. Otherwise we get a lot of indentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good idea! We should do that!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I implemented it using a decorator in the last commit.
Do you have any more suggestions? Otherwise, we can merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my side all good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again, @ChristianZimpelmann! I think we can merge!
One test is still (or again) failing.. unrelated to changes in this PR as far as I understand. Should I just ignore it? |
We are working on these random failures. Yes, it's unrelated. I'll merge. |
"Controls": ["Yes", "No"])
to the footer before callingrender_latex
.Closes #373.