Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving Model> Fit model>Two variable & One variable dialogs #7930

Open
wants to merge 56 commits into
base: master
Choose a base branch
from

Conversation

Vitalis95
Copy link
Contributor

@Vitalis95 Vitalis95 commented Oct 25, 2022

Fixes (partially) #7634
This work is still in progress
@rdstern @N-thony @lloyddewit I have added the Bayesian method to Two Variable Fit model dialog.
The function bayes_inference currently supports two level factor, the response variable (Second Variable in our case) can be either categorical (2 level factor) or numeric , the explanatory variable(First Variable in our case) is always categorical.
I am using nc data to test,
When the Second Variable is numeric i.e gained and first variable habit, the sample statistic should be mean
When the Second Variable is categorical i.e lowbirthweight and first variable habit, the sample statistic should be proportion, then specify which of its levels is the success in the y argument (Second Variable) i.e low - the level to do inference on,

@rdstern , the number of Monte Carlo drawn when the type of inference is credible interval, is more than 10,000,R-Instat output window cannot display all of them including the summaries at the end, I have set it to print a maximum of 1000, the reason for this is that by default the global options of R cut off outputs after the 1000th entry -it will cut off at 1000th entry and display summaries at the end. Alternatively, if you are not happy with this, we then not save the model so that it just display the summaries

@N-thony
Copy link
Collaborator

N-thony commented Oct 25, 2022

@derekagorhom can you peer review this?

@rdstern
Copy link
Collaborator

rdstern commented Oct 27, 2022

@Vitalis95 it is great you are working on this issue.

But I couldn't find what you have done.
image

Here is the current dialogue - from your pull request. I had expected to find another button, called Bayesian, at the top. But alternatively I suppose you could add the Bayesian options here (And label the control as Method: rather than Test:. That could also perhaps go under the general option at the top, if you can give a model formula for the Bayesian?

I tried with the nc data from the statsr package. Can you also detail which data you used,, and you are also welcome to include a figure in the pull request, if you wish, so I know what to expect.

@Vitalis95
Copy link
Contributor Author

Vitalis95 commented Oct 28, 2022

@lloyddewit , When @rdstern pulls this branch he can't see the changes, what might be the problem? Thanks

@lloyddewit
Copy link
Contributor

@lloyddewit , When @rdstern pulls this branch he can't see the changes, what might be the problem? Thanks

@Vitalis95 I pulled the branch and can see the changes in VisualStudio. However when I compile and run the branch, the dialogs look identical to the screenshots in issue #7634.

Do the dialogs look identical to you too? If there are changes that should be visible in this PR, then please could you share a screenshot so that I can check if I can see these changes also?

Thanks

@rdstern
Copy link
Collaborator

rdstern commented Oct 31, 2022

@Vitalis95 I have looked at the new one and found the Bayes in the one variable dialogue, see below. I still can't find it in the 2-variable?
Anyway in the 1 variable I really like this as a start:
image

a) For now can you hide the control for one-sided/two sided. I haven't bothered with it for the other options, because I don't think the idea is that important.
b) Notice I have Village there, which is sensible for proportions. But it is giving the success for a lavel of the yields and I don't see how to change that.
c) You have Type of Inference, with Hypothesis Test or Credible Interval. Could you get rid of that control and have just the Hypothesis Test here. Then also include Bayes in the Estimate button and that's where you have Credible Interval.
d) Also the control above has Mean and Proportion. Instead could you have Bayes: Mean and Bayes: Proportion in the list of tests here (and also in Estimate.
e) And in that list could they go just below the first line in the list. So that's a new section, between Z and Bartel.
f) And if it is easy - at the same time could you look at the existing proportion and binomial. I couldn't find the success possibility on the dialogue, though it is possible with the command.

@rdstern
Copy link
Collaborator

rdstern commented Oct 31, 2022

@Vitalis95 one important - and larger area - is that we will need a button labelled either Options or Prior. That to allow us to give informative priors. I am happy that the default is non-informative. You may want to wait with this, or indicate for now, but putting the button there, but not enabling it for now.

@Vitalis95
Copy link
Contributor Author

@rdstern , have a look at it, it's for One Variable Fit models
I am using nc data to test, currently the function bayes_inference only supports two level factor. I have added the proportion -for categories and the mean for numeric to the test and the estimate
I have not saved the object, from Bayesian analysis book. and also R examples for the function, they seem to run the function without saving.
This is the output for Bayes:Mean when not saved:
image

This the output when saved:
image

Without saving is more informative than when saved. What is your take on that?

@Vitalis95
Copy link
Contributor Author

@rdstern , I have added to Two variable Fit models:
Bayes:Mean: for numeric variable as response
Bayes:Proportion: for two level factor as the response
Have a look at it. Thanks

@rdstern
Copy link
Collaborator

rdstern commented Nov 5, 2022

@Vitalis95 this is exciting. You have made very good progress.
I have tried with the survey data, and got it from the Intro guide data sets. In this way the variety variable comes in as a character (and not a factor) variable.
I then used the filter to make a subset, omitting the level New, so variety has just 2 levels. And made it a factor.

a) This gave me a serious error that needs correction, though it is nothing to do with the Bayesian. I get the same problem with the released version. If you try the dialogue with the survey and with the yield by variety, with the variety as a character variable, i.e. not a factor. Then it crashes with the nasty error. I wonder if character-types should be made into factors, or just not allowed. Should just numeric or factor be allowed. I assume we should allow logical or date, so not sure of the simplest check to do.

b) So then I moved to the subset with variety at 2 levels. There the proportion etc seems to work well. I wonder why it can't all work with more than 2 levels. It gives the choice of which is to be a success, and (with more than 2 levels), then all the others together are against this one. That's what (I think) is possible in the ordinary proportion functions, so why not for the Bayesian?

c) This does need the Prior button to be added in each case, to give a sub-dialogue to be able to choose the prior distribution. I am not sure quite what it will look like. The non-informative prior is what you seem to have here.

d) I asked you to get rid of the alternative hypothesis checkbox. I am still not sure I want it for the frequentist options, but (sorry) please put it back here. At the same time there would be a real jackpot if better hypotheses could be included. I don't know if this is possible, so start for now with what you had before. But might it be possible to have Ho = p < 0.3 v H1 = p>= 0.3? That would be a game-changer, but I don't know if it is possible. I hope @jkmusyoka can also help with discussions on these improvements.

e) I like the option to choose between simulation and theoretical, but couldn't see the difference in the output. I don't want 1000 lines of output, but it would be good to have something different in the output.

e) Trivial changes - for one sample, in the list of tests could you add a horizontal line under the 2 bayes tests and estimates as well as above.

@Vitalis95
Copy link
Contributor Author

@rdstern I have added to the Distribution Bayesian Simple linear regression in Two variable fit model dialog. I am still investigating on the Bayes plots and I will do it as separate issue

@Vitalis95
Copy link
Contributor Author

@rdstern , I noticed Negative Binomial in 2 variable fit model dialog had a bug. It's now working fine

@Vitalis95
Copy link
Contributor Author

@rdstern , I have added sub dialog Priors which has prior sample size, s.d, degrees of freedom and hypothesis testing.
The function currently supports only two sided hypothesis tests, also it supports only two level factor.

@rdstern
Copy link
Collaborator

rdstern commented Nov 17, 2022

@Vitalis95 I get the developer error when loading the 2 variable example:
image
a) In the new Bayes Mean can you replace it, so it is a little section in the pull-down, by itself. It has horizontal lines round it, and is between the two other groups.
b) Can you move the Bayes controls down so the Prior button is first in the list.
c) Can you change the Hypothesis up-down for H0 into a similar sort like the existing control for the confidence level. So it is limited to positive and less than 1, but you can type into it. The down arrow gives possible values, with 0.1, 0.2, 0.5,0.8,0.9.
d) The control for H1 is either absent, or (preferably) present but you can't type into it. It automatically changes to be the complement.
e) I assume the estimation and hypothesis testing priors are alternatives? Is that correct? If so, then perhaps there should be 2 alternative sub-dialogues, or perhaps (preferably) there could be 2 tabs, with Confidence Interval and Hypothesis Test and you go to the respective tab, depending on whether you are doing a test, or not.
f) I wasn't sure how to use the estimation prior. Perhaps you could give some examples. I was hoping that non-informative would be easy to give and could be the default?

@Vitalis95
Copy link
Contributor Author

@lloyddewit ,I have made the changes. Thanks

lloyddewit
lloyddewit previously approved these changes Dec 1, 2022
lloyddewit
lloyddewit previously approved these changes Dec 2, 2022
@lloyddewit
Copy link
Contributor

@rdstern if you can test/approve, then we can merge, thanks
(to reduce regression risk, I will merge after the build, I hope that's OK)

@rdstern
Copy link
Collaborator

rdstern commented Dec 2, 2022

@lloyddewit I will check and support that you wait until after the build. The Model > Two Variable - in its current form is one of the dialogues that has regressed in the current merged version. I don't want Patrick's work to be further complicated.

@rdstern
Copy link
Collaborator

rdstern commented Dec 6, 2022

@Vitalis95 I looked, but couldn't find the Bayesian options anymore? This problem may be at my end, because I had a similar problem with another pull request. So, I'll try again later - perhaps after restarting!

@rdstern
Copy link
Collaborator

rdstern commented Dec 6, 2022

@Vitalis95 now it is different - very different - and there are build errors. But I think that is progress, because I earlier had a similar problem with a pull request from @N-thony where the change didn't seem to be working on my running.

@Vitalis95
Copy link
Contributor Author

@rdstern , I have also build errors after updating this branch, I don't know if it's because of the changes made by @Patowhiz.

@Patowhiz
Copy link
Contributor

Patowhiz commented Dec 6, 2022

@Vitalis95 yes. Sorry I have been working on the model dialogs generally.

@Patowhiz
Copy link
Contributor

Patowhiz commented Dec 6, 2022

You can now work on the 1 and 2 variable dialogs. Just merge the changes from the master and add the new functionalities.

@Vitalis95
Copy link
Contributor Author

@rdstern ,have a look it now.

@rdstern
Copy link
Collaborator

rdstern commented Dec 29, 2022

@Vitalis95 please direct me on what options to test here. Is it now both one variable and 2 variables. I am starting with the survey data, unless you suggest better. And I'll get a factor variable with just 2 levels there.

@Patowhiz
Copy link
Contributor

Patowhiz commented Jan 5, 2023

@rdstern this reminds me that we need to find a way for these dialogs to distinguish between. 1 variable models and 2 variable models in the selector. At the moment. I noticed that if have both types of models, they all display in either of the dialogs. Some 2 variable models operations don't really apply to 1 variable models. This will also help alleviate any user confusion.

@N-thony
Copy link
Collaborator

N-thony commented Jul 26, 2023

@Vitalis95 what should be done here?

@N-thony
Copy link
Collaborator

N-thony commented Dec 18, 2023

@Vitalis95 what should be done here?

@Vitalis95 can you provide an update on the current status of this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants