-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Random Split dialog #7519
Added Random Split dialog #7519
Conversation
updating branch
Dlgrandomsplit
@derekagorhom please wait for @lilyclements confirmation before continuing with this pull request. And make sure you fully understand the reasons for the suggested changes, before implementing them. |
@derekagorhom for (a) do you know the R code to achieve this? From #7227 If "Sample" is selected then
If "Time Series" is selected then
This creates an object, which we can call For "save training data", run For "save testing data", run @rdstern to: j) The E.g. for
for
|
Thank you for the explanation @lilyclements, I will get to working on the Pull request. |
@derekagorhom any progress? |
@lilyclements i have made the changes, you can review this now |
instat/dlgRandomSplit.vb
Outdated
@@ -59,7 +59,7 @@ Public Class dlgRandomSplit | |||
ucrChkStratifyingFactor.AddToLinkedControls({ucrNudPool, ucrReceiverRanSplit}, {True}, bNewLinkedAddRemoveParameter:=True, bNewLinkedHideIfParameterMissing:=True, bNewLinkedUpdateFunction:=True, bNewLinkedChangeToDefaultState:=True, objNewDefaultState:=0.1) | |||
|
|||
ucrNudLag.SetParameter(New RParameter("lag", 3)) | |||
ucrNudLag.SetMinMax(0, 100) | |||
ucrNudLag.SetMinMax(Integer.MinValue, Integer.MaxValue) | |||
ucrNudLag.Increment = 0.5 | |||
ucrNudLag.DecimalPlaces = 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lag should not have any decimal places
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good, I have two follow up points
|
@lilyclements i have made the changes now, can you have a look |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@derekagorhom this is working a lot better. But OK is still not enabled all the times that it should be.
For example, what is the reason for OK to not be enabled here:
We want to say if time series is checked and if lag is checked then ...
At the moment, I find that OK is only enabled if time series and lag are checked.
Do you see the difference there?
instat/dlgRandomSplit.vb
Outdated
End If | ||
Else | ||
ucrBase.OKEnabled(False) | ||
End If | ||
Else | ||
ucrBase.OKEnabled(False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this FALSE
?
If rdoSample
is checked, we do not need ucrChkStratifyingFactor
to be checked to press OK
There's a similar case for rdoTimeSeries
with ucrChkLag
instat/dlgRandomSplit.vb
Outdated
If rdoSample.Checked Then | ||
If ucrChkStratifyingFactor.Checked Then | ||
If Not ucrReceiverRanSplit.IsEmpty AndAlso Not ucrNudBreaks.IsEmpty AndAlso Not ucrNudPool.IsEmpty Then | ||
If ucrSaveTrainingData.IsComplete AndAlso ucrSaveTestingData.IsComplete AndAlso Not ucrNudFraction.IsEmpty Then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These ucrs (the two saves and the fraction nud) are not controlled by the ucrChkStratifyingFactor
.
So, whether ucrChkStratifyingFactor
is checked or not should not affect the state of these three controls. Does that make sense?
So in the If ucrChkStratifyingFactor.Checked [...] End If
portion, we only want to contain ucrs that are controlled by ucrChkStratifyingFactor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because these three controls affect the whole dialog, then to avoid repeating code, perhaps it makes sense to bring them to the "exterior" of the if-statement.
If ucrSaveTrainingData.IsComplete AndAlso ucrSaveTestingData.IsComplete AndAlso Not ucrNudFraction.IsEmpty Then
< If statements for when rdos are checked >
Else
ucrBase.OKEnabled(False)
End If
Then we can build it up from there. So, next we can add in our if statement for rdoSample and rdoTimeSeries:
If ucrSaveTrainingData.IsComplete AndAlso ucrSaveTestingData.IsComplete AndAlso Not ucrNudFraction.IsEmpty Then
If rdoSample.Checked Then
'<if statements for when rdo sample is checked >
Else ' this here is if rdoTimeSeries.Checked
'<if statements for when rdo time series is checked >
End If
Else
ucrBase.OKEnabled(False)
End If
Step 3: What if rdoSample is checked then? From looking at the dialog and the controls, there is a checkbox for "Stratifying variable". This brings up three controls. So we want to add in a statement that now says, if ucrChkStratifyingVariable.Checked Then < new controls cannot be empty for OK to be enabled >
Perfect though - since you have already done this on lines 162-163. So we can just add that in :)
Then we just need to add in what if ucrChkStratifyingFactor isnt checked? Then OK should work fine.
If ucrSaveTrainingData.IsComplete AndAlso ucrSaveTestingData.IsComplete AndAlso Not ucrNudFraction.IsEmpty Then
If rdoSample.Checked Then
If ucrChkStratifyingFactor.Checked Then
If Not ucrReceiverRanSplit.IsEmpty AndAlso Not ucrNudBreaks.IsEmpty AndAlso Not ucrNudPool.IsEmpty Then
ucrBase.OKEnabled(True)
Else
ucrBase.OKEnabled(False)
End If
Else ' And what if ucrChkStratifyingFactor isnt checked? Then OK should work fine.
ucrBase.OKEnabled(True)
End If
Else ' this here is if rdoTimeSeries.Checked
'<if statements for when rdo time series is checked >
End If
Else
ucrBase.OKEnabled(False)
End If
Then we just do the same for rdoTimeSeries.Checked.
Does this make sense?
@lilyclements i have made some changes to the testOK, can you have a look |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great and works great. Really nice job!
@rdstern @lloyddewit over to you to review!
instat/dlgRandomSplit.vb
Outdated
If ucrChkStratifyingFactor.Checked Then | ||
If Not ucrReceiverRanSplit.IsEmpty AndAlso Not ucrNudBreaks.IsEmpty AndAlso Not ucrNudPool.IsEmpty Then | ||
If ucrSaveTrainingData.IsComplete AndAlso ucrSaveTestingData.IsComplete AndAlso Not ucrNudFraction.IsEmpty Then | ||
If Not ucrSaveTrainingData.IsComplete Or Not ucrSaveTestingData.IsComplete Or ucrNudFraction.IsEmpty Then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
out of interest, what's the difference between or
and orElse
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lilyclements
Or
requires both expression to be evaluated while OrElse
would still work even if only one expression is evaluated. so in this case if ucrSaveTrainingData
(or ucrsaveTestingData/UcrNudfraction
) is empty then the TestOK
will be disabled regardless of if the other controls are true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense - thanks :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or requires both expression to be evaluated while OrElse would still work even if only one expression is evaluated. so in this case if ucrSaveTrainingData (or ucrsaveTestingData/UcrNudfraction) is empty then the TestOK will be disabled regardless of if the other controls are true.
@derekagorhom @lilyclements I'm not sure I completely agree with this. Both Or
and OrElse
provide a logical 'Or' operation and will always return the same logical result.
The only difference is that OrElse
is a short-circuiting operator, Or
is not. This means that with OrElse
if the left-hand side is true then VB will not bother to evaluate the right-hand side (because the 'Or' condition has already been met). In contrast Or
will always evaluate both sides. There may be rare conditions when you want to do this, e.g. if you want to force 2 functions to be called:
If myFunction1() Or myFunction2() Then
But this could be confusing for other developers so better to call functions explicitly above the 'If'.
OrElse
is useful for avoiding exceptions, e.g.:
If myObject Is Nothing OrElse myObject.name = 'Derrick' Then
If we used Or
above, and myObject
was nothing, then it would raise an exception.
In summary, OrElse
is more efficient and may be safer. Therefore best to always use OrElse
(apart from very rare circumstances).
Does this make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@derekagorhom Thank you, this looks good.
Please could you read my comment and if you agree, then change the Or
s to OrElse
? It's not actually essential but it makes the code more consistent with the rest of R-Instat.
If you disagree, then that's also fine, I'm happy to discuss.
Thanks
instat/dlgRandomSplit.vb
Outdated
If ucrChkStratifyingFactor.Checked Then | ||
If Not ucrReceiverRanSplit.IsEmpty AndAlso Not ucrNudBreaks.IsEmpty AndAlso Not ucrNudPool.IsEmpty Then | ||
If ucrSaveTrainingData.IsComplete AndAlso ucrSaveTestingData.IsComplete AndAlso Not ucrNudFraction.IsEmpty Then | ||
If Not ucrSaveTrainingData.IsComplete Or Not ucrSaveTestingData.IsComplete Or ucrNudFraction.IsEmpty Then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or requires both expression to be evaluated while OrElse would still work even if only one expression is evaluated. so in this case if ucrSaveTrainingData (or ucrsaveTestingData/UcrNudfraction) is empty then the TestOK will be disabled regardless of if the other controls are true.
@derekagorhom @lilyclements I'm not sure I completely agree with this. Both Or
and OrElse
provide a logical 'Or' operation and will always return the same logical result.
The only difference is that OrElse
is a short-circuiting operator, Or
is not. This means that with OrElse
if the left-hand side is true then VB will not bother to evaluate the right-hand side (because the 'Or' condition has already been met). In contrast Or
will always evaluate both sides. There may be rare conditions when you want to do this, e.g. if you want to force 2 functions to be called:
If myFunction1() Or myFunction2() Then
But this could be confusing for other developers so better to call functions explicitly above the 'If'.
OrElse
is useful for avoiding exceptions, e.g.:
If myObject Is Nothing OrElse myObject.name = 'Derrick' Then
If we used Or
above, and myObject
was nothing, then it would raise an exception.
In summary, OrElse
is more efficient and may be safer. Therefore best to always use OrElse
(apart from very rare circumstances).
Does this make sense?
@lloyddewit i have made the changes.... OrElse works fine also. thank you for the suggestion |
@rdstern Please could you test? thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be working. @derekagorhom there needs to be a possibly seperate pull request to add the rsplit package to R-Instat.
As a detail I found it odd that there was the usual data selector, when there was no receiver - unless you included the stratified option. But I found it was working well - at least for the ordinary option. I didn't check the time series. I tried with the hh data from the MICS survey and it seems to be working fine. So I am approving.
Fixes #7227
This PR replaces PR #7272
This dialogue can be found in Prepare>Data Reshape > Random Split dialogue.
This is ready for review.
I have made the recommended changes requested by @lilyclements except
no.4
, since that needs @rdstern input/approval.