Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added more options to rename multiple columns #7208

Conversation

N-thony
Copy link
Collaborator

@N-thony N-thony commented Feb 8, 2022

Fixes (partially) #6877
This is still a work in progress. @shadrackkibet I have added the grid under the multiple options as suggested by @rdstern in the corresponding issue. At this stage, I would like your help on the backend side. I'm not sure about how the R code or proper package to use for renaming multiple variables through the grid.

@shadrackkibet
Copy link
Collaborator

@N-thony thanks for the discussion. Leave this to me for now. I will alert you when the R code is ready.

@N-thony
Copy link
Collaborator Author

N-thony commented Feb 10, 2022

@N-thony thanks for the discussion. Leave this to me for now. I will alert you when the R code is ready.

@shadrackkibet sounds good. Thanks.

Comment on lines 145 to 149
If Not ucrReceiverColumns.IsEmpty() Then
ucrBase.OKEnabled(True)
Else
ucrBase.OKEnabled(False)
End If
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default is to rename all columns. Therefore if the receiver is empty we use the whole data frame. This means we do not need to disable OK if the multiple receiver is empty.

@shadrackkibet
Copy link
Collaborator

@N-thony I have now added the R code that will allow the renaming of multiple columns one by one via a grid. I have also added an option to edit variable labels.

  • There are three type options single, multiple and rename_with. This corresponds to each of the top radio buttons.
  • I also fixed a bug under rename with. The column metadata should now update correctly to reflect the new names.
  • You will notice in the function I added two arguments new_column_names_df and new_labels_df which must be data frames of dimension 2. new_column_names_df contains the new columns names and the rows(indexes) that changed. new_labels_df contains the new variable labels and the respective rows (indexes) that changed.

You need to construct these data frames from the grid and pass them into the R function. Below is an illustration of how this should look like from the dialog;

data_book$rename_column_in_data(data_name="efc", 
                                type="multiple",
                                new_column_names_df = data.frame(cols = c("c12HOUR","e16SEX","e42DEP"),
                                                                 index = c(1,3,5)),
                                new_labels_df = data.frame(labels = c("spouse, child,sibling etc","Independent, slightly dependent, moderately dependent"),
                                                           index = c(2,5)))

@N-thony
Copy link
Collaborator Author

N-thony commented Feb 11, 2022

@N-thony I have now added the R code that will allow the renaming of multiple columns one by one via a grid. I have also added an option to edit variable labels.

  • There are three type options single, multiple and rename_with. This corresponds to each of the top radio buttons.
  • I also fixed a bug under rename with. The column metadata should now update correctly to reflect the new names.
  • You will notice in the function I added two arguments new_column_names_df and new_labels_df which must be data frames of dimension 2. new_column_names_df contains the new columns names and the rows(indexes) that changed. new_labels_df contains the new variable labels and the respective rows (indexes) that changed.

You need to construct these data frames from the grid and pass them into the R function. Below is an illustration of how this should look like from the dialog;

data_book$rename_column_in_data(data_name="efc", 
                                type="multiple",
                                new_column_names_df = data.frame(cols = c("c12HOUR","e16SEX","e42DEP"),
                                                                 index = c(1,3,5)),
                                new_labels_df = data.frame(labels = c("spouse, child,sibling etc","Independent, slightly dependent, moderately dependent"),
                                                           index = c(2,5)))

@shadrackkibet thanks, I will go ahead and link it with the grid.

@N-thony
Copy link
Collaborator Author

N-thony commented Feb 11, 2022

@rdstern I have now linked the R code provided by Shadrack with the grid. I did some tests and seems fine. Could you test too? Thanks.

Copy link
Collaborator

@rdstern rdstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@N-thony so far I have just looked at the multiple. Not yet the rename_with.
a) Looks fine on manual changes.
b) When I paste from Excel they appear to change, but it doesn't register.
c) When I press Ok and have made a change, but not clicked elsewhere, then the last change doesn't register.
d) When I click to include labels it goes blank and freezes. (And, once it is working please test copying the old names and pasting into the variable labels.

@N-thony
Copy link
Collaborator Author

N-thony commented Feb 11, 2022

@N-thony so far I have just looked at the multiple. Not yet the rename_with. a) Looks fine on manual changes. b) When I paste from Excel they appear to change, but it doesn't register. c) When I press Ok and have made a change, but not clicked elsewhere, then the last change doesn't register. d) When I click to include labels it goes blank and freezes. (And, once it is working please test copying the old names and pasting into the variable labels.

@rdstern thanks for your comment. During our discussion we agreed to get the multiple option working well before moving to the rename_with option. Thanks for reminding about the pasting feature on the grid. I made some change regarding your comments above. I didn't yet finish to make a proper test to check if it is stable, but you may wish to test it too at this stage.

Copy link
Collaborator

@rdstern rdstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@N-thony that's progress!
a) Paste is working, but only for the first cell thaqt is pasted.
b) Adding labels is now sensible. I only checked pasting, and that seems the same as the names, so it just pastes the first label.

@N-thony
Copy link
Collaborator Author

N-thony commented Feb 14, 2022

@rdstern I'm still looking into this. Meanwhile, @shadrackkibet I found that with the functions parameters passing the columns and indexes is a bit tricky. Using the survey dataset, I have the selected the three first column Village, field and size then through my dialogue I renamed both the name and label of village which the index is 1. Then click Ok. All runs fine, when I go back I expected to see only 2 columns here field and size since the new village name is not in the defined filter then, I do rename field column in the grid then I found it's still rename now the new village name column since now in the grid the index of field is 1. I can demonstrate this if you have a minute for a quick call.

rdstern
rdstern previously approved these changes Mar 9, 2022
Copy link
Collaborator

@rdstern rdstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@N-thony looks fine to me. I am approving, so it can be merged. The next item is the abbreviate function, which I hope will be easy to add?

@shadrackkibet
Copy link
Collaborator

@N-thony abbreviate can be another radio button. Implement as follows;

data_book$rename_column_in_data(data_name="survey", type="rename_with", .fn=abbreviate)

If a control (Nud/input) is needed for minlength(R default =4) or any other parameter can be added as follows;

data_book$rename_column_in_data(data_name="survey", type="rename_with", .fn=abbreviate, minlength = 4)

@rdstern
Copy link
Collaborator

rdstern commented Mar 10, 2022

@shadrackkibet great to have this and a really good "quick win" for R-Instat. Once that is added and merged, then I suggest that the last feature in renaming (which overlaps with all your select stuff, i.e. your select is the find, and then we add replace!) becomes far less urgent, and could wait till later in the year. Certainly until you have everything you need in the Select stuff. So, I can write that as a new issue. (Maybe we don't even need the find, but it just works on a defined select!

@shadrackkibet
Copy link
Collaborator

@shadrackkibet great to have this and a really good "quick win" for R-Instat. Once that is added and merged, then I suggest that the last feature in renaming (which overlaps with all your select stuff, i.e. your select is the find, and then we add replace!) becomes far less urgent, and could wait till later in the year. Certainly until you have everything you need in the Select stuff. So, I can write that as a new issue. (Maybe we don't even need the find, but it just works on a defined select!

This should be straightforward to implement. In a nutshell, this will simply have a receiver to take in a select object then we just rename those columns in the select object.

@rdstern
Copy link
Collaborator

rdstern commented Mar 10, 2022

Exactly!

@N-thony
Copy link
Collaborator Author

N-thony commented Mar 10, 2022

@N-thony abbreviate can be another radio button. Implement as follows;

data_book$rename_column_in_data(data_name="survey", type="rename_with", .fn=abbreviate)

If a control (Nud/input) is needed for minlength(R default =4) or any other parameter can be added as follows;

data_book$rename_column_in_data(data_name="survey", type="rename_with", .fn=abbreviate, minlength = 4)

@rdstern @shadrackkibet I have added this.

dctRowsNewNameChanged.Clear()
dctRowsNewLabelChanged.Clear()

ucrNudAbbreviate.SetText(4)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have the control passing the parameter then the control will read from the R code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is resolved

Private clsDefaultRFunction As New RFunction
Private clsNewColNameDataframeFunction As New RFunction
Private clsNewLabelDataframeFunction As New RFunction
Private clsDummyFunction As New RFunction
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try using conditions. Then you wouldn't need a dummy function. Happy to discuss.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clsDummyFunction concerns ucrChkIncludeLabels, and it will be improved in the implementation of the ucrControl appropriate for this grid

Comment on lines 207 to 228
Private Function GetValuesAsVector(dctValues As Dictionary(Of Integer, String)) As String
Dim strValue As String = ""
Dim i As Integer
strValue = strValue & "c("
For Each iRow As Integer In dctValues.Keys
If i > 0 Then
strValue = strValue & ","
End If
strValue = strValue & Chr(34) & dctValues(iRow) & Chr(34)
i = i + 1
Next
strValue = strValue & ")"
Return strValue
End Function

Private Sub ValidateNamesFromDictionary(iColIndex As Integer)
If iColIndex = 1 Then
For Each value In dctRowsNewNameChanged.Values
If Not CheckNames(value, iColIndex) Then
MsgBox("The column name must not be a numeric or contains space or french accent or be a boolean e.g TRUE, FALSE, T, F.")
bCurrentCell = False
Exit For
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to be moving this kind of code into user control. I think we have this kind of code in climatic data entry. I suggest you add a TODO comment.

instat/dlgName.vb Outdated Show resolved Hide resolved
Copy link
Collaborator

@rdstern rdstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@N-thony I think the abbreviate works fine now. But your function has now (behind the scenes) changed the janitor function for clean names. I think it also applies the default width from the abbreviate function.
At least the result of the clean names, with snake works fine in version 0.7.4 and is now different in your current function. I was trying with the iris set.

rdstern
rdstern previously approved these changes Mar 10, 2022
Copy link
Collaborator

@rdstern rdstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great - this is now an excellent new feature. Hope @shadrackkibet or @lloyddewit can approve, so it can be merged

@lloyddewit
Copy link
Contributor

@N-thony I think there are 3 open comments from @shadrackkibet that are still open. Please could you fix these (without setting to resolved) and then ask @shadrackkibet to check. If he's happy, then he can resolve these remaining comments and approve.

@shadrackkibet thanks for reviewing this PR, I don't think I need to review also, but if you'd like me to review, then let me know, thanks

@N-thony
Copy link
Collaborator Author

N-thony commented Mar 15, 2022

@N-thony I think there are 3 open comments from @shadrackkibet that are still open. Please could you fix these (without setting to resolved) and then ask @shadrackkibet to check. If he's happy, then he can resolve these remaining comments and approve.

@shadrackkibet thanks for reviewing this PR, I don't think I need to review also, but if you'd like me to review, then let me know, thanks

@lloyddewit I think the three comments were already solved.

@lloyddewit
Copy link
Contributor

@lloyddewit I think the three comments were already solved.

@N-thony I looked at the code and couldn't see where the comments are solved. Maybe I'm missing something. For each comment, please could you explain where/how the comment is solved?
thanks

@lloyddewit
Copy link
Contributor

@N-thony thank you for the update
@shadrackkibet if you think that all your comments have been resolved, then please could you approve?
Thanks

@shadrackkibet shadrackkibet changed the title Added options to the Prepare > Data Frame > Rename Column dialogue Added more options to rename multiple columns Mar 15, 2022
Copy link
Collaborator

@shadrackkibet shadrackkibet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rdstern please approve then we can merge.

@shadrackkibet shadrackkibet merged commit bc99ca3 into IDEMSInternational:master Mar 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants