Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better handling of missing values in GenerateReport #16

Closed
djhurio opened this issue Mar 4, 2016 · 6 comments
Closed

Better handling of missing values in GenerateReport #16

djhurio opened this issue Mar 4, 2016 · 6 comments
Assignees
Labels
status: cran Work is accepted by CRAN status: develop Work is pushed to develop branch status: master Work is pushed to master branch type: enhancement Improvement on an existing feature
Milestone

Comments

@djhurio
Copy link

djhurio commented Mar 4, 2016

The example looks very nice. I am trying to run it on my own data, but I am getting the following error:

label: correlation_continuous
Quitting from lines 51-52 (report.rmd) 
Error in seq.default(from = best$lmin, to = best$lmax, by = best$lstep) : 
  'from' must be of length 1
@boxuancui
Copy link
Owner

Could you run PlotMissing function first and see if certain features are mostly NA? If so, that could be the reason.

To quick fix this, I would remove those features and run GenerateReport again.

I plan to add some missing value scanning before plotting. Please confirm this is the actual cause and I will make use of this issue as the enhancement.

@djhurio
Copy link
Author

djhurio commented Mar 7, 2016

Yes, I confirm this. Removing variables with NA rate more then 50% removed the first error. But now I have stopped on the next error:

label: correlation_discrete
Quitting from lines 63-64 (report.rmd) 
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : 
  contrasts can be applied only to factors with 2 or more levels

@boxuancui
Copy link
Owner

It is probably because you have some problematic discrete features too. Could you update the package to the latest develop branch? I have pushed some bug fixes and your issues should be addressed. Please let me know otherwise.

if (!require(devtools)) install.packages("devtools")
library(devtools)
install_github("boxuancui/DataExplorer", ref="develop")

@boxuancui boxuancui added the type: bug Existing features not working as expected label Mar 7, 2016
@boxuancui boxuancui self-assigned this Mar 7, 2016
@boxuancui boxuancui added the type: enhancement Improvement on an existing feature label Mar 7, 2016
@boxuancui boxuancui changed the title Error in seq.default Better handling of missing values in GenerateReport Mar 7, 2016
@boxuancui boxuancui added this to the 0.2.6 milestone Mar 9, 2016
@boxuancui boxuancui removed the type: bug Existing features not working as expected label Mar 10, 2016
@djhurio
Copy link
Author

djhurio commented Mar 14, 2016

I have installed development version. I am not getting errors any more. Report is generated with a warning:

Warning message:
In writeLines(if (encoding == "") res else native_encode(res, to = encoding),  :
  invalid char string in output conversion

And report is unreadable.

@boxuancui
Copy link
Owner

I believe it is due to non-ASCII characters in the data. I have created #19 to address this. For now, it is inherited from default rmarkdown settings.

@boxuancui
Copy link
Owner

I will close this ticket since it is a bug about missing values.

@boxuancui boxuancui reopened this Mar 14, 2016
@boxuancui boxuancui added the status: master Work is pushed to master branch label Mar 14, 2016
@boxuancui boxuancui added status: cran Work is accepted by CRAN status: develop Work is pushed to develop branch status: master Work is pushed to master branch and removed status: master Work is pushed to master branch labels May 8, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: cran Work is accepted by CRAN status: develop Work is pushed to develop branch status: master Work is pushed to master branch type: enhancement Improvement on an existing feature
Projects
None yet
Development

No branches or pull requests

2 participants