Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Visualization feedback #55

Open
ksedivyhaley opened this issue Jan 29, 2020 · 4 comments
Open

Visualization feedback #55

ksedivyhaley opened this issue Jan 29, 2020 · 4 comments

Comments

@ksedivyhaley
Copy link

  • Table "Cumulative Distribution of Numberic Variables" is confusing and unnecessary. Consider including it as a figure - this could replace the violin plot.
  • Violin plot is not best choice for data with long tails, and clearer y-axis labelling (with units) is required
  • For "distributions of the categorical variables,"
    • Browser, OS etc should be given proper names rather than numbers.
    • Months should be in chronological order
  • Corrplot is overall good, but
    • Would be easier to follow if on one page (smaller image size)
    • Set diagonals to a different colour to avoid dominating the visualization
  • Good use of faceting!
@aromatic-toast
Copy link
Collaborator

  1. This has already been removed in the latest commit.
  2. Will plot the variables on a log scale to see if this fixed the violin plot
  3. The data does not come with labels for the Browser and OS variables so naming these is impossible
  4. will reorder the factors of months to put them in chrono order
  5. we decided to remove the correlation plot. We had a github issue on the day the assignment was due and the code to resize the corr matrix was lost.

@ksedivyhaley
Copy link
Author

  1. 👍
  2. Log scale is worth a try, as long as the scale is clearly labelled.
  3. You mean you have no way of determining which number corresponds with each Browser and OS?
  4. 👍
  5. Too bad! If it's a choice between keeping a big correlation plot and removing it, I would keep the big plot.

@aromatic-toast
Copy link
Collaborator

Yes, the data does not come with labels of the Browser and OS. It it just numbers so we have no way of knowing what these numbers map onto. There is minimal meta data coming from the UCI website.

@aromatic-toast
Copy link
Collaborator

We removed the correlation plot because we didn't have an impact on the downstream analysis as we didn't have the time to do any feature selection later. But it was part of the initial EDA. We were not sure if it was okay to include it even thought we don't really use it to inform the downstream analysis.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants