Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] "Undefined" chart caused by special character period . in column name #175

Closed
ehsong opened this issue Dec 7, 2020 · 2 comments
Closed
Assignees
Labels
bug Something isn't working

Comments

@ehsong
Copy link

ehsong commented Dec 7, 2020

I have a data set with categorical variables, but lux is not displaying them properly. Instead of the distribution of the categorical variables, they are lumped to 'undefined'. I tried dropping NaNs but this did not fix the problem.

My data set looks like this:

analysis_total_data_-Jupyter_Notebook_and_Movie_Finder-_Chrome_Web_Store

The toggle result was:

analysis_total_data_-_Jupyter_Notebook

I was expecting for the first chart to have categories Evident, Somewhat Evident, Indicator Skipped etc...but it was all lumped to 'undefined'. The data type is non-null object

analysis_total_data_-_Jupyter_Notebook

@jinimukh jinimukh added the bug Something isn't working label Dec 7, 2020
@jinimukh jinimukh changed the title I don't understand why occurrence does not display correctly [BUG] Categorical Variables are Undefined Dec 7, 2020
@dorisjlee
Copy link
Member

Hi @ehsong, Thank you for your interest in Lux! I was able to mock up a sample of your dataset and reproduce the problem that you are seeing. The issue is caused by the period in each of the column names (this is a known issue in Altair, which is a library that Lux uses for plotting).
Below is the visualization before and after removing the period:

The easier fix on your end is to simply remove all the . in the column names in your dataframe. Based on the screenshot that you've posted, it looks like the . corresponds to every last character of every column in your dataframe. So you could do something like this to replace all the periods in the column names.

df.columns = df.columns.str.replace(".","")

image

Please let us know if this fixes the issue you are seeing for "undefined" displaying the bar charts.

We will also fix things on our end to better support special characters in column names, as well as long column names, (like the ones in your dataset) in the future. Thanks for bringing this issue to our attention!

@dorisjlee dorisjlee changed the title [BUG] Categorical Variables are Undefined [BUG] "Undefined" chart caused by special character period . in column name Dec 8, 2020
@dorisjlee
Copy link
Member

Hi @ehsong, We've resolved the issue with the special characters in column names. We've also abbreviated long column names so that they are displayed properly on the charts. We will be incorporating this into the next release. Thanks!
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants