Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Several possible improvements for ptable_heatmap plotter #138

Closed
DanielYang59 opened this issue May 3, 2024 · 4 comments · Fixed by #140
Closed

Several possible improvements for ptable_heatmap plotter #138

DanielYang59 opened this issue May 3, 2024 · 4 comments · Fixed by #140
Labels
ptable Periodic table ux User experience

Comments

@DanielYang59
Copy link
Collaborator

DanielYang59 commented May 3, 2024

I noticed several minor issues with ptable_heatmap plotter:

  • The font style of the text in each tile and colorbar title cannot be controlled individually (only a single text_style arg is allowed). (Maybe the new implementation would fix this automatically because they would be separate objects then)

  • The colormap is named as colorscale, maybe rename it to colormap for consistency with matplotlib and other parts of the code?

  • The default setting doesn't format numbers in (-1, 1) range well, see the below example (data for Ag and Zn is 0.93 and -0.43). Manually set fmt fixes the issue though.

    test_plot

  • Should we add an option to allow easy hiding of the Lanthanum and Actinium series (maybe there is already one that I'm not aware of)?

@janosh
Copy link
Owner

janosh commented May 4, 2024

thanks! all great suggestions. 👍

The colormap is named as colorscale, maybe rename it to colormap for consistency with matplotlib and other parts of the code?

it used to be colormap but at some point i renamed to colorscale with ptable_heatmap_plotly. plotly and matplotlib have differing terminology here. i'm leaning towards keeping colorscale but happy to hear other views.

The default setting doesn't format numbers in (0, 1) range well, see the below example (data for Ag and Zn is 0.93 and -0.43). Manually set fmt fixes the issue though.

thanks for reporting, that's terrible default behavior!

Should we add an option to allow easy hiding of the Lanthanum and Actinium series (maybe there is already one that I'm not aware of)?

let's do it!

@janosh janosh added ux User experience ptable Periodic table labels May 4, 2024
janosh added a commit that referenced this issue May 4, 2024
docs ReferenceError: elem is not defined
at file:///home/.../entries/pages/api/_page.svelte.js:444:565
@DanielYang59
Copy link
Collaborator Author

The colormap is named as colorscale, maybe rename it to colormap for consistency with matplotlib and other parts of the code?

it used to be colormap but at some point i renamed to colorscale with ptable_heatmap_plotly. plotly and matplotlib have differing terminology here. i'm leaning towards keeping colorscale but happy to hear other views.

Thanks for the input. I'm not quite sure either. Let's keep it as is for now.

Should we add an option to allow easy hiding of the Lanthanum and Actinium series (maybe there is already one that I'm not aware of)?

let's do it!

I would push a PR to add this soon :)

@janosh
Copy link
Owner

janosh commented May 5, 2024

i think "impute" is being misused here. it usually means inferring a likely value from context/experience. not sure what you mean by "zero: impute with zeros"

pymatviz/pymatviz/ptable.py

Lines 310 to 315 in 492cd53

Infinity would be replaced by vmax(∞) or vmin(-∞).
Missing values would be handled by selected strategy:
- zero: impute with zeros
- mean: impute with mean value

@DanielYang59
Copy link
Collaborator Author

DanielYang59 commented May 6, 2024

Thanks for bringing this up. I used the word "impute" just like the scikit-learn documents do (maybe I misunderstood):

A better strategy is to impute the missing values, i.e., to infer them from the known part of the data. See the glossary entry on imputation.

One type of imputation algorithm is univariate, which imputes values in the i-th feature dimension using only non-missing values in that feature dimension (e.g. SimpleImputer). By contrast, multivariate imputation algorithms use the entire set of available feature dimensions to estimate the missing values (e.g. IterativeImputer).

And answer from ChatGPT:

Q: I want advice on using the word "impute" in data science, can I say "missing values would be imputed with zeros" to describe a case where zeros would placed in place of missing values?

A: Yes, you can certainly use "impute" in that context. In data science, "imputation" refers to the process of replacing missing or null values with substituted values. So, saying "missing values would be imputed with zeros" is a clear and appropriate way to describe replacing missing values with zeros in your dataset. It's a common practice in data preprocessing to handle missing data before analysis or modeling.

Maybe I should clarify the docstring? In that part, I listed two available missing value imputation strategies (Literal["zero", "mean"]), where zero correspond to the missing value imputation strategy where missing values are imputed with zeros.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ptable Periodic table ux User experience
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants