Skip to content

Commit

Permalink
add readme table with links to open notebooks in GH, Colab, Binder
Browse files Browse the repository at this point in the history
  • Loading branch information
janosh committed Sep 1, 2022
1 parent 21d1264 commit 13df583
Show file tree
Hide file tree
Showing 3 changed files with 62 additions and 47 deletions.
91 changes: 48 additions & 43 deletions dataset_exploration/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

The majority of datasets explored in this directory are from the [`matbench`](https://matbench.materialsproject.org) collection. Others include:

- [`ricci_carrier_transport`](https://hackingmaterials.lbl.gov/matminer/dataset_summary): Electronic Transport Properties by F. Ricci et al.](<https://contribs.materialsproject.org/projects/carrier_transport>) from MPContribs which contains 48,000 DFT Seebeck coefficients ([Paper](https://nature.com/articles/sdata201785)). [[Download link](https://contribs.materialsproject.org/projects/carrier_transport.json.gz) (from [here](https://git.io/JOMwY))].
- [`ricci_carrier_transport`](https://hackingmaterials.lbl.gov/matminer/dataset_summary): [Electronic Transport Properties by F. Ricci et al.][carrier_transport] from MPContribs which contains 48,000 DFT Seebeck coefficients ([Paper](https://nature.com/articles/sdata201785)). [[Download link][carrier_transport.json.gz] (from [here](https://git.io/JOMwY))].
- [`boltztrap_mp`](https://hackingmaterials.lbl.gov/matminer/dataset_summary) which contains ~9000 effective mass and thermoelectric properties calculated by the BoltzTraP software package.
- [`tri_camd_2022`](https://data.matr.io/7): Toyota Research Institute's 2nd active learning crystal discovery dataset from Computational Autonomy for
Materials Discovery (CAMD)
Expand All @@ -14,55 +14,60 @@ Materials Discovery (CAMD)

> MatBench is an [ImageNet](http://www.image-net.org) for materials science; a set of 13 supervised, pre-cleaned, ready-to-use ML tasks for benchmarking and fair comparison. The tasks span across the domain of inorganic materials science applications.
To browse these datasets online, go to <https://ml.materialsproject.org> and log in.
To browse these datasets online, go to [ml.materialsproject.org] and log in.
Datasets were originally published in <https://nature.com/articles/s41524-020-00406-3>.

Detailed information about how each dataset was created and prepared for use is available at <https://hackingmaterials.lbl.gov/matminer/dataset_summary.html>

### Full list of the 13 Matbench datasets in v0.1

| task name | target column (unit) | sample count | task type | input | links |
| ------------------------ | ---------------------------- | ------------ | -------------- | ----------- | --------------------------------- |
| `matbench_dielectric` | `n` (unitless) | 4764 | regression | structure | [download][1], [interactive][2] |
| `matbench_expt_gap` | `gap expt` (eV) | 4604 | regression | composition | [download][3], [interactive][4] |
| `matbench_expt_is_metal` | `is_metal` (unitless) | 4921 | classification | composition | [download][5], [interactive][6] |
| `matbench_glass` | `gfa` (unitless) | 5680 | classification | composition | [download][7], [interactive][8] |
| `matbench_jdft2d` | `exfoliation_en` (meV/atom) | 636 | regression | structure | [download][9], [interactive][10] |
| `matbench_log_gvrh` | `log10(G_VRH)` (log(GPa)) | 10987 | regression | structure | [download][11], [interactive][12] |
| `matbench_log_kvrh` | `log10(K_VRH)` (log(GPa)) | 10987 | regression | structure | [download][13], [interactive][14] |
| `matbench_mp_e_form` | `e_form` (eV/atom) | 132752 | regression | structure | [download][15], [interactive][16] |
| `matbench_mp_gap` | `gap pbe` (eV) | 106113 | regression | structure | [download][17], [interactive][18] |
| `matbench_mp_is_metal` | `is_metal` (unitless) | 106113 | classification | structure | [download][19], [interactive][20] |
| `matbench_perovskites` | `e_form` (eV, per unit cell) | 18928 | regression | structure | [download][21], [interactive][22] |
| `matbench_phonons` | `last phdos peak` (1/cm) | 1265 | regression | structure | [download][23], [interactive][24] |
| `matbench_steels` | `yield strength` (MPa) | 312 | regression | composition | [download][25], [interactive][26] |
| task name | target column (unit) | sample count | task type | input | download |
| -------------------------- | ---------------------------- | ------------ | -------------- | ----------- | ------------------------------------------ |
| [`matbench_dielectric`] | `n` (unitless) | 4764 | regression | structure | [download][matbench_dielectric.json.gz] |
| [`matbench_expt_gap`] | `gap expt` (eV) | 4604 | regression | composition | [download][matbench_expt_gap.json.gz] |
| [`matbench_expt_is_metal`] | `is_metal` (unitless) | 4921 | classification | composition | [download][matbench_expt_is_metal.json.gz] |
| [`matbench_glass`] | `gfa` (unitless) | 5680 | classification | composition | [download][matbench_glass.json.gz] |
| [`matbench_jdft2d`] | `exfoliation_en` (meV/atom) | 636 | regression | structure | [download][matbench_jdft2d.json.gz] |
| [`matbench_log_gvrh`] | `log10(G_VRH)` (log(GPa)) | 10987 | regression | structure | [download][matbench_log_gvrh.json.gz] |
| [`matbench_log_kvrh`] | `log10(K_VRH)` (log(GPa)) | 10987 | regression | structure | [download][matbench_log_kvrh.json.gz] |
| [`matbench_mp_e_form`] | `e_form` (eV/atom) | 132752 | regression | structure | [download][matbench_mp_e_form.json.gz] |
| [`matbench_mp_gap`] | `gap pbe` (eV) | 106113 | regression | structure | [download][matbench_mp_gap.json.gz] |
| [`matbench_mp_is_metal`] | `is_metal` (unitless) | 106113 | classification | structure | [download][matbench_mp_is_metal.json.gz] |
| [`matbench_perovskites`] | `e_form` (eV, per unit cell) | 18928 | regression | structure | [download][matbench_perovskites.json.gz] |
| [`matbench_phonons`] | `last phdos peak` (1/cm) | 1265 | regression | structure | [download][matbench_phonons.json.gz] |
| [`matbench_steels`] | `yield strength` (MPa) | 312 | regression | composition | [download][matbench_steels.json.gz] |

[1]: https://ml.materialsproject.org/projects/matbench_dielectric.json.gz
[2]: https://ml.materialsproject.org/projects/matbench_dielectric
[3]: https://ml.materialsproject.org/projects/matbench_expt_gap.json.gz
[4]: https://ml.materialsproject.org/projects/matbench_expt_gap
[5]: https://ml.materialsproject.org/projects/matbench_expt_is_metal.json.gz
[6]: https://ml.materialsproject.org/projects/matbench_expt_is_metal
[7]: https://ml.materialsproject.org/projects/matbench_glass.json.gz
[8]: https://ml.materialsproject.org/projects/matbench_glass
[9]: https://ml.materialsproject.org/projects/matbench_jdft2d.json.gz
[10]: https://ml.materialsproject.org/projects/matbench_jdft2d
[11]: https://ml.materialsproject.org/projects/matbench_log_gvrh.json.gz
[12]: https://ml.materialsproject.org/projects/matbench_log_gvrh
[13]: https://ml.materialsproject.org/projects/matbench_log_kvrh.json.gz
[14]: https://ml.materialsproject.org/projects/matbench_log_kvrh
[15]: https://ml.materialsproject.org/projects/matbench_mp_e_form.json.gz
[16]: https://ml.materialsproject.org/projects/matbench_mp_e_form
[17]: https://ml.materialsproject.org/projects/matbench_mp_gap.json.gz
[18]: https://ml.materialsproject.org/projects/matbench_mp_gap
[19]: https://ml.materialsproject.org/projects/matbench_mp_is_metal.json.gz
[20]: https://ml.materialsproject.org/projects/matbench_mp_is_metal
[21]: https://ml.materialsproject.org/projects/matbench_perovskites.json.gz
[22]: https://ml.materialsproject.org/projects/matbench_perovskites
[23]: https://ml.materialsproject.org/projects/matbench_phonons.json.gz
[24]: https://ml.materialsproject.org/projects/matbench_phonons
[25]: https://ml.materialsproject.org/projects/matbench_steels.json.gz
[26]: https://ml.materialsproject.org/projects/matbench_steels
<!-- markdown-link-check-disable -->
[ml.materialsproject.org]: https://ml.materialsproject.org
[matbench_dielectric.json.gz]: https://ml.materialsproject.org/projects/matbench_dielectric.json.gz
[`matbench_dielectric`]: https://ml.materialsproject.org/projects/matbench_dielectric
[matbench_expt_gap.json.gz]: https://ml.materialsproject.org/projects/matbench_expt_gap.json.gz
[`matbench_expt_gap`]: https://ml.materialsproject.org/projects/matbench_expt_gap
[matbench_expt_is_metal.json.gz]: https://ml.materialsproject.org/projects/matbench_expt_is_metal.json.gz
[`matbench_expt_is_metal`]: https://ml.materialsproject.org/projects/matbench_expt_is_metal
[matbench_glass.json.gz]: https://ml.materialsproject.org/projects/matbench_glass.json.gz
[`matbench_glass`]: https://ml.materialsproject.org/projects/matbench_glass
[matbench_jdft2d.json.gz]: https://ml.materialsproject.org/projects/matbench_jdft2d.json.gz
[`matbench_jdft2d`]: https://ml.materialsproject.org/projects/matbench_jdft2d
[matbench_log_gvrh.json.gz]: https://ml.materialsproject.org/projects/matbench_log_gvrh.json.gz
[`matbench_log_gvrh`]: https://ml.materialsproject.org/projects/matbench_log_gvrh
[matbench_log_kvrh.json.gz]: https://ml.materialsproject.org/projects/matbench_log_kvrh.json.gz
[`matbench_log_kvrh`]: https://ml.materialsproject.org/projects/matbench_log_kvrh
[matbench_mp_e_form.json.gz]: https://ml.materialsproject.org/projects/matbench_mp_e_form.json.gz
[`matbench_mp_e_form`]: https://ml.materialsproject.org/projects/matbench_mp_e_form
[matbench_mp_gap.json.gz]: https://ml.materialsproject.org/projects/matbench_mp_gap.json.gz
[`matbench_mp_gap`]: https://ml.materialsproject.org/projects/matbench_mp_gap
[matbench_mp_is_metal.json.gz]: https://ml.materialsproject.org/projects/matbench_mp_is_metal.json.gz
[`matbench_mp_is_metal`]: https://ml.materialsproject.org/projects/matbench_mp_is_metal
[matbench_perovskites.json.gz]: https://ml.materialsproject.org/projects/matbench_perovskites.json.gz
[`matbench_perovskites`]: https://ml.materialsproject.org/projects/matbench_perovskites
[matbench_phonons.json.gz]: https://ml.materialsproject.org/projects/matbench_phonons.json.gz
[`matbench_phonons`]: https://ml.materialsproject.org/projects/matbench_phonons
[matbench_steels.json.gz]: https://ml.materialsproject.org/projects/matbench_steels.json.gz
[`matbench_steels`]: https://ml.materialsproject.org/projects/matbench_steels
[carrier_transport]: https://contribs.materialsproject.org/projects/carrier_transport
[carrier_transport.json.gz]: https://contribs.materialsproject.org/projects/carrier_transport.json.gz
<!-- markdown-link-check-enable-->

### Leaderboard

Expand Down
17 changes: 13 additions & 4 deletions readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,19 @@ pip install pymatviz

Check out the Jupyter notebooks under [`examples/`](examples/) to learn how to use `pymatviz`.

| | | | |
| ------------------------------------ | ------------------------------------------------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------- |
| **`matbench_dielectric_eda.ipynb`** | [![Binder]](https://mybinder.org/v2/gh/janosh/pymatviz/main?labpath=examples/matbench_dielectric_eda.ipynb) | [![View on GitHub]](https://github.com/janosh/pymatviz/blob/main/examples/matbench_dielectric_eda.ipynb) | [![Open in Google Colab]](https://colab.research.google.com/github/janosh/pymatviz/blob/main/examples/matbench_dielectric_eda.ipynb) |
| **`mp_bimodal_e_form.ipynb`** | [![Binder]](https://mybinder.org/v2/gh/janosh/pymatviz/main?labpath=examples/mp_bimodal_e_form.ipynb) | [![View on GitHub]](https://github.com/janosh/pymatviz/blob/main/examples/mp_bimodal_e_form.ipynb) | [![Open in Google Colab]](https://colab.research.google.com/github/janosh/pymatviz/blob/main/examples/mp_bimodal_e_form.ipynb) |
| **`matbench_perovskites_eda.ipynb`** | [![Binder]](https://mybinder.org/v2/gh/janosh/pymatviz/main?labpath=examples/matbench_perovskites_eda.ipynb) | [![View on GitHub]](https://github.com/janosh/pymatviz/blob/main/examples/matbench_perovskites_eda.ipynb) | [![Open in Google Colab]](https://colab.research.google.com/github/janosh/pymatviz/blob/main/examples/matbench_perovskites_eda.ipynb) |
| **`mprester_ptable.ipynb`** | [![Binder]](https://mybinder.org/v2/gh/janosh/pymatviz/main?labpath=examples/mprester_ptable.ipynb) | [![View on GitHub]](https://github.com/janosh/pymatviz/blob/main/examples/mprester_ptable.ipynb) | [![Open in Google Colab]](https://colab.research.google.com/github/janosh/pymatviz/blob/main/examples/mprester_ptable.ipynb) |

[Binder]: https://mybinder.org/badge_logo.svg
[View on GitHub]: https://img.shields.io/badge/View%20on-GitHub-darkblue?logo=github
[Open in Google Colab]: https://colab.research.google.com/assets/colab-badge.svg

When trying to open notebooks in Google Colab, you might encounter errors. Colab currently only supports Python 3.7. `pymatviz` uses Python 3.8 features like [self-documenting f-strings](https://docs.python.org/3/whatsnew/3.8.html#f-strings-support-for-self-documenting-expressions-and-debugging). You may still be able to use `pymatviz` on Colab by cloning the repo and patching the source code in-place [as shown here](https://github.com/janosh/pymatviz/issues/17#issuecomment-1165141311).

## Periodic Table

See [`pymatviz/ptable.py`](pymatviz/ptable.py). Heat maps of the periodic table can be plotted both with `matplotlib` and `plotly`. `plotly` supports displaying additional data on hover or full interactivity through [Dash](https://plotly.com/dash).
Expand Down Expand Up @@ -128,10 +141,6 @@ See [`pymatviz/correlation.py`](pymatviz/correlation.py).
2. **Error** `y_err = abs(y_true - y_pred)`: Absolute error between target and model prediction.
3. **Uncertainty** `y_std`: The model's estimate for its error, i.e. how much the model thinks its prediction can be trusted. (`std` for standard deviation.)

## Usage on Google Colab

For the time being, Google Colab only supports Python 3.7. `pymatviz` uses Python 3.8 features like [self-documenting f-strings](https://docs.python.org/3/whatsnew/3.8.html#f-strings-support-for-self-documenting-expressions-and-debugging). You may still be able to use `pymatviz` on Colab by cloning the repo and patching the source code in place [as shown here](https://github.com/janosh/pymatviz/issues/17#issuecomment-1165141311).

[cumulative-error]: https://raw.githubusercontent.com/janosh/pymatviz/main/assets/cumulative-error.svg
[cumulative-residual]: https://raw.githubusercontent.com/janosh/pymatviz/main/assets/cumulative-residual.svg
[density-hexbin-with-hist]: https://raw.githubusercontent.com/janosh/pymatviz/main/assets/density-hexbin-with-hist.svg
Expand Down
1 change: 1 addition & 0 deletions runtime.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
python-3.8

0 comments on commit 13df583

Please sign in to comment.