Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide functions for reference data #312

Merged
merged 23 commits into from
Jul 30, 2024
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
a0d5c55
Provide functions for reference data
jan-janssen Jul 29, 2024
dc43179
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 29, 2024
86a1645
Add a first set of unit tests
jan-janssen Jul 29, 2024
e141423
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 29, 2024
035d69d
update dependencies
jan-janssen Jul 29, 2024
fe58857
Merge remote-tracking branch 'refs/remotes/origin/reference_data' int…
jan-janssen Jul 29, 2024
9f01b57
fix import
jan-janssen Jul 29, 2024
6de6fed
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 29, 2024
be84a5d
Add lxml
jan-janssen Jul 29, 2024
a013161
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 29, 2024
8b339c6
Add docstrings and type hints with co-pilot
jan-janssen Jul 30, 2024
7cfd7c9
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 30, 2024
d4d94a7
add coderabbit recommendations
jan-janssen Jul 30, 2024
9830680
Merge branch 'reference_data' of github.com:pyiron/atomistics into re…
jan-janssen Jul 30, 2024
d221132
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 30, 2024
e820d3f
Merge pull request #315 from pyiron/main
jan-janssen Jul 30, 2024
fa0929b
Following Sam's suggestion - renamed get_experimental_elastic_propert…
jan-janssen Jul 30, 2024
dcb287a
Support both:
jan-janssen Jul 30, 2024
fe3c0d7
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 30, 2024
031632d
Fix wikipedia docstring
jan-janssen Jul 30, 2024
92cf77a
Merge remote-tracking branch 'refs/remotes/origin/reference_data' int…
jan-janssen Jul 30, 2024
d9d4a83
Update environment-old.yml
jan-janssen Jul 30, 2024
c744330
Update environment-old.yml
jan-janssen Jul 30, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 11 additions & 8 deletions .ci_support/environment-old.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,17 +2,20 @@ channels:
- conda-forge
dependencies:
- ase =3.22.1
- dynaphopy =1.17.5
- gpaw =20.1.0
- iprpy-data =2023.07.25
- jinja2 =2.11.3
- lammps =2022.06.23
- lxml =4.9.1
- mendeleev =0.10.0
- numpy =1.23.5
- pandas =1.5.3
- phonopy =2.20.0
- pylammpsmpi =0.2.1
- requests =2.24.0
- scipy =1.11.1
- seekpath =1.9.0
- spglib =2.0.2
- phonopy =2.20.0
- structuretoolkit =0.0.10
- seekpath =1.9.0
- lammps =2022.06.23
- pandas =1.5.3
- pylammpsmpi =0.2.1
- jinja2 =2.11.3
- iprpy-data =2023.07.25
- dynaphopy =1.17.5
- tqdm =4.44.0
8 changes: 6 additions & 2 deletions .ci_support/environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,14 @@ channels:
dependencies:
- ase =3.23.0
- coverage
- lxml =5.2.2
- mendeleev =0.17.0
- numpy =1.26.4
- pandas =2.2.2
- phonopy =2.26.6
- requests =2.32.3
- scipy =1.14.0
- seekpath =2.1.0
- spglib =2.5.0
- phonopy =2.26.6
- structuretoolkit =0.0.27
- seekpath =2.1.0
- tqdm =4.66.4
17 changes: 17 additions & 0 deletions atomistics/referencedata/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
from atomistics.referencedata.wiki import get_experimental_elastic_property_wikipedia

try:
from atomistics.referencedata.mendeleevdb import (
get_chemical_information_from_mendeleev,
get_chemical_information_from_wolframalpha,
)
except ImportError:
__all__ = []
else:
__all__ = [
get_chemical_information_from_mendeleev,
get_chemical_information_from_wolframalpha,
]


__all__ += [get_experimental_elastic_property_wikipedia]
114 changes: 114 additions & 0 deletions atomistics/referencedata/mendeleevdb.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
from mendeleev.fetch import fetch_table


def get_chemical_information_from_mendeleev(chemical_symbol: str) -> dict:
"""
Get information of a given chemical element

Args:
chemical_symbol: Chemical Element like Au for Gold

Returns:
dict: Dictionary with the following keys
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add error handling for missing chemical symbols.

The function assumes that the chemical symbol will always be found. Add error handling for cases where the symbol is not found.

-    return df[df.symbol == chemical_symbol].squeeze(axis=0).to_dict()
+    try:
+        return df[df.symbol == chemical_symbol].squeeze(axis=0).to_dict()
+    except KeyError:
+        return {}
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def get_chemical_information_from_mendeleev(chemical_symbol: str) -> dict:
"""
Get information of a given chemical element
Args:
chemical_symbol: Chemical Element like Au for Gold
Returns:
dict: Dictionary with the following keys
def get_chemical_information_from_mendeleev(chemical_symbol: str) -> dict:
"""
Get information of a given chemical element
Args:
chemical_symbol: Chemical Element like Au for Gold
Returns:
dict: Dictionary with the following keys
"""
try:
return df[df.symbol == chemical_symbol].squeeze(axis=0).to_dict()
except KeyError:
return {}

abundance_crust: Abundance in the Earth’s crust in mg/kg
abundance_sea: Abundance in the seas in mg/L
atomic_number: Atomic number
atomic_radius_rahm: Atomic radius by Rahm et al. in pm
atomic_radius: Atomic radius in pm
atomic_volume: Atomic volume in cm^3/mol
atomic_weight_uncertainty: Atomic weight uncertainty in Da
atomic_weight: Relative atomic weight in Da
block: Block in periodic table
boiling_point: Boiling point in K
c6_gb: C_6 dispersion coefficient according to Gould & Bučko in hartree/bohr^6
c6: C_6 dispersion coefficient in hartree/bohr^6
cas: Chemical Abstracts Serice identifier
covalent_radius_bragg: Covalent radius by Bragg in pm
covalent_radius_cordero: Covalent radius by Cerdero et al. in pm
covalent_radius_pyykko_double: Double bond covalent radius by Pyykko et al. in pm
covalent_radius_pyykko_triple: Triple bond covalent radius by Pyykko et al. in pm
covalent_radius_pyykko: Single bond covalent radius by Pyykko et al. in pm
cpk_color: Element color in CPK convention
critical_pressure: Critical pressure in MPa
critical_temperature: Critical temperature in K
density: Density at 295K in g/cm^3
description: Short description of the element
dipole_polarizability_unc: Uncertainty of the dipole polarizability in bohr^3
dipole_polarizability: Dipole polarizability in bohr^3
discoverers: The discoverers of the element
discovery_location: The location where the element was discovered
discovery_year: The year the element was discovered
econf: Ground state electronic configuration
electron_affinity: Electron affinity in eV
electronegativity_allen: Allen’s scale of electronegativity in eV
electronegativity_allred_rochow: Allred and Rochow’s scale of electronegativity in e^2/pm^2
electronegativity_cottrell_sutton: Cottrell and Sutton’s scale of electronegativity in e^0.5/pm^0.5
electronegativity_ghosh: Ghosh’s scale of electronegativity in 1/pm
electronegativity_gordy: Gordy’s scale of electronegativity in e/pm
electronegativity_li_xue: Li and Xue’s scale of electronegativity in 1/pm
electronegativity_martynov_batsanov: Martynov and Batsanov’s scale of electronegativity in eV^0.5
electronegativity_mulliken: Mulliken’s scale of electronegativity in eV
electronegativity_nagle: Nagle’s scale of electronegativity in 1/bohr
electronegativity_pauling: Pauling’s scale of electronegativity
electronegativity_sanderson: Sanderson’s scale of electronegativity
electrons: Number of electrons
electrophilicity: Parr’s electrophilicity index
evaporation_heat: Evaporation heat in kJ/mol
fusion_heat: Fusion heat in kJ/mol
gas_basicity: Gas basicity in kJ/mol
geochemical_class: Geochemical classification
glawe_number: Glawe’s number (scale)
goldschmidt_class: Goldschmidt classification
group: Group in the periodic table
hardness: Absolute hardness. Can also be calcualted for ions. in eV
heat_of_formation: Heat of formation in kJ/mol
inchi: International Chemical Identifier
ionenergy: See IonizationEnergy class documentation
ionic_radii: See IonicRadius class documentation
is_monoisotopic: Is the element monoisotopic
is_radioactive: Is the element radioactive
isotopes: See Isotope class documentation
jmol_color: Element color in Jmol convention
lattice_constant: Lattice constant
lattice_structure: Lattice structure code
mass_number: Mass number of the most abundant isotope
melting_point: Melting point in K
mendeleev_number: Mendeleev’s number
metallic_radius_c12: Metallic radius with 12 nearest neighbors in pm
metallic_radius: Single-bond metallic radius in pm
molar_heat_capacity: Molar heat capacity @ 25 C, 1 bar in J/mol/K
molcas_gv_color: Element color in MOCAS GV convention
name_origin: Origin of the name
name: Name in English
neutrons: Number of neutrons
nist_webbook_url: URL for the NIST Chemistry WebBook
nvalence: Number of valence electrons
oxides: Possible oxides based on oxidation numbers
oxistates: See OxidationState class documentation
period: Period in periodic table
pettifor_number: Pettifor scale
proton_affinity: Proton affinity in kJ/mol
protons: Number of protons
sconst: See ScreeningConstant class documentation
series: Series in the periodic table
softness: Absolute softness. Can also be calculated for ions. in 1/eV
sources: Sources of the element
specific_heat_capacity: Specific heat capacity @ 25 C, 1 bar in J/g/K
symbol: Chemical symbol
thermal_conductivity: Thermal conductivity @25 C in W/m/K
triple_point_pressure: Presseure of the triple point in kPa
triple_point_temperature: Temperature of the triple point in K
uses: Main applications of the element
vdw_radius_alvarez: Van der Waals radius according to Alvarez in pm
vdw_radius_batsanov: Van der Waals radius according to Batsanov in pm
vdw_radius_bondi: Van der Waals radius according to Bondi in pm
vdw_radius_dreiding: Van der Waals radius from the DREIDING FF in pm
vdw_radius_mm3: Van der Waals radius from the MM3 FF in pm
vdw_radius_rt: Van der Waals radius according to Rowland and Taylor in pm
vdw_radius_truhlar: Van der Waals radius according to Truhlar in pm
vdw_radius_uff: Van der Waals radius from the UFF in pm
vdw_radius: Van der Waals radius in pm
zeff: Effective nuclear charge
"""
df = fetch_table("elements")
return df[df.symbol == chemical_symbol].squeeze(axis=0).to_dict()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider caching the fetched data.

Fetching the table every time the function is called might be inefficient. Consider caching the data to improve performance.

from functools import lru_cache

@lru_cache(maxsize=1)
def get_chemical_information_from_mendeleev(chemical_symbol: str) -> dict:
    ...

30 changes: 30 additions & 0 deletions atomistics/referencedata/wiki.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
import pandas


def get_experimental_elastic_property_wikipedia(chemical_symbol: str) -> dict:
"""
Looks up elastic properties for a given chemical symbol from the Wikipedia: https://en.wikipedia.org/wiki/Elastic_properties_of_the_elements_(data_page) sourced from webelements.com.

Args:
chemical_symbol (str): Chemical symbol of the element.
property (str): Name of the property to retrieve. Options: youngs_modulus, poissons_ratio, bulk_modulus, shear_modulus

Returns:
str: Property value (various types): Value of the property for the given element, if available.
"""
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improve the function documentation.

The return type in the docstring should be dict instead of str to match the actual return type.

-        str: Property value (various types): Value of the property for the given element, if available.
+        dict: Dictionary containing the property values for the given element, if available.
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def get_experimental_elastic_property_wikipedia(chemical_symbol: str) -> dict:
"""
Looks up elastic properties for a given chemical symbol from the Wikipedia: https://en.wikipedia.org/wiki/Elastic_properties_of_the_elements_(data_page) sourced from webelements.com.
Args:
chemical_symbol (str): Chemical symbol of the element.
property (str): Name of the property to retrieve. Options: youngs_modulus, poissons_ratio, bulk_modulus, shear_modulus
Returns:
str: Property value (various types): Value of the property for the given element, if available.
"""
def get_experimental_elastic_property_wikipedia(chemical_symbol: str) -> dict:
"""
Looks up elastic properties for a given chemical symbol from the Wikipedia: https://en.wikipedia.org/wiki/Elastic_properties_of_the_elements_(data_page) sourced from webelements.com.
Args:
chemical_symbol (str): Chemical symbol of the element.
property (str): Name of the property to retrieve. Options: youngs_modulus, poissons_ratio, bulk_modulus, shear_modulus
Returns:
dict: Dictionary containing the property values for the given element, if available.
"""

property_lst = [
"youngs_modulus",
"poissons_ratio",
"bulk_modulus",
"shear_modulus",
]
df_lst = pandas.read_html(
"https://en.wikipedia.org/wiki/Elastic_properties_of_the_elements_(data_page)"
)
property_dict = {}
for i, p in enumerate(property_lst):
df_tmp = df_lst[i]
property_dict[p] = float(
df_tmp[df_tmp.symbol == chemical_symbol].squeeze(axis=0).to_dict()["WEL[1]"]
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add error handling for missing chemical symbols.

The function assumes that the chemical symbol will always be found. Add error handling for cases where the symbol is not found.

-        property_dict[p] = float(
-            df_tmp[df_tmp.symbol == chemical_symbol].squeeze(axis=0).to_dict()["WEL[1]"]
-        )
+        try:
+            property_dict[p] = float(
+                df_tmp[df_tmp.symbol == chemical_symbol].squeeze(axis=0).to_dict()["WEL[1]"]
+            )
+        except KeyError:
+            property_dict[p] = None
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for i, p in enumerate(property_lst):
df_tmp = df_lst[i]
property_dict[p] = float(
df_tmp[df_tmp.symbol == chemical_symbol].squeeze(axis=0).to_dict()["WEL[1]"]
)
for i, p in enumerate(property_lst):
df_tmp = df_lst[i]
try:
property_dict[p] = float(
df_tmp[df_tmp.symbol == chemical_symbol].squeeze(axis=0).to_dict()["WEL[1]"]
)
except KeyError:
property_dict[p] = None

return property_dict
Loading