CBFV Package

Tool to quickly create a composition-based feature vectors from materials datafiles.

Installation

The source code is currently hosted on GitHub at: https://github.com/kaaiian/CBFV

Binary installers for the latest released version are available at the Python Package Index (PyPI)

# PyPI
pip install CBFV

Making the composition-based feature vector

The CBFV package assumes your data is stored in a pandas dataframe of the following structure:

formula	target
Tc1V1	248.539
Cu1Dy1	66.8444
Cd3N2	91.5034

To featurize this data, the generate_features function can be called as follows:

from CBFV import composition
X, y, formulae, skipped = composition.generate_features(df)

Extended Functionality

The featurization scheme can be adjusted by calling the the elem_prop parameter. The following featurization schemes are included within CBFV:

jarvis
magpie
mat2vec
oliynyk (default)
onehot
random_200

Duplicate formula handeling is controlled by the drop_duplicates parameter. It is set to False by default to preserve datapoints containing variation outside of their formula. For example, heat capacity measurements performed for the same material at different temperatures.

The extend_features parameter is used to specify whether columns outside of ['formula', 'target'] should be considered during featurization. It is set to False by default to exclude nuisance information from consideration. Setting extend_features=True would allow additional information (i.e. ['temperature', 'pressure']) to be preserved.

The sum_feat parameter specifies whether to calculate the sum features when generating the CBFVs for the chemical formulae. It is set to False by default.

Calling generate_features with these parameters can be implemented as follows:

formula	target	temp
Tc1V1	248.539	373
Tc1V1	66.8444	473
Cd3N2	91.5034	273

from CBFV import composition
X, y, formulae, skipped = composition.generate_features(df,
                                                        elem_prop='magpie',
                                                        drop_duplicates=False,
                                                        extend_features=True,
                                                        sum_feat=True)

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
.github/workflows		.github/workflows
cbfv		cbfv
data		data
featurized_data		featurized_data
.gitignore		.gitignore
README.md		README.md
example_code.py		example_code.py
featurize_file.py		featurize_file.py
predictions.csv		predictions.csv
setup.py		setup.py
test.py		test.py
test_composition.py		test_composition.py
test_data_extended_feats.csv		test_data_extended_feats.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CBFV Package

Installation

Making the composition-based feature vector

Extended Functionality

About

Releases

Packages

Contributors 3

Languages

Kaaiian/CBFV

Folders and files

Latest commit

History

Repository files navigation

CBFV Package

Installation

Making the composition-based feature vector

Extended Functionality

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages