diff --git a/dev/.documenter-siteinfo.json b/dev/.documenter-siteinfo.json index 82d41ff4..15b9af7e 100644 --- a/dev/.documenter-siteinfo.json +++ b/dev/.documenter-siteinfo.json @@ -1 +1 @@ -{"documenter":{"julia_version":"1.9.4","generation_timestamp":"2024-02-01T17:29:36","documenter_version":"1.2.1"}} \ No newline at end of file +{"documenter":{"julia_version":"1.9.4","generation_timestamp":"2024-03-27T21:56:45","documenter_version":"1.3.0"}} \ No newline at end of file diff --git a/dev/api/index.html b/dev/api/index.html index 32413b50..f2e6ae84 100644 --- a/dev/api/index.html +++ b/dev/api/index.html @@ -2,111 +2,124 @@ API · PotentialLearning.jl

API Reference

This page provides a list of all documented types and functions and in PotentialLearning.jl.

PotentialLearning.ActiveSubspaceType
ActiveSubspace{T<:Real} <: DimensionReducer
     Q :: Function 
     ∇Q :: Function (gradient of Q)
-    tol :: T

Use the theory of active subspaces, with a given quantity of interest (expressed as the function Q) which takes a Configuration as an input and outputs a real scalar. ∇Q should input a Configuration and output an appropriate gradient. If tol is a float then the number of components to keep is determined by the smallest n such that relative percentage of variance explained by keeping the leading n principle components is greater than 1 - tol. If tol is an int, then we return the components corresponding to the tol largest eigenvalues.

source
PotentialLearning.AtomicDataType
AtomicData <: Data

Abstract type declaring the type of information that is unique to a particular atom (instead of a whole configuration).

source
PotentialLearning.ConfigurationMethod
Configuration(data::Union{AtomsBase.FlexibleSystem, ConfigurationData} )

A Configuration is a data struct that contains information unique to a particular configuration of atoms (Energy, LocalDescriptors, ForceDescriptors, and a FlexibleSystem) in a dictionary. Example: '''julia e = Energy(-0.57, u"eV") ld = LocalDescriptors(...) c = Configuration(e, ld) '''

Configurations can be added together, which merges the data dictionaries '''julia c1 = Configuration(e) # Contains energy c2 = Configuration(f) # contains forces c = c1 + c2 # c <: Configuration, contains energy and forces '''

source
PotentialLearning.CorrelationMatrixType
CorrelationMatrix 
-    α :: Vector{Float64} # weights

CorrelationMatrix produces a global descriptor that is the correlation matrix of the local descriptors. In other words, it is mean(bi'*bi for bi in B).

source
PotentialLearning.CovariateLinearProblemType

struct CovariateLinearProblem{T<:Real} <: LinearProblem{T} e::Vector f::Vector{Vector{T}} B::Vector{Vector{T}} dB::Vector{Matrix{T}} β::Vector{T} β0::Vector{T} σe::Vector{T} σf::Vector{T} Σ::Symmetric{T,Matrix{T}} end

A CovariateLinearProblem is a linear problem in which we are fitting energies and forces using both descriptors and their gradients (B and dB, respectively). When this is the case, the solution is not available analytically and must be solved using some iterative optimization proceedure. In the end, we fit the model coefficients, β, standard deviations corresponding to energies and forces, σe and σf, and the covariance Σ.

source
PotentialLearning.DBSCANSelectorType
struct DBSCANSelector <: SubsetSelector
+    tol :: T

Use the theory of active subspaces, with a given quantity of interest (expressed as the function Q) which takes a Configuration as an input and outputs a real scalar. ∇Q should input a Configuration and output an appropriate gradient. If tol is a float then the number of components to keep is determined by the smallest n such that relative percentage of variance explained by keeping the leading n principle components is greater than 1 - tol. If tol is an int, then we return the components corresponding to the tol largest eigenvalues.

source
PotentialLearning.AtomicDataType
AtomicData <: Data

Abstract type declaring the type of information that is unique to a particular atom (instead of a whole configuration).

source
PotentialLearning.ConfigurationMethod
Configuration(data::Union{AtomsBase.FlexibleSystem, ConfigurationData} )

A Configuration is a data struct that contains information unique to a particular configuration of atoms (Energy, LocalDescriptors, ForceDescriptors, and a FlexibleSystem) in a dictionary. Example: '''julia e = Energy(-0.57, u"eV") ld = LocalDescriptors(...) c = Configuration(e, ld) '''

Configurations can be added together, which merges the data dictionaries '''julia c1 = Configuration(e) # Contains energy c2 = Configuration(f) # contains forces c = c1 + c2 # c <: Configuration, contains energy and forces '''

source
PotentialLearning.CorrelationMatrixType
CorrelationMatrix 
+    α :: Vector{Float64} # weights

CorrelationMatrix produces a global descriptor that is the correlation matrix of the local descriptors. In other words, it is mean(bi'*bi for bi in B).

source
PotentialLearning.CovariateLinearProblemType

struct CovariateLinearProblem{T<:Real} <: LinearProblem{T} e::Vector f::Vector{Vector{T}} B::Vector{Vector{T}} dB::Vector{Matrix{T}} β::Vector{T} β0::Vector{T} σe::Vector{T} σf::Vector{T} Σ::Symmetric{T,Matrix{T}} end

A CovariateLinearProblem is a linear problem in which we are fitting energies and forces using both descriptors and their gradients (B and dB, respectively). When this is the case, the solution is not available analytically and must be solved using some iterative optimization proceedure. In the end, we fit the model coefficients, β, standard deviations corresponding to energies and forces, σe and σf, and the covariance Σ.

source
PotentialLearning.DBSCANSelectorType
struct DBSCANSelector <: SubsetSelector
     clusters
     eps
     minpts
     sample_size
-end

Definition of the type DBSCANSelector, a subselector based on the clustering method DBSCAN.

source
PotentialLearning.DBSCANSelectorMethod
function DBSCANSelector(
     ds::DataSet,
     eps,
     minpts,
     sample_size
-)

Constructor of DBSCANSelector based on the atomic configurations in ds, the DBSCAN params eps and minpts, and the sample size sample_size.

source
PotentialLearning.DataSetType
DataSet

Struct that holds vector of configuration. Most operations in PotentialLearning are built around the DataSet structure.

source
PotentialLearning.DistanceType
Distance
+)

Constructor of DBSCANSelector based on the atomic configurations in ds, the DBSCAN params eps and minpts, and the sample size sample_size.

source
PotentialLearning.DataSetType
DataSet

Struct that holds vector of configuration. Most operations in PotentialLearning are built around the DataSet structure.

source
PotentialLearning.DistanceType
Distance
 
-A struct of abstract type Distance produces the distance between two `global` descriptors, or features. Not all distances might be compatible with all types of features.
source
PotentialLearning.DotProductType
DotProduct <: Kernel 
+A struct of abstract type Distance produces the distance between two `global` descriptors, or features. Not all distances might be compatible with all types of features.
source
PotentialLearning.DivergenceType
Divergence
+
+A struct of abstract type Divergence produces a measure of discrepancy between two probability distributions. Discepancies may take as argument analytical distributions or sets of samples representing empirical distributions.
source
PotentialLearning.DotProductType
DotProduct <: Kernel 
     α :: Power of DotProduct kernel 
 
 
 Computes the dot product kernel between two features, i.e.,
 
-cos(θ) = ( A ⋅ B / (||A||^2||B||^2) )^α
source
PotentialLearning.EnergyType
Energy <: ConfigurationData
     d :: Real
-    u :: Unitful.FreeUnits

Convenience struct that holds energy information (and corresponding units). Default unit is eV

source
PotentialLearning.EuclideanType
Euclidean <: Distance 
+    u :: Unitful.FreeUnits

Convenience struct that holds energy information (and corresponding units). Default unit is eV

source
PotentialLearning.EuclideanType
Euclidean <: Distance 
     Cinv :: Covariance Matrix 
 
-Computes the squared euclidean distance with weight matrix Cinv, the inverse of some covariance matrix.
source
PotentialLearning.FeatureType
Feature

A struct of abstract type Feature represents a function that takes in a set of local descriptors corresponding to some atomic environment and produce a global descriptor.

source
PotentialLearning.ForceType
Force <: AtomicData 
+Computes the squared euclidean distance with weight matrix Cinv, the inverse of some covariance matrix.
source
PotentialLearning.FeatureType
Feature

A struct of abstract type Feature represents a function that takes in a set of local descriptors corresponding to some atomic environment and produce a global descriptor.

source
PotentialLearning.ForceType
Force <: AtomicData 
     f :: Vector{<:Real}
-    u :: Unitful.FreeUnits

Contains the force with (x,y,z)-components in f with units u. Default unit is "eV/Å".

source
PotentialLearning.ForcesType
Forces <: ConfigurationData
-    f :: Vector{force}

Forces is a struct that contains all force information in a configuration.

source
PotentialLearning.ForstnerType
Forstner <: Distance 
+    u :: Unitful.FreeUnits

Contains the force with (x,y,z)-components in f with units u. Default unit is "eV/Å".

source
PotentialLearning.ForcesType
Forces <: ConfigurationData
+    f :: Vector{force}

Forces is a struct that contains all force information in a configuration.

source
PotentialLearning.ForstnerType
Forstner <: Distance 
     α :: Regularization parameter
 
-Computes the squared Forstner distance between two positive semi-definite matrices.
source
PotentialLearning.InverseMultiquadricType
InverseMultiquadric <: Kernel 
+    d :: Distance function 
+    c2 :: Squared constant parameter
+    ℓ :: Length-scale parameter
+
+Computes the inverse multiquadric (IMQ) kernel, i.e.,
+
+ k(A, B) = (c^2 + d(A,B)/β^2)^{-1/2}
source
PotentialLearning.KernelType
Kernel
+
+A struct of abstract type Kernel is function that takes in two features and produces a semi-definite scalar representing the similarity between the two features.
source
PotentialLearning.KernelSteinDiscrepancyType
KernelSteinDiscrepancy <: Divergence
+    score :: Function
+    knl :: Kernel
 
-A struct of abstract type Kernel is function that takes in two features and produces a semi-definite scalar representing the similarity between the two features.
source
PotentialLearning.LAMMPSType
struct LAMMPS <: IO
+Computes the kernel Stein discrepancy between distributions p (from which samples are provided) and q (for which the score is provided) based on the RKHS defined by kernel k.
source
PotentialLearning.LearningProblemType

struct LearningProblem{T<:Real} <: AbstractLearningProblem ds::DataSet logprob::Function ∇logprob::Function params::Vector{T} end

Generic LearningProblem that allows the user to pass a logprob(y::params, ds::DataSet) function and its gradient. The gradient should return a vector of logprob with respect to it's params. If the user does not have a gradient function available, then Flux can provide one for it (provided that logprob is of the form above).

source
PotentialLearning.LinearProblemMethod

function LinearProblem( ds::DataSet; T = Float64 )

Construct a LinearProblem by detecting if there are energy descriptors and/or force descriptors and construct the appropriate LinearProblem (either Univariate, if only a single type of descriptor, or Covariate, if there are both types).

source
PotentialLearning.PCAType
PCA <: DimensionReducer
-    tol :: Float64

Use SVD to compute the PCA of the design matrix of descriptors. (using Force descriptors TBA)

If tol is a float then the number of components to keep is determined by the smallest n such that relative percentage of variance explained by keeping the leading n principle components is greater than 1 - tol. If tol is an int, then we return the components corresponding to the tol largest eigenvalues.

source
PotentialLearning.LearningProblemType

struct LearningProblem{T<:Real} <: AbstractLearningProblem ds::DataSet logprob::Function ∇logprob::Function params::Vector{T} end

Generic LearningProblem that allows the user to pass a logprob(y::params, ds::DataSet) function and its gradient. The gradient should return a vector of logprob with respect to it's params. If the user does not have a gradient function available, then Flux can provide one for it (provided that logprob is of the form above).

source
PotentialLearning.LinearProblemMethod

function LinearProblem( ds::DataSet; T = Float64 )

Construct a LinearProblem by detecting if there are energy descriptors and/or force descriptors and construct the appropriate LinearProblem (either Univariate, if only a single type of descriptor, or Covariate, if there are both types).

source
PotentialLearning.PCAType
PCA <: DimensionReducer
+    tol :: Float64

Use SVD to compute the PCA of the design matrix of descriptors. (using Force descriptors TBA)

If tol is a float then the number of components to keep is determined by the smallest n such that relative percentage of variance explained by keeping the leading n principle components is greater than 1 - tol. If tol is an int, then we return the components corresponding to the tol largest eigenvalues.

source
PotentialLearning.RBFType
RBF <: Kernel 
     d :: Distance function 
-    α :: Reguarlization parameter 
+    α :: Regularization parameter 
     ℓ :: Length-scale parameter
     β :: Scale parameter
 
 
 Computes the squared exponential kernel, i.e.,
 
- k(A, B) = β xp( -rac{1}{2} d(A,B)/ℓ^2 ) + α δ(A, B)
source
PotentialLearning.RandomSelectorType
struct Random
     num_configs :: Int 
     batch_size  :: Int 
-end

A convenience function that allows the user to randomly select indices uniformly over [1, num_configs].

source
PotentialLearning.UnivariateLinearProblemType

struct UnivariateLinearProblem{T<:Real} <: LinearProblem{T} ivdata::Vector dvdata::Vector β::Vector{T} β0::Vector{T} σ::Vector{T} Σ::Symmetric{T,Matrix{T}} end

A UnivariateLinearProblem is a linear problem in which there is only 1 type of independent variable / dependent variable. Typically, that means we are either only fitting energies or only fitting forces. When this is the case, the solution is available analytically and the standard deviation, σ, and covariance, Σ, of the coefficients, β, are computable.

source
PotentialLearning.YAMLType
YAML <: IO
+end

A convenience function that allows the user to randomly select indices uniformly over [1, num_configs].

source
PotentialLearning.UnivariateLinearProblemType

struct UnivariateLinearProblem{T<:Real} <: LinearProblem{T} ivdata::Vector dvdata::Vector β::Vector{T} β0::Vector{T} σ::Vector{T} Σ::Symmetric{T,Matrix{T}} end

A UnivariateLinearProblem is a linear problem in which there is only 1 type of independent variable / dependent variable. Typically, that means we are either only fitting energies or only fitting forces. When this is the case, the solution is available analytically and the standard deviation, σ, and covariance, Σ, of the coefficients, β, are computable.

source
PotentialLearning.kDPPType
struct kDPP
     K :: EllEnsemble
-end

A convenience function that allows the user access to a k-Determinantal Point Process through Determinantal.jl. All that is required to construct a kDPP is a similarity kernel, for which the user must provide a LinearProblem and two functions to compute descriptor (1) diversity and (2) quality.

source
PotentialLearning.kDPPMethod
kDPP(ds::Dataset, f::Feature, k::Kernel)

A convenience function that allows the user access to a k-Determinantal Point Process through Determinantal.jl. All that is required to construct a kDPP is a dataset, a method to compute features, and a kernel. Optional arguments include batch size and type of descriptor (default LocalDescriptors).

source
PotentialLearning.kDPPMethod
kDPP(features::Union{Vector{Vector{T}}, Vector{Symmetric{T, Matrix{T}}}}, k::Kernel)

A convenience function that allows the user access to a k-Determinantal Point Process through Determinantaljl. All that is required to construct a kDPP are features (either a vector of vector features or a vector of symmetric matrix features) and a kernel. Optional argument is batch_size (default length(features)).

source
InteratomicPotentials.compute_local_descriptorsMethod

function computelocaldescriptors( ds::DataSet, basis::BasisSystem; pbar = true )

ds: dataset. basis: basis system (e.g. ACE) pbar: progress bar

Compute local descriptors of a basis system and dataset using threads.

source
PotentialLearning.KernelMatrixMethod
KernelMatrix(ds1::DataSet, ds2::DataSet, F::Feature, k::Kernel)
+end

A convenience function that allows the user access to a k-Determinantal Point Process through Determinantal.jl. All that is required to construct a kDPP is a similarity kernel, for which the user must provide a LinearProblem and two functions to compute descriptor (1) diversity and (2) quality.

source
PotentialLearning.kDPPMethod
kDPP(ds::Dataset, f::Feature, k::Kernel)

A convenience function that allows the user access to a k-Determinantal Point Process through Determinantal.jl. All that is required to construct a kDPP is a dataset, a method to compute features, and a kernel. Optional arguments include batch size and type of descriptor (default LocalDescriptors).

source
PotentialLearning.kDPPMethod
kDPP(features::Union{Vector{Vector{T}}, Vector{Symmetric{T, Matrix{T}}}}, k::Kernel)

A convenience function that allows the user access to a k-Determinantal Point Process through Determinantaljl. All that is required to construct a kDPP are features (either a vector of vector features or a vector of symmetric matrix features) and a kernel. Optional argument is batch_size (default length(features)).

source
InteratomicPotentials.compute_local_descriptorsMethod

function computelocaldescriptors( ds::DataSet, basis::BasisSystem; pbar = true )

ds: dataset. basis: basis system (e.g. ACE) pbar: progress bar

Compute local descriptors of a basis system and dataset using threads.

source
PotentialLearning.KernelMatrixMethod
KernelMatrix(ds1::DataSet, ds2::DataSet, F::Feature, k::Kernel)
 
-Compute nonsymmetric kernel matrix K using features of the datasets ds1 and ds2 calculated using the Feature method F.
source
PotentialLearning.KernelMatrixMethod
KernelMatrix(ds::DataSet, F::Feature, k::Kernel)

Compute symmetric kernel matrix K using features of the dataset ds calculated using the Feature method F.

source
PotentialLearning.calc_centroidMethod
function calc_centroid(
+Compute nonsymmetric kernel matrix K using features of the datasets ds1 and ds2 calculated using the Feature method F.
source
PotentialLearning.KernelMatrixMethod
KernelMatrix(ds::DataSet, F::Feature, k::Kernel)

Compute symmetric kernel matrix K using features of the dataset ds calculated using the Feature method F.

source
PotentialLearning.calc_metricsMethod
calc_metrics(x_pred, x)

x_pred: vector of predicted values of a variable. E.g. energy. x: vector of true values of a variable. E.g. energy.

Returns MAE, RMSE, and RSQ.

source
PotentialLearning.compute_featuresMethod
compute_feature(ds::DataSet, f::Feature; dt = LocalDescriptors)

Computes features of the dataset ds using the feature method F on descriptors dt (default option are the LocalDescriptors, if available).

source
PotentialLearning.calc_metricsMethod
calc_metrics(x_pred, x)

x_pred: vector of predicted values of a variable. E.g. energy. x: vector of true values of a variable. E.g. energy.

Returns MAE, RMSE, and RSQ.

source
PotentialLearning.compute_featuresMethod
compute_feature(ds::DataSet, f::Feature; dt = LocalDescriptors)

Computes features of the dataset ds using the feature method F on descriptors dt (default option are the LocalDescriptors, if available).

source
PotentialLearning.fitFunction
fit(ds::DataSet, dr::DimensionReducer)

Fits a linear dimension reduction routine using information from DataSet. See individual types of DimensionReducers for specific details.

source
PotentialLearning.fitMethod
fit(ds::DataSet, as::ActiveSubspace)

Fits a linear dimension reduction routine using the eigendirections of the uncentered covariance of the function ∇Q(c::Configuration) over the configurations in ds. Primarily used to reduce the dimension of the descriptors.

source
PotentialLearning.fitMethod
fit(ds::DataSet, pca::PCA)

Fits a linear dimension reduction routine using PCA on the global descriptors in the dataset ds.

source
PotentialLearning.fit_transformMethod
fit_transform(ds::DataSet, dr::DimensionReducer)

Fits a linear dimension reduction routine using information from DataSet and performs dimension reduction on descriptors and force_descriptors (whichever are available). See individual types of DimensionReducers for specific details.

source
PotentialLearning.forceMethod

function force( c::Configuration, nnbp::NNBasisPotential )

c: atomic configuration. nnbp: neural network basis potential.

source
PotentialLearning.fitFunction
fit(ds::DataSet, dr::DimensionReducer)

Fits a linear dimension reduction routine using information from DataSet. See individual types of DimensionReducers for specific details.

source
PotentialLearning.fitMethod
fit(ds::DataSet, as::ActiveSubspace)

Fits a linear dimension reduction routine using the eigendirections of the uncentered covariance of the function ∇Q(c::Configuration) over the configurations in ds. Primarily used to reduce the dimension of the descriptors.

source
PotentialLearning.fitMethod
fit(ds::DataSet, pca::PCA)

Fits a linear dimension reduction routine using PCA on the global descriptors in the dataset ds.

source
PotentialLearning.fit_transformMethod
fit_transform(ds::DataSet, dr::DimensionReducer)

Fits a linear dimension reduction routine using information from DataSet and performs dimension reduction on descriptors and force_descriptors (whichever are available). See individual types of DimensionReducers for specific details.

source
PotentialLearning.forceMethod

function force( c::Configuration, nnbp::NNBasisPotential )

c: atomic configuration. nnbp: neural network basis potential.

source
PotentialLearning.get_batchesMethod
get_batches(n_batches, B_train, B_train_ext, e_train, dB_train, f_train,
-            B_test, B_test_ext, e_test, dB_test, f_test)

n_batches: no. of batches per dataset. B_train: descriptors of the energies used in training. B_train_ext: extendended descriptors of the energies used in training. Requiered to compute forces. e_train: energies used in training. dB_train: derivatives of the energy descritors used in training. f_train: forces used in training. B_test: descriptors of the energies used in test. B_test_ext: extendended descriptors of the energies used in test. Requiered to compute forces. e_test: energies used in test. dB_test: derivatives of the energy descritors used in test. f_test: forces used in test.

Returns the data loaders for training and test of energies and forces.

source
PotentialLearning.get_batchesMethod
get_batches(n_batches, B_train, B_train_ext, e_train, dB_train, f_train,
+            B_test, B_test_ext, e_test, dB_test, f_test)

n_batches: no. of batches per dataset. B_train: descriptors of the energies used in training. B_train_ext: extendended descriptors of the energies used in training. Requiered to compute forces. e_train: energies used in training. dB_train: derivatives of the energy descritors used in training. f_train: forces used in training. B_test: descriptors of the energies used in test. B_test_ext: extendended descriptors of the energies used in test. Requiered to compute forces. e_test: energies used in test. dB_test: derivatives of the energy descritors used in test. f_test: forces used in test.

Returns the data loaders for training and test of energies and forces.

source
PotentialLearning.get_clustersMethod
function get_clusters(
     ds,
     eps,
     minpts
-)

Computes clusters from the configurations in ds using DBSCAN with parameters eps and minpts.

source
PotentialLearning.get_dpp_modeMethod
get_dpp_mode(dpp::kDPP, batch_size::Int) <: Vector{Int64}

Access an approximate mode of the k-DPP as calculated by a greedy subset algorithm. See Determinantal.jl for details.

source
PotentialLearning.get_inclusion_probMethod
get_inclusion_prob(dpp::kDPP) <: Vector{Float64}

Access an approximation to the inclusion probabilities as calculated by Determinantal.jl (see package for details).

source
PotentialLearning.get_inputMethod
get_input(args)

args: vector of arguments (strings)

Returns an OrderedDict with the arguments. See https://github.com/cesmix-mit/AtomisticComposableWorkflows documentation for information about how to define the input arguments.

source
PotentialLearning.get_metricsMethod
get_metrics( e_train_pred, e_train, f_train_pred, f_train,
+)

Computes clusters from the configurations in ds using DBSCAN with parameters eps and minpts.

source
PotentialLearning.get_dpp_modeMethod
get_dpp_mode(dpp::kDPP, batch_size::Int) <: Vector{Int64}

Access an approximate mode of the k-DPP as calculated by a greedy subset algorithm. See Determinantal.jl for details.

source
PotentialLearning.get_inclusion_probMethod
get_inclusion_prob(dpp::kDPP) <: Vector{Float64}

Access an approximation to the inclusion probabilities as calculated by Determinantal.jl (see package for details).

source
PotentialLearning.get_inputMethod
get_input(args)

args: vector of arguments (strings)

Returns an OrderedDict with the arguments. See https://github.com/cesmix-mit/AtomisticComposableWorkflows documentation for information about how to define the input arguments.

source
PotentialLearning.get_metricsMethod
get_metrics( e_train_pred, e_train, f_train_pred, f_train,
              e_test_pred, e_test, f_test_pred, f_test,
-             B_time, dB_time, time_fitting)

e_train_pred: vector of predicted training energy values. e_train: vector of true training energy values. f_train_pred: vector of predicted training force values. f_train: vector of true training force values. e_test_pred: vector of predicted test energy values. e_test: vector of true test energy values. f_test_pred: vector of predicted test force values. f_test: vector of true test force values. B_time: elapsed time consumed by descriptors calculation. dB_time: elapsed time consumed by descriptor derivatives calculation. time_fitting: elapsed time consumed by fitting process.

Computes MAE, RMSE, and RSQ for training and testing energies and forces. Also add elapsed times about descriptors and fitting calculations. Returns an OrderedDict with the information above.

source
PotentialLearning.get_metricsMethod
get_metrics( e_train_pred, e_train, e_test_pred, e_test)

e_train_pred: vector of predicted training energy values. e_train: vector of true training energy values. e_test_pred: vector of predicted test energy values. e_test: vector of true test energy values.

Computes MAE, RMSE, and RSQ for training and testing energies. Returns an OrderedDict with the information above.

source
PotentialLearning.get_metricsMethod
get_metrics(
+             B_time, dB_time, time_fitting)

e_train_pred: vector of predicted training energy values. e_train: vector of true training energy values. f_train_pred: vector of predicted training force values. f_train: vector of true training force values. e_test_pred: vector of predicted test energy values. e_test: vector of true test energy values. f_test_pred: vector of predicted test force values. f_test: vector of true test force values. B_time: elapsed time consumed by descriptors calculation. dB_time: elapsed time consumed by descriptor derivatives calculation. time_fitting: elapsed time consumed by fitting process.

Computes MAE, RMSE, and RSQ for training and testing energies and forces. Also add elapsed times about descriptors and fitting calculations. Returns an OrderedDict with the information above.

source
PotentialLearning.get_metricsMethod
get_metrics( e_train_pred, e_train, e_test_pred, e_test)

e_train_pred: vector of predicted training energy values. e_train: vector of true training energy values. e_test_pred: vector of predicted test energy values. e_test: vector of true test energy values.

Computes MAE, RMSE, and RSQ for training and testing energies. Returns an OrderedDict with the information above.

source
PotentialLearning.get_metricsMethod
get_metrics(
     x_pred,
     x;
     metrics = [mae, rmse, rsq],
     label = "x"
-)

x_pred: vector of predicted forces, x: vector of true forces. metrics: vector of metrics. label: label used as prefix in dictionary keys.

Returns and OrderedDict with different metrics.

source
PotentialLearning.get_random_subsetFunction
get_random_subset(r::Random, batch_size :: Int) <: Vector{Int64}

Access a random subset of the data as sampled from the provided k-DPP. Returns the indices of the random subset and the subset itself.

source
PotentialLearning.get_random_subsetFunction
function get_random_subset(
+)

x_pred: vector of predicted forces, x: vector of true forces. metrics: vector of metrics. label: label used as prefix in dictionary keys.

Returns and OrderedDict with different metrics.

source
PotentialLearning.get_random_subsetFunction
get_random_subset(r::Random, batch_size :: Int) <: Vector{Int64}

Access a random subset of the data as sampled from the provided k-DPP. Returns the indices of the random subset and the subset itself.

source
PotentialLearning.get_random_subsetFunction
function get_random_subset(
     s::DBSCANSelector,
     batch_size = s.sample_size
-)

Returns a random subset of indexes composed of samples of size batch_size ÷ length(s.clusters) from each cluster in s.

source
PotentialLearning.get_random_subsetMethod
get_random_subset(dpp::kDPP, batch_size :: Int) <: Vector{Int64}

Access a random subset of the data as sampled from the provided k-DPP. Returns the indices of the random subset and the subset itself.

source
PotentialLearning.get_systemMethod
get_system(c::Configuration) <: AtomsBase.AbstractSystem

Retrieves the AtomsBase system (if available) in the Configuration c.

source
PotentialLearning.kabschMethod
function kabsch(
+)

Returns a random subset of indexes composed of samples of size batch_size ÷ length(s.clusters) from each cluster in s.

source
PotentialLearning.get_random_subsetMethod
get_random_subset(dpp::kDPP, batch_size :: Int) <: Vector{Int64}

Access a random subset of the data as sampled from the provided k-DPP. Returns the indices of the random subset and the subset itself.

source
PotentialLearning.get_systemMethod
get_system(c::Configuration) <: AtomsBase.AbstractSystem

Retrieves the AtomsBase system (if available) in the Configuration c.

source
PotentialLearning.kabschMethod
function kabsch(
     reference::Array{Float64,2},
     coords::Array{Float64,2}
-)

Input: two sets of points: reference, coords as Nx3 Matrices (so) Returns optimally rotated matrix

source
PotentialLearning.learn!Method

function learn!( iap::InteratomicPotentials.LinearBasisPotential, ds::DataSet, args... )

Learning dispatch function, common to ordinary and weghted least squares implementations.

source
PotentialLearning.learn!Method

function learn!( lp::CovariateLinearProblem, α::Real )

Fit a Gaussian distribution by finding the MLE of the following log probability: ℓ(β, σe, σf) = -0.5(e - A_e *β)'(e - Ae * β) / σe - 0.5*(f - Af β)'(f - A_f * β) / σf - log(σe) - log(σf)

through an optimization procedure.

source
PotentialLearning.learn!Method

function learn!( lp::CovariateLinearProblem, ss::SubsetSelector, α::Real; num_steps=100, opt=Flux.Optimise.Adam() )

Fit a Gaussian distribution by finding the MLE of the following log probability: ℓ(β, σe, σf) = -0.5(e - A_e *β)'(e - Ae * β) / σe - 0.5*(f - Af β)'(f - A_f * β) / σf - log(σe) - log(σf)

through an iterative batch gradient descent optimization proceedure where the batches are provided by the subset selector.

source
PotentialLearning.learn!Method

function learn!( lp::CovariateLinearProblem, ws::Vector, int::Bool )

Fit energies and forces using weighted least squares.

source
PotentialLearning.learn!Method

function learn!( lp::LearningProblem, ss::SubsetSelector; num_steps = 100::Int, opt = Flux.Optimisers.Adam() )

Attempts to fit the parameters lp.params in the learning problem lp using batch gradient descent with the optimizer opt and num_steps number of iterations. Batching is provided by the passed ss::SubsetSelector.

source
PotentialLearning.learn!Method

function learn!( lp::LearningProblem; num_steps=100::Int, opt=Flux.Optimisers.Adam() )

Attempts to fit the parameters lp.params in the learning problem lp using gradient descent with the optimizer opt and num_steps number of iterations.

source
PotentialLearning.learn!Method

function learn!( lp::UnivariateLinearProblem, α::Real )

Fit a univariate Gaussian distribution for the equation y = Aβ + ϵ, where β are model coefficients and ϵ ∼ N(0, σ). Fitting is done via SVD on the design matrix, A'*A (formed iteratively), where eigenvalues less than α are cut-off.

source
PotentialLearning.learn!Method

function learn!( lp::UnivariateLinearProblem, ss::SubsetSelector, α::Real; num_steps = 100, opt = Flux.Optimise.Adam() )

Fit a univariate Gaussian distribution for the equation y = Aβ + ϵ, where β are model coefficients and ϵ ∼ N(0, σ). Fitting is done via batched gradient descent with batches provided by the subset selector and the gradients are calculated using Flux.

source
PotentialLearning.learn!Method

function learn!( lp::UnivariateLinearProblem, ws::Vector, int::Bool )

Fit energies using weighted least squares.

source
PotentialLearning.learn!Method

function learn!( iap::InteratomicPotentials.LinearBasisPotential, ds::DataSet, args... )

Learning dispatch function, common to ordinary and weghted least squares implementations.

source
PotentialLearning.learn!Method

function learn!( lp::CovariateLinearProblem, α::Real )

Fit a Gaussian distribution by finding the MLE of the following log probability: ℓ(β, σe, σf) = -0.5(e - A_e *β)'(e - Ae * β) / σe - 0.5*(f - Af β)'(f - A_f * β) / σf - log(σe) - log(σf)

through an optimization procedure.

source
PotentialLearning.learn!Method

function learn!( lp::CovariateLinearProblem, ss::SubsetSelector, α::Real; num_steps=100, opt=Flux.Optimise.Adam() )

Fit a Gaussian distribution by finding the MLE of the following log probability: ℓ(β, σe, σf) = -0.5(e - A_e *β)'(e - Ae * β) / σe - 0.5*(f - Af β)'(f - A_f * β) / σf - log(σe) - log(σf)

through an iterative batch gradient descent optimization proceedure where the batches are provided by the subset selector.

source
PotentialLearning.learn!Method

function learn!( lp::CovariateLinearProblem, ws::Vector, int::Bool )

Fit energies and forces using weighted least squares.

source
PotentialLearning.learn!Method

function learn!( lp::LearningProblem, ss::SubsetSelector; num_steps = 100::Int, opt = Flux.Optimisers.Adam() )

Attempts to fit the parameters lp.params in the learning problem lp using batch gradient descent with the optimizer opt and num_steps number of iterations. Batching is provided by the passed ss::SubsetSelector.

source
PotentialLearning.learn!Method

function learn!( lp::LearningProblem; num_steps=100::Int, opt=Flux.Optimisers.Adam() )

Attempts to fit the parameters lp.params in the learning problem lp using gradient descent with the optimizer opt and num_steps number of iterations.

source
PotentialLearning.learn!Method

function learn!( lp::UnivariateLinearProblem, α::Real )

Fit a univariate Gaussian distribution for the equation y = Aβ + ϵ, where β are model coefficients and ϵ ∼ N(0, σ). Fitting is done via SVD on the design matrix, A'*A (formed iteratively), where eigenvalues less than α are cut-off.

source
PotentialLearning.learn!Method

function learn!( lp::UnivariateLinearProblem, ss::SubsetSelector, α::Real; num_steps = 100, opt = Flux.Optimise.Adam() )

Fit a univariate Gaussian distribution for the equation y = Aβ + ϵ, where β are model coefficients and ϵ ∼ N(0, σ). Fitting is done via batched gradient descent with batches provided by the subset selector and the gradients are calculated using Flux.

source
PotentialLearning.learn!Method

function learn!( lp::UnivariateLinearProblem, ws::Vector, int::Bool )

Fit energies using weighted least squares.

source
PotentialLearning.load_dataMethod
load_data(file::string, yaml::YAML)
 
 Load configurations from a yaml file into a Vector of Flexible Systems, with Energies and Force.
 Returns 
     ds - DataSet
-    t = Vector{Dict} (any miscellaneous info from yaml file)
source
PotentialLearning.load_datasetsMethod
load_datasets(input)

input: OrderedDict with input arguments. See get_defaults_args().

Returns training and test systems, energies, forces, and stresses.

source
PotentialLearning.maeMethod
mae(x_pred, x)

x_pred: vector of predicted values. E.g. predicted energies. x: vector of true values. E.g. DFT energies.

Returns mean absolute error.

source
PotentialLearning.load_datasetsMethod
load_datasets(input)

input: OrderedDict with input arguments. See get_defaults_args().

Returns training and test systems, energies, forces, and stresses.

source
PotentialLearning.maeMethod
mae(x_pred, x)

x_pred: vector of predicted values. E.g. predicted energies. x: vector of true values. E.g. DFT energies.

Returns mean absolute error.

source
PotentialLearning.periodic_rmsdMethod
function periodic_rmsd(
     p1::Array{Float64,2},
     p2::Array{Float64,2},
     box_lengths::Array{Float64,1}
-)

Calculates the RMSD between atom positions of two configurations taking into account the periodic boundaries.

source
PotentialLearning.rmsdMethod
function rmsd(
+)

Calculates the RMSD between atom positions of two configurations taking into account the periodic boundaries.

source
PotentialLearning.rmsdMethod
function rmsd(
     A::Array{Float64,2},
     B::Array{Float64,2}
-)

Calculate root mean square deviation of two matrices A, B. See http://en.wikipedia.org/wiki/Root-mean-squaredeviationofatomicpositions

source
PotentialLearning.rmseMethod
rmse(x_pred, x)

x_pred: vector of predicted values. E.g. predicted energies. x: vector of true values. E.g. DFT energies.

Returns mean root mean square error.

source
PotentialLearning.rsqMethod
rsq(x_pred, x)

x_pred: vector of predicted values. E.g. predicted energies. x: vector of true values. E.g. DFT energies.

Returns R-squared.

source
PotentialLearning.sampleMethod
function sample(
+)

Calculate root mean square deviation of two matrices A, B. See http://en.wikipedia.org/wiki/Root-mean-squaredeviationofatomicpositions

source
PotentialLearning.rmseMethod
rmse(x_pred, x)

x_pred: vector of predicted values. E.g. predicted energies. x: vector of true values. E.g. DFT energies.

Returns mean root mean square error.

source
PotentialLearning.rsqMethod
rsq(x_pred, x)

x_pred: vector of predicted values. E.g. predicted energies. x: vector of true values. E.g. DFT energies.

Returns R-squared.

source
PotentialLearning.translate_pointsMethod
function translate_points(
     P::Array{Float64,2},
     Q::Array{Float64,2}
-)

Translate P, Q so centroids are equal to the origin of the coordinate system Translation der Massenzentren, so dass beide Zentren im Ursprung des Koordinatensystems liegen

source
+)

Translate P, Q so centroids are equal to the origin of the coordinate system Translation der Massenzentren, so dass beide Zentren im Ursprung des Koordinatensystems liegen

source diff --git a/dev/assets/documenter.js b/dev/assets/documenter.js index f5311607..c6562b55 100644 --- a/dev/assets/documenter.js +++ b/dev/assets/documenter.js @@ -4,7 +4,6 @@ requirejs.config({ 'highlight-julia': 'https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.8.0/languages/julia.min', 'headroom': 'https://cdnjs.cloudflare.com/ajax/libs/headroom/0.12.0/headroom.min', 'jqueryui': 'https://cdnjs.cloudflare.com/ajax/libs/jqueryui/1.13.2/jquery-ui.min', - 'minisearch': 'https://cdn.jsdelivr.net/npm/minisearch@6.1.0/dist/umd/index.min', 'katex-auto-render': 'https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.16.8/contrib/auto-render.min', 'jquery': 'https://cdnjs.cloudflare.com/ajax/libs/jquery/3.7.0/jquery.min', 'headroom-jquery': 'https://cdnjs.cloudflare.com/ajax/libs/headroom/0.12.0/jQuery.headroom.min', @@ -103,9 +102,10 @@ $(document).on("click", ".docstring header", function () { }); }); -$(document).on("click", ".docs-article-toggle-button", function () { +$(document).on("click", ".docs-article-toggle-button", function (event) { let articleToggleTitle = "Expand docstring"; let navArticleToggleTitle = "Expand all docstrings"; + let animationSpeed = event.noToggleAnimation ? 0 : 400; debounce(() => { if (isExpanded) { @@ -116,7 +116,7 @@ $(document).on("click", ".docs-article-toggle-button", function () { isExpanded = false; - $(".docstring section").slideUp(); + $(".docstring section").slideUp(animationSpeed); } else { $(this).removeClass("fa-chevron-down").addClass("fa-chevron-up"); $(".docstring-article-toggle-button") @@ -127,7 +127,7 @@ $(document).on("click", ".docs-article-toggle-button", function () { articleToggleTitle = "Collapse docstring"; navArticleToggleTitle = "Collapse all docstrings"; - $(".docstring section").slideDown(); + $(".docstring section").slideDown(animationSpeed); } $(this).prop("title", navArticleToggleTitle); @@ -224,224 +224,465 @@ $(document).ready(function () { }) //////////////////////////////////////////////////////////////////////////////// -require(['jquery', 'minisearch'], function($, minisearch) { - -// In general, most search related things will have "search" as a prefix. -// To get an in-depth about the thought process you can refer: https://hetarth02.hashnode.dev/series/gsoc +require(['jquery'], function($) { -let results = []; -let timer = undefined; +$(document).ready(function () { + let meta = $("div[data-docstringscollapsed]").data(); -let data = documenterSearchIndex["docs"].map((x, key) => { - x["id"] = key; // minisearch requires a unique for each object - return x; + if (meta?.docstringscollapsed) { + $("#documenter-article-toggle-button").trigger({ + type: "click", + noToggleAnimation: true, + }); + } }); -// list below is the lunr 2.1.3 list minus the intersect with names(Base) -// (all, any, get, in, is, only, which) and (do, else, for, let, where, while, with) -// ideally we'd just filter the original list but it's not available as a variable -const stopWords = new Set([ - "a", - "able", - "about", - "across", - "after", - "almost", - "also", - "am", - "among", - "an", - "and", - "are", - "as", - "at", - "be", - "because", - "been", - "but", - "by", - "can", - "cannot", - "could", - "dear", - "did", - "does", - "either", - "ever", - "every", - "from", - "got", - "had", - "has", - "have", - "he", - "her", - "hers", - "him", - "his", - "how", - "however", - "i", - "if", - "into", - "it", - "its", - "just", - "least", - "like", - "likely", - "may", - "me", - "might", - "most", - "must", - "my", - "neither", - "no", - "nor", - "not", - "of", - "off", - "often", - "on", - "or", - "other", - "our", - "own", - "rather", - "said", - "say", - "says", - "she", - "should", - "since", - "so", - "some", - "than", - "that", - "the", - "their", - "them", - "then", - "there", - "these", - "they", - "this", - "tis", - "to", - "too", - "twas", - "us", - "wants", - "was", - "we", - "were", - "what", - "when", - "who", - "whom", - "why", - "will", - "would", - "yet", - "you", - "your", -]); - -let index = new minisearch({ - fields: ["title", "text"], // fields to index for full-text search - storeFields: ["location", "title", "text", "category", "page"], // fields to return with search results - processTerm: (term) => { - let word = stopWords.has(term) ? null : term; - if (word) { - // custom trimmer that doesn't strip @ and !, which are used in julia macro and function names - word = word - .replace(/^[^a-zA-Z0-9@!]+/, "") - .replace(/[^a-zA-Z0-9@!]+$/, ""); - } +}) +//////////////////////////////////////////////////////////////////////////////// +require(['jquery'], function($) { - return word ?? null; - }, - // add . as a separator, because otherwise "title": "Documenter.Anchors.add!", would not find anything if searching for "add!", only for the entire qualification - tokenize: (string) => string.split(/[\s\-\.]+/), - // options which will be applied during the search - searchOptions: { - boost: { title: 100 }, - fuzzy: 2, +/* +To get an in-depth about the thought process you can refer: https://hetarth02.hashnode.dev/series/gsoc + +PSEUDOCODE: + +Searching happens automatically as the user types or adjusts the selected filters. +To preserve responsiveness, as much as possible of the slow parts of the search are done +in a web worker. Searching and result generation are done in the worker, and filtering and +DOM updates are done in the main thread. The filters are in the main thread as they should +be very quick to apply. This lets filters be changed without re-searching with minisearch +(which is possible even if filtering is on the worker thread) and also lets filters be +changed _while_ the worker is searching and without message passing (neither of which are +possible if filtering is on the worker thread) + +SEARCH WORKER: + +Import minisearch + +Build index + +On message from main thread + run search + find the first 200 unique results from each category, and compute their divs for display + note that this is necessary and sufficient information for the main thread to find the + first 200 unique results from any given filter set + post results to main thread + +MAIN: + +Launch worker + +Declare nonconstant globals (worker_is_running, last_search_text, unfiltered_results) + +On text update + if worker is not running, launch_search() + +launch_search + set worker_is_running to true, set last_search_text to the search text + post the search query to worker + +on message from worker + if last_search_text is not the same as the text in the search field, + the latest search result is not reflective of the latest search query, so update again + launch_search() + otherwise + set worker_is_running to false + + regardless, display the new search results to the user + save the unfiltered_results as a global + update_search() + +on filter click + adjust the filter selection + update_search() + +update_search + apply search filters by looping through the unfiltered_results and finding the first 200 + unique results that match the filters + + Update the DOM +*/ + +/////// SEARCH WORKER /////// + +function worker_function(documenterSearchIndex, documenterBaseURL, filters) { + importScripts( + "https://cdn.jsdelivr.net/npm/minisearch@6.1.0/dist/umd/index.min.js" + ); + + let data = documenterSearchIndex.map((x, key) => { + x["id"] = key; // minisearch requires a unique for each object + return x; + }); + + // list below is the lunr 2.1.3 list minus the intersect with names(Base) + // (all, any, get, in, is, only, which) and (do, else, for, let, where, while, with) + // ideally we'd just filter the original list but it's not available as a variable + const stopWords = new Set([ + "a", + "able", + "about", + "across", + "after", + "almost", + "also", + "am", + "among", + "an", + "and", + "are", + "as", + "at", + "be", + "because", + "been", + "but", + "by", + "can", + "cannot", + "could", + "dear", + "did", + "does", + "either", + "ever", + "every", + "from", + "got", + "had", + "has", + "have", + "he", + "her", + "hers", + "him", + "his", + "how", + "however", + "i", + "if", + "into", + "it", + "its", + "just", + "least", + "like", + "likely", + "may", + "me", + "might", + "most", + "must", + "my", + "neither", + "no", + "nor", + "not", + "of", + "off", + "often", + "on", + "or", + "other", + "our", + "own", + "rather", + "said", + "say", + "says", + "she", + "should", + "since", + "so", + "some", + "than", + "that", + "the", + "their", + "them", + "then", + "there", + "these", + "they", + "this", + "tis", + "to", + "too", + "twas", + "us", + "wants", + "was", + "we", + "were", + "what", + "when", + "who", + "whom", + "why", + "will", + "would", + "yet", + "you", + "your", + ]); + + let index = new MiniSearch({ + fields: ["title", "text"], // fields to index for full-text search + storeFields: ["location", "title", "text", "category", "page"], // fields to return with results processTerm: (term) => { let word = stopWords.has(term) ? null : term; if (word) { + // custom trimmer that doesn't strip @ and !, which are used in julia macro and function names word = word .replace(/^[^a-zA-Z0-9@!]+/, "") .replace(/[^a-zA-Z0-9@!]+$/, ""); + + word = word.toLowerCase(); } return word ?? null; }, + // add . as a separator, because otherwise "title": "Documenter.Anchors.add!", would not + // find anything if searching for "add!", only for the entire qualification tokenize: (string) => string.split(/[\s\-\.]+/), - }, -}); + // options which will be applied during the search + searchOptions: { + prefix: true, + boost: { title: 100 }, + fuzzy: 2, + }, + }); -index.addAll(data); + index.addAll(data); + + /** + * Used to map characters to HTML entities. + * Refer: https://github.com/lodash/lodash/blob/main/src/escape.ts + */ + const htmlEscapes = { + "&": "&", + "<": "<", + ">": ">", + '"': """, + "'": "'", + }; + + /** + * Used to match HTML entities and HTML characters. + * Refer: https://github.com/lodash/lodash/blob/main/src/escape.ts + */ + const reUnescapedHtml = /[&<>"']/g; + const reHasUnescapedHtml = RegExp(reUnescapedHtml.source); + + /** + * Escape function from lodash + * Refer: https://github.com/lodash/lodash/blob/main/src/escape.ts + */ + function escape(string) { + return string && reHasUnescapedHtml.test(string) + ? string.replace(reUnescapedHtml, (chr) => htmlEscapes[chr]) + : string || ""; + } -let filters = [...new Set(data.map((x) => x.category))]; -var modal_filters = make_modal_body_filters(filters); -var filter_results = []; + /** + * Make the result component given a minisearch result data object and the value + * of the search input as queryString. To view the result object structure, refer: + * https://lucaong.github.io/minisearch/modules/_minisearch_.html#searchresult + * + * @param {object} result + * @param {string} querystring + * @returns string + */ + function make_search_result(result, querystring) { + let search_divider = `
`; + let display_link = + result.location.slice(Math.max(0), Math.min(50, result.location.length)) + + (result.location.length > 30 ? "..." : ""); // To cut-off the link because it messes with the overflow of the whole div + + if (result.page !== "") { + display_link += ` (${result.page})`; + } -$(document).on("keyup", ".documenter-search-input", function (event) { - // Adding a debounce to prevent disruptions from super-speed typing! - debounce(() => update_search(filter_results), 300); + let textindex = new RegExp(`${querystring}`, "i").exec(result.text); + let text = + textindex !== null + ? result.text.slice( + Math.max(textindex.index - 100, 0), + Math.min( + textindex.index + querystring.length + 100, + result.text.length + ) + ) + : ""; // cut-off text before and after from the match + + text = text.length ? escape(text) : ""; + + let display_result = text.length + ? "..." + + text.replace( + new RegExp(`${escape(querystring)}`, "i"), // For first occurrence + '$&' + ) + + "..." + : ""; // highlights the match + + let in_code = false; + if (!["page", "section"].includes(result.category.toLowerCase())) { + in_code = true; + } + + // We encode the full url to escape some special characters which can lead to broken links + let result_div = ` + +
+
${escape(result.title)}
+
${result.category}
+
+

+ ${display_result} +

+
+ ${display_link} +
+
+ ${search_divider} + `; + + return result_div; + } + + self.onmessage = function (e) { + let query = e.data; + let results = index.search(query, { + filter: (result) => { + // Only return relevant results + return result.score >= 1; + }, + }); + + // Pre-filter to deduplicate and limit to 200 per category to the extent + // possible without knowing what the filters are. + let filtered_results = []; + let counts = {}; + for (let filter of filters) { + counts[filter] = 0; + } + let present = {}; + + for (let result of results) { + cat = result.category; + cnt = counts[cat]; + if (cnt < 200) { + id = cat + "---" + result.location; + if (present[id]) { + continue; + } + present[id] = true; + filtered_results.push({ + location: result.location, + category: cat, + div: make_search_result(result, query), + }); + } + } + + postMessage(filtered_results); + }; +} + +// `worker = Threads.@spawn worker_function(documenterSearchIndex)`, but in JavaScript! +const filters = [ + ...new Set(documenterSearchIndex["docs"].map((x) => x.category)), +]; +const worker_str = + "(" + + worker_function.toString() + + ")(" + + JSON.stringify(documenterSearchIndex["docs"]) + + "," + + JSON.stringify(documenterBaseURL) + + "," + + JSON.stringify(filters) + + ")"; +const worker_blob = new Blob([worker_str], { type: "text/javascript" }); +const worker = new Worker(URL.createObjectURL(worker_blob)); + +/////// SEARCH MAIN /////// + +// Whether the worker is currently handling a search. This is a boolean +// as the worker only ever handles 1 or 0 searches at a time. +var worker_is_running = false; + +// The last search text that was sent to the worker. This is used to determine +// if the worker should be launched again when it reports back results. +var last_search_text = ""; + +// The results of the last search. This, in combination with the state of the filters +// in the DOM, is used compute the results to display on calls to update_search. +var unfiltered_results = []; + +// Which filter is currently selected +var selected_filter = ""; + +$(document).on("input", ".documenter-search-input", function (event) { + if (!worker_is_running) { + launch_search(); + } }); +function launch_search() { + worker_is_running = true; + last_search_text = $(".documenter-search-input").val(); + worker.postMessage(last_search_text); +} + +worker.onmessage = function (e) { + if (last_search_text !== $(".documenter-search-input").val()) { + launch_search(); + } else { + worker_is_running = false; + } + + unfiltered_results = e.data; + update_search(); +}; + $(document).on("click", ".search-filter", function () { if ($(this).hasClass("search-filter-selected")) { - $(this).removeClass("search-filter-selected"); + selected_filter = ""; } else { - $(this).addClass("search-filter-selected"); + selected_filter = $(this).text().toLowerCase(); } - // Adding a debounce to prevent disruptions from crazy clicking! - debounce(() => get_filters(), 300); + // This updates search results and toggles classes for UI: + update_search(); }); -/** - * A debounce function, takes a function and an optional timeout in milliseconds - * - * @function callback - * @param {number} timeout - */ -function debounce(callback, timeout = 300) { - clearTimeout(timer); - timer = setTimeout(callback, timeout); -} - /** * Make/Update the search component - * - * @param {string[]} selected_filters */ -function update_search(selected_filters = []) { - let initial_search_body = ` -
Type something to get started!
- `; - +function update_search() { let querystring = $(".documenter-search-input").val(); if (querystring.trim()) { - results = index.search(querystring, { - filter: (result) => { - // Filtering results - if (selected_filters.length === 0) { - return result.score >= 1; - } else { - return ( - result.score >= 1 && selected_filters.includes(result.category) - ); - } - }, - }); + if (selected_filter == "") { + results = unfiltered_results; + } else { + results = unfiltered_results.filter((result) => { + return selected_filter == result.category.toLowerCase(); + }); + } let search_result_container = ``; + let modal_filters = make_modal_body_filters(); let search_divider = `
`; if (results.length) { @@ -449,19 +690,23 @@ function update_search(selected_filters = []) { let count = 0; let search_results = ""; - results.forEach(function (result) { - if (result.location) { - // Checking for duplication of results for the same page - if (!links.includes(result.location)) { - search_results += make_search_result(result, querystring); - count++; - } - + for (var i = 0, n = results.length; i < n && count < 200; ++i) { + let result = results[i]; + if (result.location && !links.includes(result.location)) { + search_results += result.div; + count++; links.push(result.location); } - }); + } - let result_count = `
${count} result(s)
`; + if (count == 1) { + count_str = "1 result"; + } else if (count == 200) { + count_str = "200+ results"; + } else { + count_str = count + " results"; + } + let result_count = `
${count_str}
`; search_result_container = `
@@ -490,125 +735,37 @@ function update_search(selected_filters = []) { $(".search-modal-card-body").html(search_result_container); } else { - filter_results = []; - modal_filters = make_modal_body_filters(filters, filter_results); - if (!$(".search-modal-card-body").hasClass("is-justify-content-center")) { $(".search-modal-card-body").addClass("is-justify-content-center"); } - $(".search-modal-card-body").html(initial_search_body); + $(".search-modal-card-body").html(` +
Type something to get started!
+ `); } } /** * Make the modal filter html * - * @param {string[]} filters - * @param {string[]} selected_filters * @returns string */ -function make_modal_body_filters(filters, selected_filters = []) { - let str = ``; - - filters.forEach((val) => { - if (selected_filters.includes(val)) { - str += `${val}`; - } else { - str += `${val}`; - } - }); +function make_modal_body_filters() { + let str = filters + .map((val) => { + if (selected_filter == val.toLowerCase()) { + return `${val}`; + } else { + return `${val}`; + } + }) + .join(""); - let filter_html = ` + return `
Filters: ${str} -
- `; - - return filter_html; -} - -/** - * Make the result component given a minisearch result data object and the value of the search input as queryString. - * To view the result object structure, refer: https://lucaong.github.io/minisearch/modules/_minisearch_.html#searchresult - * - * @param {object} result - * @param {string} querystring - * @returns string - */ -function make_search_result(result, querystring) { - let search_divider = `
`; - let display_link = - result.location.slice(Math.max(0), Math.min(50, result.location.length)) + - (result.location.length > 30 ? "..." : ""); // To cut-off the link because it messes with the overflow of the whole div - - if (result.page !== "") { - display_link += ` (${result.page})`; - } - - let textindex = new RegExp(`\\b${querystring}\\b`, "i").exec(result.text); - let text = - textindex !== null - ? result.text.slice( - Math.max(textindex.index - 100, 0), - Math.min( - textindex.index + querystring.length + 100, - result.text.length - ) - ) - : ""; // cut-off text before and after from the match - - let display_result = text.length - ? "..." + - text.replace( - new RegExp(`\\b${querystring}\\b`, "i"), // For first occurrence - '$&' - ) + - "..." - : ""; // highlights the match - - let in_code = false; - if (!["page", "section"].includes(result.category.toLowerCase())) { - in_code = true; - } - - // We encode the full url to escape some special characters which can lead to broken links - let result_div = ` - -
-
${result.title}
-
${result.category}
-
-

- ${display_result} -

-
- ${display_link} -
-
- ${search_divider} - `; - - return result_div; -} - -/** - * Get selected filters, remake the filter html and lastly update the search modal - */ -function get_filters() { - let ele = $(".search-filters .search-filter-selected").get(); - filter_results = ele.map((x) => $(x).text().toLowerCase()); - modal_filters = make_modal_body_filters(filters, filter_results); - update_search(filter_results); +
`; } }) @@ -635,103 +792,107 @@ $(document).ready(function () { //////////////////////////////////////////////////////////////////////////////// require(['jquery'], function($) { -let search_modal_header = ` - -`; - -let initial_search_body = ` -
Type something to get started!
-`; - -let search_modal_footer = ` - -`; - -$(document.body).append( - ` - diff --git a/dev/index.html b/dev/index.html index f5aca1f0..0c68ff75 100644 --- a/dev/index.html +++ b/dev/index.html @@ -1,2 +1,2 @@ -Home · PotentialLearning.jl

[WIP] PotentialLearning.jl

An open source Julia library for active learning of interatomic potentials in atomistic simulations of materials. It incorporates elements of bayesian inference, machine learning, differentiable programming, software composability, and high-performance computing. This package is part of a software suite developed for the CESMIX project.

Specific goals

  • Intelligent data subsampling: iteratively query a large pool of unlabeled data to extract a minimum number of training data that would lead to a supervised ML model with superior accuracy compared to a training model with educated handpicking.
  • Quantity of Interest based dimension reduction through the theory of Active Subspaces.
  • Inference of the optimal values and uncertainties of the model parameters, to propagate them through the atomistic simulation.
    • Interatomic potential hyper-parameter optimization. E.g. estimation of the optimum cutoff radius.
    • Interatomic potential fitting. The potentials addressed in this package are defined in InteratomicPotentials.jl and InteratomicBasisPotentials.jl. E.g. ACE, SNAP, Neural Network Potentials.
  • Measurement of QoI sensitivity to individual parameters.
  • Input data management and post-processing.
    • Process input data so that it is ready for training. E.g. read XYZ file with atomic configurations, linearize energies and forces, split dataset into training and testing, normalize data, transfer data to GPU, define iterators, etc.
    • Post-processing: computation of different metrics (MAE, RSQ, COV, etc), saving results, and plotting.

Leveraging Julia!

  • Software composability through multiple dispatch. A series of composable workflows is guiding our design and development. We analyzed three of the most representative workflows: classical molecular dynamics (MD), Ab initio MD, and classical MD with active learning. In addition, it facilitates the training of new potentials defined by the composition of neural networks with state-of-the-art interatomic potential descriptors.
  • Differentiable programming. Powerful automatic differentiation tools, such as Enzyme or Zygote, help to accelerate the development of new interatomic potentials by automatically calculating loss function gradients and forces.
  • SciML: Open Source Software for Scientific Machine Learning. It provides libraries, such as Optimization.jl, that bring together several optimization packages into one unified Julia interface.
  • Machine learning and HPC abstractions: Flux.jl makes parallel learning simple using the NVIDIA GPU abstractions of CUDA.jl. Mini-batch iterations on heterogeneous data, as required by a loss function based on energies and forces, can be handled by DataLoader.jl.

Examples

See AtomisticComposableWorkflows repository. It aims to gather easy-to-use CESMIX-aligned case studies, integrating the latest developments of the Julia atomistic ecosystem with state-of-the-art tools.

+Home · PotentialLearning.jl

[WIP] PotentialLearning.jl

An open source Julia library for active learning of interatomic potentials in atomistic simulations of materials. It incorporates elements of bayesian inference, machine learning, differentiable programming, software composability, and high-performance computing. This package is part of a software suite developed for the CESMIX project.

Specific goals

  • Intelligent data subsampling: iteratively query a large pool of unlabeled data to extract a minimum number of training data that would lead to a supervised ML model with superior accuracy compared to a training model with educated handpicking.
  • Quantity of Interest based dimension reduction through the theory of Active Subspaces.
  • Inference of the optimal values and uncertainties of the model parameters, to propagate them through the atomistic simulation.
    • Interatomic potential hyper-parameter optimization. E.g. estimation of the optimum cutoff radius.
    • Interatomic potential fitting. The potentials addressed in this package are defined in InteratomicPotentials.jl and InteratomicBasisPotentials.jl. E.g. ACE, SNAP, Neural Network Potentials.
  • Measurement of QoI sensitivity to individual parameters.
  • Input data management and post-processing.
    • Process input data so that it is ready for training. E.g. read XYZ file with atomic configurations, linearize energies and forces, split dataset into training and testing, normalize data, transfer data to GPU, define iterators, etc.
    • Post-processing: computation of different metrics (MAE, RSQ, COV, etc), saving results, and plotting.

Leveraging Julia!

  • Software composability through multiple dispatch. A series of composable workflows is guiding our design and development. We analyzed three of the most representative workflows: classical molecular dynamics (MD), Ab initio MD, and classical MD with active learning. In addition, it facilitates the training of new potentials defined by the composition of neural networks with state-of-the-art interatomic potential descriptors.
  • Differentiable programming. Powerful automatic differentiation tools, such as Enzyme or Zygote, help to accelerate the development of new interatomic potentials by automatically calculating loss function gradients and forces.
  • SciML: Open Source Software for Scientific Machine Learning. It provides libraries, such as Optimization.jl, that bring together several optimization packages into one unified Julia interface.
  • Machine learning and HPC abstractions: Flux.jl makes parallel learning simple using the NVIDIA GPU abstractions of CUDA.jl. Mini-batch iterations on heterogeneous data, as required by a loss function based on energies and forces, can be handled by DataLoader.jl.

Examples

See AtomisticComposableWorkflows repository. It aims to gather easy-to-use CESMIX-aligned case studies, integrating the latest developments of the Julia atomistic ecosystem with state-of-the-art tools.

diff --git a/dev/objects.inv b/dev/objects.inv new file mode 100644 index 00000000..64be2f88 Binary files /dev/null and b/dev/objects.inv differ diff --git a/dev/search_index.js b/dev/search_index.js index ddc6e47e..902fc600 100644 --- a/dev/search_index.js +++ b/dev/search_index.js @@ -1,3 +1,3 @@ var documenterSearchIndex = {"docs": -[{"location":"api/#API-Reference","page":"API","title":"API Reference","text":"","category":"section"},{"location":"api/","page":"API","title":"API","text":"This page provides a list of all documented types and functions and in PotentialLearning.jl.","category":"page"},{"location":"api/","page":"API","title":"API","text":"Modules = [PotentialLearning]\nOrder = [:type, :function, :constant]","category":"page"},{"location":"api/#PotentialLearning.ActiveSubspace","page":"API","title":"PotentialLearning.ActiveSubspace","text":"ActiveSubspace{T<:Real} <: DimensionReducer\n Q :: Function \n ∇Q :: Function (gradient of Q)\n tol :: T\n\nUse the theory of active subspaces, with a given quantity of interest (expressed as the function Q) which takes a Configuration as an input and outputs a real scalar. ∇Q should input a Configuration and output an appropriate gradient. If tol is a float then the number of components to keep is determined by the smallest n such that relative percentage of variance explained by keeping the leading n principle components is greater than 1 - tol. If tol is an int, then we return the components corresponding to the tol largest eigenvalues.\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.AtomicData","page":"API","title":"PotentialLearning.AtomicData","text":"AtomicData <: Data\n\nAbstract type declaring the type of information that is unique to a particular atom (instead of a whole configuration).\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.Configuration-Tuple{Vararg{Union{ConfigurationData, AtomsBase.FlexibleSystem}}}","page":"API","title":"PotentialLearning.Configuration","text":"Configuration(data::Union{AtomsBase.FlexibleSystem, ConfigurationData} )\n\nA Configuration is a data struct that contains information unique to a particular configuration of atoms (Energy, LocalDescriptors, ForceDescriptors, and a FlexibleSystem) in a dictionary. Example: '''julia e = Energy(-0.57, u\"eV\") ld = LocalDescriptors(...) c = Configuration(e, ld) '''\n\nConfigurations can be added together, which merges the data dictionaries '''julia c1 = Configuration(e) # Contains energy c2 = Configuration(f) # contains forces c = c1 + c2 # c <: Configuration, contains energy and forces '''\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.ConfigurationData","page":"API","title":"PotentialLearning.ConfigurationData","text":"ConfigurationData <: Data\n\nAbstract type declaring the type of data that is unique to a particular configuration (instead of just an atom).\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.CorrelationMatrix","page":"API","title":"PotentialLearning.CorrelationMatrix","text":"CorrelationMatrix \n α :: Vector{Float64} # weights\n\nCorrelationMatrix produces a global descriptor that is the correlation matrix of the local descriptors. In other words, it is mean(bi'*bi for bi in B). \n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.CovariateLinearProblem","page":"API","title":"PotentialLearning.CovariateLinearProblem","text":"struct CovariateLinearProblem{T<:Real} <: LinearProblem{T} e::Vector f::Vector{Vector{T}} B::Vector{Vector{T}} dB::Vector{Matrix{T}} β::Vector{T} β0::Vector{T} σe::Vector{T} σf::Vector{T} Σ::Symmetric{T,Matrix{T}} end\n\nA CovariateLinearProblem is a linear problem in which we are fitting energies and forces using both descriptors and their gradients (B and dB, respectively). When this is the case, the solution is not available analytically and must be solved using some iterative optimization proceedure. In the end, we fit the model coefficients, β, standard deviations corresponding to energies and forces, σe and σf, and the covariance Σ. \n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.DBSCANSelector","page":"API","title":"PotentialLearning.DBSCANSelector","text":"struct DBSCANSelector <: SubsetSelector\n clusters\n eps\n minpts\n sample_size\nend\n\nDefinition of the type DBSCANSelector, a subselector based on the clustering method DBSCAN.\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.DBSCANSelector-Tuple{DataSet, Any, Any, Any}","page":"API","title":"PotentialLearning.DBSCANSelector","text":"function DBSCANSelector(\n ds::DataSet,\n eps,\n minpts,\n sample_size\n)\n\nConstructor of DBSCANSelector based on the atomic configurations in ds, the DBSCAN params eps and minpts, and the sample size sample_size.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.Data","page":"API","title":"PotentialLearning.Data","text":"Data\n\nAbstract supertype of ConfigurationData.\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.DataBase","page":"API","title":"PotentialLearning.DataBase","text":"DataBase\n\nAbstract type for DataSets. \n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.DataSet","page":"API","title":"PotentialLearning.DataSet","text":"DataSet\n\nStruct that holds vector of configuration. Most operations in PotentialLearning are built around the DataSet structure.\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.Distance","page":"API","title":"PotentialLearning.Distance","text":"Distance\n\nA struct of abstract type Distance produces the distance between two `global` descriptors, or features. Not all distances might be compatible with all types of features.\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.DotProduct","page":"API","title":"PotentialLearning.DotProduct","text":"DotProduct <: Kernel \n α :: Power of DotProduct kernel \n\n\nComputes the dot product kernel between two features, i.e.,\n\ncos(θ) = ( A ⋅ B / (||A||^2||B||^2) )^α\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.Energy","page":"API","title":"PotentialLearning.Energy","text":"Energy <: ConfigurationData\n d :: Real\n u :: Unitful.FreeUnits\n\nConvenience struct that holds energy information (and corresponding units). Default unit is eV\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.Euclidean","page":"API","title":"PotentialLearning.Euclidean","text":"Euclidean <: Distance \n Cinv :: Covariance Matrix \n\nComputes the squared euclidean distance with weight matrix Cinv, the inverse of some covariance matrix.\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.ExtXYZ","page":"API","title":"PotentialLearning.ExtXYZ","text":"ExtXYZ <: IO\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.Feature","page":"API","title":"PotentialLearning.Feature","text":"Feature\n\nA struct of abstract type Feature represents a function that takes in a set of local descriptors corresponding to some atomic environment and produce a global descriptor. \n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.Force","page":"API","title":"PotentialLearning.Force","text":"Force <: AtomicData \n f :: Vector{<:Real}\n u :: Unitful.FreeUnits\n\nContains the force with (x,y,z)-components in f with units u. Default unit is \"eV/Å\". \n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.ForceDescriptor","page":"API","title":"PotentialLearning.ForceDescriptor","text":"ForceDescriptor <: AtomicData\n b :: Vector{<:Vector{<:Real}}\n\nContains the x,y,z components (out vector) of the force descriptor (inner vector).\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.ForceDescriptors","page":"API","title":"PotentialLearning.ForceDescriptors","text":"ForceDescriptors <: ConfigurationData\n b :: Vector{ForceDescriptor}\n\nA container holding all of the ForceDescriptors for all atoms in a configuration.\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.Forces","page":"API","title":"PotentialLearning.Forces","text":"Forces <: ConfigurationData\n f :: Vector{force}\n\nForces is a struct that contains all force information in a configuration.\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.Forstner","page":"API","title":"PotentialLearning.Forstner","text":"Forstner <: Distance \n α :: Regularization parameter\n\nComputes the squared Forstner distance between two positive semi-definite matrices.\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.GlobalMean","page":"API","title":"PotentialLearning.GlobalMean","text":" GlobalMean{T}\n\nGlobalMean produces the mean of the local descriptors.\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.GlobalSum","page":"API","title":"PotentialLearning.GlobalSum","text":" GlobalSum{T}\n\nGlobalSum produces the sum of the local descriptors.\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.Kernel","page":"API","title":"PotentialLearning.Kernel","text":"Kernel\n\nA struct of abstract type Kernel is function that takes in two features and produces a semi-definite scalar representing the similarity between the two features.\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.LAMMPS","page":"API","title":"PotentialLearning.LAMMPS","text":"struct LAMMPS <: IO\n elements :: Vector{Symbol}\n boundary_conditions :: Vector\nend\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.LearningProblem","page":"API","title":"PotentialLearning.LearningProblem","text":"struct LearningProblem{T<:Real} <: AbstractLearningProblem ds::DataSet logprob::Function ∇logprob::Function params::Vector{T} end\n\nGeneric LearningProblem that allows the user to pass a logprob(y::params, ds::DataSet) function and its gradient. The gradient should return a vector of logprob with respect to it's params. If the user does not have a gradient function available, then Flux can provide one for it (provided that logprob is of the form above).\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.LearningProblem-Union{Tuple{T}, Tuple{DataSet, Function, Vector{T}}} where T","page":"API","title":"PotentialLearning.LearningProblem","text":"function LearningProblem( ds::DataSet, logprob::Function, params::Vector{T} ) where {T}\n\nGeneric LearningProblem construnctor.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.LinearProblem","page":"API","title":"PotentialLearning.LinearProblem","text":"abstract type LinearProblem{T<:Real} <: AbstractLearningProblem end\n\nAn abstract type to specify linear potential inference problems. \n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.LinearProblem-Tuple{DataSet}","page":"API","title":"PotentialLearning.LinearProblem","text":"function LinearProblem( ds::DataSet; T = Float64 )\n\nConstruct a LinearProblem by detecting if there are energy descriptors and/or force descriptors and construct the appropriate LinearProblem (either Univariate, if only a single type of descriptor, or Covariate, if there are both types).\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.LocalDescriptor","page":"API","title":"PotentialLearning.LocalDescriptor","text":"LocalDescriptor <: AtomicData\n\nA vector corresponding to the descriptor for a particular atom's neighborhood.\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.LocalDescriptors","page":"API","title":"PotentialLearning.LocalDescriptors","text":"LocalDescriptors <: ConfigurationData\n\nA vector of LocalDescriptor, which now should represent all local descriptors for atoms in a configuration.\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.PCA","page":"API","title":"PotentialLearning.PCA","text":"PCA <: DimensionReducer\n tol :: Float64\n\nUse SVD to compute the PCA of the design matrix of descriptors. (using Force descriptors TBA)\n\nIf tol is a float then the number of components to keep is determined by the smallest n such that relative percentage of variance explained by keeping the leading n principle components is greater than 1 - tol. If tol is an int, then we return the components corresponding to the tol largest eigenvalues.\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.RBF","page":"API","title":"PotentialLearning.RBF","text":"RBF <: Kernel \n d :: Distance function \n α :: Reguarlization parameter \n ℓ :: Length-scale parameter\n β :: Scale parameter\n\n\nComputes the squared exponential kernel, i.e.,\n\n k(A, B) = β \u001bxp( -\frac{1}{2} d(A,B)/ℓ^2 ) + α δ(A, B)\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.RandomSelector","page":"API","title":"PotentialLearning.RandomSelector","text":"struct Random\n num_configs :: Int \n batch_size :: Int \nend\n\nA convenience function that allows the user to randomly select indices uniformly over [1, num_configs]. \n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.UnivariateLinearProblem","page":"API","title":"PotentialLearning.UnivariateLinearProblem","text":"struct UnivariateLinearProblem{T<:Real} <: LinearProblem{T} ivdata::Vector dvdata::Vector β::Vector{T} β0::Vector{T} σ::Vector{T} Σ::Symmetric{T,Matrix{T}} end\n\nA UnivariateLinearProblem is a linear problem in which there is only 1 type of independent variable / dependent variable. Typically, that means we are either only fitting energies or only fitting forces. When this is the case, the solution is available analytically and the standard deviation, σ, and covariance, Σ, of the coefficients, β, are computable. \n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.YAML","page":"API","title":"PotentialLearning.YAML","text":"YAML <: IO\n energy_units :: Unitful.FreeUnits\n distance_units :: Unitful.FreeUnits\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.kDPP","page":"API","title":"PotentialLearning.kDPP","text":"struct kDPP\n K :: EllEnsemble\nend\n\nA convenience function that allows the user access to a k-Determinantal Point Process through Determinantal.jl. All that is required to construct a kDPP is a similarity kernel, for which the user must provide a LinearProblem and two functions to compute descriptor (1) diversity and (2) quality. \n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.kDPP-Tuple{DataSet, Feature, Kernel}","page":"API","title":"PotentialLearning.kDPP","text":"kDPP(ds::Dataset, f::Feature, k::Kernel)\n\nA convenience function that allows the user access to a k-Determinantal Point Process through Determinantal.jl. All that is required to construct a kDPP is a dataset, a method to compute features, and a kernel. Optional arguments include batch size and type of descriptor (default LocalDescriptors).\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.kDPP-Union{Tuple{T}, Tuple{Union{Array{Vector{T}, 1}, Array{LinearAlgebra.Symmetric{T, Matrix{T}}, 1}}, Kernel}} where T","page":"API","title":"PotentialLearning.kDPP","text":"kDPP(features::Union{Vector{Vector{T}}, Vector{Symmetric{T, Matrix{T}}}}, k::Kernel)\n\nA convenience function that allows the user access to a k-Determinantal Point Process through Determinantaljl. All that is required to construct a kDPP are features (either a vector of vector features or a vector of symmetric matrix features) and a kernel. Optional argument is batch_size (default length(features)).\n\n\n\n\n\n","category":"method"},{"location":"api/#InteratomicPotentials.compute_force_descriptors-Tuple{DataSet, InteratomicPotentials.BasisSystem}","page":"API","title":"InteratomicPotentials.compute_force_descriptors","text":"function computeforcedescriptors( ds::DataSet, basis::BasisSystem; pbar = true )\n\nCompute force descriptors of a basis system and dataset using threads.\n\n\n\n\n\n","category":"method"},{"location":"api/#InteratomicPotentials.compute_local_descriptors-Tuple{DataSet, InteratomicPotentials.BasisSystem}","page":"API","title":"InteratomicPotentials.compute_local_descriptors","text":"function computelocaldescriptors( ds::DataSet, basis::BasisSystem; pbar = true )\n\nds: dataset. basis: basis system (e.g. ACE) pbar: progress bar\n\nCompute local descriptors of a basis system and dataset using threads.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.KernelMatrix-Tuple{DataSet, DataSet, Feature, Kernel}","page":"API","title":"PotentialLearning.KernelMatrix","text":"KernelMatrix(ds1::DataSet, ds2::DataSet, F::Feature, k::Kernel)\n\nCompute nonsymmetric kernel matrix K using features of the datasets ds1 and ds2 calculated using the Feature method F.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.KernelMatrix-Tuple{DataSet, Feature, Kernel}","page":"API","title":"PotentialLearning.KernelMatrix","text":"KernelMatrix(ds::DataSet, F::Feature, k::Kernel)\n\nCompute symmetric kernel matrix K using features of the dataset ds calculated using the Feature method F. \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.KernelMatrix-Union{Tuple{T}, Tuple{Union{Array{Vector{T}, 1}, Array{LinearAlgebra.Symmetric{T, Matrix{T}}, 1}}, Kernel}} where T","page":"API","title":"PotentialLearning.KernelMatrix","text":"KernelMatrix(F, k::Kernel)\n\nCompute symmetric kernel matrix K where K{ij} = k(Fi, F_j). \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.KernelMatrix-Union{Tuple{T}, Tuple{Union{Array{Vector{T}, 1}, Array{LinearAlgebra.Symmetric{T, Matrix{T}}, 1}}, Union{Array{Vector{T}, 1}, Array{LinearAlgebra.Symmetric{T, Matrix{T}}, 1}}, Kernel}} where T","page":"API","title":"PotentialLearning.KernelMatrix","text":"KernelMatrix(F1, F2, k::Kernel)\n\nCompute non-symmetric kernel matrix K where K{ij} = k(F1i, F2_j). \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.calc_centroid-Tuple{Matrix{Float64}}","page":"API","title":"PotentialLearning.calc_centroid","text":"function calc_centroid(\n m::Array{Float64,2}\n)\n\nCalculate a centroid of a matrix.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.calc_metrics-Tuple{Any, Any}","page":"API","title":"PotentialLearning.calc_metrics","text":"calc_metrics(x_pred, x)\n\nx_pred: vector of predicted values of a variable. E.g. energy. x: vector of true values of a variable. E.g. energy.\n\nReturns MAE, RMSE, and RSQ.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.compute_features-Tuple{DataSet, Feature}","page":"API","title":"PotentialLearning.compute_features","text":"compute_feature(ds::DataSet, f::Feature; dt = LocalDescriptors)\n\nComputes features of the dataset ds using the feature method F on descriptors dt (default option are the LocalDescriptors, if available).\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.compute_kernel-Union{Tuple{T}, Tuple{T, T, RBF}} where T<:Union{LinearAlgebra.Symmetric{<:Real, <:Matrix{<:Real}}, Vector{<:Real}}","page":"API","title":"PotentialLearning.compute_kernel","text":"compute_kernel(A, B, k)\n\nCompute similarity kernel between features A and B using kernel k. \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.distance_matrix_kabsch-Tuple{DataSet}","page":"API","title":"PotentialLearning.distance_matrix_kabsch","text":"function distance_matrix_kabsch(\n ds::DataSet\n)\n\nCalculate a matrix of distances between atomic configurations using KABSCH method.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.distance_matrix_periodic-Tuple{DataSet}","page":"API","title":"PotentialLearning.distance_matrix_periodic","text":"function distance_matrix_periodic(\n ds::DataSet\n)\n\nCalculates a matrix of distances between atomic configurations taking into account the periodic boundaries.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.fit","page":"API","title":"PotentialLearning.fit","text":"fit(ds::DataSet, dr::DimensionReducer)\n\nFits a linear dimension reduction routine using information from DataSet. See individual types of DimensionReducers for specific details.\n\n\n\n\n\n","category":"function"},{"location":"api/#PotentialLearning.fit-Tuple{DataSet, ActiveSubspace}","page":"API","title":"PotentialLearning.fit","text":"fit(ds::DataSet, as::ActiveSubspace)\n\nFits a linear dimension reduction routine using the eigendirections of the uncentered covariance of the function ∇Q(c::Configuration) over the configurations in ds. Primarily used to reduce the dimension of the descriptors.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.fit-Tuple{DataSet, PCA}","page":"API","title":"PotentialLearning.fit","text":"fit(ds::DataSet, pca::PCA)\n\nFits a linear dimension reduction routine using PCA on the global descriptors in the dataset ds. \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.fit_transform-Tuple{DataSet, DimensionReducer}","page":"API","title":"PotentialLearning.fit_transform","text":"fit_transform(ds::DataSet, dr::DimensionReducer)\n\nFits a linear dimension reduction routine using information from DataSet and performs dimension reduction on descriptors and force_descriptors (whichever are available). See individual types of DimensionReducers for specific details.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.force-Tuple{Configuration, InteratomicPotentials.BasisPotential}","page":"API","title":"PotentialLearning.force","text":"function force( c::Configuration, bp::BasisPotential )\n\nc: atomic configuration. bp: basis potential.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.force-Tuple{Configuration, InteratomicPotentials.NNBasisPotential}","page":"API","title":"PotentialLearning.force","text":"function force( c::Configuration, nnbp::NNBasisPotential )\n\nc: atomic configuration. nnbp: neural network basis potential.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_all_energies-Tuple{DataSet, InteratomicPotentials.BasisPotential}","page":"API","title":"PotentialLearning.get_all_energies","text":"function get_all_energies(\n ds::DataSet,\n bp::BasisPotential\n)\n\nds: dataset. bp: basis potential.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_all_energies-Tuple{DataSet}","page":"API","title":"PotentialLearning.get_all_energies","text":"function getallenergies( ds::DataSet )\n\nds: dataset.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_all_forces-Tuple{DataSet, InteratomicPotentials.BasisPotential}","page":"API","title":"PotentialLearning.get_all_forces","text":"function getallforces( ds::DataSet, bp::BasisPotential )\n\nds: dataset. bp: basis potential.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_all_forces-Tuple{DataSet}","page":"API","title":"PotentialLearning.get_all_forces","text":"function get_all_forces(\n ds::DataSet\n)\n\nds: dataset.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_batches-NTuple{11, Any}","page":"API","title":"PotentialLearning.get_batches","text":"get_batches(n_batches, B_train, B_train_ext, e_train, dB_train, f_train,\n B_test, B_test_ext, e_test, dB_test, f_test)\n\nn_batches: no. of batches per dataset. B_train: descriptors of the energies used in training. B_train_ext: extendended descriptors of the energies used in training. Requiered to compute forces. e_train: energies used in training. dB_train: derivatives of the energy descritors used in training. f_train: forces used in training. B_test: descriptors of the energies used in test. B_test_ext: extendended descriptors of the energies used in test. Requiered to compute forces. e_test: energies used in test. dB_test: derivatives of the energy descritors used in test. f_test: forces used in test.\n\nReturns the data loaders for training and test of energies and forces.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_clusters-Tuple{Any, Any, Any}","page":"API","title":"PotentialLearning.get_clusters","text":"function get_clusters(\n ds,\n eps,\n minpts\n)\n\nComputes clusters from the configurations in ds using DBSCAN with parameters eps and minpts.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_dpp_mode-Tuple{kDPP}","page":"API","title":"PotentialLearning.get_dpp_mode","text":"get_dpp_mode(dpp::kDPP, batch_size::Int) <: Vector{Int64}\n\nAccess an approximate mode of the k-DPP as calculated by a greedy subset algorithm. See Determinantal.jl for details.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_energy-Tuple{Configuration}","page":"API","title":"PotentialLearning.get_energy","text":"get_energy(c::Configuration) <: Energy\n\nRetrieves the energy (if available) in the Configuration c. \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_force_descriptors-Tuple{Configuration}","page":"API","title":"PotentialLearning.get_force_descriptors","text":"get_force_descriptors(c::Configuration) <: ForceDescriptors\n\nRetrieves the force descriptors (if available) in the Configuration c. \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_forces-Tuple{Configuration}","page":"API","title":"PotentialLearning.get_forces","text":"get_forces(c::Configuration) <: Forces\n\nRetrieves the forces (if available) in the Configuration c. \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_inclusion_prob-Tuple{kDPP}","page":"API","title":"PotentialLearning.get_inclusion_prob","text":"get_inclusion_prob(dpp::kDPP) <: Vector{Float64}\n\nAccess an approximation to the inclusion probabilities as calculated by Determinantal.jl (see package for details).\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_input-Tuple{Any}","page":"API","title":"PotentialLearning.get_input","text":"get_input(args)\n\nargs: vector of arguments (strings)\n\nReturns an OrderedDict with the arguments. See https://github.com/cesmix-mit/AtomisticComposableWorkflows documentation for information about how to define the input arguments.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_local_descriptors-Tuple{Configuration}","page":"API","title":"PotentialLearning.get_local_descriptors","text":"get_local_descriptors(c::Configuration) <: LocalDescriptors\n\nRetrieves the local descriptors (if available) in the Configuration c. \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_metrics-NTuple{11, Any}","page":"API","title":"PotentialLearning.get_metrics","text":"get_metrics( e_train_pred, e_train, f_train_pred, f_train,\n e_test_pred, e_test, f_test_pred, f_test,\n B_time, dB_time, time_fitting)\n\ne_train_pred: vector of predicted training energy values. e_train: vector of true training energy values. f_train_pred: vector of predicted training force values. f_train: vector of true training force values. e_test_pred: vector of predicted test energy values. e_test: vector of true test energy values. f_test_pred: vector of predicted test force values. f_test: vector of true test force values. B_time: elapsed time consumed by descriptors calculation. dB_time: elapsed time consumed by descriptor derivatives calculation. time_fitting: elapsed time consumed by fitting process.\n\nComputes MAE, RMSE, and RSQ for training and testing energies and forces. Also add elapsed times about descriptors and fitting calculations. Returns an OrderedDict with the information above.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_metrics-NTuple{4, Any}","page":"API","title":"PotentialLearning.get_metrics","text":"get_metrics( e_train_pred, e_train, e_test_pred, e_test)\n\ne_train_pred: vector of predicted training energy values. e_train: vector of true training energy values. e_test_pred: vector of predicted test energy values. e_test: vector of true test energy values.\n\nComputes MAE, RMSE, and RSQ for training and testing energies. Returns an OrderedDict with the information above.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_metrics-Tuple{Any, Any}","page":"API","title":"PotentialLearning.get_metrics","text":"get_metrics(\n x_pred,\n x;\n metrics = [mae, rmse, rsq],\n label = \"x\"\n)\n\nx_pred: vector of predicted forces, x: vector of true forces. metrics: vector of metrics. label: label used as prefix in dictionary keys.\n\nReturns and OrderedDict with different metrics.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_positions-Tuple{Configuration}","page":"API","title":"PotentialLearning.get_positions","text":"get_positions(c::Configuration) <: Vector{SVector}\n\nRetrieves the AtomsBase system positions (if available) in the Configuration c. \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_random_subset","page":"API","title":"PotentialLearning.get_random_subset","text":"get_random_subset(r::Random, batch_size :: Int) <: Vector{Int64}\n\nAccess a random subset of the data as sampled from the provided k-DPP. Returns the indices of the random subset and the subset itself.\n\n\n\n\n\n","category":"function"},{"location":"api/#PotentialLearning.get_random_subset-2","page":"API","title":"PotentialLearning.get_random_subset","text":"function get_random_subset(\n s::DBSCANSelector,\n batch_size = s.sample_size\n)\n\nReturns a random subset of indexes composed of samples of size batch_size ÷ length(s.clusters) from each cluster in s.\n\n\n\n\n\n","category":"function"},{"location":"api/#PotentialLearning.get_random_subset-Tuple{kDPP}","page":"API","title":"PotentialLearning.get_random_subset","text":"get_random_subset(dpp::kDPP, batch_size :: Int) <: Vector{Int64}\n\nAccess a random subset of the data as sampled from the provided k-DPP. Returns the indices of the random subset and the subset itself.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_system-Tuple{Configuration}","page":"API","title":"PotentialLearning.get_system","text":"get_system(c::Configuration) <: AtomsBase.AbstractSystem\n\nRetrieves the AtomsBase system (if available) in the Configuration c. \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_values-Tuple{Energy}","page":"API","title":"PotentialLearning.get_values","text":"get_values(e::Energy) <: Real\n\nGet the underlying real value (= e.d)\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_values-Tuple{StaticArraysCore.SVector}","page":"API","title":"PotentialLearning.get_values","text":"get_values(v::SVector)\n\nRemoves units from a position.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.kabsch-Tuple{Matrix{Float64}, Matrix{Float64}}","page":"API","title":"PotentialLearning.kabsch","text":"function kabsch(\n reference::Array{Float64,2},\n coords::Array{Float64,2}\n)\n\nInput: two sets of points: reference, coords as Nx3 Matrices (so) Returns optimally rotated matrix \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.kabsch_rmsd-Tuple{Matrix{Float64}, Matrix{Float64}}","page":"API","title":"PotentialLearning.kabsch_rmsd","text":"function kabsch_rmsd(\n P::Array{Float64,2},\n Q::Array{Float64,2}\n)\n\nDirectly return RMSD for matrices P, Q for convenience.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.learn!-Tuple{InteratomicPotentials.LinearBasisPotential, DataSet, Vararg{Any}}","page":"API","title":"PotentialLearning.learn!","text":"function learn!( iap::InteratomicPotentials.LinearBasisPotential, ds::DataSet, args... )\n\nLearning dispatch function, common to ordinary and weghted least squares implementations.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.learn!-Tuple{PotentialLearning.CovariateLinearProblem, Real}","page":"API","title":"PotentialLearning.learn!","text":"function learn!( lp::CovariateLinearProblem, α::Real )\n\nFit a Gaussian distribution by finding the MLE of the following log probability: ℓ(β, σe, σf) = -0.5(e - A_e *β)'(e - Ae * β) / σe - 0.5*(f - Af β)'(f - A_f * β) / σf - log(σe) - log(σf)\n\nthrough an optimization procedure. \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.learn!-Tuple{PotentialLearning.CovariateLinearProblem, SubsetSelector, Real}","page":"API","title":"PotentialLearning.learn!","text":"function learn!( lp::CovariateLinearProblem, ss::SubsetSelector, α::Real; num_steps=100, opt=Flux.Optimise.Adam() )\n\nFit a Gaussian distribution by finding the MLE of the following log probability: ℓ(β, σe, σf) = -0.5(e - A_e *β)'(e - Ae * β) / σe - 0.5*(f - Af β)'(f - A_f * β) / σf - log(σe) - log(σf)\n\nthrough an iterative batch gradient descent optimization proceedure where the batches are provided by the subset selector. \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.learn!-Tuple{PotentialLearning.CovariateLinearProblem, Vector, Bool}","page":"API","title":"PotentialLearning.learn!","text":"function learn!( lp::CovariateLinearProblem, ws::Vector, int::Bool )\n\nFit energies and forces using weighted least squares.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.learn!-Tuple{PotentialLearning.LearningProblem, SubsetSelector}","page":"API","title":"PotentialLearning.learn!","text":"function learn!( lp::LearningProblem, ss::SubsetSelector; num_steps = 100::Int, opt = Flux.Optimisers.Adam() )\n\nAttempts to fit the parameters lp.params in the learning problem lp using batch gradient descent with the optimizer opt and num_steps number of iterations. Batching is provided by the passed ss::SubsetSelector. \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.learn!-Tuple{PotentialLearning.LearningProblem}","page":"API","title":"PotentialLearning.learn!","text":"function learn!( lp::LearningProblem; num_steps=100::Int, opt=Flux.Optimisers.Adam() )\n\nAttempts to fit the parameters lp.params in the learning problem lp using gradient descent with the optimizer opt and num_steps number of iterations.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.learn!-Tuple{PotentialLearning.LinearProblem}","page":"API","title":"PotentialLearning.learn!","text":"function learn!( lp::LinearProblem )\n\nDefault learning problem: weighted least squares.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.learn!-Tuple{PotentialLearning.UnivariateLinearProblem, Real}","page":"API","title":"PotentialLearning.learn!","text":"function learn!( lp::UnivariateLinearProblem, α::Real )\n\nFit a univariate Gaussian distribution for the equation y = Aβ + ϵ, where β are model coefficients and ϵ ∼ N(0, σ). Fitting is done via SVD on the design matrix, A'*A (formed iteratively), where eigenvalues less than α are cut-off. \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.learn!-Tuple{PotentialLearning.UnivariateLinearProblem, SubsetSelector, Real}","page":"API","title":"PotentialLearning.learn!","text":"function learn!( lp::UnivariateLinearProblem, ss::SubsetSelector, α::Real; num_steps = 100, opt = Flux.Optimise.Adam() )\n\nFit a univariate Gaussian distribution for the equation y = Aβ + ϵ, where β are model coefficients and ϵ ∼ N(0, σ). Fitting is done via batched gradient descent with batches provided by the subset selector and the gradients are calculated using Flux. \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.learn!-Tuple{PotentialLearning.UnivariateLinearProblem, Vector, Bool}","page":"API","title":"PotentialLearning.learn!","text":"function learn!( lp::UnivariateLinearProblem, ws::Vector, int::Bool )\n\nFit energies using weighted least squares.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.linearize_forces-Tuple{Any}","page":"API","title":"PotentialLearning.linearize_forces","text":"linearize_forces(forces)\n\nforces: vector of forces per system\n\nReturns a vector with the components of the forces of the systems.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.load_data-Tuple{Any, ExtXYZ}","page":"API","title":"PotentialLearning.load_data","text":"load_data(file::string, extxyz::ExtXYZ)\nLoad configuration from an extxyz file into a DataSet\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.load_data-Tuple{String, YAML}","page":"API","title":"PotentialLearning.load_data","text":"load_data(file::string, yaml::YAML)\n\nLoad configurations from a yaml file into a Vector of Flexible Systems, with Energies and Force.\nReturns \n ds - DataSet\n t = Vector{Dict} (any miscellaneous info from yaml file)\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.load_datasets-Tuple{Any}","page":"API","title":"PotentialLearning.load_datasets","text":"load_datasets(input)\n\ninput: OrderedDict with input arguments. See get_defaults_args().\n\nReturns training and test systems, energies, forces, and stresses.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.mae-Tuple{Any, Any}","page":"API","title":"PotentialLearning.mae","text":"mae(x_pred, x)\n\nx_pred: vector of predicted values. E.g. predicted energies. x: vector of true values. E.g. DFT energies.\n\nReturns mean absolute error.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.mean_cos-Tuple{Any, Any}","page":"API","title":"PotentialLearning.mean_cos","text":"mean_cos(x_pred, x)\n\nx_pred: vector of predicted forces, x: vector of true forces.\n\nReturns mean cosine.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.periodic_rmsd-Tuple{Matrix{Float64}, Matrix{Float64}, Vector{Float64}}","page":"API","title":"PotentialLearning.periodic_rmsd","text":"function periodic_rmsd(\n p1::Array{Float64,2},\n p2::Array{Float64,2},\n box_lengths::Array{Float64,1}\n)\n\nCalculates the RMSD between atom positions of two configurations taking into account the periodic boundaries.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.potential_energy-Tuple{Configuration, InteratomicPotentials.BasisPotential}","page":"API","title":"PotentialLearning.potential_energy","text":"function potential_energy( c::Configuration, bp::BasisPotential )\n\nc: atomic configuration. bp: basis potential.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.potential_energy-Tuple{Configuration, InteratomicPotentials.NNBasisPotential}","page":"API","title":"PotentialLearning.potential_energy","text":"function potential_energy( c::Configuration, nnbp::NNBasisPotential )\n\nc: atomic configuration. nnbp: neural network basis potential.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.rmsd-Tuple{Matrix{Float64}, Matrix{Float64}}","page":"API","title":"PotentialLearning.rmsd","text":"function rmsd(\n A::Array{Float64,2},\n B::Array{Float64,2}\n)\n\nCalculate root mean square deviation of two matrices A, B. See http://en.wikipedia.org/wiki/Root-mean-squaredeviationofatomicpositions\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.rmse-Tuple{Any, Any}","page":"API","title":"PotentialLearning.rmse","text":"rmse(x_pred, x)\n\nx_pred: vector of predicted values. E.g. predicted energies. x: vector of true values. E.g. DFT energies.\n\nReturns mean root mean square error.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.rsq-Tuple{Any, Any}","page":"API","title":"PotentialLearning.rsq","text":"rsq(x_pred, x)\n\nx_pred: vector of predicted values. E.g. predicted energies. x: vector of true values. E.g. DFT energies.\n\nReturns R-squared.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.sample-Tuple{Any, Any}","page":"API","title":"PotentialLearning.sample","text":"function sample(\n c,\n batch_size\n)\n\nSelect from cluster c a sample of size batch_size.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.to_num-Tuple{Any}","page":"API","title":"PotentialLearning.to_num","text":"to_num(str)\n\nstr: string with a number: integer or float\n\nReturns an integer or float.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.translate_points-Tuple{Matrix{Float64}, Matrix{Float64}}","page":"API","title":"PotentialLearning.translate_points","text":"function translate_points(\n P::Array{Float64,2},\n Q::Array{Float64,2}\n)\n\nTranslate P, Q so centroids are equal to the origin of the coordinate system Translation der Massenzentren, so dass beide Zentren im Ursprung des Koordinatensystems liegen\n\n\n\n\n\n","category":"method"},{"location":"#[WIP]-PotentialLearning.jl","page":"Home","title":"[WIP] PotentialLearning.jl","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"An open source Julia library for active learning of interatomic potentials in atomistic simulations of materials. It incorporates elements of bayesian inference, machine learning, differentiable programming, software composability, and high-performance computing. This package is part of a software suite developed for the CESMIX project.","category":"page"},{"location":"#Specific-goals","page":"Home","title":"Specific goals","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Intelligent data subsampling: iteratively query a large pool of unlabeled data to extract a minimum number of training data that would lead to a supervised ML model with superior accuracy compared to a training model with educated handpicking.\nVia DPP, clustering.\nQuantity of Interest based dimension reduction through the theory of Active Subspaces.\nInference of the optimal values and uncertainties of the model parameters, to propagate them through the atomistic simulation.\nInteratomic potential hyper-parameter optimization. E.g. estimation of the optimum cutoff radius.\nInteratomic potential fitting. The potentials addressed in this package are defined in InteratomicPotentials.jl and InteratomicBasisPotentials.jl. E.g. ACE, SNAP, Neural Network Potentials.\nMeasurement of QoI sensitivity to individual parameters. \nInput data management and post-processing.\nProcess input data so that it is ready for training. E.g. read XYZ file with atomic configurations, linearize energies and forces, split dataset into training and testing, normalize data, transfer data to GPU, define iterators, etc.\nPost-processing: computation of different metrics (MAE, RSQ, COV, etc), saving results, and plotting.","category":"page"},{"location":"#Leveraging-Julia!","page":"Home","title":"Leveraging Julia!","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Software composability through multiple dispatch. A series of composable workflows is guiding our design and development. We analyzed three of the most representative workflows: classical molecular dynamics (MD), Ab initio MD, and classical MD with active learning. In addition, it facilitates the training of new potentials defined by the composition of neural networks with state-of-the-art interatomic potential descriptors.\nDifferentiable programming. Powerful automatic differentiation tools, such as Enzyme or Zygote, help to accelerate the development of new interatomic potentials by automatically calculating loss function gradients and forces.\nSciML: Open Source Software for Scientific Machine Learning. It provides libraries, such as Optimization.jl, that bring together several optimization packages into one unified Julia interface. \nMachine learning and HPC abstractions: Flux.jl makes parallel learning simple using the NVIDIA GPU abstractions of CUDA.jl. Mini-batch iterations on heterogeneous data, as required by a loss function based on energies and forces, can be handled by DataLoader.jl.","category":"page"},{"location":"#Examples","page":"Home","title":"Examples","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"See AtomisticComposableWorkflows repository. It aims to gather easy-to-use CESMIX-aligned case studies, integrating the latest developments of the Julia atomistic ecosystem with state-of-the-art tools.","category":"page"},{"location":"how-to-run-the-examples/#How-to-run-the-examples","page":"How to run the examples","title":"How to run the examples","text":"","category":"section"},{"location":"how-to-run-the-examples/#Add-registries","page":"How to run the examples","title":"Add registries","text":"","category":"section"},{"location":"how-to-run-the-examples/","page":"How to run the examples","title":"How to run the examples","text":"Open a Julia REPL ($ julia), type ] to enter the Pkg REPL, and add the following registries:","category":"page"},{"location":"how-to-run-the-examples/","page":"How to run the examples","title":"How to run the examples","text":" pkg> registry add https://github.com/JuliaRegistries/General\n pkg> registry add https://github.com/cesmix-mit/CESMIX.git \n pkg> registry add https://github.com/JuliaMolSim/MolSim.git\n pkg> registry add https://github.com/ACEsuit/ACEregistry","category":"page"},{"location":"how-to-run-the-examples/#Install-the-dependencies-of-the-examples-folder-project","page":"How to run the examples","title":"Install the dependencies of the examples folder project","text":"","category":"section"},{"location":"how-to-run-the-examples/","page":"How to run the examples","title":"How to run the examples","text":"Clone PotentialLearning.jl repository in your working directory.","category":"page"},{"location":"how-to-run-the-examples/","page":"How to run the examples","title":"How to run the examples","text":" $ git clone git@github.com:cesmix-mit/PotentialLearning.jl.git","category":"page"},{"location":"how-to-run-the-examples/","page":"How to run the examples","title":"How to run the examples","text":"Open a Julia REPL activating the examples folder project.","category":"page"},{"location":"how-to-run-the-examples/","page":"How to run the examples","title":"How to run the examples","text":" $ julia --project=PotentialLearning.jl/examples","category":"page"},{"location":"how-to-run-the-examples/","page":"How to run the examples","title":"How to run the examples","text":"Type ] to enter the Pkg REPL and instantiate.","category":"page"},{"location":"how-to-run-the-examples/","page":"How to run the examples","title":"How to run the examples","text":" pkg> instantiate","category":"page"},{"location":"how-to-run-the-examples/#Run-an-example","page":"How to run the examples","title":"Run an example","text":"","category":"section"},{"location":"how-to-run-the-examples/","page":"How to run the examples","title":"How to run the examples","text":"Access to any folder within PotentialLearning.jl/examples. E.g.","category":"page"},{"location":"how-to-run-the-examples/","page":"How to run the examples","title":"How to run the examples","text":" $ cd PotentialLearning.jl/examples/ACE","category":"page"},{"location":"how-to-run-the-examples/","page":"How to run the examples","title":"How to run the examples","text":"Open a Julia REPL, activate the examples folder project, and define the number of threads.","category":"page"},{"location":"how-to-run-the-examples/","page":"How to run the examples","title":"How to run the examples","text":" $ julia --project=../ --threads=4","category":"page"},{"location":"how-to-run-the-examples/","page":"How to run the examples","title":"How to run the examples","text":"Finally, include the example file.","category":"page"},{"location":"how-to-run-the-examples/","page":"How to run the examples","title":"How to run the examples","text":" julia> include(\"fit-ace.jl\")","category":"page"}] +[{"location":"api/#API-Reference","page":"API","title":"API Reference","text":"","category":"section"},{"location":"api/","page":"API","title":"API","text":"This page provides a list of all documented types and functions and in PotentialLearning.jl.","category":"page"},{"location":"api/","page":"API","title":"API","text":"Modules = [PotentialLearning]\nOrder = [:type, :function, :constant]","category":"page"},{"location":"api/#PotentialLearning.ActiveSubspace","page":"API","title":"PotentialLearning.ActiveSubspace","text":"ActiveSubspace{T<:Real} <: DimensionReducer\n Q :: Function \n ∇Q :: Function (gradient of Q)\n tol :: T\n\nUse the theory of active subspaces, with a given quantity of interest (expressed as the function Q) which takes a Configuration as an input and outputs a real scalar. ∇Q should input a Configuration and output an appropriate gradient. If tol is a float then the number of components to keep is determined by the smallest n such that relative percentage of variance explained by keeping the leading n principle components is greater than 1 - tol. If tol is an int, then we return the components corresponding to the tol largest eigenvalues.\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.AtomicData","page":"API","title":"PotentialLearning.AtomicData","text":"AtomicData <: Data\n\nAbstract type declaring the type of information that is unique to a particular atom (instead of a whole configuration).\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.Configuration-Tuple{Vararg{Union{ConfigurationData, AtomsBase.FlexibleSystem}}}","page":"API","title":"PotentialLearning.Configuration","text":"Configuration(data::Union{AtomsBase.FlexibleSystem, ConfigurationData} )\n\nA Configuration is a data struct that contains information unique to a particular configuration of atoms (Energy, LocalDescriptors, ForceDescriptors, and a FlexibleSystem) in a dictionary. Example: '''julia e = Energy(-0.57, u\"eV\") ld = LocalDescriptors(...) c = Configuration(e, ld) '''\n\nConfigurations can be added together, which merges the data dictionaries '''julia c1 = Configuration(e) # Contains energy c2 = Configuration(f) # contains forces c = c1 + c2 # c <: Configuration, contains energy and forces '''\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.ConfigurationData","page":"API","title":"PotentialLearning.ConfigurationData","text":"ConfigurationData <: Data\n\nAbstract type declaring the type of data that is unique to a particular configuration (instead of just an atom).\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.CorrelationMatrix","page":"API","title":"PotentialLearning.CorrelationMatrix","text":"CorrelationMatrix \n α :: Vector{Float64} # weights\n\nCorrelationMatrix produces a global descriptor that is the correlation matrix of the local descriptors. In other words, it is mean(bi'*bi for bi in B). \n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.CovariateLinearProblem","page":"API","title":"PotentialLearning.CovariateLinearProblem","text":"struct CovariateLinearProblem{T<:Real} <: LinearProblem{T} e::Vector f::Vector{Vector{T}} B::Vector{Vector{T}} dB::Vector{Matrix{T}} β::Vector{T} β0::Vector{T} σe::Vector{T} σf::Vector{T} Σ::Symmetric{T,Matrix{T}} end\n\nA CovariateLinearProblem is a linear problem in which we are fitting energies and forces using both descriptors and their gradients (B and dB, respectively). When this is the case, the solution is not available analytically and must be solved using some iterative optimization proceedure. In the end, we fit the model coefficients, β, standard deviations corresponding to energies and forces, σe and σf, and the covariance Σ. \n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.DBSCANSelector","page":"API","title":"PotentialLearning.DBSCANSelector","text":"struct DBSCANSelector <: SubsetSelector\n clusters\n eps\n minpts\n sample_size\nend\n\nDefinition of the type DBSCANSelector, a subselector based on the clustering method DBSCAN.\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.DBSCANSelector-Tuple{DataSet, Any, Any, Any}","page":"API","title":"PotentialLearning.DBSCANSelector","text":"function DBSCANSelector(\n ds::DataSet,\n eps,\n minpts,\n sample_size\n)\n\nConstructor of DBSCANSelector based on the atomic configurations in ds, the DBSCAN params eps and minpts, and the sample size sample_size.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.Data","page":"API","title":"PotentialLearning.Data","text":"Data\n\nAbstract supertype of ConfigurationData.\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.DataBase","page":"API","title":"PotentialLearning.DataBase","text":"DataBase\n\nAbstract type for DataSets. \n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.DataSet","page":"API","title":"PotentialLearning.DataSet","text":"DataSet\n\nStruct that holds vector of configuration. Most operations in PotentialLearning are built around the DataSet structure.\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.Distance","page":"API","title":"PotentialLearning.Distance","text":"Distance\n\nA struct of abstract type Distance produces the distance between two `global` descriptors, or features. Not all distances might be compatible with all types of features.\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.Divergence","page":"API","title":"PotentialLearning.Divergence","text":"Divergence\n\nA struct of abstract type Divergence produces a measure of discrepancy between two probability distributions. Discepancies may take as argument analytical distributions or sets of samples representing empirical distributions.\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.DotProduct","page":"API","title":"PotentialLearning.DotProduct","text":"DotProduct <: Kernel \n α :: Power of DotProduct kernel \n\n\nComputes the dot product kernel between two features, i.e.,\n\ncos(θ) = ( A ⋅ B / (||A||^2||B||^2) )^α\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.Energy","page":"API","title":"PotentialLearning.Energy","text":"Energy <: ConfigurationData\n d :: Real\n u :: Unitful.FreeUnits\n\nConvenience struct that holds energy information (and corresponding units). Default unit is eV\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.Euclidean","page":"API","title":"PotentialLearning.Euclidean","text":"Euclidean <: Distance \n Cinv :: Covariance Matrix \n\nComputes the squared euclidean distance with weight matrix Cinv, the inverse of some covariance matrix.\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.ExtXYZ","page":"API","title":"PotentialLearning.ExtXYZ","text":"ExtXYZ <: IO\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.Feature","page":"API","title":"PotentialLearning.Feature","text":"Feature\n\nA struct of abstract type Feature represents a function that takes in a set of local descriptors corresponding to some atomic environment and produce a global descriptor. \n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.Force","page":"API","title":"PotentialLearning.Force","text":"Force <: AtomicData \n f :: Vector{<:Real}\n u :: Unitful.FreeUnits\n\nContains the force with (x,y,z)-components in f with units u. Default unit is \"eV/Å\". \n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.ForceDescriptor","page":"API","title":"PotentialLearning.ForceDescriptor","text":"ForceDescriptor <: AtomicData\n b :: Vector{<:Vector{<:Real}}\n\nContains the x,y,z components (out vector) of the force descriptor (inner vector).\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.ForceDescriptors","page":"API","title":"PotentialLearning.ForceDescriptors","text":"ForceDescriptors <: ConfigurationData\n b :: Vector{ForceDescriptor}\n\nA container holding all of the ForceDescriptors for all atoms in a configuration.\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.Forces","page":"API","title":"PotentialLearning.Forces","text":"Forces <: ConfigurationData\n f :: Vector{force}\n\nForces is a struct that contains all force information in a configuration.\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.Forstner","page":"API","title":"PotentialLearning.Forstner","text":"Forstner <: Distance \n α :: Regularization parameter\n\nComputes the squared Forstner distance between two positive semi-definite matrices.\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.GlobalMean","page":"API","title":"PotentialLearning.GlobalMean","text":" GlobalMean{T}\n\nGlobalMean produces the mean of the local descriptors.\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.GlobalSum","page":"API","title":"PotentialLearning.GlobalSum","text":" GlobalSum{T}\n\nGlobalSum produces the sum of the local descriptors.\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.InverseMultiquadric","page":"API","title":"PotentialLearning.InverseMultiquadric","text":"InverseMultiquadric <: Kernel \n d :: Distance function \n c2 :: Squared constant parameter\n ℓ :: Length-scale parameter\n\nComputes the inverse multiquadric (IMQ) kernel, i.e.,\n\n k(A, B) = (c^2 + d(A,B)/β^2)^{-1/2}\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.Kernel","page":"API","title":"PotentialLearning.Kernel","text":"Kernel\n\nA struct of abstract type Kernel is function that takes in two features and produces a semi-definite scalar representing the similarity between the two features.\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.KernelSteinDiscrepancy","page":"API","title":"PotentialLearning.KernelSteinDiscrepancy","text":"KernelSteinDiscrepancy <: Divergence\n score :: Function\n knl :: Kernel\n\nComputes the kernel Stein discrepancy between distributions p (from which samples are provided) and q (for which the score is provided) based on the RKHS defined by kernel k.\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.LAMMPS","page":"API","title":"PotentialLearning.LAMMPS","text":"struct LAMMPS <: IO\n elements :: Vector{Symbol}\n boundary_conditions :: Vector\nend\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.LearningProblem","page":"API","title":"PotentialLearning.LearningProblem","text":"struct LearningProblem{T<:Real} <: AbstractLearningProblem ds::DataSet logprob::Function ∇logprob::Function params::Vector{T} end\n\nGeneric LearningProblem that allows the user to pass a logprob(y::params, ds::DataSet) function and its gradient. The gradient should return a vector of logprob with respect to it's params. If the user does not have a gradient function available, then Flux can provide one for it (provided that logprob is of the form above).\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.LearningProblem-Union{Tuple{T}, Tuple{DataSet, Function, Vector{T}}} where T","page":"API","title":"PotentialLearning.LearningProblem","text":"function LearningProblem( ds::DataSet, logprob::Function, params::Vector{T} ) where {T}\n\nGeneric LearningProblem construnctor.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.LinearProblem","page":"API","title":"PotentialLearning.LinearProblem","text":"abstract type LinearProblem{T<:Real} <: AbstractLearningProblem end\n\nAn abstract type to specify linear potential inference problems. \n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.LinearProblem-Tuple{DataSet}","page":"API","title":"PotentialLearning.LinearProblem","text":"function LinearProblem( ds::DataSet; T = Float64 )\n\nConstruct a LinearProblem by detecting if there are energy descriptors and/or force descriptors and construct the appropriate LinearProblem (either Univariate, if only a single type of descriptor, or Covariate, if there are both types).\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.LocalDescriptor","page":"API","title":"PotentialLearning.LocalDescriptor","text":"LocalDescriptor <: AtomicData\n\nA vector corresponding to the descriptor for a particular atom's neighborhood.\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.LocalDescriptors","page":"API","title":"PotentialLearning.LocalDescriptors","text":"LocalDescriptors <: ConfigurationData\n\nA vector of LocalDescriptor, which now should represent all local descriptors for atoms in a configuration.\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.PCA","page":"API","title":"PotentialLearning.PCA","text":"PCA <: DimensionReducer\n tol :: Float64\n\nUse SVD to compute the PCA of the design matrix of descriptors. (using Force descriptors TBA)\n\nIf tol is a float then the number of components to keep is determined by the smallest n such that relative percentage of variance explained by keeping the leading n principle components is greater than 1 - tol. If tol is an int, then we return the components corresponding to the tol largest eigenvalues.\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.RBF","page":"API","title":"PotentialLearning.RBF","text":"RBF <: Kernel \n d :: Distance function \n α :: Regularization parameter \n ℓ :: Length-scale parameter\n β :: Scale parameter\n\n\nComputes the squared exponential kernel, i.e.,\n\n k(A, B) = β \u001bxp( -\frac{1}{2} d(A,B)/ℓ^2 ) + α δ(A, B)\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.RandomSelector","page":"API","title":"PotentialLearning.RandomSelector","text":"struct Random\n num_configs :: Int \n batch_size :: Int \nend\n\nA convenience function that allows the user to randomly select indices uniformly over [1, num_configs]. \n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.UnivariateLinearProblem","page":"API","title":"PotentialLearning.UnivariateLinearProblem","text":"struct UnivariateLinearProblem{T<:Real} <: LinearProblem{T} ivdata::Vector dvdata::Vector β::Vector{T} β0::Vector{T} σ::Vector{T} Σ::Symmetric{T,Matrix{T}} end\n\nA UnivariateLinearProblem is a linear problem in which there is only 1 type of independent variable / dependent variable. Typically, that means we are either only fitting energies or only fitting forces. When this is the case, the solution is available analytically and the standard deviation, σ, and covariance, Σ, of the coefficients, β, are computable. \n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.YAML","page":"API","title":"PotentialLearning.YAML","text":"YAML <: IO\n energy_units :: Unitful.FreeUnits\n distance_units :: Unitful.FreeUnits\n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.kDPP","page":"API","title":"PotentialLearning.kDPP","text":"struct kDPP\n K :: EllEnsemble\nend\n\nA convenience function that allows the user access to a k-Determinantal Point Process through Determinantal.jl. All that is required to construct a kDPP is a similarity kernel, for which the user must provide a LinearProblem and two functions to compute descriptor (1) diversity and (2) quality. \n\n\n\n\n\n","category":"type"},{"location":"api/#PotentialLearning.kDPP-Tuple{DataSet, Feature, Kernel}","page":"API","title":"PotentialLearning.kDPP","text":"kDPP(ds::Dataset, f::Feature, k::Kernel)\n\nA convenience function that allows the user access to a k-Determinantal Point Process through Determinantal.jl. All that is required to construct a kDPP is a dataset, a method to compute features, and a kernel. Optional arguments include batch size and type of descriptor (default LocalDescriptors).\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.kDPP-Union{Tuple{T}, Tuple{Union{Array{Vector{T}, 1}, Array{LinearAlgebra.Symmetric{T, Matrix{T}}, 1}}, Kernel}} where T","page":"API","title":"PotentialLearning.kDPP","text":"kDPP(features::Union{Vector{Vector{T}}, Vector{Symmetric{T, Matrix{T}}}}, k::Kernel)\n\nA convenience function that allows the user access to a k-Determinantal Point Process through Determinantaljl. All that is required to construct a kDPP are features (either a vector of vector features or a vector of symmetric matrix features) and a kernel. Optional argument is batch_size (default length(features)).\n\n\n\n\n\n","category":"method"},{"location":"api/#InteratomicPotentials.compute_force_descriptors-Tuple{DataSet, InteratomicPotentials.BasisSystem}","page":"API","title":"InteratomicPotentials.compute_force_descriptors","text":"function computeforcedescriptors( ds::DataSet, basis::BasisSystem; pbar = true )\n\nCompute force descriptors of a basis system and dataset using threads.\n\n\n\n\n\n","category":"method"},{"location":"api/#InteratomicPotentials.compute_local_descriptors-Tuple{DataSet, InteratomicPotentials.BasisSystem}","page":"API","title":"InteratomicPotentials.compute_local_descriptors","text":"function computelocaldescriptors( ds::DataSet, basis::BasisSystem; pbar = true )\n\nds: dataset. basis: basis system (e.g. ACE) pbar: progress bar\n\nCompute local descriptors of a basis system and dataset using threads.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.KernelMatrix-Tuple{DataSet, DataSet, Feature, Kernel}","page":"API","title":"PotentialLearning.KernelMatrix","text":"KernelMatrix(ds1::DataSet, ds2::DataSet, F::Feature, k::Kernel)\n\nCompute nonsymmetric kernel matrix K using features of the datasets ds1 and ds2 calculated using the Feature method F.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.KernelMatrix-Tuple{DataSet, Feature, Kernel}","page":"API","title":"PotentialLearning.KernelMatrix","text":"KernelMatrix(ds::DataSet, F::Feature, k::Kernel)\n\nCompute symmetric kernel matrix K using features of the dataset ds calculated using the Feature method F. \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.KernelMatrix-Union{Tuple{T}, Tuple{Union{Array{Vector{T}, 1}, Array{LinearAlgebra.Symmetric{T, Matrix{T}}, 1}}, Kernel}} where T","page":"API","title":"PotentialLearning.KernelMatrix","text":"KernelMatrix(F, k::Kernel)\n\nCompute symmetric kernel matrix K where K{ij} = k(Fi, F_j). \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.KernelMatrix-Union{Tuple{T}, Tuple{Union{Array{Vector{T}, 1}, Array{LinearAlgebra.Symmetric{T, Matrix{T}}, 1}}, Union{Array{Vector{T}, 1}, Array{LinearAlgebra.Symmetric{T, Matrix{T}}, 1}}, Kernel}} where T","page":"API","title":"PotentialLearning.KernelMatrix","text":"KernelMatrix(F1, F2, k::Kernel)\n\nCompute non-symmetric kernel matrix K where K{ij} = k(F1i, F2_j). \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.calc_centroid-Tuple{Matrix{Float64}}","page":"API","title":"PotentialLearning.calc_centroid","text":"function calc_centroid(\n m::Array{Float64,2}\n)\n\nCalculate a centroid of a matrix.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.calc_metrics-Tuple{Any, Any}","page":"API","title":"PotentialLearning.calc_metrics","text":"calc_metrics(x_pred, x)\n\nx_pred: vector of predicted values of a variable. E.g. energy. x: vector of true values of a variable. E.g. energy.\n\nReturns MAE, RMSE, and RSQ.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.compute_distance-Union{Tuple{T}, Tuple{Vector{T}, Vector{T}, Euclidean}} where T<:Real","page":"API","title":"PotentialLearning.compute_distance","text":"compute_distance(A, B, d)\n\nCompute the distance between features A and B using distance metric d. \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.compute_features-Tuple{DataSet, Feature}","page":"API","title":"PotentialLearning.compute_features","text":"compute_feature(ds::DataSet, f::Feature; dt = LocalDescriptors)\n\nComputes features of the dataset ds using the feature method F on descriptors dt (default option are the LocalDescriptors, if available).\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.compute_gradx_distance-Union{Tuple{T}, Tuple{T, T, Euclidean}} where T<:(Vector{<:Real})","page":"API","title":"PotentialLearning.compute_gradx_distance","text":"compute_gradx_distance(A, B, d)\n\nCompute gradient of the distance between features A and B using distance metric d, with respect to the first argument (A). \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.compute_gradx_kernel-Union{Tuple{T}, Tuple{T, T, RBF}} where T<:(Vector{<:Real})","page":"API","title":"PotentialLearning.compute_gradx_kernel","text":"compute_gradx_kernel(A, B, k)\n\nCompute gradient of the kernel between features A and B using kernel k, with respect to the first argument (A). \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.compute_gradxy_distance-Union{Tuple{T}, Tuple{T, T, Euclidean}} where T<:(Vector{<:Real})","page":"API","title":"PotentialLearning.compute_gradxy_distance","text":"compute_gradxy_distance(A, B, d)\n\nCompute second-order cross derivative of the distance between features A and B using distance metric d. \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.compute_gradxy_kernel-Union{Tuple{T}, Tuple{T, T, RBF}} where T<:(Vector{<:Real})","page":"API","title":"PotentialLearning.compute_gradxy_kernel","text":"compute_gradxy_kernel(A, B, k)\n\nCompute the second-order cross derivative of the kernel between features A and B using kernel k. \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.compute_grady_distance-Union{Tuple{T}, Tuple{T, T, Euclidean}} where T<:(Vector{<:Real})","page":"API","title":"PotentialLearning.compute_grady_distance","text":"compute_grady_distance(A, B, d)\n\nCompute gradient of the distance between features A and B using distance metric d, with respect to the second argument (B). \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.compute_grady_kernel-Union{Tuple{T}, Tuple{T, T, RBF}} where T<:(Vector{<:Real})","page":"API","title":"PotentialLearning.compute_grady_kernel","text":"compute_grady_kernel(A, B, k)\n\nCompute gradient of the kernel between features A and B using kernel k, with respect to the second argument (B). \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.compute_kernel-Union{Tuple{T}, Tuple{T, T, RBF}} where T<:Union{LinearAlgebra.Symmetric{<:Real, <:Matrix{<:Real}}, Vector{<:Real}}","page":"API","title":"PotentialLearning.compute_kernel","text":"compute_kernel(A, B, k)\n\nCompute similarity kernel between features A and B using kernel k. \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.distance_matrix_kabsch-Tuple{DataSet}","page":"API","title":"PotentialLearning.distance_matrix_kabsch","text":"function distance_matrix_kabsch(\n ds::DataSet\n)\n\nCalculate a matrix of distances between atomic configurations using KABSCH method.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.distance_matrix_periodic-Tuple{DataSet}","page":"API","title":"PotentialLearning.distance_matrix_periodic","text":"function distance_matrix_periodic(\n ds::DataSet\n)\n\nCalculates a matrix of distances between atomic configurations taking into account the periodic boundaries.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.fit","page":"API","title":"PotentialLearning.fit","text":"fit(ds::DataSet, dr::DimensionReducer)\n\nFits a linear dimension reduction routine using information from DataSet. See individual types of DimensionReducers for specific details.\n\n\n\n\n\n","category":"function"},{"location":"api/#PotentialLearning.fit-Tuple{DataSet, ActiveSubspace}","page":"API","title":"PotentialLearning.fit","text":"fit(ds::DataSet, as::ActiveSubspace)\n\nFits a linear dimension reduction routine using the eigendirections of the uncentered covariance of the function ∇Q(c::Configuration) over the configurations in ds. Primarily used to reduce the dimension of the descriptors.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.fit-Tuple{DataSet, PCA}","page":"API","title":"PotentialLearning.fit","text":"fit(ds::DataSet, pca::PCA)\n\nFits a linear dimension reduction routine using PCA on the global descriptors in the dataset ds. \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.fit_transform-Tuple{DataSet, DimensionReducer}","page":"API","title":"PotentialLearning.fit_transform","text":"fit_transform(ds::DataSet, dr::DimensionReducer)\n\nFits a linear dimension reduction routine using information from DataSet and performs dimension reduction on descriptors and force_descriptors (whichever are available). See individual types of DimensionReducers for specific details.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.force-Tuple{Configuration, InteratomicPotentials.BasisPotential}","page":"API","title":"PotentialLearning.force","text":"function force( c::Configuration, bp::BasisPotential )\n\nc: atomic configuration. bp: basis potential.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.force-Tuple{Configuration, InteratomicPotentials.NNBasisPotential}","page":"API","title":"PotentialLearning.force","text":"function force( c::Configuration, nnbp::NNBasisPotential )\n\nc: atomic configuration. nnbp: neural network basis potential.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_all_energies-Tuple{DataSet, InteratomicPotentials.BasisPotential}","page":"API","title":"PotentialLearning.get_all_energies","text":"function get_all_energies(\n ds::DataSet,\n bp::BasisPotential\n)\n\nds: dataset. bp: basis potential.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_all_energies-Tuple{DataSet}","page":"API","title":"PotentialLearning.get_all_energies","text":"function getallenergies( ds::DataSet )\n\nds: dataset.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_all_forces-Tuple{DataSet, InteratomicPotentials.BasisPotential}","page":"API","title":"PotentialLearning.get_all_forces","text":"function getallforces( ds::DataSet, bp::BasisPotential )\n\nds: dataset. bp: basis potential.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_all_forces-Tuple{DataSet}","page":"API","title":"PotentialLearning.get_all_forces","text":"function get_all_forces(\n ds::DataSet\n)\n\nds: dataset.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_batches-NTuple{11, Any}","page":"API","title":"PotentialLearning.get_batches","text":"get_batches(n_batches, B_train, B_train_ext, e_train, dB_train, f_train,\n B_test, B_test_ext, e_test, dB_test, f_test)\n\nn_batches: no. of batches per dataset. B_train: descriptors of the energies used in training. B_train_ext: extendended descriptors of the energies used in training. Requiered to compute forces. e_train: energies used in training. dB_train: derivatives of the energy descritors used in training. f_train: forces used in training. B_test: descriptors of the energies used in test. B_test_ext: extendended descriptors of the energies used in test. Requiered to compute forces. e_test: energies used in test. dB_test: derivatives of the energy descritors used in test. f_test: forces used in test.\n\nReturns the data loaders for training and test of energies and forces.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_clusters-Tuple{Any, Any, Any}","page":"API","title":"PotentialLearning.get_clusters","text":"function get_clusters(\n ds,\n eps,\n minpts\n)\n\nComputes clusters from the configurations in ds using DBSCAN with parameters eps and minpts.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_dpp_mode-Tuple{kDPP}","page":"API","title":"PotentialLearning.get_dpp_mode","text":"get_dpp_mode(dpp::kDPP, batch_size::Int) <: Vector{Int64}\n\nAccess an approximate mode of the k-DPP as calculated by a greedy subset algorithm. See Determinantal.jl for details.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_energy-Tuple{Configuration}","page":"API","title":"PotentialLearning.get_energy","text":"get_energy(c::Configuration) <: Energy\n\nRetrieves the energy (if available) in the Configuration c. \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_force_descriptors-Tuple{Configuration}","page":"API","title":"PotentialLearning.get_force_descriptors","text":"get_force_descriptors(c::Configuration) <: ForceDescriptors\n\nRetrieves the force descriptors (if available) in the Configuration c. \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_forces-Tuple{Configuration}","page":"API","title":"PotentialLearning.get_forces","text":"get_forces(c::Configuration) <: Forces\n\nRetrieves the forces (if available) in the Configuration c. \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_inclusion_prob-Tuple{kDPP}","page":"API","title":"PotentialLearning.get_inclusion_prob","text":"get_inclusion_prob(dpp::kDPP) <: Vector{Float64}\n\nAccess an approximation to the inclusion probabilities as calculated by Determinantal.jl (see package for details).\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_input-Tuple{Any}","page":"API","title":"PotentialLearning.get_input","text":"get_input(args)\n\nargs: vector of arguments (strings)\n\nReturns an OrderedDict with the arguments. See https://github.com/cesmix-mit/AtomisticComposableWorkflows documentation for information about how to define the input arguments.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_local_descriptors-Tuple{Configuration}","page":"API","title":"PotentialLearning.get_local_descriptors","text":"get_local_descriptors(c::Configuration) <: LocalDescriptors\n\nRetrieves the local descriptors (if available) in the Configuration c. \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_metrics-NTuple{11, Any}","page":"API","title":"PotentialLearning.get_metrics","text":"get_metrics( e_train_pred, e_train, f_train_pred, f_train,\n e_test_pred, e_test, f_test_pred, f_test,\n B_time, dB_time, time_fitting)\n\ne_train_pred: vector of predicted training energy values. e_train: vector of true training energy values. f_train_pred: vector of predicted training force values. f_train: vector of true training force values. e_test_pred: vector of predicted test energy values. e_test: vector of true test energy values. f_test_pred: vector of predicted test force values. f_test: vector of true test force values. B_time: elapsed time consumed by descriptors calculation. dB_time: elapsed time consumed by descriptor derivatives calculation. time_fitting: elapsed time consumed by fitting process.\n\nComputes MAE, RMSE, and RSQ for training and testing energies and forces. Also add elapsed times about descriptors and fitting calculations. Returns an OrderedDict with the information above.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_metrics-NTuple{4, Any}","page":"API","title":"PotentialLearning.get_metrics","text":"get_metrics( e_train_pred, e_train, e_test_pred, e_test)\n\ne_train_pred: vector of predicted training energy values. e_train: vector of true training energy values. e_test_pred: vector of predicted test energy values. e_test: vector of true test energy values.\n\nComputes MAE, RMSE, and RSQ for training and testing energies. Returns an OrderedDict with the information above.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_metrics-Tuple{Any, Any}","page":"API","title":"PotentialLearning.get_metrics","text":"get_metrics(\n x_pred,\n x;\n metrics = [mae, rmse, rsq],\n label = \"x\"\n)\n\nx_pred: vector of predicted forces, x: vector of true forces. metrics: vector of metrics. label: label used as prefix in dictionary keys.\n\nReturns and OrderedDict with different metrics.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_positions-Tuple{Configuration}","page":"API","title":"PotentialLearning.get_positions","text":"get_positions(c::Configuration) <: Vector{SVector}\n\nRetrieves the AtomsBase system positions (if available) in the Configuration c. \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_random_subset","page":"API","title":"PotentialLearning.get_random_subset","text":"get_random_subset(r::Random, batch_size :: Int) <: Vector{Int64}\n\nAccess a random subset of the data as sampled from the provided k-DPP. Returns the indices of the random subset and the subset itself.\n\n\n\n\n\n","category":"function"},{"location":"api/#PotentialLearning.get_random_subset-2","page":"API","title":"PotentialLearning.get_random_subset","text":"function get_random_subset(\n s::DBSCANSelector,\n batch_size = s.sample_size\n)\n\nReturns a random subset of indexes composed of samples of size batch_size ÷ length(s.clusters) from each cluster in s.\n\n\n\n\n\n","category":"function"},{"location":"api/#PotentialLearning.get_random_subset-Tuple{kDPP}","page":"API","title":"PotentialLearning.get_random_subset","text":"get_random_subset(dpp::kDPP, batch_size :: Int) <: Vector{Int64}\n\nAccess a random subset of the data as sampled from the provided k-DPP. Returns the indices of the random subset and the subset itself.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_system-Tuple{Configuration}","page":"API","title":"PotentialLearning.get_system","text":"get_system(c::Configuration) <: AtomsBase.AbstractSystem\n\nRetrieves the AtomsBase system (if available) in the Configuration c. \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_values-Tuple{Energy}","page":"API","title":"PotentialLearning.get_values","text":"get_values(e::Energy) <: Real\n\nGet the underlying real value (= e.d)\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.get_values-Tuple{StaticArraysCore.SVector}","page":"API","title":"PotentialLearning.get_values","text":"get_values(v::SVector)\n\nRemoves units from a position.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.kabsch-Tuple{Matrix{Float64}, Matrix{Float64}}","page":"API","title":"PotentialLearning.kabsch","text":"function kabsch(\n reference::Array{Float64,2},\n coords::Array{Float64,2}\n)\n\nInput: two sets of points: reference, coords as Nx3 Matrices (so) Returns optimally rotated matrix \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.kabsch_rmsd-Tuple{Matrix{Float64}, Matrix{Float64}}","page":"API","title":"PotentialLearning.kabsch_rmsd","text":"function kabsch_rmsd(\n P::Array{Float64,2},\n Q::Array{Float64,2}\n)\n\nDirectly return RMSD for matrices P, Q for convenience.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.learn!-Tuple{InteratomicPotentials.LinearBasisPotential, DataSet, Vararg{Any}}","page":"API","title":"PotentialLearning.learn!","text":"function learn!( iap::InteratomicPotentials.LinearBasisPotential, ds::DataSet, args... )\n\nLearning dispatch function, common to ordinary and weghted least squares implementations.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.learn!-Tuple{PotentialLearning.CovariateLinearProblem, Real}","page":"API","title":"PotentialLearning.learn!","text":"function learn!( lp::CovariateLinearProblem, α::Real )\n\nFit a Gaussian distribution by finding the MLE of the following log probability: ℓ(β, σe, σf) = -0.5(e - A_e *β)'(e - Ae * β) / σe - 0.5*(f - Af β)'(f - A_f * β) / σf - log(σe) - log(σf)\n\nthrough an optimization procedure. \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.learn!-Tuple{PotentialLearning.CovariateLinearProblem, SubsetSelector, Real}","page":"API","title":"PotentialLearning.learn!","text":"function learn!( lp::CovariateLinearProblem, ss::SubsetSelector, α::Real; num_steps=100, opt=Flux.Optimise.Adam() )\n\nFit a Gaussian distribution by finding the MLE of the following log probability: ℓ(β, σe, σf) = -0.5(e - A_e *β)'(e - Ae * β) / σe - 0.5*(f - Af β)'(f - A_f * β) / σf - log(σe) - log(σf)\n\nthrough an iterative batch gradient descent optimization proceedure where the batches are provided by the subset selector. \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.learn!-Tuple{PotentialLearning.CovariateLinearProblem, Vector, Bool}","page":"API","title":"PotentialLearning.learn!","text":"function learn!( lp::CovariateLinearProblem, ws::Vector, int::Bool )\n\nFit energies and forces using weighted least squares.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.learn!-Tuple{PotentialLearning.LearningProblem, SubsetSelector}","page":"API","title":"PotentialLearning.learn!","text":"function learn!( lp::LearningProblem, ss::SubsetSelector; num_steps = 100::Int, opt = Flux.Optimisers.Adam() )\n\nAttempts to fit the parameters lp.params in the learning problem lp using batch gradient descent with the optimizer opt and num_steps number of iterations. Batching is provided by the passed ss::SubsetSelector. \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.learn!-Tuple{PotentialLearning.LearningProblem}","page":"API","title":"PotentialLearning.learn!","text":"function learn!( lp::LearningProblem; num_steps=100::Int, opt=Flux.Optimisers.Adam() )\n\nAttempts to fit the parameters lp.params in the learning problem lp using gradient descent with the optimizer opt and num_steps number of iterations.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.learn!-Tuple{PotentialLearning.LinearProblem}","page":"API","title":"PotentialLearning.learn!","text":"function learn!( lp::LinearProblem )\n\nDefault learning problem: weighted least squares.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.learn!-Tuple{PotentialLearning.UnivariateLinearProblem, Real}","page":"API","title":"PotentialLearning.learn!","text":"function learn!( lp::UnivariateLinearProblem, α::Real )\n\nFit a univariate Gaussian distribution for the equation y = Aβ + ϵ, where β are model coefficients and ϵ ∼ N(0, σ). Fitting is done via SVD on the design matrix, A'*A (formed iteratively), where eigenvalues less than α are cut-off. \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.learn!-Tuple{PotentialLearning.UnivariateLinearProblem, SubsetSelector, Real}","page":"API","title":"PotentialLearning.learn!","text":"function learn!( lp::UnivariateLinearProblem, ss::SubsetSelector, α::Real; num_steps = 100, opt = Flux.Optimise.Adam() )\n\nFit a univariate Gaussian distribution for the equation y = Aβ + ϵ, where β are model coefficients and ϵ ∼ N(0, σ). Fitting is done via batched gradient descent with batches provided by the subset selector and the gradients are calculated using Flux. \n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.learn!-Tuple{PotentialLearning.UnivariateLinearProblem, Vector, Bool}","page":"API","title":"PotentialLearning.learn!","text":"function learn!( lp::UnivariateLinearProblem, ws::Vector, int::Bool )\n\nFit energies using weighted least squares.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.linearize_forces-Tuple{Any}","page":"API","title":"PotentialLearning.linearize_forces","text":"linearize_forces(forces)\n\nforces: vector of forces per system\n\nReturns a vector with the components of the forces of the systems.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.load_data-Tuple{Any, ExtXYZ}","page":"API","title":"PotentialLearning.load_data","text":"load_data(file::string, extxyz::ExtXYZ)\nLoad configuration from an extxyz file into a DataSet\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.load_data-Tuple{String, YAML}","page":"API","title":"PotentialLearning.load_data","text":"load_data(file::string, yaml::YAML)\n\nLoad configurations from a yaml file into a Vector of Flexible Systems, with Energies and Force.\nReturns \n ds - DataSet\n t = Vector{Dict} (any miscellaneous info from yaml file)\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.load_datasets-Tuple{Any}","page":"API","title":"PotentialLearning.load_datasets","text":"load_datasets(input)\n\ninput: OrderedDict with input arguments. See get_defaults_args().\n\nReturns training and test systems, energies, forces, and stresses.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.mae-Tuple{Any, Any}","page":"API","title":"PotentialLearning.mae","text":"mae(x_pred, x)\n\nx_pred: vector of predicted values. E.g. predicted energies. x: vector of true values. E.g. DFT energies.\n\nReturns mean absolute error.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.mean_cos-Tuple{Any, Any}","page":"API","title":"PotentialLearning.mean_cos","text":"mean_cos(x_pred, x)\n\nx_pred: vector of predicted forces, x: vector of true forces.\n\nReturns mean cosine.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.periodic_rmsd-Tuple{Matrix{Float64}, Matrix{Float64}, Vector{Float64}}","page":"API","title":"PotentialLearning.periodic_rmsd","text":"function periodic_rmsd(\n p1::Array{Float64,2},\n p2::Array{Float64,2},\n box_lengths::Array{Float64,1}\n)\n\nCalculates the RMSD between atom positions of two configurations taking into account the periodic boundaries.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.potential_energy-Tuple{Configuration, InteratomicPotentials.BasisPotential}","page":"API","title":"PotentialLearning.potential_energy","text":"function potential_energy( c::Configuration, bp::BasisPotential )\n\nc: atomic configuration. bp: basis potential.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.potential_energy-Tuple{Configuration, InteratomicPotentials.NNBasisPotential}","page":"API","title":"PotentialLearning.potential_energy","text":"function potential_energy( c::Configuration, nnbp::NNBasisPotential )\n\nc: atomic configuration. nnbp: neural network basis potential.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.rmsd-Tuple{Matrix{Float64}, Matrix{Float64}}","page":"API","title":"PotentialLearning.rmsd","text":"function rmsd(\n A::Array{Float64,2},\n B::Array{Float64,2}\n)\n\nCalculate root mean square deviation of two matrices A, B. See http://en.wikipedia.org/wiki/Root-mean-squaredeviationofatomicpositions\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.rmse-Tuple{Any, Any}","page":"API","title":"PotentialLearning.rmse","text":"rmse(x_pred, x)\n\nx_pred: vector of predicted values. E.g. predicted energies. x: vector of true values. E.g. DFT energies.\n\nReturns mean root mean square error.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.rsq-Tuple{Any, Any}","page":"API","title":"PotentialLearning.rsq","text":"rsq(x_pred, x)\n\nx_pred: vector of predicted values. E.g. predicted energies. x: vector of true values. E.g. DFT energies.\n\nReturns R-squared.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.sample-Tuple{Any, Any}","page":"API","title":"PotentialLearning.sample","text":"function sample(\n c,\n batch_size\n)\n\nSelect from cluster c a sample of size batch_size.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.to_num-Tuple{Any}","page":"API","title":"PotentialLearning.to_num","text":"to_num(str)\n\nstr: string with a number: integer or float\n\nReturns an integer or float.\n\n\n\n\n\n","category":"method"},{"location":"api/#PotentialLearning.translate_points-Tuple{Matrix{Float64}, Matrix{Float64}}","page":"API","title":"PotentialLearning.translate_points","text":"function translate_points(\n P::Array{Float64,2},\n Q::Array{Float64,2}\n)\n\nTranslate P, Q so centroids are equal to the origin of the coordinate system Translation der Massenzentren, so dass beide Zentren im Ursprung des Koordinatensystems liegen\n\n\n\n\n\n","category":"method"},{"location":"#[WIP]-PotentialLearning.jl","page":"Home","title":"[WIP] PotentialLearning.jl","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"An open source Julia library for active learning of interatomic potentials in atomistic simulations of materials. It incorporates elements of bayesian inference, machine learning, differentiable programming, software composability, and high-performance computing. This package is part of a software suite developed for the CESMIX project.","category":"page"},{"location":"#Specific-goals","page":"Home","title":"Specific goals","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Intelligent data subsampling: iteratively query a large pool of unlabeled data to extract a minimum number of training data that would lead to a supervised ML model with superior accuracy compared to a training model with educated handpicking.\nVia DPP, clustering.\nQuantity of Interest based dimension reduction through the theory of Active Subspaces.\nInference of the optimal values and uncertainties of the model parameters, to propagate them through the atomistic simulation.\nInteratomic potential hyper-parameter optimization. E.g. estimation of the optimum cutoff radius.\nInteratomic potential fitting. The potentials addressed in this package are defined in InteratomicPotentials.jl and InteratomicBasisPotentials.jl. E.g. ACE, SNAP, Neural Network Potentials.\nMeasurement of QoI sensitivity to individual parameters. \nInput data management and post-processing.\nProcess input data so that it is ready for training. E.g. read XYZ file with atomic configurations, linearize energies and forces, split dataset into training and testing, normalize data, transfer data to GPU, define iterators, etc.\nPost-processing: computation of different metrics (MAE, RSQ, COV, etc), saving results, and plotting.","category":"page"},{"location":"#Leveraging-Julia!","page":"Home","title":"Leveraging Julia!","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Software composability through multiple dispatch. A series of composable workflows is guiding our design and development. We analyzed three of the most representative workflows: classical molecular dynamics (MD), Ab initio MD, and classical MD with active learning. In addition, it facilitates the training of new potentials defined by the composition of neural networks with state-of-the-art interatomic potential descriptors.\nDifferentiable programming. Powerful automatic differentiation tools, such as Enzyme or Zygote, help to accelerate the development of new interatomic potentials by automatically calculating loss function gradients and forces.\nSciML: Open Source Software for Scientific Machine Learning. It provides libraries, such as Optimization.jl, that bring together several optimization packages into one unified Julia interface. \nMachine learning and HPC abstractions: Flux.jl makes parallel learning simple using the NVIDIA GPU abstractions of CUDA.jl. Mini-batch iterations on heterogeneous data, as required by a loss function based on energies and forces, can be handled by DataLoader.jl.","category":"page"},{"location":"#Examples","page":"Home","title":"Examples","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"See AtomisticComposableWorkflows repository. It aims to gather easy-to-use CESMIX-aligned case studies, integrating the latest developments of the Julia atomistic ecosystem with state-of-the-art tools.","category":"page"},{"location":"how-to-run-the-examples/#How-to-run-the-examples","page":"How to run the examples","title":"How to run the examples","text":"","category":"section"},{"location":"how-to-run-the-examples/#Add-registries","page":"How to run the examples","title":"Add registries","text":"","category":"section"},{"location":"how-to-run-the-examples/","page":"How to run the examples","title":"How to run the examples","text":"Open a Julia REPL ($ julia), type ] to enter the Pkg REPL, and add the following registries:","category":"page"},{"location":"how-to-run-the-examples/","page":"How to run the examples","title":"How to run the examples","text":" pkg> registry add https://github.com/JuliaRegistries/General\n pkg> registry add https://github.com/cesmix-mit/CESMIX.git \n pkg> registry add https://github.com/JuliaMolSim/MolSim.git\n pkg> registry add https://github.com/ACEsuit/ACEregistry","category":"page"},{"location":"how-to-run-the-examples/#Install-the-dependencies-of-the-examples-folder-project","page":"How to run the examples","title":"Install the dependencies of the examples folder project","text":"","category":"section"},{"location":"how-to-run-the-examples/","page":"How to run the examples","title":"How to run the examples","text":"Clone PotentialLearning.jl repository in your working directory.","category":"page"},{"location":"how-to-run-the-examples/","page":"How to run the examples","title":"How to run the examples","text":" $ git clone git@github.com:cesmix-mit/PotentialLearning.jl.git","category":"page"},{"location":"how-to-run-the-examples/","page":"How to run the examples","title":"How to run the examples","text":"Open a Julia REPL activating the examples folder project.","category":"page"},{"location":"how-to-run-the-examples/","page":"How to run the examples","title":"How to run the examples","text":" $ julia --project=PotentialLearning.jl/examples","category":"page"},{"location":"how-to-run-the-examples/","page":"How to run the examples","title":"How to run the examples","text":"Type ] to enter the Pkg REPL and instantiate.","category":"page"},{"location":"how-to-run-the-examples/","page":"How to run the examples","title":"How to run the examples","text":" pkg> instantiate","category":"page"},{"location":"how-to-run-the-examples/#Run-an-example","page":"How to run the examples","title":"Run an example","text":"","category":"section"},{"location":"how-to-run-the-examples/","page":"How to run the examples","title":"How to run the examples","text":"Access to any folder within PotentialLearning.jl/examples. E.g.","category":"page"},{"location":"how-to-run-the-examples/","page":"How to run the examples","title":"How to run the examples","text":" $ cd PotentialLearning.jl/examples/ACE","category":"page"},{"location":"how-to-run-the-examples/","page":"How to run the examples","title":"How to run the examples","text":"Open a Julia REPL, activate the examples folder project, and define the number of threads.","category":"page"},{"location":"how-to-run-the-examples/","page":"How to run the examples","title":"How to run the examples","text":" $ julia --project=../ --threads=4","category":"page"},{"location":"how-to-run-the-examples/","page":"How to run the examples","title":"How to run the examples","text":"Finally, include the example file.","category":"page"},{"location":"how-to-run-the-examples/","page":"How to run the examples","title":"How to run the examples","text":" julia> include(\"fit-ace.jl\")","category":"page"}] }