API Reference
This page provides a list of all documented types and functions and in PotentialLearning.jl.
PotentialLearning.ActiveSubspace
— TypeActiveSubspace{T<:Real} <: DimensionReducer
Q :: Function
∇Q :: Function (gradient of Q)
- tol :: T
Use the theory of active subspaces, with a given quantity of interest (expressed as the function Q) which takes a Configuration as an input and outputs a real scalar. ∇Q should input a Configuration and output an appropriate gradient. If tol is a float then the number of components to keep is determined by the smallest n such that relative percentage of variance explained by keeping the leading n principle components is greater than 1 - tol. If tol is an int, then we return the components corresponding to the tol largest eigenvalues.
PotentialLearning.AtomicData
— TypeAtomicData <: Data
Abstract type declaring the type of information that is unique to a particular atom (instead of a whole configuration).
PotentialLearning.Configuration
— MethodConfiguration(data::Union{AtomsBase.FlexibleSystem, ConfigurationData} )
A Configuration is a data struct that contains information unique to a particular configuration of atoms (Energy, LocalDescriptors, ForceDescriptors, and a FlexibleSystem) in a dictionary. Example: '''julia e = Energy(-0.57, u"eV") ld = LocalDescriptors(...) c = Configuration(e, ld) '''
Configurations can be added together, which merges the data dictionaries '''julia c1 = Configuration(e) # Contains energy c2 = Configuration(f) # contains forces c = c1 + c2 # c <: Configuration, contains energy and forces '''
PotentialLearning.ConfigurationData
— TypeConfigurationData <: Data
Abstract type declaring the type of data that is unique to a particular configuration (instead of just an atom).
PotentialLearning.CorrelationMatrix
— TypeCorrelationMatrix
- α :: Vector{Float64} # weights
CorrelationMatrix produces a global descriptor that is the correlation matrix of the local descriptors. In other words, it is mean(bi'*bi for bi in B).
PotentialLearning.CovariateLinearProblem
— Typestruct CovariateLinearProblem{T<:Real} <: LinearProblem{T} e::Vector f::Vector{Vector{T}} B::Vector{Vector{T}} dB::Vector{Matrix{T}} β::Vector{T} β0::Vector{T} σe::Vector{T} σf::Vector{T} Σ::Symmetric{T,Matrix{T}} end
A CovariateLinearProblem is a linear problem in which we are fitting energies and forces using both descriptors and their gradients (B and dB, respectively). When this is the case, the solution is not available analytically and must be solved using some iterative optimization proceedure. In the end, we fit the model coefficients, β, standard deviations corresponding to energies and forces, σe and σf, and the covariance Σ.
PotentialLearning.DBSCANSelector
— Typestruct DBSCANSelector <: SubsetSelector
+ tol :: T
Use the theory of active subspaces, with a given quantity of interest (expressed as the function Q) which takes a Configuration as an input and outputs a real scalar. ∇Q should input a Configuration and output an appropriate gradient. If tol is a float then the number of components to keep is determined by the smallest n such that relative percentage of variance explained by keeping the leading n principle components is greater than 1 - tol. If tol is an int, then we return the components corresponding to the tol largest eigenvalues.
PotentialLearning.AtomicData
— TypeAtomicData <: Data
Abstract type declaring the type of information that is unique to a particular atom (instead of a whole configuration).
PotentialLearning.Configuration
— MethodConfiguration(data::Union{AtomsBase.FlexibleSystem, ConfigurationData} )
A Configuration is a data struct that contains information unique to a particular configuration of atoms (Energy, LocalDescriptors, ForceDescriptors, and a FlexibleSystem) in a dictionary. Example: '''julia e = Energy(-0.57, u"eV") ld = LocalDescriptors(...) c = Configuration(e, ld) '''
Configurations can be added together, which merges the data dictionaries '''julia c1 = Configuration(e) # Contains energy c2 = Configuration(f) # contains forces c = c1 + c2 # c <: Configuration, contains energy and forces '''
PotentialLearning.ConfigurationData
— TypeConfigurationData <: Data
Abstract type declaring the type of data that is unique to a particular configuration (instead of just an atom).
PotentialLearning.CorrelationMatrix
— TypeCorrelationMatrix
+ α :: Vector{Float64} # weights
CorrelationMatrix produces a global descriptor that is the correlation matrix of the local descriptors. In other words, it is mean(bi'*bi for bi in B).
PotentialLearning.CovariateLinearProblem
— Typestruct CovariateLinearProblem{T<:Real} <: LinearProblem{T} e::Vector f::Vector{Vector{T}} B::Vector{Vector{T}} dB::Vector{Matrix{T}} β::Vector{T} β0::Vector{T} σe::Vector{T} σf::Vector{T} Σ::Symmetric{T,Matrix{T}} end
A CovariateLinearProblem is a linear problem in which we are fitting energies and forces using both descriptors and their gradients (B and dB, respectively). When this is the case, the solution is not available analytically and must be solved using some iterative optimization proceedure. In the end, we fit the model coefficients, β, standard deviations corresponding to energies and forces, σe and σf, and the covariance Σ.
PotentialLearning.DBSCANSelector
— Typestruct DBSCANSelector <: SubsetSelector
clusters
eps
minpts
sample_size
-end
Definition of the type DBSCANSelector, a subselector based on the clustering method DBSCAN.
PotentialLearning.DBSCANSelector
— Methodfunction DBSCANSelector(
+end
Definition of the type DBSCANSelector, a subselector based on the clustering method DBSCAN.
PotentialLearning.DBSCANSelector
— Methodfunction DBSCANSelector(
ds::DataSet,
eps,
minpts,
sample_size
-)
Constructor of DBSCANSelector based on the atomic configurations in ds
, the DBSCAN params eps
and minpts
, and the sample size sample_size
.
PotentialLearning.Data
— TypeData
Abstract supertype of ConfigurationData.
PotentialLearning.DataBase
— TypeDataBase
Abstract type for DataSets.
PotentialLearning.DataSet
— TypeDataSet
Struct that holds vector of configuration. Most operations in PotentialLearning are built around the DataSet structure.
PotentialLearning.Distance
— TypeDistance
+)
Constructor of DBSCANSelector based on the atomic configurations in ds
, the DBSCAN params eps
and minpts
, and the sample size sample_size
.
PotentialLearning.Data
— TypeData
Abstract supertype of ConfigurationData.
PotentialLearning.DataBase
— TypeDataBase
Abstract type for DataSets.
PotentialLearning.DataSet
— TypeDataSet
Struct that holds vector of configuration. Most operations in PotentialLearning are built around the DataSet structure.
PotentialLearning.Distance
— TypeDistance
-A struct of abstract type Distance produces the distance between two `global` descriptors, or features. Not all distances might be compatible with all types of features.
PotentialLearning.DotProduct
— TypeDotProduct <: Kernel
+A struct of abstract type Distance produces the distance between two `global` descriptors, or features. Not all distances might be compatible with all types of features.
PotentialLearning.DotProduct
— TypeDotProduct <: Kernel
α :: Power of DotProduct kernel
Computes the dot product kernel between two features, i.e.,
-cos(θ) = ( A ⋅ B / (||A||^2||B||^2) )^α
PotentialLearning.Energy
— TypeEnergy <: ConfigurationData
+cos(θ) = ( A ⋅ B / (||A||^2||B||^2) )^α
PotentialLearning.Energy
— TypeEnergy <: ConfigurationData
d :: Real
- u :: Unitful.FreeUnits
Convenience struct that holds energy information (and corresponding units). Default unit is eV
PotentialLearning.Euclidean
— TypeEuclidean <: Distance
+ u :: Unitful.FreeUnits
Convenience struct that holds energy information (and corresponding units). Default unit is eV
PotentialLearning.Euclidean
— TypeEuclidean <: Distance
Cinv :: Covariance Matrix
-Computes the squared euclidean distance with weight matrix Cinv, the inverse of some covariance matrix.
PotentialLearning.ExtXYZ
— TypeExtXYZ <: IO
PotentialLearning.Feature
— TypeFeature
A struct of abstract type Feature represents a function that takes in a set of local descriptors corresponding to some atomic environment and produce a global
descriptor.
PotentialLearning.Force
— TypeForce <: AtomicData
+Computes the squared euclidean distance with weight matrix Cinv, the inverse of some covariance matrix.
PotentialLearning.ExtXYZ
— TypeExtXYZ <: IO
PotentialLearning.Feature
— TypeFeature
A struct of abstract type Feature represents a function that takes in a set of local descriptors corresponding to some atomic environment and produce a global
descriptor.
PotentialLearning.Force
— TypeForce <: AtomicData
f :: Vector{<:Real}
- u :: Unitful.FreeUnits
Contains the force with (x,y,z)-components in f with units u. Default unit is "eV/Å".
PotentialLearning.ForceDescriptor
— TypeForceDescriptor <: AtomicData
- b :: Vector{<:Vector{<:Real}}
Contains the x,y,z components (out vector) of the force descriptor (inner vector).
PotentialLearning.ForceDescriptors
— TypeForceDescriptors <: ConfigurationData
- b :: Vector{ForceDescriptor}
A container holding all of the ForceDescriptors for all atoms in a configuration.
PotentialLearning.Forces
— TypeForces <: ConfigurationData
- f :: Vector{force}
Forces is a struct that contains all force information in a configuration.
PotentialLearning.Forstner
— TypeForstner <: Distance
+ u :: Unitful.FreeUnits
Contains the force with (x,y,z)-components in f with units u. Default unit is "eV/Å".
PotentialLearning.ForceDescriptor
— TypeForceDescriptor <: AtomicData
+ b :: Vector{<:Vector{<:Real}}
Contains the x,y,z components (out vector) of the force descriptor (inner vector).
PotentialLearning.ForceDescriptors
— TypeForceDescriptors <: ConfigurationData
+ b :: Vector{ForceDescriptor}
A container holding all of the ForceDescriptors for all atoms in a configuration.
PotentialLearning.Forces
— TypeForces <: ConfigurationData
+ f :: Vector{force}
Forces is a struct that contains all force information in a configuration.
PotentialLearning.Forstner
— TypeForstner <: Distance
α :: Regularization parameter
-Computes the squared Forstner distance between two positive semi-definite matrices.
PotentialLearning.GlobalMean
— Type GlobalMean{T}
GlobalMean produces the mean of the local descriptors.
PotentialLearning.GlobalSum
— Type GlobalSum{T}
GlobalSum produces the sum of the local descriptors.
PotentialLearning.Kernel
— TypeKernel
+Computes the squared Forstner distance between two positive semi-definite matrices.
PotentialLearning.GlobalMean
— Type GlobalMean{T}
GlobalMean produces the mean of the local descriptors.
PotentialLearning.GlobalSum
— Type GlobalSum{T}
GlobalSum produces the sum of the local descriptors.
PotentialLearning.Kernel
— TypeKernel
-A struct of abstract type Kernel is function that takes in two features and produces a semi-definite scalar representing the similarity between the two features.
PotentialLearning.LAMMPS
— Typestruct LAMMPS <: IO
+A struct of abstract type Kernel is function that takes in two features and produces a semi-definite scalar representing the similarity between the two features.
PotentialLearning.LAMMPS
— Typestruct LAMMPS <: IO
elements :: Vector{Symbol}
boundary_conditions :: Vector
-end
PotentialLearning.LearningProblem
— Typestruct LearningProblem{T<:Real} <: AbstractLearningProblem ds::DataSet logprob::Function ∇logprob::Function params::Vector{T} end
Generic LearningProblem that allows the user to pass a logprob(y::params, ds::DataSet) function and its gradient. The gradient should return a vector of logprob with respect to it's params. If the user does not have a gradient function available, then Flux can provide one for it (provided that logprob is of the form above).
PotentialLearning.LearningProblem
— Methodfunction LearningProblem( ds::DataSet, logprob::Function, params::Vector{T} ) where {T}
Generic LearningProblem construnctor.
PotentialLearning.LinearProblem
— Typeabstract type LinearProblem{T<:Real} <: AbstractLearningProblem end
An abstract type to specify linear potential inference problems.
PotentialLearning.LinearProblem
— Methodfunction LinearProblem( ds::DataSet; T = Float64 )
Construct a LinearProblem by detecting if there are energy descriptors and/or force descriptors and construct the appropriate LinearProblem (either Univariate, if only a single type of descriptor, or Covariate, if there are both types).
PotentialLearning.LocalDescriptor
— TypeLocalDescriptor <: AtomicData
A vector corresponding to the descriptor for a particular atom's neighborhood.
PotentialLearning.LocalDescriptors
— TypeLocalDescriptors <: ConfigurationData
A vector of LocalDescriptor, which now should represent all local descriptors for atoms in a configuration.
PotentialLearning.PCA
— TypePCA <: DimensionReducer
- tol :: Float64
Use SVD to compute the PCA of the design matrix of descriptors. (using Force descriptors TBA)
If tol is a float then the number of components to keep is determined by the smallest n such that relative percentage of variance explained by keeping the leading n principle components is greater than 1 - tol. If tol is an int, then we return the components corresponding to the tol largest eigenvalues.
PotentialLearning.RBF
— TypeRBF <: Kernel
+end
PotentialLearning.LearningProblem
— Typestruct LearningProblem{T<:Real} <: AbstractLearningProblem ds::DataSet logprob::Function ∇logprob::Function params::Vector{T} end
Generic LearningProblem that allows the user to pass a logprob(y::params, ds::DataSet) function and its gradient. The gradient should return a vector of logprob with respect to it's params. If the user does not have a gradient function available, then Flux can provide one for it (provided that logprob is of the form above).
PotentialLearning.LearningProblem
— Methodfunction LearningProblem( ds::DataSet, logprob::Function, params::Vector{T} ) where {T}
Generic LearningProblem construnctor.
PotentialLearning.LinearProblem
— Typeabstract type LinearProblem{T<:Real} <: AbstractLearningProblem end
An abstract type to specify linear potential inference problems.
PotentialLearning.LinearProblem
— Methodfunction LinearProblem( ds::DataSet; T = Float64 )
Construct a LinearProblem by detecting if there are energy descriptors and/or force descriptors and construct the appropriate LinearProblem (either Univariate, if only a single type of descriptor, or Covariate, if there are both types).
PotentialLearning.LocalDescriptor
— TypeLocalDescriptor <: AtomicData
A vector corresponding to the descriptor for a particular atom's neighborhood.
PotentialLearning.LocalDescriptors
— TypeLocalDescriptors <: ConfigurationData
A vector of LocalDescriptor, which now should represent all local descriptors for atoms in a configuration.
PotentialLearning.PCA
— TypePCA <: DimensionReducer
+ tol :: Float64
Use SVD to compute the PCA of the design matrix of descriptors. (using Force descriptors TBA)
If tol is a float then the number of components to keep is determined by the smallest n such that relative percentage of variance explained by keeping the leading n principle components is greater than 1 - tol. If tol is an int, then we return the components corresponding to the tol largest eigenvalues.
PotentialLearning.RBF
— TypeRBF <: Kernel
d :: Distance function
α :: Reguarlization parameter
ℓ :: Length-scale parameter
@@ -49,59 +49,59 @@
Computes the squared exponential kernel, i.e.,
- k(A, B) = β xp( -rac{1}{2} d(A,B)/ℓ^2 ) + α δ(A, B)
PotentialLearning.RandomSelector
— Typestruct Random
+ k(A, B) = β xp( -rac{1}{2} d(A,B)/ℓ^2 ) + α δ(A, B)
PotentialLearning.RandomSelector
— Typestruct Random
num_configs :: Int
batch_size :: Int
-end
A convenience function that allows the user to randomly select indices uniformly over [1, num_configs].
PotentialLearning.UnivariateLinearProblem
— Typestruct UnivariateLinearProblem{T<:Real} <: LinearProblem{T} ivdata::Vector dvdata::Vector β::Vector{T} β0::Vector{T} σ::Vector{T} Σ::Symmetric{T,Matrix{T}} end
A UnivariateLinearProblem is a linear problem in which there is only 1 type of independent variable / dependent variable. Typically, that means we are either only fitting energies or only fitting forces. When this is the case, the solution is available analytically and the standard deviation, σ, and covariance, Σ, of the coefficients, β, are computable.
PotentialLearning.YAML
— TypeYAML <: IO
+end
A convenience function that allows the user to randomly select indices uniformly over [1, num_configs].
PotentialLearning.UnivariateLinearProblem
— Typestruct UnivariateLinearProblem{T<:Real} <: LinearProblem{T} ivdata::Vector dvdata::Vector β::Vector{T} β0::Vector{T} σ::Vector{T} Σ::Symmetric{T,Matrix{T}} end
A UnivariateLinearProblem is a linear problem in which there is only 1 type of independent variable / dependent variable. Typically, that means we are either only fitting energies or only fitting forces. When this is the case, the solution is available analytically and the standard deviation, σ, and covariance, Σ, of the coefficients, β, are computable.
PotentialLearning.YAML
— TypeYAML <: IO
energy_units :: Unitful.FreeUnits
- distance_units :: Unitful.FreeUnits
PotentialLearning.kDPP
— Typestruct kDPP
+ distance_units :: Unitful.FreeUnits
PotentialLearning.kDPP
— Typestruct kDPP
K :: EllEnsemble
-end
A convenience function that allows the user access to a k-Determinantal Point Process through Determinantal.jl. All that is required to construct a kDPP is a similarity kernel, for which the user must provide a LinearProblem and two functions to compute descriptor (1) diversity and (2) quality.
PotentialLearning.kDPP
— MethodkDPP(ds::Dataset, f::Feature, k::Kernel)
A convenience function that allows the user access to a k-Determinantal Point Process through Determinantal.jl. All that is required to construct a kDPP is a dataset, a method to compute features, and a kernel. Optional arguments include batch size and type of descriptor (default LocalDescriptors).
PotentialLearning.kDPP
— MethodkDPP(features::Union{Vector{Vector{T}}, Vector{Symmetric{T, Matrix{T}}}}, k::Kernel)
A convenience function that allows the user access to a k-Determinantal Point Process through Determinantaljl. All that is required to construct a kDPP are features (either a vector of vector features or a vector of symmetric matrix features) and a kernel. Optional argument is batch_size (default length(features)).
InteratomicPotentials.compute_force_descriptors
— Methodfunction computeforcedescriptors( ds::DataSet, basis::BasisSystem; pbar = true, T = Float64 )
Compute force descriptors of a basis system and dataset using threads
InteratomicPotentials.compute_local_descriptors
— Methodfunction computelocaldescriptors( ds::DataSet, basis::BasisSystem; pbar = true, T = Float64 )
Compute local descriptors of a basis system and dataset using threads.
PotentialLearning.KernelMatrix
— MethodKernelMatrix(ds1::DataSet, ds2::DataSet, F::Feature, k::Kernel)
+end
A convenience function that allows the user access to a k-Determinantal Point Process through Determinantal.jl. All that is required to construct a kDPP is a similarity kernel, for which the user must provide a LinearProblem and two functions to compute descriptor (1) diversity and (2) quality.
PotentialLearning.kDPP
— MethodkDPP(ds::Dataset, f::Feature, k::Kernel)
A convenience function that allows the user access to a k-Determinantal Point Process through Determinantal.jl. All that is required to construct a kDPP is a dataset, a method to compute features, and a kernel. Optional arguments include batch size and type of descriptor (default LocalDescriptors).
PotentialLearning.kDPP
— MethodkDPP(features::Union{Vector{Vector{T}}, Vector{Symmetric{T, Matrix{T}}}}, k::Kernel)
A convenience function that allows the user access to a k-Determinantal Point Process through Determinantaljl. All that is required to construct a kDPP are features (either a vector of vector features or a vector of symmetric matrix features) and a kernel. Optional argument is batch_size (default length(features)).
InteratomicPotentials.compute_force_descriptors
— Methodfunction computeforcedescriptors( ds::DataSet, basis::BasisSystem; pbar = true, T = Float64 )
Compute force descriptors of a basis system and dataset using threads
InteratomicPotentials.compute_local_descriptors
— Methodfunction computelocaldescriptors( ds::DataSet, basis::BasisSystem; pbar = true, T = Float64 )
Compute local descriptors of a basis system and dataset using threads.
PotentialLearning.KernelMatrix
— MethodKernelMatrix(ds1::DataSet, ds2::DataSet, F::Feature, k::Kernel)
-Compute nonsymmetric kernel matrix K using features of the datasets ds1 and ds2 calculated using the Feature method F.
PotentialLearning.KernelMatrix
— MethodKernelMatrix(ds::DataSet, F::Feature, k::Kernel)
Compute symmetric kernel matrix K using features of the dataset ds calculated using the Feature method F.
PotentialLearning.KernelMatrix
— MethodKernelMatrix(F, k::Kernel)
Compute symmetric kernel matrix K where K{ij} = k(Fi, F_j).
PotentialLearning.KernelMatrix
— MethodKernelMatrix(F1, F2, k::Kernel)
Compute non-symmetric kernel matrix K where K{ij} = k(F1i, F2_j).
PotentialLearning.calc_centroid
— Methodfunction calc_centroid(
+Compute nonsymmetric kernel matrix K using features of the datasets ds1 and ds2 calculated using the Feature method F.
PotentialLearning.KernelMatrix
— MethodKernelMatrix(ds::DataSet, F::Feature, k::Kernel)
Compute symmetric kernel matrix K using features of the dataset ds calculated using the Feature method F.
PotentialLearning.KernelMatrix
— MethodKernelMatrix(F, k::Kernel)
Compute symmetric kernel matrix K where K{ij} = k(Fi, F_j).
PotentialLearning.KernelMatrix
— MethodKernelMatrix(F1, F2, k::Kernel)
Compute non-symmetric kernel matrix K where K{ij} = k(F1i, F2_j).
PotentialLearning.calc_centroid
— Methodfunction calc_centroid(
m::Array{Float64,2}
-)
Calculate a centroid of a matrix.
PotentialLearning.calc_metrics
— Methodcalc_metrics(x_pred, x)
x_pred
: vector of predicted values of a variable. E.g. energy. x
: vector of true values of a variable. E.g. energy.
Returns MAE, RMSE, and RSQ.
PotentialLearning.compute_features
— Methodcompute_feature(ds::DataSet, f::Feature; dt = LocalDescriptors)
Computes features of the dataset ds using the feature method F on descriptors dt (default option are the LocalDescriptors, if available).
PotentialLearning.compute_kernel
— Methodcompute_kernel(A, B, k)
Compute similarity kernel between features A and B using kernel k.
PotentialLearning.distance_matrix_kabsch
— Methodfunction distance_matrix_kabsch(
+)
Calculate a centroid of a matrix.
PotentialLearning.calc_metrics
— Methodcalc_metrics(x_pred, x)
x_pred
: vector of predicted values of a variable. E.g. energy. x
: vector of true values of a variable. E.g. energy.
Returns MAE, RMSE, and RSQ.
PotentialLearning.compute_features
— Methodcompute_feature(ds::DataSet, f::Feature; dt = LocalDescriptors)
Computes features of the dataset ds using the feature method F on descriptors dt (default option are the LocalDescriptors, if available).
PotentialLearning.compute_kernel
— Methodcompute_kernel(A, B, k)
Compute similarity kernel between features A and B using kernel k.
PotentialLearning.distance_matrix_kabsch
— Methodfunction distance_matrix_kabsch(
ds::DataSet
-)
Calculate a matrix of distances between atomic configurations using KABSCH method.
PotentialLearning.distance_matrix_periodic
— Methodfunction distance_matrix_periodic(
+)
Calculate a matrix of distances between atomic configurations using KABSCH method.
PotentialLearning.distance_matrix_periodic
— Methodfunction distance_matrix_periodic(
ds::DataSet
-)
Calculates a matrix of distances between atomic configurations taking into account the periodic boundaries.
PotentialLearning.fit
— Functionfit(ds::DataSet, dr::DimensionReducer)
Fits a linear dimension reduction routine using information from DataSet. See individual types of DimensionReducers for specific details.
PotentialLearning.fit
— Methodfit(ds::DataSet, as::ActiveSubspace)
Fits a linear dimension reduction routine using the eigendirections of the uncentered covariance of the function ∇Q(c::Configuration) over the configurations in ds. Primarily used to reduce the dimension of the descriptors.
PotentialLearning.fit
— Methodfit(ds::DataSet, pca::PCA)
Fits a linear dimension reduction routine using PCA on the global descriptors in the dataset ds.
PotentialLearning.fit_transform
— Methodfit_transform(ds::DataSet, dr::DimensionReducer)
Fits a linear dimension reduction routine using information from DataSet and performs dimension reduction on descriptors and force_descriptors (whichever are available). See individual types of DimensionReducers for specific details.
PotentialLearning.get_all_energies
— Methodfunction get_all_energies(
+)
Calculates a matrix of distances between atomic configurations taking into account the periodic boundaries.
PotentialLearning.fit
— Functionfit(ds::DataSet, dr::DimensionReducer)
Fits a linear dimension reduction routine using information from DataSet. See individual types of DimensionReducers for specific details.
PotentialLearning.fit
— Methodfit(ds::DataSet, as::ActiveSubspace)
Fits a linear dimension reduction routine using the eigendirections of the uncentered covariance of the function ∇Q(c::Configuration) over the configurations in ds. Primarily used to reduce the dimension of the descriptors.
PotentialLearning.fit
— Methodfit(ds::DataSet, pca::PCA)
Fits a linear dimension reduction routine using PCA on the global descriptors in the dataset ds.
PotentialLearning.fit_transform
— Methodfit_transform(ds::DataSet, dr::DimensionReducer)
Fits a linear dimension reduction routine using information from DataSet and performs dimension reduction on descriptors and force_descriptors (whichever are available). See individual types of DimensionReducers for specific details.
PotentialLearning.get_all_energies
— Methodfunction get_all_energies(
ds::DataSet,
lb::LinearBasisPotential
-)
PotentialLearning.get_all_energies
— Methodfunction getallenergies( ds::DataSet )
PotentialLearning.get_all_forces
— Methodfunction getallforces( ds::DataSet, lb::LinearBasisPotential )
PotentialLearning.get_all_forces
— Methodfunction get_all_forces(
+)
PotentialLearning.get_all_energies
— Methodfunction getallenergies( ds::DataSet )
PotentialLearning.get_all_forces
— Methodfunction getallforces( ds::DataSet, lb::LinearBasisPotential )
PotentialLearning.get_all_forces
— Methodfunction get_all_forces(
ds::DataSet
-)
PotentialLearning.get_batches
— Methodget_batches(n_batches, B_train, B_train_ext, e_train, dB_train, f_train,
- B_test, B_test_ext, e_test, dB_test, f_test)
n_batches
: no. of batches per dataset. B_train
: descriptors of the energies used in training. B_train_ext
: extendended descriptors of the energies used in training. Requiered to compute forces. e_train
: energies used in training. dB_train
: derivatives of the energy descritors used in training. f_train
: forces used in training. B_test
: descriptors of the energies used in test. B_test_ext
: extendended descriptors of the energies used in test. Requiered to compute forces. e_test
: energies used in test. dB_test
: derivatives of the energy descritors used in test. f_test
: forces used in test.
Returns the data loaders for training and test of energies and forces.
PotentialLearning.get_clusters
— Methodfunction get_clusters(
+)
PotentialLearning.get_batches
— Methodget_batches(n_batches, B_train, B_train_ext, e_train, dB_train, f_train,
+ B_test, B_test_ext, e_test, dB_test, f_test)
n_batches
: no. of batches per dataset. B_train
: descriptors of the energies used in training. B_train_ext
: extendended descriptors of the energies used in training. Requiered to compute forces. e_train
: energies used in training. dB_train
: derivatives of the energy descritors used in training. f_train
: forces used in training. B_test
: descriptors of the energies used in test. B_test_ext
: extendended descriptors of the energies used in test. Requiered to compute forces. e_test
: energies used in test. dB_test
: derivatives of the energy descritors used in test. f_test
: forces used in test.
Returns the data loaders for training and test of energies and forces.
PotentialLearning.get_clusters
— Methodfunction get_clusters(
ds,
eps,
minpts
-)
Computes clusters from the configurations in ds
using DBSCAN with parameters eps
and minpts
.
PotentialLearning.get_dpp_mode
— Methodget_dpp_mode(dpp::kDPP, batch_size::Int) <: Vector{Int64}
Access an approximate mode of the k-DPP as calculated by a greedy subset algorithm. See Determinantal.jl for details.
PotentialLearning.get_energy
— Methodget_energy(c::Configuration) <: Energy
Retrieves the energy (if available) in the Configuration c.
PotentialLearning.get_force_descriptors
— Methodget_force_descriptors(c::Configuration) <: ForceDescriptors
Retrieves the force descriptors (if available) in the Configuration c.
PotentialLearning.get_forces
— Methodget_forces(c::Configuration) <: Forces
Retrieves the forces (if available) in the Configuration c.
PotentialLearning.get_inclusion_prob
— Methodget_inclusion_prob(dpp::kDPP) <: Vector{Float64}
Access an approximation to the inclusion probabilities as calculated by Determinantal.jl (see package for details).
PotentialLearning.get_input
— Methodget_input(args)
args
: vector of arguments (strings)
Returns an OrderedDict with the arguments. See https://github.com/cesmix-mit/AtomisticComposableWorkflows documentation for information about how to define the input arguments.
PotentialLearning.get_local_descriptors
— Methodget_local_descriptors(c::Configuration) <: LocalDescriptors
Retrieves the local descriptors (if available) in the Configuration c.
PotentialLearning.get_metrics
— Methodget_metrics( e_train_pred, e_train, f_train_pred, f_train,
+)
Computes clusters from the configurations in ds
using DBSCAN with parameters eps
and minpts
.
PotentialLearning.get_dpp_mode
— Methodget_dpp_mode(dpp::kDPP, batch_size::Int) <: Vector{Int64}
Access an approximate mode of the k-DPP as calculated by a greedy subset algorithm. See Determinantal.jl for details.
PotentialLearning.get_energy
— Methodget_energy(c::Configuration) <: Energy
Retrieves the energy (if available) in the Configuration c.
PotentialLearning.get_force_descriptors
— Methodget_force_descriptors(c::Configuration) <: ForceDescriptors
Retrieves the force descriptors (if available) in the Configuration c.
PotentialLearning.get_forces
— Methodget_forces(c::Configuration) <: Forces
Retrieves the forces (if available) in the Configuration c.
PotentialLearning.get_inclusion_prob
— Methodget_inclusion_prob(dpp::kDPP) <: Vector{Float64}
Access an approximation to the inclusion probabilities as calculated by Determinantal.jl (see package for details).
PotentialLearning.get_input
— Methodget_input(args)
args
: vector of arguments (strings)
Returns an OrderedDict with the arguments. See https://github.com/cesmix-mit/AtomisticComposableWorkflows documentation for information about how to define the input arguments.
PotentialLearning.get_local_descriptors
— Methodget_local_descriptors(c::Configuration) <: LocalDescriptors
Retrieves the local descriptors (if available) in the Configuration c.
PotentialLearning.get_metrics
— Methodget_metrics( e_train_pred, e_train, f_train_pred, f_train,
e_test_pred, e_test, f_test_pred, f_test,
- B_time, dB_time, time_fitting)
e_train_pred
: vector of predicted training energy values. e_train
: vector of true training energy values. f_train_pred
: vector of predicted training force values. f_train
: vector of true training force values. e_test_pred
: vector of predicted test energy values. e_test
: vector of true test energy values. f_test_pred
: vector of predicted test force values. f_test
: vector of true test force values. B_time
: elapsed time consumed by descriptors calculation. dB_time
: elapsed time consumed by descriptor derivatives calculation. time_fitting
: elapsed time consumed by fitting process.
Computes MAE, RMSE, and RSQ for training and testing energies and forces. Also add elapsed times about descriptors and fitting calculations. Returns an OrderedDict with the information above.
PotentialLearning.get_metrics
— Methodget_metrics( e_train_pred, e_train, e_test_pred, e_test)
e_train_pred
: vector of predicted training energy values. e_train
: vector of true training energy values. e_test_pred
: vector of predicted test energy values. e_test
: vector of true test energy values.
Computes MAE, RMSE, and RSQ for training and testing energies. Returns an OrderedDict with the information above.
PotentialLearning.get_positions
— Methodget_positions(c::Configuration) <: Vector{SVector}
Retrieves the AtomsBase system positions (if available) in the Configuration c.
PotentialLearning.get_random_subset
— Functionfunction get_random_subset(
+ B_time, dB_time, time_fitting)
e_train_pred
: vector of predicted training energy values. e_train
: vector of true training energy values. f_train_pred
: vector of predicted training force values. f_train
: vector of true training force values. e_test_pred
: vector of predicted test energy values. e_test
: vector of true test energy values. f_test_pred
: vector of predicted test force values. f_test
: vector of true test force values. B_time
: elapsed time consumed by descriptors calculation. dB_time
: elapsed time consumed by descriptor derivatives calculation. time_fitting
: elapsed time consumed by fitting process.
Computes MAE, RMSE, and RSQ for training and testing energies and forces. Also add elapsed times about descriptors and fitting calculations. Returns an OrderedDict with the information above.
PotentialLearning.get_metrics
— Methodget_metrics( e_train_pred, e_train, e_test_pred, e_test)
e_train_pred
: vector of predicted training energy values. e_train
: vector of true training energy values. e_test_pred
: vector of predicted test energy values. e_test
: vector of true test energy values.
Computes MAE, RMSE, and RSQ for training and testing energies. Returns an OrderedDict with the information above.
PotentialLearning.get_positions
— Methodget_positions(c::Configuration) <: Vector{SVector}
Retrieves the AtomsBase system positions (if available) in the Configuration c.
PotentialLearning.get_random_subset
— Functionfunction get_random_subset(
s::DBSCANSelector,
batch_size = s.sample_size
-)
Returns a random subset of indexes composed of samples of size batch_size ÷ length(s.clusters)
from each cluster in s
.
PotentialLearning.get_random_subset
— Functionget_random_subset(r::Random, batch_size :: Int) <: Vector{Int64}
Access a random subset of the data as sampled from the provided k-DPP. Returns the indices of the random subset and the subset itself.
PotentialLearning.get_random_subset
— Methodget_random_subset(dpp::kDPP, batch_size :: Int) <: Vector{Int64}
Access a random subset of the data as sampled from the provided k-DPP. Returns the indices of the random subset and the subset itself.
PotentialLearning.get_system
— Methodget_system(c::Configuration) <: AtomsBase.AbstractSystem
Retrieves the AtomsBase system (if available) in the Configuration c.
PotentialLearning.get_values
— Methodget_values(e::Energy) <: Real
Get the underlying real value (= e.d)
PotentialLearning.get_values
— Methodget_values(v::SVector)
Removes units from a position.
PotentialLearning.kabsch
— Methodfunction kabsch(
+)
Returns a random subset of indexes composed of samples of size batch_size ÷ length(s.clusters)
from each cluster in s
.
PotentialLearning.get_random_subset
— Functionget_random_subset(r::Random, batch_size :: Int) <: Vector{Int64}
Access a random subset of the data as sampled from the provided k-DPP. Returns the indices of the random subset and the subset itself.
PotentialLearning.get_random_subset
— Methodget_random_subset(dpp::kDPP, batch_size :: Int) <: Vector{Int64}
Access a random subset of the data as sampled from the provided k-DPP. Returns the indices of the random subset and the subset itself.
PotentialLearning.get_system
— Methodget_system(c::Configuration) <: AtomsBase.AbstractSystem
Retrieves the AtomsBase system (if available) in the Configuration c.
PotentialLearning.get_values
— Methodget_values(e::Energy) <: Real
Get the underlying real value (= e.d)
PotentialLearning.get_values
— Methodget_values(v::SVector)
Removes units from a position.
PotentialLearning.kabsch
— Methodfunction kabsch(
reference::Array{Float64,2},
coords::Array{Float64,2}
-)
Input: two sets of points: reference, coords as Nx3 Matrices (so) Returns optimally rotated matrix
PotentialLearning.kabsch_rmsd
— Methodfunction kabsch_rmsd(
+)
Input: two sets of points: reference, coords as Nx3 Matrices (so) Returns optimally rotated matrix
PotentialLearning.kabsch_rmsd
— Methodfunction kabsch_rmsd(
P::Array{Float64,2},
Q::Array{Float64,2}
-)
Directly return RMSD for matrices P, Q for convenience.
PotentialLearning.learn!
— Methodfunction learn!( iap::InteratomicPotentials.LinearBasisPotential, ds::DataSet, args... )
Learning dispatch function, common to ordinary and weghted least squares implementations.
PotentialLearning.learn!
— Methodfunction learn!( lp::CovariateLinearProblem, α::Real )
Fit a Gaussian distribution by finding the MLE of the following log probability: ℓ(β, σe, σf) = -0.5(e - A_e *β)'(e - Ae * β) / σe - 0.5*(f - Af β)'(f - A_f * β) / σf - log(σe) - log(σf)
through an optimization procedure.
PotentialLearning.learn!
— Methodfunction learn!( lp::CovariateLinearProblem, ss::SubsetSelector, α::Real; num_steps=100, opt=Flux.Optimise.Adam() )
Fit a Gaussian distribution by finding the MLE of the following log probability: ℓ(β, σe, σf) = -0.5(e - A_e *β)'(e - Ae * β) / σe - 0.5*(f - Af β)'(f - A_f * β) / σf - log(σe) - log(σf)
through an iterative batch gradient descent optimization proceedure where the batches are provided by the subset selector.
PotentialLearning.learn!
— Methodfunction learn!( lp::CovariateLinearProblem, ws::Vector, int::Bool )
Fit energies and forces using weighted least squares.
PotentialLearning.learn!
— Methodfunction learn!( lp::LearningProblem, ss::SubsetSelector; num_steps = 100::Int, opt = Flux.Optimisers.Adam() )
Attempts to fit the parameters lp.params in the learning problem lp using batch gradient descent with the optimizer opt and num_steps number of iterations. Batching is provided by the passed ss::SubsetSelector.
PotentialLearning.learn!
— Methodfunction learn!( lp::LearningProblem; num_steps=100::Int, opt=Flux.Optimisers.Adam() )
Attempts to fit the parameters lp.params in the learning problem lp using gradient descent with the optimizer opt and num_steps number of iterations.
PotentialLearning.learn!
— Methodfunction learn!( lp::LinearProblem )
Default learning problem: weighted least squares.
PotentialLearning.learn!
— Methodfunction learn!( lp::UnivariateLinearProblem, α::Real )
Fit a univariate Gaussian distribution for the equation y = Aβ + ϵ, where β are model coefficients and ϵ ∼ N(0, σ). Fitting is done via SVD on the design matrix, A'*A (formed iteratively), where eigenvalues less than α are cut-off.
PotentialLearning.learn!
— Methodfunction learn!( lp::UnivariateLinearProblem, ss::SubsetSelector, α::Real; num_steps = 100, opt = Flux.Optimise.Adam() )
Fit a univariate Gaussian distribution for the equation y = Aβ + ϵ, where β are model coefficients and ϵ ∼ N(0, σ). Fitting is done via batched gradient descent with batches provided by the subset selector and the gradients are calculated using Flux.
PotentialLearning.learn!
— Methodfunction learn!( lp::UnivariateLinearProblem, ws::Vector, int::Bool )
Fit energies using weighted least squares.
PotentialLearning.linearize_forces
— Methodlinearize_forces(forces)
forces
: vector of forces per system
Returns a vector with the components of the forces of the systems.
PotentialLearning.load_data
— Methodload_data(file::string, extxyz::ExtXYZ)
-Load configuration from an extxyz file into a DataSet
PotentialLearning.load_data
— Methodload_data(file::string, yaml::YAML)
+)
Directly return RMSD for matrices P, Q for convenience.
PotentialLearning.learn!
— Methodfunction learn!( iap::InteratomicPotentials.LinearBasisPotential, ds::DataSet, args... )
Learning dispatch function, common to ordinary and weghted least squares implementations.
PotentialLearning.learn!
— Methodfunction learn!( lp::CovariateLinearProblem, α::Real )
Fit a Gaussian distribution by finding the MLE of the following log probability: ℓ(β, σe, σf) = -0.5(e - A_e *β)'(e - Ae * β) / σe - 0.5*(f - Af β)'(f - A_f * β) / σf - log(σe) - log(σf)
through an optimization procedure.
PotentialLearning.learn!
— Methodfunction learn!( lp::CovariateLinearProblem, ss::SubsetSelector, α::Real; num_steps=100, opt=Flux.Optimise.Adam() )
Fit a Gaussian distribution by finding the MLE of the following log probability: ℓ(β, σe, σf) = -0.5(e - A_e *β)'(e - Ae * β) / σe - 0.5*(f - Af β)'(f - A_f * β) / σf - log(σe) - log(σf)
through an iterative batch gradient descent optimization proceedure where the batches are provided by the subset selector.
PotentialLearning.learn!
— Methodfunction learn!( lp::CovariateLinearProblem, ws::Vector, int::Bool )
Fit energies and forces using weighted least squares.
PotentialLearning.learn!
— Methodfunction learn!( lp::LearningProblem, ss::SubsetSelector; num_steps = 100::Int, opt = Flux.Optimisers.Adam() )
Attempts to fit the parameters lp.params in the learning problem lp using batch gradient descent with the optimizer opt and num_steps number of iterations. Batching is provided by the passed ss::SubsetSelector.
PotentialLearning.learn!
— Methodfunction learn!( lp::LearningProblem; num_steps=100::Int, opt=Flux.Optimisers.Adam() )
Attempts to fit the parameters lp.params in the learning problem lp using gradient descent with the optimizer opt and num_steps number of iterations.
PotentialLearning.learn!
— Methodfunction learn!( lp::LinearProblem )
Default learning problem: weighted least squares.
PotentialLearning.learn!
— Methodfunction learn!( lp::UnivariateLinearProblem, α::Real )
Fit a univariate Gaussian distribution for the equation y = Aβ + ϵ, where β are model coefficients and ϵ ∼ N(0, σ). Fitting is done via SVD on the design matrix, A'*A (formed iteratively), where eigenvalues less than α are cut-off.
PotentialLearning.learn!
— Methodfunction learn!( lp::UnivariateLinearProblem, ss::SubsetSelector, α::Real; num_steps = 100, opt = Flux.Optimise.Adam() )
Fit a univariate Gaussian distribution for the equation y = Aβ + ϵ, where β are model coefficients and ϵ ∼ N(0, σ). Fitting is done via batched gradient descent with batches provided by the subset selector and the gradients are calculated using Flux.
PotentialLearning.learn!
— Methodfunction learn!( lp::UnivariateLinearProblem, ws::Vector, int::Bool )
Fit energies using weighted least squares.
PotentialLearning.linearize_forces
— Methodlinearize_forces(forces)
forces
: vector of forces per system
Returns a vector with the components of the forces of the systems.
PotentialLearning.load_data
— Methodload_data(file::string, extxyz::ExtXYZ)
+Load configuration from an extxyz file into a DataSet
PotentialLearning.load_data
— Methodload_data(file::string, yaml::YAML)
Load configurations from a yaml file into a Vector of Flexible Systems, with Energies and Force.
Returns
ds - DataSet
- t = Vector{Dict} (any miscellaneous info from yaml file)
PotentialLearning.load_datasets
— Methodload_datasets(input)
input
: OrderedDict with input arguments. See get_defaults_args()
.
Returns training and test systems, energies, forces, and stresses.
PotentialLearning.periodic_rmsd
— Methodfunction periodic_rmsd(
+ t = Vector{Dict} (any miscellaneous info from yaml file)
PotentialLearning.load_datasets
— Methodload_datasets(input)
input
: OrderedDict with input arguments. See get_defaults_args()
.
Returns training and test systems, energies, forces, and stresses.
PotentialLearning.periodic_rmsd
— Methodfunction periodic_rmsd(
p1::Array{Float64,2},
p2::Array{Float64,2},
box_lengths::Array{Float64,1}
-)
Calculates the RMSD between atom positions of two configurations taking into account the periodic boundaries.
PotentialLearning.rmsd
— Methodfunction rmsd(
+)
Calculates the RMSD between atom positions of two configurations taking into account the periodic boundaries.
PotentialLearning.rmsd
— Methodfunction rmsd(
A::Array{Float64,2},
B::Array{Float64,2}
-)
Calculate root mean square deviation of two matrices A, B. See http://en.wikipedia.org/wiki/Root-mean-squaredeviationofatomicpositions
PotentialLearning.sample
— Methodfunction sample(
+)
Calculate root mean square deviation of two matrices A, B. See http://en.wikipedia.org/wiki/Root-mean-squaredeviationofatomicpositions
PotentialLearning.sample
— Methodfunction sample(
c,
batch_size
-)
Select from cluster c
a sample of size batch_size
.
PotentialLearning.to_num
— Methodto_num(str)
str
: string with a number: integer or float
Returns an integer or float.
PotentialLearning.translate_points
— Methodfunction translate_points(
+)
Select from cluster c
a sample of size batch_size
.
PotentialLearning.to_num
— Methodto_num(str)
str
: string with a number: integer or float
Returns an integer or float.
PotentialLearning.translate_points
— Methodfunction translate_points(
P::Array{Float64,2},
Q::Array{Float64,2}
-)
Translate P, Q so centroids are equal to the origin of the coordinate system Translation der Massenzentren, so dass beide Zentren im Ursprung des Koordinatensystems liegen