Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Diffusion Map Refactor/Implementation #863

Closed
wants to merge 49 commits into from
Closed
Changes from 1 commit
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
c511ead
new analysis feature: diffusion map
Mar 2, 2016
3be892e
description of diffusion map
Mar 2, 2016
3ce705a
diffusionmap minor fix
Mar 2, 2016
328c27a
test for diffusionmap
Mar 2, 2016
d2d4593
diffusionmap faster test
Mar 2, 2016
ed5f4c5
included notes
Mar 11, 2016
2947c0f
initial work for refactor
jdetle May 24, 2016
7e94a58
functional refactor
jdetle May 24, 2016
ab7f93c
updated changes
jdetle May 24, 2016
43144e0
iteration error
jdetle May 29, 2016
18c84ee
refactor of @euhruska diffusion map
jdetle May 29, 2016
0da96fa
removed deco import
jdetle May 29, 2016
2c93f2d
added tests, fixed logger issue
jdetle May 30, 2016
49578a0
style fixes, protection fixes, added docs
jdetle Jun 3, 2016
7ec486a
fix of some broken areas
jdetle Jun 3, 2016
f350eaa
small docs fix
jdetle Jun 4, 2016
245d2e8
Constant epsilon tests
jdetle Jun 6, 2016
17fced2
updated CHANGELOG
jdetle Jun 6, 2016
14b08e4
fixed selections and updated tests accordingly
jdetle Jun 8, 2016
cfcbfe0
switched order of imports [skip ci]
jdetle Jun 8, 2016
07a83d1
Separated calculation of distance matrix from diffusion
jdetle Jun 10, 2016
6faec56
Working refactor
jdetle Jun 10, 2016
48c27a9
Change of direction in API, more work to be done implementing everything
jdetle Jun 16, 2016
fe08866
added timescaling, fixed some parts
jdetle Jun 16, 2016
cafeb68
Automodule added, Changed some object oriented stuff around
jdetle Jun 16, 2016
bb394ec
Changed things around to get embedding working
jdetle Jun 16, 2016
074a950
Docs fixes
jdetle Jun 17, 2016
68f2928
Added tests, found bugs, squashed bugs.
jdetle Jun 17, 2016
98e899b
Eliminated square of distance, random mistake
jdetle Jun 17, 2016
b893244
Now witness the firepower of this fully armed and
jdetle Jun 18, 2016
7b12ece
Satisfying the linter
jdetle Jun 18, 2016
e5a3487
Updated docs
jdetle Jun 18, 2016
cbf685d
Fixed bug in _single_frame()
jdetle Jun 19, 2016
e0ed7b4
Eliminated spectral gap function... for now
jdetle Jun 21, 2016
303b816
Fixed matrix multiplication bug
jdetle Jun 23, 2016
01dc0d5
Better single_frame loop
jdetle Jun 27, 2016
2f2db09
Refactor of code to stop DiffusionMap inheriting from AnalysisBase
jdetle Jun 29, 2016
34b1e7c
Removed epsilon class from pull request, now must supply a constant a…
jdetle Jun 29, 2016
7c56863
Added exception to be thrown for large distance matrices.
jdetle Jun 29, 2016
bd40f38
Updated diffusion map after some test-driven development, added error…
jdetle Jun 29, 2016
420edd0
Fixed some docs [skip ci]
jdetle Jun 30, 2016
c058c6c
Change to warning
jdetle Jun 30, 2016
b6c309a
Updated CHANGELOG
jdetle Jun 30, 2016
8ba1a58
Fixes as suggested to be more pythonic and elegant
jdetle Jul 1, 2016
aa22772
Fixes to docs, Transform function.
jdetle Jul 3, 2016
97c3105
Removed anisotropic kernel creation. Figured out that I was
jdetle Jul 3, 2016
b3b2ce9
Small style change
jdetle Jul 5, 2016
c5d1958
Updated docs to remove references to anisotropy, update tutorial [ski…
jdetle Jul 5, 2016
d7cd454
Added link to jupyter notebook [skip ci]
jdetle Jul 6, 2016
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Refactor of code to stop DiffusionMap inheriting from AnalysisBase
Changed code further to make transform a simple iteration.
Provided functionality for doing a diffusion map with just a distance matrix.
jdetle committed Jun 30, 2016
commit 2f2db098bf5bab2adb40d6b5dbafe829f632eb5b
133 changes: 65 additions & 68 deletions package/MDAnalysis/analysis/diffusionmap.py
Original file line number Diff line number Diff line change
@@ -160,8 +160,7 @@ class DistanceMatrix(AnalysisBase):
dist_matrix : array
Array of all possible ij metric distances between frames in trajectory.
This matrix is symmetric with zeros on the diagonal.
calculated : boolean
Boolean indicating if the distance matrix has been calculated.

Methods
-------
save(filename)
@@ -205,7 +204,7 @@ def __init__(self, u, select='all', metric=rmsd, cutoff=1E0-5,
self._metric = metric
self._cutoff = cutoff
self._weights = weights
self.calculated = False
self._calculated = False
# remember that this must be called before referencing self.nframes
self._setup_frames(traj, start, stop, step)

@@ -214,23 +213,22 @@ def _prepare(self):

def _single_frame(self):
iframe = self._ts.frame
i_ref = self.atoms.positions - self.atoms.center_of_mass()
i_ref = self.atoms.positions
# diagonal entries need not be calculated due to metric(x,x) == 0 in
# theory, _ts not updated properly. Possible savings by setting a
# cutoff for significant decimal places to sparsify matrix
for j, ts in enumerate(self._u.trajectory[iframe:self.stop:self.step]):
self._ts = ts
j_ref = self.atoms.positions-self.atoms.center_of_mass()
j_ref = self.atoms.positions
dist = self._metric(i_ref, j_ref, weights=self._weights)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can allow optional arguments for the metric. Because some might want to center as well.

self.dist_matrix[self._index, j+self._index] = (dist if dist >
self._cutoff else 0)
self.dist_matrix[j+self._index, self._index] = (self.dist_matrix[
self._index,
j+self._index])
self.dist_matrix[self._frame_index, j+self._frame_index] = (
dist if dist > self._cutoff else 0)
self.dist_matrix[j+self._frame_index, self._frame_index] = (
self.dist_matrix[self._frame_index, j+self._frame_index])
self._ts = self._u.trajectory[iframe]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please make a comment why you are resetting the trajectory


def _conclude(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this isn't good. If I want to change my choice of epsilon I still have to recalculate the complete distance matrix! Or close knowledge about the inner working of this class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all this valuable input @kain88-de! Perhaps you're right and I went the wrong way. I want to allow the function to be tuneable for custom purposes such that the way epsilon is calculated is left up to the user. In addition, it would be nice if someone working in a scripting environment could just write a function in there script and pass it as an argument. Given some explicit instructions, a user passes epsilon=function_written_for_choice_of_epsilon and the arguments necessary to execute the function. (Would this be in the form of **kwargs?, I am not entirely sure of the formalism). Of course, many epsilon functions would take the distance matrix as a parameter. Hopefully that makes sense and indicates why I think this wouldn't require recalculating the complete distance matrix.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

write up some usage examples that you are thinking of. They can later be used as documentation as well. Then it becomes clearer what you want to do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, good suggestion thanks.

self.calculated = True
self._calculated = True

def save(self, filename):
np.save(filename, self.dist_matrix)
@@ -242,49 +240,53 @@ class Epsilon(object):

Attributes
----------
scaled_matrix : DistanceMatrix object
scaled_matrix : numpy array
A matrix with each term divided by a local scale parameter
calculated : boolean
True after determine epsilon called

Methods
-------
determine_epsilon()
Determine local scale parameters using a chosen algorithm
"""
def __init__(self, DistanceMatrix, **kwargs):
self._dist_matrix = DistMatrix
def __init__(self, dist_matrix):
if dist_matrix._calculated:
# the DistanceMatrix object
self.dist_matrix = dist_matrix
# the actual numpy array containing the data
self.scaled_matrix = dist_matrix.dist_matrix
else:
raise AttributeError('Distance Matrix does not exist, was'
'DistanceMatrix.run() called?')
self._calculated = False

def determine_epsilon(self):
# set self.calculated = True here
pass


class EpsilonConstant(Epsilon):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't use this anymore so I think it can go.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is so weird. I had deleted it in a previous commit but I think I mistakenly brought it back into diffusion when creating a new Epsilon class. Will get rid of it.

"""Premade function for the determination of epsilon based on providing
an arbitrary constant"""

def __init__(self, DistanceMatrix, epsilon):
def __init__(self, dist_matrix, epsilon):
"""
Parameters
----------
epsilon : int
The value of epsilon to be used as a local scale parameter
"""
if DistanceMatrix.calculated:
self.scaled_matrix = DistanceMatrix.dist_matrix
else:
raise AttributeError('Distance Matrix does not exist, was'
'DistanceMatrix.run() called?')

Epsilon.__init__(self, dist_matrix)
self._epsilon = epsilon
self.calculated = False

def determine_epsilon(self):
self.scaled_matrix /= self._epsilon
self.calculated = True
# set self.calculated = True here
if not self._calculated:
self.scaled_matrix /= self._epsilon
self._calculated = True
return


class DiffusionMap(AnalysisBase):
class DiffusionMap(object):
"""Non-linear dimension reduction method

Dimension reduction with diffusion mapping of selected structures in a
@@ -310,59 +312,63 @@ class DiffusionMap(AnalysisBase):
the collective coordinates.
"""

def __init__(self, DistanceMatrix, Epsilon, weights=None, timescale=1):
def __init__(self, dist_matrix, epsilon=1, manifold_density=None, timescale=1):
"""
Parameters
-------------
DistanceMatrix : DistanceMatrix object
dist_matrix : DistanceMatrix object
Distance matrix to be made into a diffusion kernel and perform
an eigenvector decomposition on.
Epsilon : Epsilon object
epsilon : Float or Epsilon object
Specifies the method used for the choice of scale parameter in the
diffusion map. More information in [1], [2] and [3].
weights: list, optional
diffusion map. More information in [1], [2] and [3]. If a float
is given, this will be the scale parameter used for
the kernel.
manifold_density: list, optional
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is a manifold density? What would be good weights? Why would a user change the weight of a frame? From which paper did you get that this should be done?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a cool result from Lafon's Diffusion Maps Paper, Page 14. He provides a proof to show how diffusion maps can reflect a system that is modeled by the Fokker-Plank equation. (Does that sentence make sense?) If I'm not mistaken MD simulations are one such area, and it is taken into consideration in Dr. Clementi's locally scaled diffusion and free energy landscapes paper. I will gladly admit that these more advanced topics are a bit beyond me.

I can't tell if you want to see this gotten rid of or if I should work on a better description of this in the documentation. I can provide a link to my blog in the docs where I talked about this derivation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah Focker-Plank equation and diffusion are related I know (also good the correspondence can be shown for diffusion maps). Yes MD-simulations can be modeled with Fokker-Plank equations. What I would like to know is what is a manifold_density? Your answer doesn't give me the description. I still don't know what this parameter does from reading the docs nor do I know what sensible values are as a user nor could I guess them. Please explain what the manifold_density is for the diffusion maps and how they influence the result. If we can't explain it think about leaving it out.

Please don't link to blog posts. We don't know how long they will be around and MDAnalysis will likely be around for another 10 years so I rather keep everything in the docs self contained. References to publications are OK.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're absolutely right, I will do some research and get back to you with a
satisfying explanation or eliminate it as a parameter.
On Fri, Jul 1, 2016 at 1:18 PM kain88-de notifications@github.com wrote:

In package/MDAnalysis/analysis/diffusionmap.py
#863 (comment):

  • """
  • def init(self, u, epsilon=1, manifold_density=None,
  •             timescale=1, **kwargs):
    
  •    """
    
  •    Parameters
    

  •    u : MDAnalysis Universe or DistanceMatrix object
    
  •        Can be a Universe, in which case one must supply kwargs for the
    
  •        initialization of a DistanceMatrix. Otherwise, this can be a
    
  •        DistanceMatrix already initialized. Either way, this will be made
    
  •        into a diffusion kernel.
    
  •    epsilon : Float
    
  •        Specifies the method used for the choice of scale parameter in the
    
  •        diffusion map. More information in [1], [2] and [3], Default: 1.
    
  •    manifold_density: list, optional
    

Yeah Focker-Plank equation and diffusion are related I know (also good the
correspondence can be shown for diffusion maps). Yes MD-simulations can be
modeled with Fokker-Plank equations. What I would like to know is what
is a manifold_density?
Your answer doesn't give me the description. I
still don't know what this parameter does from reading the docs nor do I
know what sensible values are as a user nor could I guess them. Please
explain what the manifold_density is for the diffusion maps and how they
influence the result. If we can't explain it think about leaving it out.

Please don't link to blog posts. We don't know how long they will be
around and MDAnalysis will likely be around for another 10 years so I
rather keep everything in the docs self contained. References to
publications are OK.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/MDAnalysis/mdanalysis/pull/863/files/b6c309a5dac5b2a5fc816fb0c2eccfd1a4a81b93#r69350871,
or mute the thread
https://github.com/notifications/unsubscribe/AKcARv5Jw8X2-0Ck7TR5MuS9Tcw5abKjks5qRXYagaJpZM4IpYIS
.

Have a wonderful day,
John J. Detlefs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay so with 10 minutes of reading another Lafon paper he gives the density at a point x to be equal to $\int d(x,y) Pr{y} dy $ Pr{y} is a probability measure equal to e^U(y), where U is the potential function on the diffusion process. Does this make sense?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok that is a better answer. So you are saying you want to implement a anisotropic algorithm for the diffusion maps. I have so far not found that information in the docstrings for this class.

As I read the paper they provide an equation (11) for p_{epsilon} that can be calculated from the data. And I assume you are referring to the first unnumbered equations in chapter 4. anisotropic diffusion. Then the integral you have written is wrong. Second from the equation in the paper I read $\int d(y) k_{epsilon}(x,y) Pr{y} dy $ that the probability to be in state x is equal to the "sum" of all transitions to x from any point y considering the probability to be in y. Which sounds unnecessary confusing.

Or do you mean equation 13? Even in that case we can hardly give any estimate because the p_{epsilon} is some normalization and not a probability density. Giving this as a user is really hard, because having a good estimate of \mu is hard in high dimensions (more then 3) and would help us solve the complete state equation to a large extend.

I currently think your implementation has algorithms from different papers mixed up. Please once write up in latex the formulas you use. What each symbol stands for and from what paper you have copied the algorithm. Since they all provide test-case data we can compare against that.

Copy link
Member

@kain88-de kain88-de Jul 1, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before we have any more math discussion I think it's the best when you once write down what exactly you are doing on one page in latex. That page will then be used as a reference to check your implementation.

At least to remove the confusion in my head in regard to what you are doing ;-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I really appreciate the help. When are you leaving for wedding prep by the way? I just posted a copy of the equations I am trying to follow by the way:
anisotropicdiffusion

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please prepare the one sheet paper with citations. Then I can say more where I have questions about the details.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, this certainly the best medium for an easy-to-follow dialogue.

The list has to have the same length as the trajectory.
With 'None' the weight of each frame of the trajectory will be the
same.
timescale: int, optional
The number of steps in the random walk, large t reflects global
structure whereas small t indicates local structure.
"""
# check .run() called
if DistanceMatrix.calculated:
self._dist_matrix = DistanceMatrix
else:
raise AttributeError('Distance Matrix does not exist, was'
'DistanceMatrix.run() called?')

# .run() called in _prepare
self._dist_matrix = dist_matrix
# because we check for .run() in Epsilon class, we must be sure to call
# dist_matrix.run() before determine_epsilon, if a user wishes to
# provide their own epsilon class, dist_matrix.run() must be run
# prior to Diffusion Map initilization
self._epsilon = epsilon
# important for transform function
self._nframes = self._dist_matrix.nframes
# determines length of diffusion process
self._t = timescale
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keep names as from the init arguments for consistency and easier understanding of other methods that use the init arguments. (best practice)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use the same names as the input variables. This makes them easier to look up later.


# check determine_epsilon called
if Epsilon.calculated:
self._kernel = Epsilon.scaled_matrix
else:
raise AttributeError('scaled_matrix does not exist, was'
'determine_epsilon() called?')
self._epsilon = Epsilon

if weights is None:
if manifold_density is None:
# weights do not apply to metric but density of data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something you can read about in page 14 of the Lafon Paper or my diffusion maps blog post. The weight quantity is supposed to reflect how crowded points are on the free energy surface in the case of the adk simulation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well good if I'd know that my problem would be solved. The density is what determines the free energy landscape and hard to calculate accurately, especially if you talk about the density in the full conformation space.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this is not documented which would be incredible important here since the meaning of weight deviates from the usual meaning of weight inside of MDAnalysis.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'll go back and have a look at the literature before I make any changes. I'll email @euhruska to see if he has any examples of literature using a density estimate.

self._weights_ker = np.ones((self._nframes,))
else:
if weights.shape[0] != self._nframes:
raise ValueError("The weight should have the same length as "
'the trajectory')
else:
# weights are constructed as relative to the mean
# density weights are constructed as relative to the mean
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why?

self._weights_ker = (np.asarray(weights, dtype=np.float64) /
np.mean(weights))

def decompose_kernel(self):
def run(self):
# will only run if distance matrix not already calculated
self._dist_matrix.run()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why wouldn't this trigger a rerun of the distmatrix? There is nothing in the AnalysisBase code or DistMatrix that would stop it from being run again. You have to do the check here manually.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally right, I don't know why these mistakes are popping up, I'll try to be more careful.

# this is the case when user provides no Epsilon class
if isinstance(self._epsilon, float) or isinstance(self._epsilon, int):
self._epsilon = EpsilonConstant(self._dist_matrix,
self._epsilon)

self._epsilon.determine_epsilon()
# this should be a reference to the same object as
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure when this comment used to make sense/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relic of Epsilon class, would permanently change dist_matrix.dist_matrix to reflect scaling. There weren't two matrices in memory.

# self.dist_matrix.dist_matrix
# take negative exponent of scaled matrix to create Isotropic kernel
self._kernel = np.exp(-self._kernel)
self._kernel = np.exp(-self._epsilon.scaled_matrix)

# define an anistropic diffusion term q
q_vector = np.zeros((self._nframes, ))
@@ -394,18 +400,7 @@ def decompose_kernel(self):
eg_arg = np.argsort(eigenvals)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can already change the sorting direction here. Since these are indices it would also be nice if the name reflects that.

self.eigenvalues = eigenvals[eg_arg[::-1]]
self.eigenvectors = eigenvectors[eg_arg[::-1], :]

def _prepare(self):
self.diffusion_space = np.zeros((self.eigenvectors.shape[0],
self.num_eigenvectors))

def _single_frame(self):
# The diffusion map embedding takes the ith sample in the
# data matrix and maps it to each of the ith coordinates
# in the set of k-dominant eigenvectors
for k in range(self.num_eigenvectors):
self.diffusion_space[self._index][k] = (self.eigenvectors[k][
self._index])
self._calculated = True

def transform(self, num_eigenvectors):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function doesn't check that decomposition was called.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And how would someone perform such a check? (Asking for a friend...)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two ways. Either check that variables exists that are only created in decompose or create an is_decomposed boolean in __init__ which is set to True after decompose finishes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

noted thanks

""" Embeds a trajectory via the diffusion map
@@ -424,9 +419,11 @@ def transform(self, num_eigenvectors):
between the higher dimensional space and the space spanned by
the eigenvectors.
"""
self.num_eigenvectors = num_eigenvectors
self._setup_frames(self._dist_matrix._u.trajectory,
self._dist_matrix.start, self._dist_matrix.stop,
self._dist_matrix.step)
self.run()
self.diffusion_space = np.zeros((self.eigenvectors.shape[0],
num_eigenvectors))

for i in range(self.nframes):
for j in range(num_eigenvectors):
self.diffusion_space[i][j] = self.eigenvectors[j][i]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a transpose operation. self.diffusionspace = self.eigenvectors.T[:, :num_eigenvectors].
Another issues is that I think you aren't scaling the diffusion space correctly. According to the literature you have to multiply the eigenvector with the eigenvalues to get the "projection" into the diffusion space.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're totally right, will fix.


return self.diffusion_space
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to a straight

return self.eigenvectors.T[:, :n_eigenvectors] * self.eigenvalues[:n_eigenvectors]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes total sense, will fix.