-
Notifications
You must be signed in to change notification settings - Fork 657
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Diffusion Map Implementation in MDAnalysis #857
Comments
The question got discussed in several threads. The most relevant being #618 and #719. I also invite you to have a look at MDReader from @mnmelo, as it can already do some parallelization. There is room for a lot more discussion on the matter. If I well remember, some of the issues we had are fixed in the new topology scheme or by #831. |
Thanks Jonathan! This is a bit of a rabbit's hole to fall into but I'm doing my best to take notes and take in the vast amount of discussion going on here. Obviously there will be some work to be done. |
So far the threads of interest seem to be: #293, #797, topology improvements multicore by Tyler Reddy seem all useful as well. I am getting started on refactoring as we speak. |
lets get it working first. This can be done in a later PR. The same goes for the memory optimization
This should have a high priority. RMSD is OK but in often other metrics are better suited. |
First priority is getting it working first definitely. On that and On Mon, May 23, 2016 at 8:26 AM kain88-de [email protected] wrote:
|
Ok – I looked into the commits and @euhruska is definitely an author. I will fix this and directly commit to develop. |
Holy crap!!! If I didn't attribute him anywhere it certainly wasn't On Fri, Jul 22, 2016 at 4:09 PM Oliver Beckstein [email protected]
|
I could have sworn he was in CHANGELOG and docs, perhaps I overlooked AUTHORS |
Ditto! Sorry @euhruska, completely unintentional. |
Hi All,
As far as I'm concerned, I'd like to go ahead and refactor @euhruska's work instead of starting off from scratch. So far it looks like he did a good job in implementing all the principle components (see what I did there?) of a diffusion map algorithm, but there are some changes and improvements that can definitely be made:
Refactor to satisfy Bauhaus Style
The current implementation is left as a function and needs to be remade to inherit from BaseAlign,
Parallelize Trajectory Analysis
There are a few routes to go about doing this. I think it would be interesting to try using deco, as it could lay the groundwork for doing this simply in other analysis modules. (Can
_single_frame()
be made easily concurrent? Or is this limited by how our Reader is working?) If that doesn't work quickly, we could move on and work on something using cython and the like, as demonstrated by this.Optimize memory allocation
Find all possible areas to prevent copying of arrays where possible, unfortunately the eigenvalue
problem is one area where we need to load an entire matrix into memory.
Provide API for introduction of new metric
A metric function should be able to be provided as an optional argument in initialization, with rmsd used if none is given. RMSD is a somewhat non-useful metric from the literature I read.
Perform thorough testing on real-world data
The most important part of this work is to make sure it actually works, using @euhruska's PR rather than starting from scratch again gives us more time to validate it against data and make sure everything is performing as it should. Of course, I would also work on covering the code.
The text was updated successfully, but these errors were encountered: