mongodb file size limitations #26

thempel · 2017-03-20T15:23:41Z

I just came across a pymongo.errors.DocumentTooLarge error on a very small dataset, probably because I accidentally produced a huge transition matrix. We might need to solve this problem sooner or later, especially because the discrete trajectories become larger and larger with time...

The text was updated successfully, but these errors were encountered:

jhprinz · 2017-03-21T12:23:08Z

Agreed, I need to think about the best way to do that.

Store the complete file, but then you cannot search or do cool stuff like with the other objects in it.
Break the large object down into seperate parts. That already works, but you need to know now to use subobjects in the Model you return. This also allows to access subparts easily. Example. We create a DiscretizedTrajectory objects and instead of writing an array of n_traj x length you write ntraj separate objects. and then only store references. Could still be too small...
Run the picking of frames on the cluster and only return the new frames. That should also be possible already, but you need to write a function that does that.

thempel · 2017-03-21T12:40:07Z

Hmm. What about option 1 with additionally copying the file into the working directory on the user's machine? I assume it wouldn't be usable within the DB, but it could be loaded into the script/notebook the user is using as numpy array.
About option 2, I'm a bit sceptical because in my experience, there will be a lot of (potentially useless...) MSMs which we don't really need to store. So chopping-up everything and storing it in the DB might just artificially blow things up.

jhprinz · 2017-03-21T13:08:26Z

What about option 1 with additionally copying the file into the working directory on the user's machine?

That would be no problem I guess. Good idea. It will require some thinking about the implementation, you might even not have to write it to disk.

Actually I just checked. This is really super simple...

thempel added the enhancement label Mar 20, 2017

thempel mentioned this issue Mar 22, 2017

MSM analysis worker #29

Closed

jhprinz mentioned this issue Mar 23, 2017

Larger file storage #35

Merged

jhprinz closed this as completed in #35 Mar 27, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mongodb file size limitations #26

mongodb file size limitations #26

thempel commented Mar 20, 2017

jhprinz commented Mar 21, 2017

thempel commented Mar 21, 2017

jhprinz commented Mar 21, 2017

mongodb file size limitations #26

mongodb file size limitations #26

Comments

thempel commented Mar 20, 2017

jhprinz commented Mar 21, 2017

thempel commented Mar 21, 2017

jhprinz commented Mar 21, 2017