-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ram consumption #161
Comments
1TB might work if you're not doing anything crazy with semi-enzymatic/HLA/etc... I think the largest I have done in a single search was ~2k files. Are they wanting to run a label-free quantitative analysis where MBR is needed, or is it possible that the files could be processed in batches? If you don't need MBR, you could search each file individually or in batches (e.g. 96 at a time) and then use Mokapot or percolator to control FDR across the experiment. Sage will handle predicting/aligning RTs (used for rescoring PSMs) & global FDR for you, but there aren't any other substantial differences between searching all files at once and individually - and I imagine that the difference between searching 512 files and all 3000 at once will be very minor and within acceptable error. Even with MBR, I would suggest assessing whether moderately sized batches are OK (I would imagine very few peptides are lost going from 3000 -> batches of 512, for example) |
Hi Michael I think if possible they would love to do standard tryptic no crazy modification and MBR quant. That being said I would propose that they stay with the 1TB setup for now and just try the 3000 with MBR. If it crashes then all files without MBR Then simply check both the reference files and the nonMBR quantization s whether the 512 batch MBR runs are behaving as expected. If this is the case then stitch everything together with additional tools for adjusting FDR. Would you think this would be viable as a strategy? I would report back once we have the results as this might be of interest to others. Best and thanks again for your time and this cool tool. |
I think it would be viable! If it doesn't work, please reach out and I can try to implement some other kind of solution |
Hi Michael
A collaborator has to analyze 3000+ files ( dda orbitrap so around 700MB - 1.5GB per file )
FragPipe ran out of memory with 1TB on a VM during quant....
I was wondering wherher SAGE might be an option here.... I am confident speed would not be an issue :)
As I understand, SAGE does not write intermediate results, but rather stores everything in memory. Would this cause proble,s with so many files or would you recommend a specific procedure how to deal with this?
I quickly went over the intro and the github issues but did not see anything related to ram requirements for large scale analyses.
Any help would be appreciated!
best Klemens
The text was updated successfully, but these errors were encountered: