We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
This is more of a question of how PEMA uses storage for each run. For my project I have 140 samples with PE sequences resulting to 14 gb of data.
14G ./my data 196G /pema215_otu
Is possible to reduce the storage needed for a run of PEMA or all output is required?
For example I have 2 all_samples.fasta (one in mainOutput and one in PEMA folder) files and 1 final_all_samples.fasta, are all necessary?
Also some intermediate folders like linearizedSequences, mergedSequences take up similar space as the mydata folder.
linearizedSequences
mergedSequences
mydata
The reason for this issue is that in large scale projects this can lead to exceeding disk quota.
The text was updated successfully, but these errors were encountered:
Hi @savvas-paragkamian. Thanks for the points.
The all_samples.fasta should be removed from the top output folder.
all_samples.fasta
In general, a feature could be added so files that are not being used from a step and afterwards could be removed on the fly.
At the moment pema returns everything so the user can validate the filtering parameters and their affect.
However, it might be a good option to remove intermediate files optionally for such cases.
Sorry, something went wrong.
No branches or pull requests
This is more of a question of how PEMA uses storage for each run. For my project I have 140 samples with PE sequences resulting to 14 gb of data.
Is possible to reduce the storage needed for a run of PEMA or all output is required?
For example I have 2 all_samples.fasta (one in mainOutput and one in PEMA folder) files and 1 final_all_samples.fasta, are all necessary?
Also some intermediate folders like
linearizedSequences
,mergedSequences
take up similar space as themydata
folder.The reason for this issue is that in large scale projects this can lead to exceeding disk quota.
The text was updated successfully, but these errors were encountered: