Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error running Epinano_DiffErr.R on 5mer output from Epinano_sumErr.py #125

Open
GeoffLyle opened this issue Aug 12, 2022 · 1 comment
Open
Labels
enhancement New feature or request

Comments

@GeoffLyle
Copy link

I have been running into an issue trying to run Epinano_DiffErr.R on the output from Epinano_sumErr.py.

Running Epnano_sumErr.py appears to work:
Epinano_sumErr.py --quality --file NHA_hTERT_DRNA_20220609_self_transcript_aligned.sorted.plus_strand.per.site.5mer.csv --out NHA-hTERT_5mer.sum_err.csv --kmer 5

python3 Epinano_sumErr.py --quality --file full_fq_to_sample_transcripts_output.sorted.plus_strand.per.site.5mer.csv --out DIPG-IV_5mer.sum_err.csv --kmer 5

However, when I try to run this output through Epinano_DiffErr.R I run into the following error:
Rscript Epinano_DiffErr.R -k NHA-hTERT_5mer.sum_err.csv -w DIPG-IV_5mer.sum_err.csv -c 30 -d 0.1 -t 3 -o DIPG-IV_NHA-hTERT_5mer_sumErr --feature sum_err3

Error:

Error in merge.data.frame(dat1, dat2, by = "chr_pos") :
negative length vectors are not allowed

This appears to be due to a memory limit issue.

Note:
I also tried changing line 126 in Epinano_DiffErr.R:
combine <- merge(dat1, dat2, by="chr_pos")
to:
combine <- dplyr::full_join(dat1, dat2, by="chr_pos")

I thought that the dplyr package could fix the memory limit issue, but I'm getting this error now:

Error: cannot allocate vector of size 127613.3 Gb
Execution halted

This is the size of the dataframes I want to merge:
[1] "Number of rows in dat1: 3571272"
54.5 Mb
[1] "Number of rows in dat2: 9592079"
146.4 Mb

Have you run into this error when running Epinano_DiffErr.R, and if so what was your solution?

PS: It would also be great to be able to pass the 5 sum_err columns at the same time as was suggested in #122

@enovoa enovoa added the enhancement New feature or request label Sep 16, 2022
@enovoa
Copy link
Collaborator

enovoa commented Sep 16, 2022

Hi @GeoffLyle sorry for the slow reply. Were you able to solve this issue? Also, thanks for your suggestion on using 5sum_err columns, we will keep this in mind for future updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants