Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues of using umap (1. where to download the latest version; 2. potential bug of unify_bowtie.py) #12

Open
kyhkkm opened this issue Jun 20, 2023 · 1 comment

Comments

@kyhkkm
Copy link

kyhkkm commented Jun 20, 2023

Hello,Our team is working on a project about the genome Blacklist of different species, which uses umap software to generate a mappability containing umap mappability files. I have encountered these two problems in using umap.

  1. Your latest version is 1.2.0, but I can only download version 1.1.1 in conda or github, how can I download the latest version?
  2. We noticed that after running 'unify_bowtie', the kmers result of the first chromosome is always lost, for example, the bowtie file will be from chr2 to chrY, and all chr1 result will be lost.
    My current solution is to add chrMT as chr0 (before chr1), so that chr0 will be lost. And we could get the result from chr1 to chrY. We are not sure if this soultion is appropriate or not? And could you provide guidance on this issue?

Below is the corresponding command:

##############################################################################################################################################################################################################
working_dir=/public/home/mkong/Blacklist/03.work/02.mappability
bowtie_bin=/public/home/mkong/anaconda3/envs/Blacklist/bin
bowtie_index_dir=/public/home/mkong/Blacklist/03.work/02.mappability/genome
umap_path=/public/home/mkong/Blacklist/00.soft/umap/umap

we only consider kmer=50

for i in 50
do
for j in seq 0 2442
do python ${umap_path}/run_bowtie.py -var_id SGE_TASK_ID -job_id ${j} ${working_dir}/kmers/k${i} ${bowtie_bin} ${bowtie_index_dir} genome.fa
done
done

for i in 50
do
for j in seq 0 20
do python ${umap_path}/unify_bowtie.py ${working_dir}/kmers/k${i} ${working_dir}/chrsize.tsv -var_id SGE_TASK_ID -job_id ${j}
done
done
##############################################################################################################################################################################################################

@mehrankr
Copy link
Collaborator

mehrankr commented Jul 3, 2023

Dear Mei,

Regarding version, the v.1.2.0 tag was on bitbucket.
Currently you can use the main branch on GitHub:
https://github.com/hoffmangroup/umap

This is the latest version and you will be using the correct scripts.

About the error you are describing:
This seems to me like an off-by-one error caused by some task managers using 1-base and some using 0-based indexing. The example script I have provided doesn't have that issue.

You can investigate the contents of the chrsize_index.tsv file to see why this is happening.
Adjusting that file might be easier than adding a fake chromosome.

Best,
Mehran

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants