Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

blastn not seeing nt_blast database #8

Closed
nick-catalog opened this issue Jul 1, 2024 · 5 comments
Closed

blastn not seeing nt_blast database #8

nick-catalog opened this issue Jul 1, 2024 · 5 comments

Comments

@nick-catalog
Copy link

nick-catalog commented Jul 1, 2024

I am unable to get a full screening to work due to BLAST not finding the database/nt_blast/ database.

The output for my .screening file

>> STEP 1: Checking for biorisk genes...
		 --> Biorisks: no hits detected, PASS
 STEP 1 completed at 2024-07-01 14:23:38
 >> STEP 2: Checking regulated pathogen proteins...
	...no hits
 STEP 2 completed at 2024-07-01 14:32:57
 >> STEP 3: Checking regulated pathogen nucleotides...
	...no hits to the nr database
BLAST Database error: No alias or index file found for nucleotide database [/datadrive/common-mechanism/database/nt_blast] in search path [/datadrive/common-mechanism::]
	 ERROR: command 'blastn -query /datadrive/common-mechanism/layer_outputs/biosecurity_full_tester.noncoding.fasta -db /datadrive/common-mechanism/database/nt_blast -out /datadrive/common-mechanism/layer_outputs/biosecurity_full_tester.nt.blastn -outfmt 7 qacc stitle sacc staxids evalue bitscore pident qlen qstart qend slen sstart send -max_target_seqs 50 -num_threads 8 -culling_limit 5 -evalue 10' failed

I installed the databases using the scripts recommended in the Wiki installation.

Any ideas??

@alexanian alexanian mentioned this issue Jul 2, 2024
2 tasks
@alexanian
Copy link
Member

alexanian commented Jul 2, 2024

Could you try pulling down https://github.com/ibbis-screening/common-mechanism/releases/tag/v0.1.2 ? I think the changes in there should fix this! (Apologies for the quite stupid typo, and for not pushing out the bugfix sooner.)

@nick-catalog
Copy link
Author

Hello! I have tried doing a clean pull and build of the commec-dev env and running the following test.

commec screen -d /datadrive/common-mechanism/db/ -o ./full-test example_data/igem_test_queries/BBa_K380009_A_20830_Coding_Protein_A_Z-domain.fasta --fast

which is now returning a new error I have not seen:
This is the end of my full-test.screen file

Internal pipeline statistics summary:
-------------------------------------
Query sequence(s):                         1  (58 residues searched)
Target model(s):                         353  (306998 nodes)
Passed MSV filter:                         0  (0); expected 7.1 (0.02)
Passed bias filter:                        0  (0); expected 7.1 (0.02)
Passed Vit filter:                         0  (0); expected 0.4 (0.001)
Passed Fwd filter:                         0  (0); expected 0.0 (1e-05)
Initial search space (Z):                353  [actual number of targets]
Domain search space  (domZ):               0  [number of targets reported over threshold]
# CPU time: 0.00u 0.01s 00:00:00.01 Elapsed: 00:00:00.00
# Mc/sec: 2724.98
//
[ok]
BLAST Database error: Error: Not a valid version 4 database.

I used update_blastdb.pl --passive --decompress nr and update_blastdb.pl --passive --decompress nt to download databases.

Any ideas?

@alexanian
Copy link
Member

Ah, yep, this is something I've seen before related to setting up my bioconda environment.

The defaults conda channel contains outdated versions of dependencies for both Diamond and BLAST, and so if you don't set a strict channel priority that includes conda-forge​ at the top, you end up stuck with really old versions of them.

This can be solved by running the following commands in order (which is part of Bioconda setup):

conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
conda config --set channel_priority strict

@alexanian
Copy link
Member

Just checking in @nick-catalog, was this able to resolve your issue?

@nick-catalog
Copy link
Author

Hello @alexanian! Sorry for delayed response.

On a clean pull and install of the commec-dev env, I was able to do a full and fast run on example data without issue!

commec screen -d /datadrive/common-mechanism/db/ -o ./fast-test ./example_data/igem_test_queries/BBa_K380009_A_20830_Coding_Protein_A_Z-domain.fasta --fast

commec screen -d /datadrive/common-mechanism/db/ -o ./full-test ./example_data/igem_test_queries/BBa_K380009_A_20830_Coding_Protein_A_Z-domain.fasta

Thanks for the help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants