We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
useCache
Or add a reload() and testCache method to determine if should rebuild from scratch (in MMseqsUtils.py).
reload()
testCache
Currently, in etl.build_exdb_resources.UpdateTargetsCofactors, the taxonomy data is downloaded 6 times in a row (https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump.tar.gz):
etl.build_exdb_resources.UpdateTargetsCofactors
... running UpdateTargetsCofactors() 2023-09-19 08:57:32,208 - luigi-interface - INFO - [MainThread] - Starting workflow ProteinTargetSequenceExecutionWorkflow (full) 2023-09-19 08:57:33,444 - root - INFO - [MainThread] - Running cacheTaxonomy... 2023-09-19 08:57:35,886 - rcsb.utils.taxonomy.TaxonomyProvider - INFO - [MainThread] - Taxonomy primary fetch status (True) using 'https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump.tar.gz' 2023-09-19 08:58:16,205 - rcsb.utils.taxonomy.TaxonomyProvider - INFO - [MainThread] - Taxonomy lengths name 2528571 node 2528676 merge 73969 ... ... 2023-09-19 09:59:32,570 - root - INFO - [MainThread] - Running searchDatabases... 2023-09-19 09:59:39,341 - __name__ - INFO - [MainThread] - convertalis status is True 2023-09-19 09:59:39,697 - __name__ - INFO - [MainThread] - Starting search result with (8531) records 2023-09-19 09:59:39,733 - __name__ - INFO - [MainThread] - Query match count 742 2023-09-19 09:59:39,733 - rcsb.workflow.targets.ProteinTargetSequenceWorkflow - INFO - [MainThread] - Query sequences with matches 'sabdab' (742) bitScore filter (False) 2023-09-19 09:59:40,046 - rcsb.workflow.targets.ProteinTargetSequenceWorkflow - INFO - [MainThread] - Completed searching sabdab targets (status True) (cutoff=0.95) at 2023 09 19 09:59:40 (7.4758 seconds) 2023-09-19 10:00:05,435 - __name__ - INFO - [MainThread] - convertalis status is True 2023-09-19 10:00:09,308 - __name__ - INFO - [MainThread] - useTaxonomy flag (True) 2023-09-19 10:00:13,438 - rcsb.utils.taxonomy.TaxonomyProvider - INFO - [MainThread] - Taxonomy primary fetch status (True) using 'https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump.tar.gz' 2023-09-19 10:00:36,127 - rcsb.utils.taxonomy.TaxonomyProvider - INFO - [MainThread] - Taxonomy lengths name 2528571 node 2528676 merge 73969 2023-09-19 10:00:36,127 - __name__ - INFO - [MainThread] - Starting search result with (89015) records 2023-09-19 10:00:37,255 - __name__ - INFO - [MainThread] - Query match count 2839 2023-09-19 10:00:37,259 - rcsb.workflow.targets.ProteinTargetSequenceWorkflow - INFO - [MainThread] - Query sequences with matches 'card' (2839) bitScore filter (False) 2023-09-19 10:00:42,039 - rcsb.workflow.targets.ProteinTargetSequenceWorkflow - INFO - [MainThread] - Completed searching card targets (status True) (cutoff=0.95) at 2023 09 19 10:00:42 (61.9924 seconds) 2023-09-19 10:01:24,274 - __name__ - INFO - [MainThread] - convertalis status is True 2023-09-19 10:01:28,952 - __name__ - INFO - [MainThread] - useTaxonomy flag (True) 2023-09-19 10:01:32,687 - rcsb.utils.taxonomy.TaxonomyProvider - INFO - [MainThread] - Taxonomy primary fetch status (True) using 'https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump.tar.gz' 2023-09-19 10:02:09,622 - rcsb.utils.taxonomy.TaxonomyProvider - INFO - [MainThread] - Taxonomy lengths name 2528571 node 2528676 merge 73969 2023-09-19 10:02:09,622 - __name__ - INFO - [MainThread] - Starting search result with (106156) records 2023-09-19 10:02:11,032 - __name__ - INFO - [MainThread] - Query match count 4180 2023-09-19 10:02:11,036 - rcsb.workflow.targets.ProteinTargetSequenceWorkflow - INFO - [MainThread] - Query sequences with matches 'drugbank' (4180) bitScore filter (False) 2023-09-19 10:02:16,465 - rcsb.workflow.targets.ProteinTargetSequenceWorkflow - INFO - [MainThread] - Completed searching drugbank targets (status True) (cutoff=0.95) at 2023 09 19 10:02:16 (94.4264 seconds) 2023-09-19 10:03:54,283 - __name__ - INFO - [MainThread] - convertalis status is True 2023-09-19 10:04:02,151 - __name__ - INFO - [MainThread] - useTaxonomy flag (True) 2023-09-19 10:04:05,087 - rcsb.utils.taxonomy.TaxonomyProvider - INFO - [MainThread] - Taxonomy primary fetch status (True) using 'https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump.tar.gz' 2023-09-19 10:04:45,656 - rcsb.utils.taxonomy.TaxonomyProvider - INFO - [MainThread] - Taxonomy lengths name 2528571 node 2528676 merge 73969 2023-09-19 10:04:45,657 - __name__ - INFO - [MainThread] - Starting search result with (173617) records 2023-09-19 10:04:48,781 - __name__ - INFO - [MainThread] - Query match count 6684 2023-09-19 10:04:48,788 - rcsb.workflow.targets.ProteinTargetSequenceWorkflow - INFO - [MainThread] - Query sequences with matches 'chembl' (6684) bitScore filter (False) 2023-09-19 10:04:57,883 - rcsb.workflow.targets.ProteinTargetSequenceWorkflow - INFO - [MainThread] - Completed searching chembl targets (status True) (cutoff=0.95) at 2023 09 19 10:04:57 (161.4171 seconds) 2023-09-19 10:06:25,748 - __name__ - INFO - [MainThread] - convertalis status is True 2023-09-19 10:06:31,521 - __name__ - INFO - [MainThread] - useTaxonomy flag (True) 2023-09-19 10:06:34,666 - rcsb.utils.taxonomy.TaxonomyProvider - INFO - [MainThread] - Taxonomy primary fetch status (True) using 'https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump.tar.gz' 2023-09-19 10:07:12,424 - rcsb.utils.taxonomy.TaxonomyProvider - INFO - [MainThread] - Taxonomy lengths name 2528571 node 2528676 merge 73969 2023-09-19 10:07:12,426 - __name__ - INFO - [MainThread] - Starting search result with (131516) records 2023-09-19 10:07:14,207 - __name__ - INFO - [MainThread] - Query match count 8840 2023-09-19 10:07:14,214 - rcsb.workflow.targets.ProteinTargetSequenceWorkflow - INFO - [MainThread] - Query sequences with matches 'pharos' (8840) bitScore filter (False) 2023-09-19 10:07:20,676 - rcsb.workflow.targets.ProteinTargetSequenceWorkflow - INFO - [MainThread] - Completed searching pharos targets (status True) (cutoff=0.95) at 2023 09 19 10:07:20 (142.7927 seconds) 2023-09-19 10:07:45,967 - __name__ - INFO - [MainThread] - convertalis status is True 2023-09-19 10:07:50,188 - __name__ - INFO - [MainThread] - useTaxonomy flag (True) 2023-09-19 10:07:53,175 - rcsb.utils.taxonomy.TaxonomyProvider - INFO - [MainThread] - Taxonomy primary fetch status (True) using 'https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump.tar.gz' 2023-09-19 10:08:28,294 - rcsb.utils.taxonomy.TaxonomyProvider - INFO - [MainThread] - Taxonomy lengths name 2528571 node 2528676 merge 73969 2023-09-19 10:08:28,295 - __name__ - INFO - [MainThread] - Starting search result with (89015) records 2023-09-19 10:08:29,210 - __name__ - INFO - [MainThread] - Query match count 2582 2023-09-19 10:08:29,250 - rcsb.workflow.targets.ProteinTargetSequenceWorkflow - INFO - [MainThread] - Query sequences with matches 'card' (2582) bitScore filter (True) 2023-09-19 10:08:32,958 - rcsb.workflow.targets.ProteinTargetSequenceWorkflow - INFO - [MainThread] - Completed searching card targets (status True) (cutoff=0.95) at 2023 09 19 10:08:32 (72.2820 seconds) 2023-09-19 10:08:32,959 - root - INFO - [MainThread] - Running buildFeatures...
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Or add a
reload()
andtestCache
method to determine if should rebuild from scratch (in MMseqsUtils.py).Currently, in
etl.build_exdb_resources.UpdateTargetsCofactors
, the taxonomy data is downloaded 6 times in a row (https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump.tar.gz):The text was updated successfully, but these errors were encountered: