Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating the database #21

Open
erinyoung opened this issue Apr 13, 2023 · 5 comments
Open

Updating the database #21

erinyoung opened this issue Apr 13, 2023 · 5 comments

Comments

@erinyoung
Copy link

Hi! I'd like to use emmtyper on some group A strep, but I'm foggy as to how often the database is updated.

Is there a way to update it on my end?

@Daniel-VM
Copy link

Hi,

I have added a Python script that:

Downloads and parses emm sequences from CDC's SFTP server.
Generates a multi-FASTA file containing all emm sequences.
Optionally creates a BLAST database from the multi-FASTA file, which can be used as input for emmtyper.

It can be accessed here: https://github.com/Daniel-VM/cdc-utilities

@erinyoung
Copy link
Author

@Daniel-VM , thank you for your script! Forgive me for taking so long to try it out.

@Daniel-VM
Copy link

Hi @erinyoung,

I recently discovered that the CDC has uploaded a multifasta file containing all emm sequences, which simplifies things considerably. Now, we just need to periodically download the CDC multifasta and build the BLAST database. I recommend using their blastdb version included in the Singularity image available here: emmtyper:0.2.0--py_0 in the Galaxy repository.

I hope this helps!

@JamesZlosnik
Copy link

Hi @Daniel-VM. Could I just confirm that the right multifasta to use is the alltrimmed.tfa from https://ftp.cdc.gov/pub/infectious_diseases/biotech/tsemm/. rather than the untrimmed version that the CDC also offers.

Thanks in advance!

@Daniel-VM
Copy link

Hi @JamesZlosnik,

I recently noticed the file you mentioned. In my opinion, alltrimmed.tfa is the file we should use to build the BLAST database for emmyper.

I'm planning to update the script I mentioned above with alltrimmed.tfa and run a few tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants